The Important/Neglected/Tractable framework needs to be applied with care

By Robert_Wiblin @ 2016-01-24T15:10 (+23)

The Important/Neglected/Tractable (INT) framework has become quite popular over the last two years. I believe it was first used by GiveWell for their cause and intervention investigations. Since then Giving What We Can and 80,000 Hours have adopted it with the addition of 'personal fit' as a further criterion, and it has been featured prominently in Will MacAskill's book Doing Good Better.

This qualitative framework is an alternative to straight cost-effectiveness calculations when they are impractical. Briefly the arguments in favour of scoring a cause or charity using a multiple-criteria framework are:

It is more robust to a major error in one part of your research.
It forces you to look at something from multiple different angles and seek support from 'many weak arguments' rather than just 'one strong argument'.
In practice it leads to faster research progress than trying to come up with a persuasive cost-effectiveness estimate when raw information is scarce.

I favour continued use of this framework. But I have also noticed some traps it's easy to fall into when using it.

These criteria are 'heuristics' that are designed to point in the direction of something being cost-effective. Unfortunately, using the common definitions of these words they blur heavily into one another. For example:

A cause can remain important because it's neglected.
A cause can remain tractable because it's neglected.
A cause can be tractable because it's important.

Sometimes I have seen projects evaluated using this framework dramatically overweight one observation, because it can fit into multiple criteria.

Naturally, a cause that's highly neglected will score well on neglectedness. It can then go on to score very well on importance because its neglectedness means the problem remains serious. Finally, it can also score well on tractability, because the most promising approaches are not yet being taken, because it's neglected.

Most of the case in favour of the cause then boils down to it being neglected, rather than just one third, as originally intended.

A way around this problem is to rate each criteria, while 'controlling for' the others.

For example, rather than ask 'how important is this cause?', instead ask 'how important would it be if nobody were doing anything about it?', or 'how important is this cause relative to others that are equally as neglected?'. Rather than ask 'how tractable is this cause on the margin?', instead ask 'how tractable would it be if nobody were working on it?', or 'how tractable is it relative to other causes that are equally neglected?'.

Those questions would be inappropriate if we weren't considering neglectedness separately, but fortunately we are.

If you do make this adjustment, make sure to do it consistently across all of the options you are comparing.

Another pitfall is to look at the wrong thing when evaluating 'Importance' (sometimes also called 'Scale'). Imagine you were looking into 'unconditional cash transfers' distributed by GiveDirectly. What should you look at to see if it's solving an 'Important' problem? Two natural options are:

The total number of people living in extreme poverty who could benefit from cash transfers.
The severity of the poverty of GiveDirectly's recipients.

Which should you use? It depends what you think you are funding.

When GiveDirectly was first developing 'unconditional cash transfers', carefully studying their impact, and popularising the approach, the most important measure was the first. If their cash transfers proved to be a huge success, then the number of people who could benefit from the innovation was the 1-2 billion people with extremely low incomes.

But today a lot of this research has been done, and many or most people in development are aware of the arguments for cash transfers. Scaling up GiveDirectly further is only going to have modest learning and demonstration effects.

For that reason, I now think the most suitable criteria for evaluating the importance of the problem GiveDirectly is solving, relative to others, is the second: how desperately poor are their recipients relative to other beneficiaries we could help?

To generalise:

If someone is inventing something new, what you want to know is the size of the total problem that invention could contribute to solving.
If someone is scaling up something that already exists, what you want to know is the severity of the problem is that it solves per recipient (or, alternatively per dollar that is spent on it).

If you are trying to compare inventing new things against scaling up things that already exists, be prepared to face serious challenges with any framework I know of.

undefined @ 2016-01-24T19:13 (+11)

A way around this problem is to rate each criteria, while 'controlling for' the others.

For example, rather than ask 'how important is this cause?', instead ask 'how important would it be if nobody were doing anything about it?', or 'how important is this cause relative to others that are equally as neglected?'. Rather than ask 'how tractable is this cause on the margin?', instead ask 'how tractable would it be if nobody were working on it?', or 'how tractable is it relative to other causes that are equally neglected?'.

Those questions would be inappropriate if we weren't considering neglectedness separately, but fortunately we are.

This strikes me as a weird way to go about it. You're essentially taking your best estimate about how effective a cause is, breaking out its effectiveness into three separate factors, and then re-combining those three factors to estimate its effectiveness. You're going to lose information at each step here, and end up back where you started.

It seems to me that the importance/tractability/neglectedness framework is more useful when you have a good sense of a cause's overall importance but you don't know about its importance on margin--factoring in tractability and neglectedness helps you understand the value of marginal investment.

Paul_Lang @ 2023-09-04T10:41 (+5)

I am wondering if 80k should publicly speak of the PINT instead of the INT framework (with P for personal fit). This is because I get the impression that the INT framework contributes to generating a reputation hierarchy of cause areas to work on; and many (young) EAs tend to over-emphasize reputation over personal fit, which basically sets them up for failure a couple of years down the line. Putting the "P" in there might help to avoid this.

undefined @ 2016-01-24T15:28 (+1)

Good post. I agree that this framework, though somewhat useful, sometimes seems to be taken as more precise than it is. I've also thought about how it could be improved, but without notable success so far (one problem is that in order to become more precise, you need more criteria, which threatens to make the framework too cumbersome). I also thought about the "controlling for other factors"-idea, which I think people should use.