Rightsizing rigor in operational M&E?

By Lev Heller @ 2025-09-04T14:51 (+8)

We all generally understand the importance of high rigor to determine what interventions are effective. On the other hand though, the amount of work that goes into an RCT isn't scalable, so we're stuck assuming that assumptions hold across implementations because we can't repeat an RCT in every single place we apply a program.

At their core, though, RCTs are about comparing a control and intervention group and controlling for interfering factors. There's a sliding scale of effort to rigor, but there's also still operational decision-making value to comparison tests with less statistical integrity. What I'd like to do is identify a point where we maximize the scalability of analysis but still get enough insight to act as an indicator to catch 80%+ of failed cases. What might that look like?

(I'm aware data collection is inherently effort intensive. I'm working on that angle in parallel, but if I do manage to get improved scalable data access for outcome indicators, I'm trying to pin down the best way to make use of it. Right now I'm thinking live monitoring of simple trend comparison between intervention cohort and control group, but I'm not a statistician. I'm open to feedback on this.)