A Brief History of Evidence

By DavidNash @ 2025-07-21T11:30 (+9)

This post explores efforts to improve lives globally using evidence. (Part 2 of a 7 week series covering global development).

Systematic observation and rigorous testing, which played a key role in the scientific revolution and the development of clinical trials in medicine, has increasingly influenced how we understand and address other global challenges.

In recent decades, global development has seen a shift towards using evidence to inform action. This application has led to the increasing use of methodologies like randomised control trials (RCTs). This shift has generated successes, such as cost-effective interventions to combat infectious diseases, but questions remain about how much evidence can be generalised and how to implement solutions at scale.


The Evidence Revolution

Science

The pursuit of evidence as a cornerstone of understanding and progress has deep historical roots. The scientific revolution of the 16th and 17th centuries marked a fundamental shift away from reliance on tradition and authority towards systematic observation, experimentation and the formulation of testable hypotheses. It hinged on the earlier invention of the printing press by Johannes Gutenberg in the 15th century. This allowed scholarly works to be more widely read and enabled people to build upon previous knowledge, rather than starting from scratch. This acceleration in knowledge dissemination paved the way for evidence-based thinking.

Key moments:

These intellectual transformations laid the groundwork for evidence-based approaches in other domains, including a gradual evolution within academia. While the scientific revolution was driven by individual scientists, scientific societies and wealthy patrons largely outside the traditional university system[1], the growing prestige and practical successes eventually spurred universities to incorporate scientific disciplines. Over time, universities increasingly adopted peer review and empirical methodologies as key parts of academic inquiry.


Medicine

Building upon the foundations laid by the Scientific Revolution, the medical field underwent its own transformative period. It saw a shift from relying on traditional remedies and anecdotal observations towards a scientific understanding of disease and treatment. As diseases were increasingly understood through a biological lens with the advent of germ theory, the need for evaluation of medical interventions became increasingly apparent.

The advances in understanding the causes of disease, coupled with a growing emphasis on systematic evaluation of treatments, transformed the practice of medicine.

The Evidence-Based Medicine Pyramid

Increasingly, cost-effectiveness analysis (CEA) was adopted to evaluate the relative value of different interventions, informing resource allocation decisions by comparing costs to health outcomes achieved. CEA often uses standardised metrics like Quality-Adjusted Life Years[2], Disability-Adjusted Life Years[3], and Well-being Adjusted Life Years[4] to quantify the health and well-being benefits of interventions, allowing for comparisons across different diseases and programs.

The data needed to perform rigorous CEAs often relies on well-designed studies, and one of the key tools that was developed and used in multiple fields to generate this evidence is the randomised controlled trial (RCT).


Randomised Control Trials

A randomised controlled trial (RCT) is an experiment designed to control for factors outside of direct experimental control. In an RCT, participants are randomly allocated to different treatment groups. This random allocation helps to ensure that the groups are statistically comparable, even for characteristics that researchers haven't identified or cannot directly manage. When well-designed, properly conducted and involving a sufficient number of participants, an RCT can provide a robust comparison of the treatments being studied, mitigating the influence of confounding factors.

In 2004 the Center for Global Development convened the Evaluation Gap Working Group. The group was asked to investigate why rigorous impact evaluations of social development programs, whether financed directly by governments or supported by international aid were relatively rare.

3ie - Trends in impact evaluation: Did we ever learn? (3 minutes)

Trends in Impact Evaluation - 3ie

Abhijit V. Banerjee and Esther Duflo's 'Poor Economics' book released in 2011 advocated for an increasingly evidence-based approach to tackling global poverty, challenging grand theories and market-based solutions. The book proposed understanding the decisions and circumstances of the poor through rigorous RCTs and argued that small, well-targeted interventions could lead to significant progress.

Following the growing recognition of a desire for rigorous evidence in global development, a range of influential organisations emerged or shifted their priorities. The Abdul Latif Jameel Poverty Action Lab (J-PAL) spearheaded the movement by focusing on generating research through hundreds of RCTs globally, partnering with implementing organisations to conduct real-world experiments and influencing billions in development spending through its findings.

Funders like USAID and the Gates Foundation, traditionally focused on large-scale interventions, began allocating significant resources to RCTs and other rigorous evaluations.

This shift was driven by a desire for greater accountability and a feeling that traditional development approaches had yielded limited or uncertain results.

However, the increased emphasis on RCTs also sparked debates about the limitations of this methodology.


Critiques of RCTs (and impact evaluation generally)

The use of RCTs and broader impact evaluation tools in global development are not without detractors. A range of concerns have been raised, including questions about generalisability, scaling effectiveness, ethical considerations and whether micro-level interventions can address macro development challenges.

Stephanie Wykstra - It’s difficult to test whether poverty relief actually works. Do randomised controlled trials provide a scientific measure? (8 minutes)


Limited Generalisability


The Micro vs. Macro Debate


Research Vulnerabilities


Practical Implementation Challenges

Scaling Challenges

Scaling successful interventions beyond controlled trials introduces more challenges:

Research initiatives like the Yale Research Initiative on Innovation and Scale (Y-RISE) are investigating these challenges to understand how to preserve impact while navigating the transition from controlled studies to real-world implementation.


Resource Constraints

RCTs are typically expensive and time-consuming to implement properly, which can limit their feasibility in resource-constrained settings. The high costs of data collection, analysis and monitoring can divert resources from program implementation, raising questions about opportunity costs in development spending.


Methodological Limitations

Hawthorne Effect

Subjects behaving differently simply because they know they're being studied can artificially inflate intervention effects. For example, households that receive regular monitoring visits as part of an RCT may change their behaviour in ways that wouldn't occur in a scaled program without such intensive observation.


Time Horizons

RCTs typically measure short-term effects, but scaled interventions may produce different outcomes over longer timeframes, creating challenges for predicting long-term impact. Many important development outcomes unfold over years or decades, beyond the typical timeframe of an RCT.

This is especially relevant in countries that are changing economically, demographically, culturally in a short amount of time.


Evolution of the Field

While RCTs in development economics face legitimate criticisms, practitioners have evolved their methods to address these critiques.

Tim Ogden - RCTs in Development Economics, Their Critics and Their Evolution

The fundamental disagreements about RCTs stem from different theories of change regarding:

Using the ‘Hype Cycle’ framework, Ogden argues that RCTs have matured past inflated expectations and are now in a more productive ‘slope of enlightenment’ phase, with practitioners implementing more sophisticated designs, addressing causal mechanisms, and building institutions to support evidence-based policymaking.

"Hype Cycle" framework


Conclusion

While RCTs have made valuable contributions to development economics and policy, understanding their limitations is critical for appropriate application. The critiques outlined above highlight potential value in methodological pluralism. Rather than relying exclusively on RCTs, the field is moving toward a more nuanced understanding of when they are most useful and how they can be complemented by other research approaches.

Building on this broader understanding of evaluation methodologies, it's useful to examine the progress and ongoing challenges in specific areas of global development. One of the most fundamental of these is global health, where significant strides have been made in improving life expectancy and reducing mortality, but substantial issues persist.

 

I've split this post into two parts, you can find the second part here.

  1. ^

    Universities prioritised established curricula and traditional disciplines like theology, law and (observation based) medicine

  2. ^

    A QALY is a measure of health outcome that combines the length of life with its quality (or well-being). It represents one year of life in perfect health. QALYs are used to assess the benefit of medical interventions by estimating the number of additional years of life gained due to the intervention, weighted by the quality of those years. A year lived in less-than-perfect health receives a value between 0 and 1. QALYs are often used in cost-effectiveness analysis to compare the value of different medical treatments or public health interventions

  3. ^

    A DALY is a measure of overall disease burden, expressed as the number of years lost due to ill-health, disability or early death. One DALY represents one lost year of healthy life. DALYs are calculated by summing the years of life lost (YLL) due to premature mortality and the years lived with disability (YLD) due to prevalent cases of the disease or health condition in a population. DALYs allow for comparisons of the burden of different diseases and risk factors, and are used to prioritise health interventions and research efforts

  4. ^

    A WALY is a newer metric that extends the concept of QALYs by incorporating subjective well-being as a key dimension. It aims to capture not only the health-related quality of life but also the broader aspects of an individual's well-being, such as happiness, life satisfaction, and social connections. WALYs are calculated by weighting years of life by a well-being score, reflecting an individual's overall sense of flourishing. This metric is particularly relevant for evaluating interventions that aim to improve mental health, social support, and overall life satisfaction, in addition to physical health

  5. ^

    “The cases were as similar as I could have them. They all in general had putrid gums, the spots and lassitude, with weakness of their knees. They lay together in one place, being a proper apartment of the sick in the fore-hold; and had one diet common to all”


SummaryBot @ 2025-07-21T16:06 (+1)

Executive summary: This post offers a historical overview of how evidence—from the scientific revolution to modern randomized controlled trials (RCTs)—has come to shape global development practices, highlighting both the power and the limitations of evidence-based approaches, especially in scaling interventions and addressing systemic challenges; it serves as part two of a reflective and educational seven-part series.

Key points:

  1. Historical foundations of evidence-based thinking: The scientific revolution, driven by figures like Copernicus, Galileo, Bacon, and Newton, emphasized observation, experimentation, and the systematic sharing of knowledge, laying the groundwork for later evidence-based practices in fields like medicine and development.
  2. Transformation of medicine through science: The transition from anecdotal remedies to germ theory, vaccines, and public health infrastructure marked a major shift toward evidence-driven healthcare, later bolstered by statistical methods and cost-effectiveness analysis (CEA) using metrics like QALYs and DALYs.
  3. Rise of RCTs in global development: Inspired by successes in medicine and agriculture, RCTs gained prominence in economics and development, exemplified by the work of Banerjee and Duflo and institutions like J-PAL, promoting a “small interventions, rigorous testing” paradigm.
  4. Critiques and limitations of RCTs: Concerns include limited generalisability across contexts, ethical and practical challenges in scaling, and the narrow focus on micro-level interventions over structural causes of poverty. Research by Wykstra, Vivalt, and others highlights variability in outcomes and potential biases.
  5. Challenges of scaling and sustainability: Implementing interventions beyond controlled trials often faces logistical, motivational, political, and economic hurdles that can erode effectiveness, prompting research initiatives like Y-RISE to study real-world transitions.
  6. Toward methodological pluralism: While RCTs remain valuable, the field is maturing into a more nuanced phase that embraces diverse methods, better causal inference, and institutional capacity-building, moving past “hype” to more balanced, context-sensitive evidence use.

 

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.