Using Mixed Methods in Monitoring and Evaluation : Experiences from International Development

This paper provides an overview of the various ways in which mixing qualitative and quantitative methods could add value to monitoring and evaluating development projects. In particular it examines how qualitative methods could address some of the limitations of randomized trials and other quantitative impact evaluation methods; it also explores the importance of examining "process" in addition to "impact", distinguishing design from implementation failures, and the value of mixed methods in the real-time monitoring of projects. It concludes by suggesting topics for future research -- including the use of mixed methods in constructing counterfactuals, and in conducting reasonable evaluations within severe time and budget constraints.

Citation

“Bamberger, Michael; Rao, Vijayendra; Woolcock, Michael. 2010. Using Mixed Methods in Monitoring and Evaluation : Experiences from International Development. Policy Research working paper ; no. WPS 5245. © World Bank. http://hdl.handle.net/10986/3732 License: CC BY 3.0 IGO.”

Other publications in this report series

The Economic Value of Weather Forecasts: A Quantitative Systematic Literature Review
(Washington, DC: World Bank, 2025-09-10) Farkas, Hannah; Linsenmeier, Manuel; Talevi, Marta; Avner, Paolo; Jafino, Bramka Arga; Sidibe, Moussa
This study systematically reviews the literature that quantifies the economic benefits of weather observations and forecasts in four weather-dependent economic sectors: agriculture, energy, transport, and disaster-risk management. The review covers 175 peer-reviewed journal articles and 15 policy reports. Findings show that the literature is concentrated in high-income countries and most studies use theoretical models, followed by observational and then experimental research designs. Forecast horizons studied, meteorological variables and services, and monetization techniques vary markedly by sector. Estimated benefits even within specific subsectors span several orders of magnitude and broad uncertainty ranges. An econometric meta-analysis suggests that theoretical studies and studies in richer countries tend to report significantly larger values. Barriers that hinder value realization are identified on both the provider and user sides, with inadequate relevance, weak dissemination, and limited ability to act recurring across sectors. Policy reports rely heavily on back-of-the-envelope or recursive benefit-transfer estimates, rather than on the methods and results of the peer-reviewed literature, revealing a science-to-policy gap. These findings suggest substantial socioeconomic potential of hydrometeorological services around the world, but also knowledge gaps that require more valuation studies focusing on low- and middle-income countries, addressing provider- and user-side barriers and employing rigorous empirical valuation methods to complement and validate theoretical models.
The Macroeconomic Implications of Climate Change Impacts and Adaptation Options
(Washington, DC: World Bank, 2025-05-29) Abalo, Kodzovi; Boehlert, Brent; Bui, Thanh; Burns, Andrew; Castillo, Diego; Chewpreecha, Unnada; Haider, Alexander; Hallegatte, Stephane; Jooste, Charl; McIsaac, Florent; Ruberl, Heather; Smet, Kim; Strzepek, Ken
Estimating the macroeconomic implications of climate change impacts and adaptation options is a topic of intense research. This paper presents a framework in the World Bank's macrostructural model to assess climate-related damages. This approach has been used in many Country Climate and Development Reports, a World Bank diagnostic that identifies priorities to ensure continued development in spite of climate change and climate policy objectives. The methodology captures a set of impact channels through which climate change affects the economy by (1) connecting a set of biophysical models to the macroeconomic model and (2) exploring a set of development and climate scenarios. The paper summarizes the results for five countries, highlighting the sources and magnitudes of their vulnerability --- with estimated gross domestic product losses in 2050 exceeding 10 percent of gross domestic product in some countries and scenarios, although only a small set of impact channels is included. The paper also presents estimates of the macroeconomic gains from sector-level adaptation interventions, considering their upfront costs and avoided climate impacts and finding significant net gross domestic product gains from adaptation opportunities identified in the Country Climate and Development Reports. Finally, the paper discusses the limits of current modeling approaches, and their complementarity with empirical approaches based on historical data series. The integrated modeling approach proposed in this paper can inform policymakers as they make proactive decisions on climate change adaptation and resilience.
Labor Demand in the Age of Generative AI: Early Evidence from the U.S. Job Posting Data
(Washington, DC: World Bank, 2025-11-18) Liu, Yan; Wang, He; Yu, Shu
This paper examines the causal impact of generative artificial intelligence on U.S. labor demand using online job posting data. Exploiting ChatGPT’s release in November 2022 as an exogenous shock, the paper applies difference-in-differences and event study designs to estimate the job displacement effects of generative artificial intelligence. The identification strategy compares labor demand for occupations with high versus low artificial intelligence substitution vulnerability following ChatGPT’s launch, conditioning on similar generative artificial intelligence exposure levels to isolate substitution effects from complementary uses. The analysis uses 285 million job postings collected by Lightcast from the first quarter of 2018 to the second quarter of 2025Q2. The findings show that the number of postings for occupations with above-median artificial intelligence substitution scores fell by an average of 12 percent relative to those with below-median scores. The effect increased from 6 percent in the first year after the launch to 18 percent by the third year. Losses were particularly acute for entry-level positions that require neither advanced degrees (18 percent) nor extensive experience (20 percent), as well as those in administrative support (40 percent) and professional services (30 percent). Although generative artificial intelligence generates new occupations and enhances productivity, which may increase labor demand, early evidence suggests that some occupations may be less likely to be complemented by generative artificial intelligence than others.
The Lasting Effects of Working while in School
(Washington, DC: World Bank, 2025-08-18) Ferrando, Mery; Katzkowicz, Noemi; Le Barbanchon, Thomas; Ubfal, Diego
This paper provides the first experimental evidence on the long-term effects of work-study programs, leveraging a randomized lottery design from a national program in Uruguay. Participation leads to a persistent 11 percent increase in formal labor earnings, observable seven years after the program. Effects are stronger for youth who participate during pivotal educational transitions and are larger for vulnerable youth and men, while remaining positive for women and non-vulnerable youth. The program is highly cost-effective, with average impacts exceeding those of job training programs and comparable to early childhood investments.
It’s Not (Just) the Tariffs: Rethinking Non-Tariff Measures in a Fragmented Global Economy
(Washington, DC: World Bank, 2025-10-22) Taglioni, Daria; KEE, Hiau Looi
As tariffs have declined, non-tariff measures (NTMs) have become central to trade policy, especially in high-income countries and regulated sectors like food and green technologies. Although NTMs may serve legitimate goals, they could also sort countries and firms into or out of markets based on compliance capacity and differences in product mix. Documenting recent advances in the estimation of ad valorem equivalents (AVEs), this paper uncovers new patterns of use and exposure of NTMs. High-income countries rely more heavily on NTMs relative to tariffs, while low- and middle-income countries face steeper AVEs on their exports. Firm-level evidence shows that NTMs disproportionately affect smaller firms, leading to market exit and concentration. Poorly designed NTMs can harm productivity and welfare, while coordinated, capacity-aware use can deliver inclusive outcomes. Policy design, transparency, and diagnostics must evolve to reflect the growing role—and risks—of NTMs in a fragmented global trade landscape.

Collections

Policy Research Working Papers

Full item page

Related items

Showing items related by metadata.

Integrating Qualitative Methods into Investment Climate Impact Evaluations
(World Bank Group, Washington, DC, 2014-12) Mendoza Alcantara, Alejandra; Woolcock, Michael
Incorporating qualitative methods into the evaluation of development programs has become increasingly popular in recent years, both for the distinctive insights such approaches can bring in their own right and because of their capacity to complement the strengths -- and where necessary correct some of the weaknesses -- of quantitative approaches. Some initial work deploying mixed methods has been undertaken in the assessment of investment climate reforms, but considerable room for expansion exists. This paper summarizes some of the key principles and practices underpinning mixed methods evaluations in development, highlight some notable examples of how such work has been conducted (and the particular contributions it has made), and offers some guidelines for those seeking to increase the sophistication and utility of qualitative methods in the evaluation of investment climate reforms.
Evaluating the Impact of Development Projects on Poverty : A Handbook for Practitioners
(Washington, DC: World Bank, 2000-05) Baker, Judy L.
Very little is known about the actual impact of projects on the poor. Many are reluctant to carry out impact evaluations because they are deemed expensive, time consuming, and technically complex, and because the findings can be politically sensitive. Yet a rigorous evaluation can be powerful in assessing the appropriateness and effectiveness of programs. Evaluating impact is particularly critical in developing where resources are scarce and every dollar spent should aim to maximize its impact on poverty reduction. This handbook seeks to provide project managers and policy analysts with the tools needed for evaluating project impact. It is aimed at readers with a general knowledge of statistics. Chapter 1 presents an overview of concepts and methods, Chapter 2 discusses key steps and related issues to consider in implementation, Chapter 3 illustrates various analytical techniques through a case study, and Chapter 4 includes a discussion of lessons that have been reviewed for this handbook. The case studies, included in Annex I, were selected from a range of evaluations carried out by the Bank, other donor agencies, research institutions, and private consulting firms. Also included in the annexes are samples of the main components that would be necessary in planning any impact evaluation - sample terms of reference, a budget, impact indicators, a log frame, and a matrix of analysis.
Integrating Quantitative and Qualitative Research in Development Projects
(Washington, DC: World Bank, 2000-06) Bamberger, Michael; Bamberger, Michael
"Much of the early work on poverty was highly quantitative:... It became increasingly clear, however, that while numbers are essential for policy and monitoring purposes, it is also important to understand people's perception of poverty and their mechanisms for coping with poverty and other situations of extreme economic and social stress." Researchers have recognized over the past few years that quantitative analysis of the incidence and trends in poverty, while essential for national economic development planning, must be complemented by qualitative methods that help planners and managers understand the cultural, social, political, and institutional context within which projects are designed and implemented. This report is based on a two-day workshop held in June 1998, where outside research specialists and World Bank staff discussed the importance of integrating these research methods. The participants reviewed experiences in the use of mixed-method approaches in Bank research and project design. This report is a result of those discussions. The report examines the need for integrated research approaches in social and economic development, presents case studies of integrated approaches in practice, and talks about lessons learned. Part I describes the evolution of interest in, and the potential benefits of integrated research, and, Part II presents case studies on how integrated approaches have been used in poverty analysis, education, health, and water supply and sanitation, while Part III discusses lessons learned with respect to the use of integrated approaches, and assesses the benefits that can be achieved.
Handbook on Impact Evaluation : Quantitative Methods and Practices
(World Bank, 2010) Khandker, Shahidur R.; Koolwal, Gayatri B.; Samad, Hussain A.
This book reviews quantitative methods and models of impact evaluation. The formal literature on impact evaluation methods and practices is large, with a few useful overviews. Yet there is a need to put the theory into practice in a hands-on fashion for practitioners. This book also details challenges and goals in other realms of evaluation, including monitoring and evaluation (M&E), operational evaluation, and mixed-methods approaches combining quantitative and qualitative analyses. This book is organized as follows. Chapter two reviews the basic issues pertaining to an evaluation of an intervention to reach certain targets and goals. It distinguishes impact evaluation from related concepts such as M&E, operational evaluation, qualitative versus quantitative evaluation, and ex-ante versus ex post impact evaluation. Chapter three focuses on the experimental design of an impact evaluation, discussing its strengths and shortcomings. Various non-experimental methods exist as well, each of which are discussed in turn through chapters four to seven. Chapter four examines matching methods, including the propensity score matching technique. Chapter five deal with double-difference methods in the context of panel data, which relax some of the assumptions on the potential sources of selection bias. Chapter six reviews the instrumental variable method, which further relaxes assumptions on self-selection. Chapter seven examines regression discontinuity and pipeline methods, which exploit the design of the program itself as potential sources of identification of program impacts. Specifically, chapter eight presents a discussion of how distributional impacts of programs can be measured, including new techniques related to quantile regression. Chapter nine discusses structural approaches to program evaluation, including economic models that can lay the groundwork for estimating direct and indirect effects of a program. Finally, chapter ten discusses the strengths and weaknesses of experimental and non-experimental methods and also highlights the usefulness of impact evaluation tools in policy making.
Reconstructing Baseline Data for Impact Evaluation and Results Measurement
(World Bank, Washington, DC, 2010-11) Bamberger, Michael
Many international development agencies and some national governments base future budget planning and policy decisions on a systematic assessment of the projects and programs in which they have already invested. Results are assessed through Mid-Term Reviews (MTRs), Implementation Completion Reports (ICRs), or through more rigorous impact evaluations (IE), all of which require the collection of baseline data before the project or program begins. The baseline is compared with the MTR, ICR, or the posttest IE measurement to estimate changes in the indicators used to measure performance, outcomes, or impacts. However, it is often the case that a baseline study is not conducted, seriously limiting the possibility of producing a rigorous assessment of project outcomes and impacts. This note discusses the reasons why baseline studies are often not conducted, even when they are included in the project design and funds have been approved, and describe strategies that can be used to 'reconstruct' baseline data at a later stage in the project or program cycle.

Users also downloaded

Showing related downloaded files

Governance Matters IV : Governance Indicators for 1996-2004
(World Bank, Washington, DC, 2005-06) Kaufmann, Daniel; Kraay, Aart; Mastruzzi, Massimo
The authors present the latest update of their aggregate governance indicators, together with new analysis of several issues related to the use of these measures. The governance indicators measure the following six dimensions of governance: (1) voice and accountability; (2) political instability and violence; (3) government effectiveness; (4) regulatory quality; (5) rule of law, and (6) control of corruption. They cover 209 countries and territories for 1996, 1998, 2000, 2002, and 2004. They are based on several hundred individual variables measuring perceptions of governance, drawn from 37 separate data sources constructed by 31 organizations. The authors present estimates of the six dimensions of governance for each period, as well as margins of error capturing the range of likely values for each country. These margins of error are not unique to perceptions-based measures of governance, but are an important feature of all efforts to measure governance, including objective indicators. In fact, the authors give examples of how individual objective measures provide an incomplete picture of even the quite particular dimensions of governance that they are intended to measure. The authors also analyze in detail changes over time in their estimates of governance; provide a framework for assessing the statistical significance of changes in governance; and suggest a simple rule of thumb for identifying statistically significant changes in country governance over time. The ability to identify significant changes in governance over time is much higher for aggregate indicators than for any individual indicator. While the authors find that the quality of governance in a number of countries has changed significantly (in both directions), they also provide evidence suggesting that there are no trends, for better or worse, in global averages of governance. Finally, they interpret the strong observed correlation between income and governance, and argue against recent efforts to apply a discount to governance performance in low-income countries.
Government Matters III : Governance Indicators for 1996-2002
(World Bank, Washington, DC, 2003-08) Kaufmann, Daniel; Kraay, Aart; Mastruzzi, Massimo
The authors present estimates of six dimensions of governance covering 199 countries and territories for four time periods: 1996, 1998, 2000, and 2002. These indicators are based on several hundred individual variables measuring perceptions of governance, drawn from 25 separate data sources constructed by 18 different organizations. The authors assign these individual measures of governance to categories capturing key dimensions of governance and use an unobserved components model to construct six aggregate governance indicators in each of the four periods. They present the point estimates of the dimensions of governance as well as the margins of errors for each country for the four periods. The governance indicators reported here are an update and expansion of previous research work on indicators initiated in 1998 (Kaufmann, Kraay, and Zoido-Lobat 1999a,b and 2002). The authors also address various methodological issues, including the interpretation and use of the data given the estimated margins of errors.
Governance Matters VIII : Aggregate and Individual Governance Indicators 1996–2008
(2009-06-01) Kaufmann, Daniel; Kraay, Aart; Mastruzzi, Massimo
This paper reports on the 2009 update of the Worldwide Governance Indicators (WGI) research project, covering 212 countries and territories and measuring six dimensions of governance between 1996 and 2008: Voice and Accountability, Political Stability and Absence of Violence/Terrorism, Government Effectiveness, Regulatory Quality, Rule of Law, and Control of Corruption. These aggregate indicators are based on hundreds of specific and disaggregated individual variables measuring various dimensions of governance, taken from 35 data sources provided by 33 different organizations. The data reflect the views on governance of public sector, private sector and NGO experts, as well as thousands of citizen and firm survey respondents worldwide. The authors also explicitly report the margins of error accompanying each country estimate. These reflect the inherent difficulties in measuring governance using any kind of data. They find that even after taking margins of error into account, the WGI permit meaningful cross-country comparisons as well as monitoring progress over time. The aggregate indicators, together with the disaggregated underlying indicators, are available at www.govindicators.org.
Breaking the Conflict Trap : Civil War and Development Policy
(Washington, DC: World Bank and Oxford University Press, 2003) Collier, Paul; Elliott, V. L.; Hegre, Håvard; Hoeffler, Anke; Reynal-Querol, Marta; Sambanis, Nicholas
Most wars are now civil wars. Even though international wars attract enormous global attention, they have become infrequent and brief. Civil wars usually attract less attention, but they have become increasingly common and typically go on for years. This report argues that civil war is now an important issue for development. War retards development, but conversely, development retards war. This double causation gives rise to virtuous and vicious circles. Where development succeeds, countries become progressively safer from violent conflict, making subsequent development easier. Where development fails, countries are at high risk of becoming caught in a conflict trap in which war wrecks the economy and increases the risk of further war. The global incidence of civil war is high because the international community has done little to avert it. Inertia is rooted in two beliefs: that we can safely 'let them fight it out among themselves' and that 'nothing can be done' because civil war is driven by ancestral ethnic and religious hatreds. The purpose of this report is to challenge these beliefs.
Design Thinking for Social Innovation
(2010-07) Brown, Tim; Wyatt, Jocelyn
Designers have traditionally focused on enchancing the look and functionality of products.

Publication:
Using Mixed Methods in Monitoring and Evaluation : Experiences from International Development

Files in English

Published

ISSN

Date

Author(s)

Editor(s)

Abstract

Link to Data Set

Citation

Digital Object Identifier

URI

Associated URLs

Associated content

Report Series

Other publications in this report series

Journal

Journal Volume

Journal Issue

Collections

Related items

Users also downloaded

Publication: Using Mixed Methods in Monitoring and Evaluation : Experiences from International Development

Files in English

Published

ISSN

Date

Author(s)

Editor(s)

Abstract

Link to Data Set

Citation

Digital Object Identifier

URI

Associated URLs

Associated content

Report Series

Other publications in this report series

Journal

Journal Volume

Journal Issue

Collections

Related items

Users also downloaded

Publication:
Using Mixed Methods in Monitoring and Evaluation : Experiences from International Development