64386 The World Bank MAY PREMnotes 2011 NUMBER 9 Special Series on Combining Quantitative and Qualitative Methods for Program Monitoring and Evaluation: Why Are Mixed- Method Designs Best? Michelle Adato Despite signi�cant methodological advances, much program evaluation and monitoring data are of limited utility because of an over-reliance on quantitative methods alone. While surveys provide generalizable �ndings on what outcomes or impacts have or have not occurred, qualitative methods are better able to identify the underlying explanations for these outcomes and impacts, and therefore enable more effective responses. Qualitative methods also inform survey design, identify social and institutional drivers and impacts that are hard to quantify, uncover unanticipated issues, and trace impact pathways. When used together, quantitative and qualitative approaches provide more coherent, reliable, and useful conclusions than do each on their own. This note identi�es key elements of good mixed-method design and provides examples of these principles applied in several countries. Over the last decade, development programs in programs work as intended depends not only Latin America, Asia, and Africa have increas- on how efficiently resources and knowledge are ingly undertaken rigorous impact evaluation. transferred, but also on complex economic and Despite advances, much evaluation and program social dynamics in households, communities, monitoring data have limited utility because and institutions. These dynamics cannot be of an over-reliance on quantitative methods disentangled through surveys alone. This note alone. While surveys provide essential data on provides guidance on how to combine quanti- whether or not changes have occurred as a result tative and qualitative methods for monitoring of a program, qualitative methods identify the and evaluation (M&E) to maximize the ability underlying explanations for why we do or do not to assess program performance and interpret and observe these changes. Survey methods will tell act on that information. This note also includes us, for example, the rate of change in attended examples of different mixed-method designs hospital births, while qualitative methods will used in Haiti, South Africa, Nicaragua, Turkey, explain why some women now go to hospitals and Zimbabwe. to give birth while others will not, despite a program designed to encourage their attendance. Qualitative methods also improve survey design, What Do Mixed-Method identify social and institutional impacts that are Evaluation Designs Offer? hard to quantify, and uncover unanticipated pro- Quantitative methods provide uniform measures cesses or outcomes. Mixed-method approaches of project outputs and impacts, for example, the are necessary, because whether development number of farmers trained or vaccines adminis- FROM THE POVERTY REDUCTION AND ECONOMIC MANAGEMENT NETWORK tered, or changes in income, crop yields, school coherent, reliable, and useful than those from enrollment, or child stunting. Representative single-method studies. sample sizes ensure that �ndings are generaliz- Box 1 provides examples of issues that tend able among a wider population. Econometric to be best addressed by either a quantitative or analysis further enables inferences of causality qualitative approach. Note, however, that catego- and relationships between impacts and explana- ries represent relative strengths for emphasis, but tory variables. not a dichotomy. Each topic can potentially be Quantitative methods perform less well in addressed in different ways by quantitative and explaining these results, particularly when expla- qualitative methods, yielding different types of nations involve issues that are hard to quantify, information. Furthermore, surveys can include but are often fundamental to understanding questions with open-ended responses, and qualita- program results—such as beliefs and perceptions, tive data can be quanti�ed. social relationships, administrative bottlenecks, Most importantly, the approaches work in or institutional dynamics. Qualitative methods complementary ways to address a given issue. do better at capturing these issues because they In the evaluation of the Child Support Grant in use more flexible questions, ask for open-ended South Africa, for example, qualitative methods responses, thoroughly explore the topic, and pro- identi�ed the full range of adolescent high-risk be- mote rapport between researchers and research havior and its economic and social drivers. These subjects, which results in more candid responses. were turned into survey questions and responses Observation methods independently con�rm or carefully tailored to this program context. Fur- contradict what people say. There is, however, a thermore, the focus groups tested a key assump- trade-off between depth and breadth, and smaller tion used to construct the survey’s control group, sample sizes in qualitative studies mean that tested respondent strati�cation and recruitment �ndings are rarely statistically representative of strategies, and provided data that would later help a broad population. Quantitative and qualitative with interpretation of survey data. evaluation methods compensate for each other’s Qualitative methods explain process. Quantita- weaknesses, and each approach provides more tive methods can determine, for example, whether value when used in a mixed-method design, provid- dissemination through farmers’ organizations ing information and conclusions that are more leads to increased adoption of an agricultural Box 1: Examples of Issues Normally Studied through Quantitative and Qualitative Methods Quantitative Qualitative - - - Source 2 PREMNOTE MAY 2011 technology. Qualitative methods will tell us about the social and political relationships that explain Box 2: Complementarities between why different types of farmers join, and about Qualitative and Quantitative Methods formal and informal practices—factors necessary for understanding whether such organizations are likely to generate the intended outcomes under different circumstances. Such process issues “can be crucial to understanding impact, as opposed to simply measuring it� (Rao and Woolcock 2003). The study of process is also an important com- ponent of program monitoring. Studying what actually occurs during program implementation can determine whether failure to achieve intended outcomes or impact results from design failure or implementation failure (Bamberger, Rao, and Woolcock 2010). Box 2 presents the wide variety of ways in which quantitative and qualitative methods sup- port each other. Triangulation is a central function of mixed-method M&E designs: comparing quan- titative and qualitative datasets to see how each - one con�rms, challenges, or explains the other. In the evaluation of the conditional cash transfer - (CCT) program in Turkey, a survey was used to measure program impacts on attendance rates at - school and health check-ups, while the qualitative research collected the full range of economic, political, and sociocultural explanations for atten- dance and lack of attendance at both. The survey found, for example, that the CCT program raised secondary school enrollment for girls by 10.7 percent, but enrollment rates were still very low at the secondary level: 38.2 percent for secondary school girls nationally, and lower in some regions. The qualitative study found that sociocultural beliefs and practices, and especially gender issues, frequently overpowered the �nancial incentive of the transfer. The issues hindering the success of the CCT program included the belief that Source women’s primary roles are as wives and mothers, the perceived lack of bene�ts of education, fear of girls’ sexuality and male advances bringing harm The evaluators speculated, based on anecdotal to family reputation and honor, exacerbated by information, that mothers might have deliberately inadequate transportation and location of schools kept their children underfed due to a mistaken that were perceived to put girls at further risk. The belief that they would lose bene�ts if they gained �ndings explained why the CCT was successful weight (Morris and others 2004). Qualitative in some contexts and not others, and the need methods could have been used to test whether for complementary interventions (Adato 2008). this explanation—or a different one—was likely Single-method studies don’t have this ability to be correct. In another example from a study of to explain. In the evaluation of a CCT for nutri- domestic violence, a survey found a strong cor- tion in Brazil, the survey found a small negative relation between domestic violence and female effect of the program on children’s weight gain. sterilization. This would have been difficult to MAY 2011 PREMNOTE 3 explain without qualitative research, which found focus groups by categories such as gender, wealth, that husbands became more suspicious of their age, or ethnicity is essential for reducing this risk. wives’ �delity due to reduced risk of pregnancy, Still, focus groups are useful methods for rapid, thereby increasing the risk of violence (Rao and low-cost identi�cation of issues and for assessing Woolcock 2003). Surveys may be able to further bene�ciary or service provider perceptions and test the generalizability of some of these �ndings, experiences. but qualitative work is necessary to identify these In-depth interviews and observations can be pathways. used at any stage of M&E to identify issues early on and to gather data once a program is underway. Choosing among Methods These methods allow the �eld-worker to pursue a topic until it is well understood. People may be The most common qualitative methods used for more willing to respond candidly in individual M&E are focus groups, participatory appraisal, interviews, and observations enable independent bene�ciary/nonbene�ciary interviews, key infor- con�rmation. These data can then be triangulated mant interviews, and observation. These can be and analyzed in relation to other individual and used in rapid appraisals, spending a day to several contextual data. In evaluation, ethnographic case days in a locality or program delivery setting, or study methods are sometimes used, where �eld- as part of extended case studies and ethnographic workers live for a period of time (for example, studies spending several weeks or months in one three to six months) in program communities. such location. Monitoring systems tend to use These methods permit the most reliable picture of these methods through shorter data collection program processes and impacts, by providing the exercises at regular intervals. Which methods time to: establish strong rapport and trust with are selected depends in part on M&E budgets program stakeholders, conduct iterative sets of and time frames, but also on the purpose, stage interviews, and observe household, community, of M&E at which they are used, and the types of and program interactions and key activities over issues to be investigated. time. These methods are, however, more time Focus groups and participatory appraisal tend and resource intensive compared to focus group to be best suited for broad identi�cation of issues methods, and sample sizes are normally smaller. and preferences. They are frequently used at early Key informant interviews are essential for stages of evaluation design to inform the design M&E, gathering the knowledge of program of- of surveys, though they can also be used at later �cials and staff, service delivery professionals, stages to provide data on how well a program is community leaders, business owners, contractors, functioning and why. A main advantage of focus and other stakeholders. Key informant interviews groups is that a large number of people can be provide information and analysis based on day- included in the study in a relatively short period to-day observations of the program. Another of time, maximizing the diversity of experiences mixed-method approach, particularly valuable for and opinions identi�ed while minimizing costs. operations or process M&E, are systematic obser- Another advantage is that a group discussion can vations of service delivery, combining quantitative stimulate recollection and debate. Participatory instruments to record and rate observed condi- appraisal methods, which combine visual exer- tions and practices with qualitative interviews cises with discussions, provide more control and and observations. bene�ts to participants. The main limitations of focus groups are that there is relatively little time to establish rapport and trust, or to investigate Key Issues in Mixed- issues in depth, and it is difficult to link the in- Method Designs formation to other datasets. People may also be Sequencing of methods. Although sequencing can less willing to discuss sensitive topics in groups be done in various ways, a best practice evaluation (though not always), such as domestic violence or design might look like this: The evaluation starts HIV (human immunode�ciency virus), particu- with qualitative methods to identify key issues larly with respect to their own experience. Finally, and gather information to inform survey design. minority opinions or those of the less powerful This is followed by the baseline survey. The survey may not be revealed: careful disaggregation of data are used to design and select the sample for a 4 PREMNOTE MAY 2011 new stage of qualitative research and to identify data fail to take advantage of their synergies. For issues for investigation—such as �ndings that need example, data from alternating rounds of surveys explanation. Following this qualitative study, an and qualitative research are not always used to in- evaluation survey mirrors the baselines, but adds form the questions for the alternating next round. some new questions identi�ed in the qualitative Even more common is the failure to triangulate study. A new phase of qualitative research then and integrate the �ndings at the analysis stage— examines impacts and investigates survey �ndings. thus losing much of the principle analytic power Depending on the evaluation design, resources, of mixed-method designs. Under the typical time and needs, additional rounds may follow. For pressure to complete evaluation or monitoring program monitoring, a subset of quantitative reports, data are analyzed and reported separately. indicators and qualitative data can be collected It is critical that data integration becomes a prior- at regular intervals, maintaining common indi- ity, and that the time and resources needed for cators but also adapting indicators based on new data integration at the analysis stage are included �ndings. Many governments invest substantial in the budget. resources in monitoring systems that collect large quantities of quantitative data that reflect What Do We Learn? Findings expected outputs. They typically do not explain from Mixed-Method M&E the reasons for good or poor performance, which limits the ability to respond. Complementary Examples of mixed-method M&E from four use of qualitative methods in monitoring systems countries are outlined in this section, illustrating can help provide these explanations and identify the different purposes, designs, and methods unanticipated issues and outcomes. discussed above. Site and household selection. While some Monitoring a food-assisted maternal and qualitative studies involve a convenience sample child health and nutrition program in Haiti of locations or households, a more rigorous ap- An operations research approach was used for proach uses survey data to stratify qualitative the M&E system for World Vision’s food-assisted samples. The qualitative samples would then maternal and child health and nutrition program reflect characteristics of the quantitative sample. in Haiti (Loechl and others 2005). The objectives Within these stratified categories, households were to assess the implementation of service de- or individuals are often selected purposively, to livery, identify constraints to effective operation, ensure inclusion of households across the distribu- and implement corrective actions. Quantitative tion; if a random sample is used, the sample should methods used were structured observations and be large enough to capture this distribution. In interviews at program delivery points. The qualita- an evaluation of an agroforestry intervention in tive methods were semistructured interviews with Kenya, for example, survey data were used to select stakeholders and focus group discussions with the a sample of households for qualitative case stud- ies that captured Luo and Luhya ethnic groups, program staff. The service delivery points were: male- and female-headed households, richer and Rally Posts, where targeting, health education and poorer farmers, and agroforestry early adopters, services, and growth monitoring and promotion late adopters, nonadopters, and disadopters (Place, took place; Mothers’ Clubs, where smaller groups Adato, and Hebinck 2007). In evaluations of CCT of participants gathered to discuss health and programs in Turkey, Nicaragua, and El Salvador, nutrition topics; and Food Distribution Points, survey data were used to stratify the qualitative where bene�ciaries received monthly food ra- sample between households where children tions. Selected �ndings from the service delivery performed well and poorly on the key education points include: and health indicators targeted by the program. In Operations at the Rally Posts: These were found this way, the qualitative research could investigate to be operating as planned; however, problems the conditions and characteristics that explained identified included crowding, a high partici- this different performance, with and without the pant/staff ratio, long waiting times, bottlenecks program (Adato 2008). at registration, and the lack of supplies and Data analysis and integration. Many M&E transport for staff. Improvements were needed systems that collect quantitative and qualitative in the general education sessions and the com- MAY 2011 PREMNOTE 5 munication between health staff and caregivers. in the study communities for approximately four Measurement errors were also identified in months, conducting interviews about program weighing and plotting children’s weight on the experiences and impacts and people’s attitudes growth chart; this was a critical area because the and behaviour, and observing meal preparation; growth charts were used for targeting children health and hygiene practices; shopping; bene�- for recuperative action. ciary and community gatherings; health service Mothers’ Clubs: These were found to be highly delivery; health and nutrition education, and popular among health staff and beneficiaries. other program activities. Some of the bene�ts of A new behavior change and communication this mixed-methods design are outlined below: strategy and new materials and techniques had Targeting. The survey found that the program been recently developed to improve infant and was well targeted, with undercoverage rates of 3 young child feeding practices. The mixed-method to 10 percent. The qualitative research found, approach enabled an objective assessment of the however, that people saw themselves as “all poor� technical content of the sessions and health staff’s and did not understand why households were facilitation and teaching skills, which were found selected into or out of the program, resulting in to have improved. However, ensuring the intended several types of stress and tension in the commu- composition of the clubs was identi�ed as an on- nities. This led to recommendations to improve going challenge, and continued supervision and program communications and to provide some retraining of the staff was recommended. limited bene�ts to nonbene�ciary households. Food Distribution Points (FDPs). Observations Iron supplements. The survey found a large in- of the FDPs identi�ed excessive crowding and crease in the percentage of children receiving iron long waiting times, delays in arrival of the food supplements: from under 25 percent to nearly and staff, and the reasons for these problems: bad 80 percent. However, it found no impact on the road conditions, limited transport facilities, and high anemia rates in this population. In initial fuel scarcity. Exit interviews revealed that a large interviews in the qualitative research, mothers proportion of bene�ciaries did not receive the said that they gave the supplements to their chil- amounts of food commodities they were entitled dren. However, over time, the case study methods to. The sharing of food commodities among other revealed a different picture: mothers were picking relatives, neighbors, and others was reported to be up the supplements but not giving them to their widespread. This was determined to be inevitable, children because of the perception that iron and it was recommended that an additional indi- negatively affected children’s stomachs and teeth. rect ration be provided to cover this, and that the “Stuffing� children before weighing. In the �rst program should continue to emphasize the use of phase of the program, if children twice fell below forti�ed commodities with micronutrient content an established rate of weight gain, bene�ts could targeted to bene�ciaries, especially young children. be suspended. Although this policy was dropped, After implementation of the recommenda- the study found that many bene�ciaries did not tions, a new round of operations research was know this, and that to avoid what they believed conducted to monitor the corrective measures and would be a loss of their bene�ts, some mothers document improvements in the program. were stuffing their children with food and liquids on the day or days leading up to the weighing. This Evaluation of the conditional cash revealed important information about poorly transfer program in Nicaragua conceived incentives, as well as the impact of The evaluation of the CCT in Nicaragua (Adato inadequate communications. 2008) involved baseline and follow-up panel Program impacts on gender relations. Concerns surveys with 1,359 households, conducted in have been raised that giving cash transfers to 42 administrative units (comarcas) with and women could cause tensions with their male part- without the program. Survey data were later used ners, possibly contributing to domestic violence. to stratify households by high and low perfor- The qualitative research was able to explore this mance in health and education indicators, with delicate topic, but found that men largely sup- a qualitative sample drawn from each category. ported women receiving the bene�t, because they In total, 120 households were included in the saw the CCT program as for children and believed qualitative study in Nicaragua. Field-workers lived that women would spend the cash more wisely. 6 PREMNOTE MAY 2011 Furthermore, the new resources in households Culture and magic. In the case study com- helped to ease tensions. It also found that the pro- munities, there was widespread belief in the gram’s discourse on women’s empowerment and effect of witchcraft on crop performance. People women’s receipt of the cash increased their self- frequently attributed magical powers to those con�dence and gave them some new autonomy who achieved unusually high yields, and poor in certain spending decisions. yields to theft of crops through witchcraft. In one resettlement area, people would not show Evaluation of high yielding maize interest in the crops of others, because observ- varieties in Zimbabwe ing how others grew their crops could arouse This evaluation (Bourdillon and others 2007) suspicions of witchcraft. In another area, there used data from a panel survey conducted in was a widespread belief that implements or resettlement areas from 1983–84, 1987, and animals lent to other farmers could be returned annually from 1992–2000. The surveys con- bewitched. This has important implications tained extensive information on agricultural and for farmer-to-farmer methods of dissemination nonfarm activities, expenditure, assets, and other and extension, and the expectation that farmers impacts. Qualitative household case studies, focus “learn from each other.� groups, and key informant interviews were con- ducted during a six-month period of �eldwork in 2001. Findings include: Final Remarks Gender relations and intrahousehold resource Although mixed methods are widely used by gov- control. Despite a modest reduction in household- ernments and international agencies, there are a level poverty, bene�ts to men from high yielding number of reasons why it is still common to �nd varieties (HYVs) of maize undermined women’s single-method approaches. The high cost of survey control of resources. Whereas men operated with- research means that decisions are often made to in the public commercial markets for HYV maize, allocate an entire evaluation budget to a single women preferred the open-pollinated varieties approach. Second, timelines are often perceived (OPVs), which HYV maize had displaced, because as too tight for iterative rounds of data collection. OPV seeds and maize were marketed through in- Third, researchers are usually trained in one ap- formal networks where women operated. Women proach—quantitative or qualitative—and do not also did not have access to credit for the commer- sufficiently understand or appreciate the methods cial fertilizer necessary for HYV maize, but not and value of the other. However, mixed-method for OPVs. Although money from the sale of HYV research designs can be adapted to �t a given set of maize was called “family money,� the qualitative conditions, and the bene�ts are likely to far exceed research revealed that “family money� was really the costs. Still, it is important to recognize that the the household head’s money, kept in his bank open-ended nature of qualitative research methods account. This “family money� was often invested requires a considerable degree of skill on the part in cattle (traditionally male property), and in one of �eld researchers to obtain quality data, and that case study a woman explained her fear that if her sufficient resources are needed to ensure a strong husband died, his relatives would take the cattle research design, a sample size large enough to cap- away and she would be left with nothing. ture heterogeneity, adequate time for �eldwork, The signi�cance of age for extension approaches. and the systematic analysis and integration of The qualitative research revealed generational dif- data. If both quantitative and qualitative research ferences in how farmers value knowledge. Young are undertaken with rigor, then mixed-method people trusted the knowledge of the national M&E will result in a far better understanding of extension service officers, viewing them as trained program results than either approach alone. This and experienced. In contrast, older people trusted level of understanding is critical to provide effec- their own experiences and demonstration units. tive feedback that will improve performance and Cultural values and beliefs attributed wisdom enable programs to meet their goals. to age, and older men especially found it hard to admit to limitations in their knowledge, preferring References their own “practical� knowledge to what they saw Adato, M. 2008. “Combining Survey and Ethnographic as “theoretical� knowledge. Methods to Improve Evaluation of Conditional MAY 2011 PREMNOTE 7 Cash Transfer Programs.� International Journal of Kanbur, R. (editor). 2003. Q-Squared: Qualitative and Multiple Research Approaches 2, (2): 222–36. Quantitative Methods of Poverty Appraisal. New Bamberger, M., V. Rao, and M. Woolcock. 2010. “Us- Delhi: Permanent Black. ing Mixed Methods in Monitoring and Evaluation: Maluccio, J., M. Adato, and E. Skou�as. 2010. “Com- Experiences from International Development.� bining Quantitative and Qualitative Research World Bank Policy Research Working Paper 5245, Methods for the Evaluation of Conditional Cash Washington, DC. Transfer Programs.� In Conditional Cash Transfers Bourdillon, M. F. C., P. Hebinck, and J. Hoddinott, with in Latin America, ed. M. Adato and J. Hoddinott, B. Kinsey, J. Marondo, N. Mudege, and T. Owens. 26–52. Baltimore, MD: Johns Hopkins University 2007. “Assessing the Impact of High-Yield Varieties Press. of Maize in Resettlement Areas of Zimbabwe.� In Plano Clark, V. L., and J. W. Creswell (editors). 2008. Agricultural Research, Livelihoods and Poverty: Stud- The Mixed-Methods Reader. Thousand Oaks: Sage ies of Economic and Social Impacts in Six Countries, Publications. ed. M. Adato and R. Meinzen-Dick, 198–237. Baltimore, MD: Johns Hopkins University Press. Loechl, C., M. T. Ruel, G. Pelto, and P. Menon. 2005. Acknowledgment “The Use of Operations Research as a Tool for For their comments, the author thanks Laura Raw- Monitoring and Managing Food-Assisted Maternal/ Child Health and Nutrition (MCHN) Programs: lings (Lead Social Protection Specialist, HDNSP), An Example from Haiti.� FCND Discussion Paper and the following members of the Poverty Reduc- 187, International Food Policy Research Institute, tion and Equity Group: Philipp Krause (Consul- Washington, DC. tant), Gladys Lopez-Acevedo (Senior Economist), Morris, S., P. Olinto, R. Flores, E. Nilson, and A. Keith Mackay (Consultant), and Jaime Saavedra Figueiró. 2004. “Conditional Cash Transfers Are Associated with a Small Reduction in Weight Gain (Acting Sector Director). The views expressed in of Preschool Children in Northeast Brazil.� Journal this note are those of the author. To access other of Nutrition 134 (9): 2336–41. notes in this series, visit www.worldbank.org/poverty/ Place, F., M. Adato, and P. Hebinck. 2007. “Understand- nutsandbolts. ing Rural Poverty and Investment in Agriculture: An Assessment of Integrated Quantitative and Qualitative Research in Western Kenya.� World About the Author Development 35 (2) 312–25. Michelle Adato is Director of Social and Gender As- Rao, V., and M. Woolcock. 2003. “Integrating Quali- sessment for the Millennium Challenge Corporation tative and Quantitative Approaches in Program Evaluation.�  In The Impact of Economic Policies on of the U.S. government, which she joined in late Poverty and Income Distribution: Evaluation Tools 2010. At the time that she wrote this note, she was and Techniques, ed. F. Bourguignon and L. A. Pereira a Senior Research Fellow at the International Food da Silva, 165–90. Washington, DC: World Bank. Policy Research Institute (IFPRI). For eight years, she was co-leader of IFPRI’s Global and Regional For Further Reading Program on Large-Scale Human Capital Interven- Adato, M., R. Meinzen-Dick, P. Hazell, and L. Haddad. tions, which specialized in program evaluations 2007. “Studying Poverty Impact Using Livelihoods using quantitative and qualitative methods, and Analysis and Quantitative Methods: Conceptual she is the co-editor of two books based on multi- Frameworks and Research Methods.� In Agricultural country impact assessments. She has a Ph.D. in Research, Livelihoods and Poverty: Studies of Economic and Social Impacts in Six Countries, ed. M. Adato and Development Sociology from Cornell University R. Meinzen-Dick, 20–55. Baltimore, MD: Johns and an M.P.A. from Harvard University’s John F. Hopkins University Press. Kennedy School of Government. This note series is intended to summarize good practices and key policy �ndings on PREM-related topics. The views expressed in the notes are those of the authors and do not necessarily reflect those of the World Bank. PREMnotes are widely distributed to Bank staff and are also available on the PREM Web site (http://www. worldbank.org/prem). If you are interested in writing a PREMnote, email your idea to Madjiguene Seck at mseck@worldbank.org. For additional copies of this PREMnote please contact the PREM Advisory Service at x87736. This series is for both external and internal dissemination 8 PREMNOTE MAY 2011