Microfinance Poverty Assessment Tool

Assessment Tool The Microfinance Poverty Assessment Tool was developed as a much-needed tool to increase transparency on the depth of outreach of microfinance institutions (MFIs). It is intended to assist donors and investors to integrate a poverty focus into their appraisals and funding of financial institutions through a more precise understanding of the clients served by these institutions. Used in conjunction with an institutional appraisal of financial sustainability, governance, management, staff, and systems, a poverty assessment allows for a more holistic understanding of an MFI. The Microfinance Poverty Assessment Tool provides accurate data on the poverty levels of MFI clients relative to people living in the same community. It uses a more standardized, globally applicable, and rigorous set of indicators than those used by conventional microfinance targeting tools. The tool employs principal component analysis to construct a multidimensional poverty index that allows the poverty outreach of MFIs to be compared within and across countries. Originally field tested in four countries on three continents, it has subsequently been applied by microfinance donors and MFI networks in numerous other countries. Although the Microfinance Poverty Assessment Tool was designed for microfinance, the tool can be used to measure the poverty levels of clients of other development programs. In terms of cost and reliability, the tool provides far more detailed and statistically accurate data than that offered by low-cost methodologies such as Rapid Rural Appraisal, Participatory Appraisal, or Housing Index methodologies, while avoiding the high cost and extensive time requirements of a detailed household expenditure survey. Consultative Group to Assist the Poor Building financial services for the poor

Using principal component analysis to estimate a poverty index 130  Figure 9.9 Case study example of average relative poverty scores 143 disaggregated by survey area and client status Figure 9.10 Case study example of average poverty household scores 143 disaggregated by MFI program type and client status Figure 9.11 Constructing poverty groups 145 Figure 9.12 SPSS "Rank Cases" dialogue box 146 Figure 9.13 SPSS "Rank Cases: Types" dialogue box 146 Figure 9.14 SPSS "Compute Variable" and "Compute Variable: 147 If Cases" dialogue boxes  Aggregating from F2 data file: Family structure for 106 adults (aged 15 and above) Table 7.2 Aggregating from F3 data file: Family structure for 106 children (aged 0 to 14) Table 7.3 Aggregating from F4 data file: Value of selected 107 household assets Table 7.4 Computation and output variables at the household level 112 Table 8.1 Example of cross tabulation of client status by 116 principal occupation of adults in household Table 8.2  Example of SPSS output table for chi-square test of  117  cross tabulation   Table 8.3 Sample tests of significance between client status 120 and occupation at cluster level Table 8.4 Example of SPSS output table for independent t-test 121 on samples Table 8.5 Example of independent samples t-test 122 Table 9.1  Example of an SPSS correlation output table  129  Table 9.2 Template for recording ranked indicators by level 129 of association with benchmark poverty indicator Table 9. 3 Example of an SPSS component matrix 135 Table 9.4 Example of SPSS component matrix with additional 137 variables Table 9.5 Example of SPSS explained common variance table 138 Table 9. 6 Example of an SPSS communalities table  139  Table 9.7 KMO-Bartlett test 139 Table 9.8 Example of descriptive statistics for middle tercile 147 of poverty index Table 9.9 Example of cross tabulation of "type of latrine" and 150 poverty group Table 9.10 Chi-square tests of type of cross tabulation 151 The Consultative Group to Assist the Poor (CGAP) is committed to the twin objectives of increasing the financial sustainability of MFIs and deepening their poverty focus-that is, increasing their outreach and impact on the lives of poorer people. As part of this commitment, CGAP has continually endeavored to provide tools that allow for greater transparency on the performance of microfinance institutions (MFIs) in meeting these objectives.
To date, the focus on transparency in microfinance has centered primarily on financial performance. The Microfinance Poverty Assessment Tool was developed as a much-needed tool to improve transparency on the depth of MFI poverty outreach. The tool is intended for use by donors and MFI evaluators as a practical, accurate, and relatively simple means of assessing the extent to which MFI programs reach the poor. The methodology outlined in this guide is relatively easy to implement in a short time and at minimal cost to a donor organization-key criteria for the development of the tool.
In addition, the tool supports the comparison of poverty outreach among MFIs and across countries. The methodology is applicable to all MFIs, regardless of their location, client structure, or outreach approach. When used in conjunction with the CGAP Format for Appraisal of Microfinance Institutions (1999), the Microfinance Poverty Assessment Tool provides a straightforward means of gauging the likelihood than an MFI can reach poor clients while relying predominantly on commercial funding.
The poverty assessment methodology was originally field tested in four case studies in Asia, Africa, and Latin America conducted in 1999. Since that time, the tool has been applied successfully in a number of other countries, including Bolivia, Mali, Mexico, Nepal, and South Africa. The cumulative experience gained from these studies provided insight into how to standardize the tool while maintaining its adaptability to local conditions. The Microfinance Poverty Assessment Tool encourages donors to integrate a poverty focus in their appraisal and funding of MFIs. CGAP strongly believes that the future of the microfinance industry lies in moving beyond the poverty-sustainability polemic in favor of pushing microfinance forward on both the poverty outreach and sustainability frontiers. There is great scope to creatively improve on both without sacrificing either. By making the depth of MFI outreach more transparent, the Microfinance Poverty Assessment Tool can help guide the industry to support a broader range of MFIs more effectively.
The microfinance industry promotes the dual objectives of sustainability of services and outreach to the very poor. When deciding to fund specific microfinance institutions (MFIs), donors and other social investors in the sector consider both objectives, but their relative importance varies among funders. Furthermore, many practitioners, donors, and experts perceive a trade-off between financial sustainability and depth of outreach, although the exact nature of this trade-off is not well understood.
In recent years, several tools have emerged to assist donors in their assessment of the institutional performance of MFIs. One example is the CGAP Format for Appraisal of Microfinance Institutions (hereafter, CGAP Appraisal Format), which contains practical guidelines and indicators for measuring MFI performance on a range of issues, including governance, management and leadership, mission and plans, systems, operations, human resource management, products, portfolio quality, and financial analysis. Analysis of these institutional features allows an appraisal to consider an institution's potential for viability and/or sustainability. At the same time, the proliferation of tools such as the CGAP Appraisal Format has encouraged transparency and the development of standards for financial sustainability in microfinance.
Currently, no rigorous tool exists to measure the poverty level of MFI clients. In order to gain more transparency on the depth of poverty outreach, CGAP collaborated with the International Food Policy Research Institute (IFPRI) to design and test a simple, low-cost operational tool to measure the poverty level of MFI clients relative to nonclients. This tool is a companion piece to the CGAP Appraisal Format; donors should not use the poverty assessment tool without also conducting a larger institutional appraisal.
The concept of poverty is complex and strongly influenced by local cultural and socioeconomic conditions. The poverty assessment approach presented in this manual supports a flexible definition of poverty that can be adapted to fit local perceptions and conditions of poverty.
The tool is intended neither as a means to target new clients nor to assess the impact of microfinance services on the lives of existing clients. It may provide a useful means to verify-both for the donor and the MFI-the extent to which an existing strategy results in poor clients

Intended users
Donors are the intended beneficiaries of this poverty assessment tool, but they are unlikely to be the actual implementers of the tool. Although the manual presents as simply as possible the techniques involved in conducting a poverty assessment, an assessment is best handled by a team of research professionals with expertise in survey methodology and statistical analysis. In almost all countries, knowledgeable and reliable research institutes regularly conduct studies at the level of detail presented in this manual. By documenting all steps of the survey design, data collection, and analysis, as well as the interpretation and reporting of results, this manual provides a clear-cut guide for the experienced researcher to conduct a poverty assessment.
Donors will want to read through the manual to understand the level of effort and time frame required, the likely costs associated with an assessment, and the level of expertise needed in a contracting institute. (Chapter 2 provides specific guidelines for contracting individuals or institutions to conduct the assessment.) The assessment is intended to be conducted independently of the MFI whose clients are being surveyed. However, the manual does indicate the types of information support required from the MFI. Donors will also want to review the results of an assessment to anticipate how the quantitative measurement of poverty outreach can best be integrated into additional appraisal methods.
The tool is not meant for direct use by an MFI. Not only is the required level of specialized knowledge unlikely to be found among MFI staff, but direct field testing by an MFI could greatly bias household responses. The results of an assessment will certainly interest MFIs, which may have ideas on how to use the results for their own purposes.
However, the assessment tool is not specifically intended to guide MFIs in applying assessment results to their future program development. Any decision on how to use assessment results is left solely to the MFI and the donor.
The tool is also not an appropriate means of targeting new MFI clients. It can complement targeting tools by providing a statistically rigorous, objective assessment of how effectively various targeting methodologies perform.

Manual layout
This manual is divided into five parts. Part I, Overview (chapter 1), describes the methodology of the poverty assessment tool and the level of detail needed to successfully implement the survey. Part II, Planning and Organizing the Assessment (chapter 2), guides donors in contracting the project to a qualified institution or individual.
Part III, Collecting Survey Data (chapters 3-5), provides guidelines and instructions for collecting survey data. Chapter 3 guides users to develop a sampling frame and conduct the actual sampling of households. Chapter 4 outlines how to customize a standardized questionnaire to fit the specific local conditions where an MFI operates. Chapter 5 presents guidelines for organizing and training the survey field team.
Part IV, Analyzing the Data (chapters 6-9), focuses on managing and analyzing the data using the Statistical Program for Social Science (SPSS) software. Chapter 6 guides users in managing the survey data once it is collected, including how to enter, structure, link, and clean data. Chapter 7 summarizes SPSS techniques for preparing data for analysis. Chapter 8 gives an overview of the data analysis techniques used to describe socioeconomic similarities and differences between survey households and how to use SPSS to implement these techniques. Chapter 9 provides an overview of the statistical procedures and principle component analysis used to create the poverty index, describing each step in detail.
Part V, Interpreting the Results (chapter 10), explains how donors can interpret results of the data analysis to form conclusions. Donors are strongly urged to read chapters 1, 2, and 10 in detail and to browse through the remaining chapters.

Study parameters and choice of an indicator-based methodology
microfinance clients. In order for the tool to be effective and practical, the tool needed to have several features. First, the methodology should be simple enough to remain operational. Second, the methodology should permit comparisons between different MFIs and, if possible, across countries. Third, the tool should not be costly to implement and should have a minimum turnaround time without unduly sacrificing the credibility of results.
Consideration of these parameters led to the adoption of the indicator-based method. This method involves (i) identifying a range of indicators that reflect powerfully on poverty levels and for which credible information can be quickly and inexpensively obtained; (ii) designing a survey methodology that facilitates the collection of information on these indicators from households living in the operational area of an MFI; and (iii) formulating a single summary index that combines information from the range of indicators and facilitates poverty comparisons between client and nonclient households.
Approaches based on intensive household expenditure surveys were ruled out not only because they were too expensive and time-consuming to implement, but because they necessitated advanced skills in statistical data analysis. On the other hand, participatory or rapid assessment techniques were ruled out mainly because they did not easily allow for objective comparisons between MFIs. A brief discussion of these alternative approaches is provided in appendix 1.

Methodological steps
The development of this indicator-based poverty assessment tool followed the methodological steps below: The methodology is simple, permits comparisons between MFIs and across countries, and is not costly to implement.

Multiple dimensions of poverty and their implications
Because of the multifaceted nature of poverty, reliance on any one dimension or any one type of indicator was not recommended. Indicators for this poverty assessment tool were, moreover, selected to capture common characteristics of poverty rather than to describe the causes of poverty. Three groups of indicators were used to capture different dimensions of poverty in developing the generic questionnaire (see appendix 2 for a detailed list).
Indicator Group 1. These indicators express the means to achieve welfare. These reflect the earning potential of households and relate to human capital (family size, education, occupation, etc.), asset ownership, and social capital of the household.
Indicator Group 2. These indicators relate to the fulfillment of basic needs, including health status and access to health services, food, shelter, and clothing.
Indicator Group 3. These indicators relate to other aspects of welfare, such as security, social status, and the environment.
In many cases, a single indicator may not even be fully reliable to describe one particular dimension of poverty. For example, collecting information on TV ownership is not likely to shed complete light on a household's access to consumer assets in general, and needs to be supplemented by other indicators on ownership of kitchen appliances and/or other electronic assets such as radios or electric fans.

Selection criteria for indicators
An exhaustive list of indicators was first obtained through a literature review. A subset of indicators was then selected for the generic questionnaire, based on the following criteria: • nationally valid (can be used in different local contexts, urban versus rural) • not too sensitive (can be asked openly) • practical (can be observed as well as asked) • high-quality (indicator is sensitive in discriminating different poverty levels) • reliable (low risk of falsification or error; also possible to verify) • simple (direct and easy to answer versus computed information) • time-efficient (can be answered rapidly) • universal (can be used in different countries) Indicators were selected to capture common characteristics of poverty rather than to describe the causes of poverty.
• quality of housing (e.g., walls, roofs, access to water) • wealth (e.g., type, number, and value of assets) • human capital (e.g., level of school education and occupation of household members) • food security and vulnerability (e.g., hunger episodes in last 30 days and last 12 months, types of food eaten in last two days) • household expenditures on clothing and footwear (poverty benchmark)

Purpose of field testing
The questionnaire was field tested in each of the four case studies with three objectives in mind.
Objective 1: to further select and/or reduce the number of indicators to be included in the recommended final questionnaire. This objective was reached by (i) identifying indicators that were tightly related to poverty levels in each case study, (ii) identifying indicators that could be commonly used across the four countries (i.e., those that were sufficiently robust to reflect conditions in diverse socioeconomic and cultural contexts), (iii) identifying indicators suitable for capturing local specificities and evaluating their importance in an overall assessment, (iv) cataloguing the problems and strengths of the survey tool and related analysis resulting from the case-study tests in different country and MFI settings, and (v) critically evaluating the methodology by sharing the results with MFIs and other stakeholders.
Objective 2: to test and standardize the methodology used to integrate different indicators into a poverty index that would allow for comparisons between MFIs and countries.
Objective 3: to document all procedures involved in objectives 1 and 2 in a user-friendly manual to support future independent assessments.
Indicators chosen for questionnaire Table 1.1 lists the indicators included in the final recommended questionnaire (see appendix 3). Their selection was based on the ease and accuracy with which information on them could be elicited in a typical household survey, and how well they correlated with the benchmark 6 Microfinance Poverty Assessment Tool The methodology used to integrate different indicators into a poverty index was tested and standardized in four case studies. poverty indicator: per capita expenditure on clothing and footwear. The latter expenditure was chosen as the benchmark indicator since it bears a stable and highly linear relationship to total consumption expenditure, itself a comprehensive measure of welfare at the household level.
The following indicators were rejected: Indicators using child-specific information. Not all households have children, hence using child-related information precluded some households from comparative analysis.
Indicators of social capital. This is an evolving area of investigation and measurable, comparable indicators were not easily found.
Subjective responses. Self-assessment of poverty was considered unreliable for use in comparisons.

Health-related information.
Eliciting health-related information requires longer recall periods and more intensive and specialized training of interviewers. In the absence of training provided by health specialists (which is expensive), responses can be highly subjective and misleading.
The standard questionnaire contains a set of recommended core indicators that can be adapted to local conditions. The adaptation required will depend on local perceptions of poverty and how these perceptions are integrated into the questionnaire. In all case studies, minor changes were made to the standard questionnaire to ensure local relevancy; in several cases, a few additional location-specific indicators were added.

Methodology overview
The use of multiple indicators enables a more complete description of poverty, but it also complicates the task of drawing comparisons. The wide array of indicators has to be summarized in a logical way, underlining the importance of combining information from different indicators into a single index. The creation of an index requires finding a method of weighting that can be meaningfully applied to different indicators so as to reach an overall conclusion. The case studies used the method of principle component analysis to accomplish this task.

Using principle component analysis to develop the poverty index
Principal component (PC) analysis isolates and measures the poverty component embedded in various poverty indicators to create a household-specific poverty score or index. Relative poverty comparisons are then made between client and nonclient households based on this index. PC analysis extracts underlying components from a set of information provided by the indicators. In this poverty assessment tool, information collected from the questionnaires make up the "indicators," and the underlying component that is isolated and measured is "poverty." 1 The choice and form of indicators used in measuring relative poverty is driven by requirements of the PC method. In particular, only indicators that can measure a progressive change in welfare are appropriate.
In the example presented in figure 1.1, poverty and demographic characteristics constitute the two underlying components affecting the level of all indicators. Because the indicators are determined by these common underlying components, they are likely to be related to each other. PC analysis uses this information (the co-movement among indicators) to isolate and quantify the underlying common components. PC analysis is also used to compute a series of weights that mark each indicator's relative contribution to the overall poverty component. Using these weights, a household-specific poverty index (or poverty score) can be computed based on the indicator values of each household.
The indicators in the case studies were specially chosen to correlate well with poverty, including those that had significant correlation with per capita clothing and footwear expenditure, the benchmark indicator. Hence the poverty component is expected to account for most movement 8 Microfinance Poverty Assessment Tool in the indicators and is the "strongest" of all the components. Further, the poverty component is identified based on the size and consistent signs of the indicators that contribute to the index. For example, education level should contribute positively-not negatively-to wealth. Figure 1.2 gives an example of the distribution of a poverty index across households using data from MFI B, one of the MFIs that participated in the original field testing of the assessment tool. The greater the value of the score, the relatively wealthier the household.

Using the poverty index
Each poverty assessment includes a random sample of 300 nonclient households and 200 client households. To use the poverty index for making comparisons, the nonclient sample is first sorted in ascending order according to its index score. Once sorted, nonclient households are divided into terciles based on their poverty-index score: the top third of the nonclient households are grouped into the "higher"-ranked group, followed by the "middle"-ranked group, and finally, the "lowest"-ranked group. Since there are 300 nonclients, each group contains 100 households.
The cutoff scores for each tercile define the limits of each poverty group. Client households are then categorized into the same three groups based on their household scores. Figure 1.3 illustrates the use of cutoff scores to create poverty terciles from nonclient households. The cutoff scores of -0.70 and +0.21 were calculated from the case study example shown in figure 1.2.
If the pattern of poverty among client households matches that of nonclient households, client households will divide equally among the three poverty groupings in the same way as the nonclient households, with 33 percent falling into each group. Any deviation from this equal proportion signals a difference between the client and nonclient popula- tions. For instance, if 60 percent of client households fall into the first tercile, or lowest poverty category, the MFI reaches a disproportionately high number of very poor clients relative to the general population.

Relative versus absolute poverty
The poverty index provides a tool to calibrate relative poverty-the extent to which a household is worse off or better off compared to other households. It does not by itself provide information on the absolute level of poverty, the actual level of deprivation of the "lowest" category of households or the level of affluence of the "highest" category. A good sense of the absolute level of poverty among clients and nonclients can be derived by noting and comparing the values of individual indicators (see chapter 7). Another assessment of absolute poverty can be derived from comparing welfare indicators at the national level, such as per capita real incomes or the Human Development Index of the United Nations Development Programme (UNDP). Results from the analysis of the poverty index can then be juxtaposed with regional-and national-level indicators to make final inferences, as illustrated in the following section and described in detail in chapter 10.

Interpreting results
The outcome of a poverty outreach assessment can be somewhat threatening to an MFI, particularly if it jeopardizes its likelihood of attracting donor support. Interpreting results of the poverty assessment involves reviewing quantitative findings within the context of the institutional and environmental setting of each MFI. The organizational mission and strategies of many MFIs do not focus exclusively on outreach to the poor. Some face geographical, political, and other external constraints that limit their effectiveness in attracting poor clients. The stage of institutional development of an MFI and conditionalities imposed by outsiders may also influence its capacity to focus on the poor. Finally, an MFI may be changing its practices or supporting different types of programs that place varying emphasis on targeting the poor. All of these aspects need to be considered when analyzing the poverty outreach performance of an MFI.

Selected results of test case studies
The quantitative results of an assessment are best summarized by examining the proportion of client households falling into the three poverty groups. The results of the four original case studies used to test the methodology in 1999 are summarized below.
The poverty index provides a tool to calibrate relative poverty-the extent to which a household is worse off or better off compared to other households. MFI A. Figure 1.4 presents the three poverty groups by client and nonclient households. The distribution of MFI A clients across the poverty groups closely mirrors the distribution of nonclients, indicating that MFI A serves a clientele that is quite similar to the general population in its operational area. This result is consistent with both the stated objective of MFI A to reach micro, small, and medium enterprises, and the diversity of financial products offered by the MFI. Figure 1.5 shows that the poorest households are underrepresented among MFI B clients. However, about one-half of its clients fall into the two poorest categories. This result is noteworthy, considering that the mission of the institution is not exclusively poverty oriented (it is to reach only women in business), the focus of its product (to finance businesses after submitting a business plan), and the lack of overt targeting.

MFI C.
About half of MFI C clients belong to the higher-ranked poverty group, while they are underrepresented in the lowest poverty group (figure 1.6). This result reflects the fact that MFI C membership is share based and open to all individuals. However, poverty outreach is significantly higher when considering only clients belonging to the new program for women. Nearly one-half (45.2 percent) of clients of the women's program belonged to the lowest-ranked group, with only 19 percent belonging to the higher-ranked group.   Figure 1.7 indicates quite clearly that the poorest groups are strongly overrepresented and less poor households underrepresented among MFI D clients. This result is not only consistent with the explicit aim of MFI D to serve the poorest households in its operational area, but also indicates considerable success in its targeting practices.

Overall comparative results
A comprehensive assessment of an MFI must include an evaluation of how its poverty-outreach record reconciles with its mission and program objectives. As the case studies themselves show, the MFIs differ in terms of geography, stated mission, type of market niche sought, preference for a specific type of institutional culture, and a host of other factors. Ignoring these considerations or providing incomplete information on institutional details fails to tell a complete story, meaning that the poverty assessment methodology can be easily misused. Chapter 10 provides guidelines on how to report findings from a balanced perspective. A basis for making overall comparisons of quantitative results across MFIs and countries is discussed below. Table 1.2 presents three measures that facilitate comparisons between MFIs. Measure 1 is the percentage of client households that belong to the lowest tercile of ranked households. This measure reflects the extent to which the poorest households are represented in the client population.
A similar measure, measure 2, indicates the percentage of client households that belong to the highest-ranked group. This measure reflects the extent to which less-poor households are represented in the client population. A ratio above 33 indicates that, in comparison to the nonclient population, a greater proportion of client households falls into the higherranked poverty group.
While measures 1 and 2 provide relative poverty comparisons in the operational area of an MFI, this information must be supplemented by local and regional information that relates the general poverty level within the operational area to that found at national or provincial levels. When it is available and of good quality, existing secondary data can provide a useful means to estimate absolute poverty levels within the operational area. When not available, interviews with an expert panel can be used to develop a relative measure of how the poverty level of an MFI operational area compares to national poverty levels. These methods are discussed in more detail in chapter 10.
Finally, country-level information using the Human Development Index (HDI) computed by the UNDP can indicate how overall poverty levels within a country compare to those of other countries.
All of the countries in which the case-study MFIs were located fell below the all-developing-countries HDI average (see table 1.2). The human development index for the African country in which MFI B is 14 Microfinance Poverty Assessment Tool A comprehensive assessment of an MFI must include an evaluation of how its poverty-outreach record reconciles with its mission and program objectives. located, for example, is only 75 percent of the average HDI for all developing countries taken together. Therefore, even the higher-ranked clients of MFI B are likely to be very poor according to international standards.
The two measures in combination provide transparency by indicating the extent of general poverty within the operational area of an MFI and the extent to which the institution is reaching the poorest within this area. The methodology leaves the responsibility for making the final conclusion to the reader.

Summary
As the following chapters describe, the stages of developing a poverty index and using the index to assess the relative poverty of MFI clients are: Stage 1: Using random sampling methods, choose a survey sample that will support results representative of the MFI client and nonclient populations.
Stage 2: Develop a formalized questionnaire by adapting a standardized template to fit local conditions.

Planning and Organizing the Assessment
Guidelines for contracting the assessment A well-executed study assessing the depth of outreach of an MFI requires a clear understanding of the characteristics of the poor in a given area, as well as familiarity with how qualitative and quantitative aspects of poverty can be captured in the form of specific indicators. Also required is a team of social science professionals who can develop a sampling frame, implement a household-level survey, analyze statistical data, and report findings in a professional manner.
Selection of qualified local researchers is critical to the success of a poverty assessment. The ideal local researchers must have at least several years experience in conducting statistical socioeconomic household surveys in the operational area of the MFI and in supervising data entry, data cleaning, and tabulation. They should also be familiar with how qualitative techniques can be applied to establish local definitions of poverty. In addition, researchers should have a track record for successfully completing research projects on time and within budget.
Researchers may be asked for proof of their past experience in conducting data analysis for publication. Finally, both researchers and the institutions for which they work should be accepted within the areas surveyed and not be subject to a conflict of interest due to political, ethnic, or religious affiliation. Researchers should be prepared to work independently of any parties interested in biasing the outcome of an assessment.
The choice of a local researcher, whether with an institution or as an individual consultant, should be based on the experience, availability, and cost of the principal individual; this person's participation should be tied contractually to the actual project assignment.
The research agreement between the donor and the local researcher fixes the term of the agreement, the responsibilities of the contracting parties, the responsibilities of the researcher and field team, the scope of work, payments, reports and delivery schedule, copyright and ownership of the work, and cases of dispute and termination. Both sides of the intended agreement need to be familiar with the institutional procedures and constraints of the other party.

Planning and Organizing the Assessment
In communicating the procedures for channeling funds, donors should take care to provide details on any special forms, contacts, or billing requirements. Donors should also ascertain how funds will be channeled locally and determine the amount of time the contracting institution will need to process them. Many field operations are unnecessarily delayed because donors do not consider the length of time required for funds to filter through local institutions or assume that local institutions will be able to incur expenses using their own resources.

Responsibilities of the researcher
Survey design, data analysis, and preparation of a final report is estimated to involve approximately four to six weeks of effort for a trained, experienced researcher working with a cleaned electronic data set. The actual cost associated with this component of the study will depend on the fees charged by the researcher, but most will fall in the

Contractual issues
Is the researcher's level of involvement explicitly specified in the contract? Does each party know the other's contracting practices and institutional constraints? Has the budget been altered to reflect local costs and wage rates? Does the contract specify how progress is to be monitored and funds released? Are the expected deliverables well specified and feasible within the budget and time frame?
Donors should ascertain how funds will be channeled locally and determine the amount of time the contracting institution will need to process them.
range of US$2,000-US$4,000. 1 This work will include the following major tasks: • coordination with field survey team to adapt and test the questionnaire • estimation of the poverty index and calculation of regional and national poverty measures, following the methodology presented in the manual, plus the preparation of all statistical tables • qualitative and quantitative assessment of poverty levels in survey areas in relation to national averages • meeting with MFI staff to present results and gather feedback for any needed changes • preparation of the final report

Sequencing project payments
Payments to the local researcher can be sequenced against specific stages in the scope of work, as suggested in the brief outline below.
First installment of funds paid to researcher. This amount represents approximately one-third of the field operations budget and enables the following stages of work to be completed: 1. The researcher compiles information and data to set up a sampling frame for the selection of MFI branch offices, identifies experts, and gathers data to assess poverty levels regionally by month 1.
2. The researcher assesses local definitions of poverty, adapts the standard questionnaire to fit local conditions, and-if deemed necessary-identifies up to five additional local indicators.
3. During a meeting with the contracting party scheduled during month 1, the researcher finalizes the questionnaire, trains enumerators, and randomly selects survey clusters. Agreement is reached on how to randomly select clients and nonclients.
Second installment of funds delivered. This amount represents the remaining two-thirds of the field operations budget and permits the following stages of work to occur: 4. The field team implements the survey by the end of month 2.
5. While data is collected, the researcher conducts expert interviews or analyzes secondary poverty data to compare general poverty levels in survey areas to national levels. He or she then calculates a regional measure of poverty.
6. The researcher finalizes the cleaned data near the beginning of month 3 and delivers it to the contracting party.
7. The researcher analyzes the data and computes a composite poverty index by the end of month 3.
8. By month 4, the researcher writes a draft report containing the data and index described in step 6, together with comparative regional and national poverty measures.
9. The researcher organizes and participates in a seminar at which the methodology and results of the poverty assessment are presented and a draft report is circulated.
10. The researcher submits a final report to the contracting party by the end of month 4.
Final payment. Final payment of the researcher's fee and any overhead charges is contingent on delivery and acceptance of the final report.

Determining the required time frame
The implementation period refers to the time period beginning with the decision to conduct the poverty assessment and ending with the completion and distribution of final results. A time frame for starting and completing the study, including the sequencing of various activities, needs to be established early in the process. The time required and the amount of overlap between activities should be carefully estimated; researchers should be careful not to cut corners to save time. Field operations are best scheduled to avoid major national or religious holidays, periods of bad weather, or heavy workloads. Figure 2.1 provides a list of activities and estimated time frames for an MFI poverty assessment. The time allocation estimates are based on actual times used to test the tool in the four test case studies. It is estimated that an assessment can be completed in approximately four months, excluding delays associated with holidays, weather, or other reasons for postponement. Contractors may need to allow for additional time, depending on the season and circumstances at hand.
In general, field operations are the most expensive stage of an assessment. Good planning and time management contributes greatly to holding down costs. The quality of field survey implementation, moreover, can make or break a study. The five aspects of successful field operations are schedule, budget, personnel, logistical support, and performance measurement.

Allocating the poverty assessment budget
Estimating the budget needed for the field survey requires careful scrutiny of how the field survey will be implemented. The allocation of the budget closely follows the schedule of activities, as shown in figure 2.2.

22
Microfinance Poverty Assessment Tool It is estimated that an assessment can be completed in approximately four months.
Effective budgeting and cost control requires a detailed breakdown of major cost categories that correspond to specific survey activities. The major expenses incurred in an MFI field survey will be personnel wages and per diems, transportation, fuel and related costs, and reproduction of questionnaires. Additional expenses may include office and computer rental, plus telephone and other communication costs. Survey personnel need to review the budget regularly to ensure that cost estimates remain in line with actual field progress. A small contingency fund is needed to cover unforeseen expenses.
The actual cost of implementing a field survey will vary depending on the country or region in which it is conducted and on the rates charged by the contracting researcher or institution. In the test case studies, actual field costs ranged from a low of US$4,000 to a high of US$16,000. An additional researcher fee for the data analysis and report preparation will average between US$2,000 and US$4,000. It is estimated that the minimum cost for the field survey and data analysis will be near US$10,000, with costs approaching US$15,000 if transportation costs and local wages are relatively high. A sample budget worksheet and summary budget for a rural MFI poverty assessment can be found in tables 2.1A and 2.1B, respectively.

Personnel, logistical, and performance issues affecting field implementation
Skilled personnel who are well trained and motivated can strongly influence the success of the field operation. A project manager will take overall responsibility for planning and implementing the field survey. In the initial phase, he or she will verify that all field staff are adhering to the sampling frame, appropriately conducting the random selection of clients, and applying the random walk as intended. He or she will monitor progress towards completing the survey and verify that interviewers and supervisors are following the questionnaires consistently during interviews and filling in the forms correctly and completely. The project manager will also monitor the team's progress in staying on schedule and within budget as the field work progresses.
The manager may also be the primary researcher for the project or may work with the researcher or researchers in coordinating the field survey. Ideally, the project manager will have previous survey experience 24 Microfinance Poverty Assessment Tool and a good track record for successfully managing resources and personnel.
In addition to the project manager, the study requires at least six to eight interviewers and one field supervisor (if one interview team is planned for fieldwork) or two supervisors (if the interviewers will be split into two teams). It is recommended that supervisors manage between three and four interviewers. See box 2.2 for an example of an actual field implementation team and corresponding schedule.
Planning and Organizing the Assessment 25

Box 2.2 Field implementation in Kenyan case study
The field personnel for the survey in Kenya consisted of one manager, two field supervisors, six interviewers, and two drivers. The field staff was split into two teams. Each team traveled to a different survey site. By dividing responsibility for the survey, the teams used one day to travel to survey sites, find accommodation, randomly select new clients at the site, locate the homes of sampled clients to schedule interviews, and determine the boundaries and direction for the random-walk sampling of nonclient households. Once settled, the interviewers were able to interview an average of six to seven households per day. Counting the day of preparation, interviewers were targeted to complete an average of five interviews per day. More time was considered unfeasible, less was considered avoidable with good organization. Time in the field for both survey teams totaled 22 days, with interviewers working Saturdays but not Sundays and avoiding interviews at night.
While the survey team were active in the field, several data-entry specialists began to enter data as soon as interviews from the first two survey sites were completed. In this way, the data entry was complete one week after the survey team finished the field work. The short overlap permitted the interviewers to help clean the data. Field supervisors are responsible for coordinating the daily activities of the interviewers, including arranging movement to and from interviews and transport from one survey site to the next. Supervisors also take responsibility for ensuring that the questionnaires are filled out correctly and completely and that the information contained in them is accurate before leaving each survey area. Field supervisors check the work of each interviewer on a daily basis to minimize the number of errors and missing values. Supervisors also conduct repeated random spot checks to verify the accuracy of data by partially repeating a household interview without the interviewer being present.
Field supervisors report regularly to the project manager on progress, costs incurred, and any irregularities in the field. Supervisors should have prior experience in conducting quantitative surveys; good supervisors have strong leadership skills and are assertive in supervising interviewers to ensure that high-quality data is collected.
Interviewers with prior field-survey experience are also desirable, but just as important are individuals with strong communication skills who can carry out interviews in a confident and relaxed manner while maintaining their train of thought. All interviewers require thorough training that includes in-depth review of the questionnaire to understand its intent and repeated practice in posing the questions in the local language.
In many cases, personnel involved in field operations may be the same as those who later participate in the data analysis. In past cases, field supervisors also worked as research assistants, and interviewers entered and cleaned data. In these cases, the individuals involved had previous experience doing both types of tasks.
Training methods are discussed in detail in chapter 5. The project manager, supervisors, and interviewers need to participate in the training to ensure that they all share a common understanding of how to use the questionnaire.
Well-planned logistical support-coordinated transportation, communications, field supplies, and contingency plans for disruptions-also greatly enhances the quality of field implementation. Logistical support needs to be carefully planned at all stages of the survey, especially where operations take place in remote locations with limited infrastructure. Vehicles should be large enough to carry the field team and supplies, and sturdy enough to withstand road conditions in survey areas. The estimated time needed to move from site to site should be based on a careful review of distances and road conditions. Communication methods and emergency plans also should be identified beforehand. Access to petrol stations, food, and accommodation also need to be determined. Finally, those planning the logistics should consider local customs and political circumstances to avoid an unfriendly welcome or hasty exit. 26 Microfinance Poverty Assessment Tool The most common performance measurement is the number of complete interviews finished each day.
Performance measures for the field implementation are a means of controlling activities and meeting objectives. These measures need to be carefully thought through and well defined to ensure that they are not misunderstood or do not create unintended incentives. The most common measurement is the number of complete interviews to be finished each day. The actual interview is estimated to take only 20 minutes. However, locating households, making introductions, and departing smoothly can easily double the time needed. An average target of five interviews per day for each interviewer is recommended, although some adjustment may be needed to reflect survey conditions. More interviews per day could compromise interview quality; less per day could increase field costs. Expenditure limits are also an effective means of measuring performance.

Collecting Survey Data
The poverty assessment tool is based on a comparison of the relative poverty levels of new MFI client households and nonclient households. The tool compares new clients to nonclients only in areas where the MFI currently operates. At a later stage, this comparison is expanded to relate the MFI operational area to general poverty levels in the region and country as a whole (see chapter 10). For these comparisons to be valid, it is essential that the researcher follow a well-structured sample design and accurately document the survey locations according to local government demarcations. This chapter guides users in how to choose a representative group of new MFI client and nonclient households within the institution's operational area. Through random sampling, the results for sampled households will hold true for the entire population of MFI client and nonclient households living within the operational area.
Prior to developing a sampling process for the study, the research team should first map the organizational structure of the MFI to determine the organizational and geographic breakdowns already existing within its operational area. Figure 3.1 shows the common geographic levels used by MFIs to organize their field staff.
Once the geographic organization of the MFI is identified, the researcher follows a series of steps to ensure that the final set of households surveyed represents a random sample of all possible households that could have been interviewed in the operational area of the MFI. These steps are described in detail below.
Step 1: Define the population and sampling unit While some characteristics of poverty can be measured at the individual level, such as a person's income or the assets only he or she owns, much of an individual's wealth is shared with and influenced by the household in which that individual lives. Assessing the relative poverty of an individual without considering the conditions of the entire household provides a distorted view of his or her poverty. The poverty assessment tool

Developing the Sample Design
Chapter 3 thus measures the relative poverty of the household rather than that of any single member in the household.

Household as the basic sampling unit
The household approach has the disadvantage of being unable to account for an uneven distribution of wealth within a household. MFIs that target disadvantaged household members may have a stronger poverty outreach than is indicated by a poverty assessment, particularly if the targeted members have limited access to and control a disproportionately small share of household resources. This possibility should be factored in when interpreting the assessment results, as described in chapter 10.  The research team should map the organizational structure of the MFI to determine the organizational and geographic breakdowns in its operational area.

Client
on the assumption that their living standards have not yet been affected by MFI participation. Every attempt should be made to capture as new a sample of clients as possible. Clients who have newly joined the MFI but have not yet received a loan would be the ideal group from which to draw the sample. When it is not possible to survey such a group, a general rule would be to define a new client as someone who joined the MFI and received a first loan only within the past three months.
• Adopt filtering criteria that respond to the situation at hand. In specific cases, new-client selection criteria may prove too stringent-MFIs may not know how long a client has been with the MFI, or which new clients have already received loans. In addition, an MFI may accept new clients only on a yearly basis, as is the case for many agriculturebased lending schemes. In general, if the sampling process is not able to rule out new clients who have already received a loan, the questionnaire will require additional precautions to control for the possible effects of this loan. This issue is discussed further in chapter 5.
• If necessary, eliminate new clients who have joined older groups. It may be necessary to introduce a further restriction on new-client households to exclude individuals who recently joined client groups in existence longer than three months. This filter is often necessary because information about the number of new individuals in older groups is either unreliable or unavailable at a central location. In addition, surveying new members in older groups, who are likely to be few in number but spread over large areas, may prove too costly in terms of logistics.
• Check that all household members meet the criteria. Client households may have more than one member currently active in the MFI, provided that none of these members has been a client for longer than three months. In addition, the household should not have any member who was once an MFI client but is no longer active.
Nonclient households. The sampling of nonclient households also requires that no household members be current or past clients of the MFI being assessed. Both clients and nonclients can be active participants in other MFIs and still be eligible for the sample.

Determining a feasible survey area
Determine the operational area of the MFI. In addition to knowing which households to sample, the sampling area must also be determined. The operational area is the geographic area in which the MFI operates. The operational area may be best divided according to existing MFI regions or branch offices. These areas may in turn be subdivided into areas of coverage by individual field agents. Note the ways in which the MFI breaks its operational area into sub-units. Also note how these breakdowns compare to those used by local government offices and identify ways in which the two delineations can be aligned. If the MFI offers more than one type of program to its clients, the operational area of the MFI may also be divided by type of program. These programs may or may not overlap geographically.
Identify any problem areas that may not be feasible to survey. In general, the research team will need to determine a standardized rule or set of rules for filtering out unfeasible survey areas and then follow this rule consistently. However, the process of eliminating areas from consideration needs to be carefully scrutinized to avoid any unintentional introduction of bias.
Before setting any rules to limit the feasible survey area, the MFI needs to be asked whether the research team's proposed rules would result in a possible selection bias. The MFI itself may also have reasons for excluding operational areas from the survey. In some cases, these reasons need to be respected, such as the likelihood that the survey could create local animosity towards the MFI. In other cases, these reasons may obscure motives that the MFI does not wish to make clear. In summary, determining the feasible area for the survey requires good information and careful judgment to avoid bias (see box 3.1).
Document any potential source of bias from limiting the feasible area. Exclusion of unfeasible areas may introduce a bias if the areas excluded are likely to be either below or above the poverty levels found in the remaining areas. In a recent assessment, densely populated urban areas were excluded because these households would not have been willing to participate in the survey. Their exclusion was noted and factored in when assessing the regional coverage of the MFI. Exclusion of some areas

Box 3.1 Determining the survey area
In Kenya, a coastal area was eliminated from the operational area because of its remoteness to the rest of the program. This exclusion did not introduce client/nonclient bias because the types of services and delivery mechanisms were the same as those employed in the rest of the operational area of the MFI. Any differences in poverty levels between the regions were later accounted for through the regional analysis.
The MFI surveyed in Madagascar operated two different programs in the same geographical area, but only one specifically targeted poor women. When determining the survey area, the coverage of each program was considered to avoid any unintended bias.
If the MFI offers more than one type of program, the operational area may be divided by type of program.
should not introduce bias of local results since MFI clients and nonclients in these areas are equally excluded. Bias may also be introduced if excluded areas receive a different set of MFI services or are subject to different targeting methods than the survey areas. In cases where services differ between areas, sampling methods will need to distribute the random selection of households proportionally across the two programs.
Step 2: Construct the MFI-based sampling frame The sampling frame refers to lists of the number and distribution of new MFI clients within the operational area. These lists can be used to determine which localities (and the number of households in each) will be surveyed. The number of new clients can be structured according to the geographic breakdowns described in figure 3.1.
In most cases, the research team need not compile an actual list of all qualifying new-client households. Instead, information on the distribution of new clients within the operational area of the MFI can be used to randomly select a handful of smaller geographic areas, making it necessary to prepare new-client lists only for these areas. Ideally, the MFI will provide information on the number of new clients in each region or branch down to the number of new clients located in each field agent area. If appropriate, information on the number of new clients in each program type may also be needed.
Determining the locations of the actual survey will require the use of several sampling techniques, as described below.

Cluster sampling for new MFI clients
Cluster sampling is a sampling technique that randomly reduces the number of MFI localities to be surveyed. It is a useful technique not only because it eliminates the need to make extensive lists of households, but because it systematically limits the number of locations in which the survey will be conducted, thereby reducing field costs.
Cluster sampling requires that the entire feasible area for the survey be divided into non-overlapping clusters, allowing a subset of these clusters to be randomly chosen for the actual sampling of households. Deciding how to form clusters will largely be determined by MFI geographic delineations. As a rule, the more clusters that can be identified, the better. A minimum of 10 clusters is recommended, but a number closer to 30 is preferred. It is recommended that only five to six clusters be selected from this group for actual sampling of households.
As already mentioned, area clusters are best formed on the basis of geographic divisions already created by the MFI. These are usually sub- For very large MFIs, it may also be necessary to first randomly select a subset of the operational regions of the institution. This can be done by assigning each region or branch a number and conducting a random sample according to the relative size of each region (this technique is described in detail in the section below entitled " Step 4"). Figure 3.2 illustrates the geographic levels of the MFI used in cluster sampling.

Determining required clustering stages
Most MFIs are sufficiently large and dispersed to require at least a twostage cluster approach. In addition to a random selection of approximately five to six geographic clusters (field-agent territories), a second random sampling is done within each area to select a set of client households. Random sampling within a cluster usually requires a list of the number of new clients residing in the area and a random selection method to sample households from this list. Box 3.2 summarizes the steps used in cluster sampling.

36
Microfinance Poverty Assessment Tool  In some cases, a three-stage cluster may be appropriate, particularly when MFI clients represent groups of individuals. Here random selection of client groups within each randomly selected geographic cluster would be followed by a random selection of members within that group. Figure  3.3 summarizes the decisions used to determine the stages of cluster sampling to randomly sample new-client households.
Step 3: Determine appropriate sample size Calculating sample sizes is one of the most technically demanding aspects of survey design. On a practical level, sample size is partly determined by the time and resources available for the survey. On a technical level, four parameters inform the decision on sample size: (i) the desired precision of the survey, (ii) the probability distribution of the variable that the survey seeks to measure in the population, (iii) the choice of sampling design (i.e., single random sampling or multi-stage random sampling), and (iv) the number of variables (in this case, poverty indicators) that the survey seeks to capture.
Without prior knowledge of the distribution of poverty indicators among clients, a rule-of-thumb approach must be applied to determine sample size. As a default, this manual recommends a sample size of at least 500 and that a 2-to-3 ratio of clients to nonclients be maintained in all survey clusters, or 200 clients to 300 nonclients. The larger sampling size for nonclients captures the presumably larger variance among nonclients with respect to any poverty indicator than exists among clients. Due to MFI targeting practices and the self-selection of MFI clients, the Developing the Sample Design 37

Box 3.2 Steps used in cluster sampling
Step 1: Randomly sample a subset of MFI branches or, if needed, regions for larger MFIs.
Step 2: Randomly sample a subset of clusters from a list of all MFI clusters in each region.
Step 3: Within each selected cluster, develop a list of all new MFI client households.
Step 4: Choose a random-sampling technique to select client households to be interviewed.
Step 5: If clients constitute groups of individuals, randomly sample groups within each selected cluster and then randomly sample a subset of members of each group.
This manual recommends that a 2-to-3 ratio of clients to nonclients be maintained in all survey clusters.

Yes
No client group is likely to be less heterogeneous (have less variance) than the nonclient group.
Step 4: Distribute the samples proportionally The main objective of the sampling process is to ensure that the selected sample represents the population of all new MFI clients. Equal-probability sampling is one means of ensuring that each new MFI client has an equal chance of being selected. This type of sampling can be applied in two ways: probability-proportionate-to-size sampling (PPS) and equalproportion sampling (EPS). The type of sampling technique used will largely depend on whether the survey team can determine the number of new MFI clients in each 38 Microfinance Poverty Assessment Tool  However, if the number of new clients in each cluster cannot be known without visiting each field office, then the EPS method may be more reasonable logistically. PPS is generally preferred over EPS because it permits equal numbers of clients to be surveyed in each sampled cluster, whereas EPS requires that the number of clients in each survey locality be proportional to the total number of new clients located in all selected clusters.

Probability-proportionate-to-size sampling (PPS)
PPS ensures the equal-chance selection of households, but requires that the number of new clients in each cluster be known before households are randomly selected (see box 3.3). Of the two methods, PPS is the easiest to implement in the field because the number of households surveyed are the same in each survey locality.
PPS is carried out in two stages. In the first stage, each cluster is assigned a chance of selection proportionate to the number of new MFI clients it contains, with the result that larger clusters have a better chance of selection than smaller ones.
In the second stage, the same number of MFI new-client households is selected from each selected cluster. Table 3.1 illustrates the steps involved in weighting the clusters proportionally to the size of new clients in each cluster. Assuming only three clusters will be used for sampling, these are selected randomly using a random-number chart, where Developing the Sample Design 39

Box 3.3 Example of PPS sampling
In Nicaragua, new clients were randomly sampled from two large geographic areas on the basis of the number of new members in each area. Twenty percent of new clients were located in the northern region. Of the five offices areas randomly sampled, one was drawn from the north and four were drawn from the south. Because the number of new clients was known for each area, the PPS method was used to determine the number of clients interviewed in each sampled office area. The list of clients was determined by the credit agent at the office level. An equal number of clients (40) were randomly selected from each of the five branches. Sixty (60) nonclients were randomly sampled using the random-walk method at each survey site.
PPS requires that the number of new clients in each cluster be known before households are randomly selected. the range of numbers from 1 to 100 is assigned to each cluster according to the share of new clients each cluster holds: 1 to 16 for cluster 1, 17 to 37 for cluster 2, and so on. If the random numbers selected are, for example, 5, 18, and 60, then clusters 1, 2, and 4 are selected. An equal number of MFI new clients (67) is then drawn from each cluster.
A number of software programs offer functions to generate randomnumber tables, such as the one shown in table 3.2.

Equal-proportion sampling (EPS)
EPS attaches to each cluster an equal chance of selection regardless of size, but then distributes the numbers of clients interviewed in each cluster according to the share of new MFI clients in that cluster to the total number of new clients in the selected clusters. In this way, if new-client information is not centralized, evaluators need only determine the number of new clients in those clusters that are randomly selected. The evaluation team will eventually be required to visit branch offices to collect new-client information for that cluster.
The number of households interviewed in each cluster will differ when using the EPS method. Bigger clusters will have more MFI house-

40
Microfinance Poverty Assessment Tool

EPS method applied to client groups
In many MFIs, clients are members of financial groups. The individuals in these groups are each counted as a new client. However, when distributing samples within a cluster, evaluators will want to draw a sample of groups from which to randomly select households for interviewing. If the PPS method is used for sampling, the same number of households can be chosen from each group, regardless of its membership size. However, if the EPS method is used, the number of individuals interviewed in each group needs to be adjusted according to the ratio of new clients represented in that group to the total number of new members in the cluster. Table 3.4 gives an example of how the number of members from each group is determined. In the example, assume that 46 interviews from cluster 1 are to be taken from 5 randomly sampled financial groups. The sample size for each group is adjusted proportionally according to the ratio of the membership of each group to total membership in the cluster.
Developing the Sample Design 41  Step 5: Select the actual sample The discussion so far has guided a researcher to develop a sample frame using several levels of geographic clusters, which systematically reduces the number of new-client households from which to choose a random sample. It has also guided a researcher to determine the number of newclient households that will be randomly sampled in each cluster so that each such household in the feasible survey area has an equal chance of being selected. Once these two techniques are applied, the actual random selection of new-client households can proceed. The selection of actual client and nonclient households is done during the course of the survey. This is generally also the best time to verify the accuracy of survey sampling information with MFI field staff.

Random sampling within clusters
The research team will need to define how to randomly select clients within each cluster. A relatively straightforward method of selection is systematic sampling, where draws are made at fixed intervals through a list of the sample population, starting from a random unit. This method requires that a list be made of all new clients or client groups within a selected cluster.
For example, suppose the survey team needs a sample of 10 from a list of 150 clients or client groups. First, a number is randomly selected between 1 and 15 (150 divided by 10) and, starting from a randomly selected client or client group on the list, every 15th one is selected. If 5 were the randomly selected number, then the sample would be composed of clients 5, 20, 35, 50, 65, 80, 95, 110, 125, and 140. A second method is systematic random sampling. Using this method, all new clients are assigned identification numbers and then selected either by drawing slips of paper from a hat or by using a random-number table.
In addition to randomly sampling households to interview, the survey team should also prepare a second list of randomly sampled households to place on a reserve list in the event that a sampled household does not qualify for an interview or is unable to be interviewed. As a rule of thumb, the reserve list should contain an additional 4 reserve names for each 10 sampled names. Once the survey is underway, the first name on the reserve list is taken to replace the first sampled household falling out of the survey. All additional replacements are made in the order in which they appear on the reserve list.

Random sampling of nonclient households: The random walk
Sampling nonclient households would be a time-consuming exercise if an accurate list of all households had to be created within each survey area.

42
Microfinance Poverty Assessment Tool The selection of actual client and nonclient households is done during the course of the survey.
A researcher can avoid this task by employing a two-stage technique called the EPI Cluster Survey Design method, or EPI method. Although the method may be less precise than sampling from a true population list, its greater efficiency is an appropriate trade-off for the loss of precision. In contrast to the sampling frame for client households, the EPI method requires no preparatory work other than defining the boundaries of each survey site. The random selection of nonclient households is done at the same time that the survey is conducted.
The EPI method, developed by the United Nations Children's Fund to monitor the immunization of children within large areas, can be easily adapted to fit the MFI situation. The method is used within the local community or subdivision where sampled MFI clients live. The boundaries of the area can be set by asking MFI clients to identify landmarks in all directions that establish an outer perimeter of where client households are located.
As illustrated in figure 3.5, the survey team identifies a central point in this area from which to divide the area into quarters (the area may be divided into more than four quarters if it is a very large zone). From this central point, the survey team selects a random direction by spinning a bottle or pen on a flat surface and noting the direction in which it points. The interviewer selects only households lying in this direction or quartile. The next interviewer makes a second spin to select a second direction to follow.
In cases where clients are scattered across a large geographic area, it may be necessary to divide the area into quarters as indicated above, but instead of randomly selecting a direction from the center, several quarters are randomly selected. From a new central point within each quarter, the interviewer then randomly spins for a direction along which households will be randomly selected.
Developing the Sample Design 43

Figure 3.4 Quartiles of a survey area
Quarter 4 Quarter 1 Quarter 3 Quarter 2 The EPI method, developed to monitor the immunization of children, can be easily adapted to fit the MFI situation.
Care should be taken in determining the walking route to avoid unintended bias. Ideally, the interviewer walks in a straight line rather than following a road or street. It is possible that households located on a road are different from those located elsewhere. Adjustments can also be made for areas that do not easily fit into a grid pattern. For instance, if a locality is oriented along a single road, then the road itself can be divided into sections and one or more of these sections is randomly selected for the walk. Finally, care should be taken that no particular sections of a locality are systematically excluded, either by eliminating them as too difficult or by selecting too small an interval number so that households on the periphery are missed.
Depending on the density of households and the approximate area in which MFI clients reside, the survey team determines an interval number for selecting (sampling) and interviewing nonclient households. In dense urban areas, an appropriate interval may be 10, so that every 10th dwelling counted from the center along the randomly selected direction would be sampled and interviewed. For rural areas, a much smaller interval number may be more appropriate. Interviewers may need reminding that households do not necessarily live in separate homesteads, but also in housing complexes. Within a single building, a random process should also be defined for selecting which households to interview. Several households may be located within the same building, and renting and squatter households are also counted. Box 3.4 outlines the random-walk process.

44
Microfinance Poverty Assessment Tool

Box 3.4 The random walk
Step 1: Approximate the village or locality boundaries of sampled MFI new-client households and draw a rough map.
Step 2: Determine a central point and assess the density of households.
Step 3: Divide the area into quarters.
Step 4: Randomly select one or more directions by spinning a pen or bottle to determine the quarter to be sampled. If households are dispersed, count households within a quarter; if concentrated, narrow the count to a particular route within the quarter. Do NOT select only households located along a road.
Step 5: Follow the direction identified for the random walk and select households at intervals of a predetermined number based on population density (for example, every fifth household).

Step 6:
Replace dropout households by sampling the very next household.
Care should be taken to avoid unintended bias in the random walk.

Describing each survey site
In the course of surveying nonclient households, interviewers should make notes on their general impressions of housing quality, level of infrastructure, population density, and any other notable characteristics of the neighborhood, town, or rural area being surveyed. Interviewers should also take note of how the survey area differs from other areas in the same general locality.
The following questions indicate the specific information that should be recorded about the survey site: • Is the site primarily urban, semi-urban, or rural?
• How far is it to a major urban area?
• What is the quality of the main roads serving the area?
• Does the site have piped public water?
• Does the site have electricity?
• Do residents have the possibility of collecting firewood?
• What ethnic, religious, or caste groups are located at the site?
• What are the major sources of employment around the site?
• What is the topology and climate?
• If rural, what agricultural crops are grown in the area and what is the current season?
Developing the Sample Design 45

Box 3.5 Summary of steps for developing sample survey design
Step 1: Define the population and sampling unit. The "population" refers to all clients and nonclients who reside within the operational area of the MFI. The "basic sampling unit" is the household of new clients and nonclients.
Step 2: Construct the MFI-based sampling frame. The sampling frame refers to how information on the number of new MFI clients within the operational area of the MFI is used to determine which localities will be surveyed.
Step 3: Determine the appropriate sample size. The minimum sample size is 500 households, of which 200 are new MFI clients and 300 are nonclients, or a 2-to-3 ratio of clients to nonclients.
Step 4: Distribute the sample proportionally. Proportional sampling refers to techniques that structure the selection of households so that each has an equal chance of being selected.
Step 5: Select the actual sample.

Interviewers should note how the survey area differs from other areas in the same general locality.
This manual provides a well-tested list of questions that have been worded, coded, and ordered in a questionnaire format to produce consistent, measurable results. (See appendix 3 for a copy of the recommended questionnaire.) The core questions identified for the survey should be included and maintained in their general form under all circumstances. Used in combination, these questions build indicators that are later used to calculate a poverty index. The ways in which responses are grouped, sequenced, and measured are designed to support subsequent analysis of the survey data at a later stage.

Identifying local definitions of poverty
The recommended questionnaire requires some customization to fit local conditions. The research team is responsible for adapting the standardized questionnaire form to fit the national and, sometimes, localized setting. Researchers can begin the customization process by first assessing local perceptions of poverty to see how well these aspects are addressed in the standard questionnaire. Local poverty definitions can be discovered informally through discussions with area residents or MFI field staff and clients. Straightforward questions can help to uncover how the local population distinguishes between the poorest, less poor, and nonpoor within their communities. Some examples of the types of questions that can be posed are: • How would you describe the poorest people or families in your neighborhood or village?
• What would be different for a person or family that is a bit better off but still poor?
• How would you describe someone in your community who is not poor, someone that is doing rather well?

Adapting the Poverty Assessment Questionnaire to the Local Setting
Chapter 4 In one test case, discussions with residents uncovered a local perception of the poorest as being those who could not afford burial insurance, those a bit better off as having community insurance, and the well-off as having a formal insurance policy.
Good poverty indicators measure changing conditions at different levels of welfare within a locality. Ideally, they capture increasing levels of well-being as household wealth increases. They do not rely on "yes/no" or "have/don't have" responses. These types of indicators cannot be used in principal component analysis. Instead, indicators that measure a quantity, value, or frequency should be used.
In addition, indicator responses capturing qualitative information need to be structured so that the first category of response indicates the lowest level of well-being and the last category of response indicates the highest level of well-being. No indicator should contain a "don't know" or "not applicable" response option.
Documenting local perceptions of poverty will help to interpret assessment results within the local context. In most cases, many of the characteristics describing poverty within a locality will already be covered in the questionnaire. However, the wording of the question or choice of responses may not reflect local conditions, so that small changes are required. To avoid distortions that could weaken the reliability and validity of a question and its underlying indicator, those tasked to adapt the questionnaire can benefit greatly from an overview of the intended purpose of each section within the standardized questionnaire and a short summary of possible adaptations.
The remainder of the chapter outlines the objectives and issues associated with each section of the questionnaire and describes how each section can be adapted without altering its underlying intent. The sections entitled "The survey form" and "Customizing the questionnaire" in this chapter provide guidelines on how to construct additional local indicators of poverty that do not appear in the standard questionnaire.

Introducing the study and screening households
Ideally, introductory information is written ahead of time so that interviewers can introduce themselves and the reasons for the interview precisely and accurately to the household. The following instructions indicate the kinds of information provided to household respondents.

How to introduce the study
Step 1. Identify yourselves. Households will be more cooperative if they know who is conducting the study. An important point to mention is that the survey team is not directly employed by the MFI. This disclosure will 48 Microfinance Poverty Assessment Tool

Good poverty indicators capture
increasing levels of well-being as household wealth increases.
eliminate a potential source of bias if the household thinks its answers may affect its access to services from the MFI.
If the survey team is associated with a well-known university or other institution, this may encourage cooperation and should be highlighted.
On the other hand, if the organization is associated with certain parts of government, particularly local government, or a political party or ethnic group, many households may be reluctant to provide truthful information about their relative wealth or poverty. Downplaying these aspects may be prudent.
Step 2. Show letters of introduction and endorsement. Most countries expect that outsiders will first seek permission from local leaders before approaching households in a given locality. In addition to introducing the survey, these courtesy visits also can provide an opportunity to collect important information about the community being surveyed. In some cases, a letter of introduction from MFI headquarters to MFI clients and from local authorities to nonclients can reassure households and further facilitate introductions.
Step 3. Inform households of your purpose. Most households will not fully understand the methodology used for this study. However, many will quickly fathom the overall purpose: to determine if the living standards of new MFI clients differ from nonclients living in the same area and, if so, in what ways. Further clarification of the purpose of determining whether the households of MFI clients are relatively poorer or wealthier is discouraged. This information could influence the way that questions are answered by the households and thereby introduce a major source of error in the results.
Step 4. Explain why the household has been selected. Households also appreciate knowing that they have been selected for an interview on the basis of a random process. Those making introductions can draw analogies to such methods as pulling names from a hat to explain exactly what this means.
Step 5. Assure respondents of confidentiality. In many countries, fear of crime or traditional beliefs may also inhibit many households from sharing private information. Introductions by the survey team should incorporate clear statements about the neutrality of the study team and the confidentiality of information collected for the study. The study team should guarantee and subsequently follow through on their guarantee that no outside body will access the data for purposes other than those intended.

Screening households for applicability
Not all households qualify for participation in the study. After making introductions and before beginning the interview, the interviewer must Most countries expect that outsiders will first seek permission from local leaders before approaching households in a given locality.
verify that the household qualifies either as a new-client household or as a nonclient household. A household identified as having a member who is a new MFI client can still be disqualified for three reasons: • Someone else in the household is also a client of the MFI and has been a client for longer than six months, or less if the time restriction for new clients is set for fewer months.
• Someone in the household was, but no longer is, a client of the MFI.
• Someone in the household has already received two or more loans from the MFI.
If a sampled MFI client household is disqualified, it is replaced with the next household named on the replacement list. A household sampled as a nonclient household can also be disqualified for similar reasons: • Someone in the household is a client of the MFI.
• Someone in the household was, but no longer is, a client of the MFI.
If either type of household is found, the interviewer should thank the members of the household for their time and terminate the interview. In the event that a nonclient household is disqualified, it is replaced with the next household in the same direction.

Type of respondent and preferred interview venue
In addition to verifying that the household is an appropriate new MFI client or nonclient household, it is also important to determine who within the household responds to the questions. Ideally, both the head of household and the spouse of the head of household will respond. In many cases, if this is not possible, having either of these persons respond is the next best choice. The location of the interview will also partially determine results for several key indicators. Interviews ideally take place in the respondent's home, where the quality of housing and extent of durable assets can be observed.

The survey form
The following sections specifically identify where the questionnaire will need to be adapted. Some changes will require altering the actual questionnaire form, others will require that a sheet of notes be developed to provide definitions of question categories and terminology. Field staff will use these notes as a reference during the actual household survey. 50 Microfinance Poverty Assessment Tool Ideally, both the head of household and the spouse of the head of household will respond to the survey.

Section A: Documenting households through identification information
Purpose. Table 4.1 shows section A of the questionnaire. Successful field surveys require adequate identification to distinguish information from different households. Coding of all households, client groups, localities, and clusters is required to identify households in later stages of the analysis.
Issues. The range of numbers used by each interviewer for each household needs to be pre-specified. To eliminate any risk of overlap, it is recommended that each type of identification variable be assigned a range of identity code numbers with a beginning and end point. Evaluators should assign identification code ranges so that they are easily understood. The types of codes that can be used are as follows: Item A2: Locality codes. Codes that link survey sites to government administrative localities are used to relate survey data to secondary data collected from other sources.
Define: Assign numeric codes to each administrative locality in which survey sites are located and list on the sheet of notes.
Adapting the Poverty Assessment Questionnaire to the Local Setting 51 Item A3: MFI cluster codes. Each questionnaire can also be partially identified by the MFI survey area in which it is located. A name and code number for each of these areas should be determined and written on questionnaires before interviews. The likely number of survey areas will be 5 to 6.
Define: Assign numeric codes to each survey area and list on the sheet of notes.
Item A4: Group codes. If clients are organized into groups, the name and an associated code number for the group become important identifiers.
Define: Assign numeric codes to each group of clients surveyed and list on the sheet of notes.
Item A6: Household identification codes. In this study, the key means of identifying each household is through the assignment of a unique identity code. Given the sample size of 500, a household identity number can be three digits. These numbers can be assigned before the interview, or written on the questionnaire at the time of the interview. Non-overlapping household code ranges need to be assigned for each survey site. For instance, the first survey site could be assigned a range of 100 to 199; site two, 200 to 299; and so on.
Define: Assign ranges of household identification codes for each survey site and list on the sheet of notes.
Item A11: Interviewer codes. Finally, to control for data errors and monitor interview performance, the questionnaire records the name and code of each interviewer as well as that of the person who has proofed the questionnaire in the field.
All coding associated with identifying households, members, areas, and groups should be summarized on the sheet of paper that is given to each interviewer to use as a reference. Table 4.2 shows the ranges of different types of identification codes used in an actual poverty assessment survey.

52
Microfinance Poverty Assessment Tool The key means of identifying each household is through a unique identity code.

Section B: Family structure
Purpose. Characteristics-such as the number, age, health, education, and occupation of household members-represent indicators of the household's resources in the form of human capital. The purpose of this section is to quantify the key aspects of the household's investments in human capital. Specific objectives include determining the composition of household members based on the survey definition of a household and recording selected poverty-related aspects of each individual member of a household.
Issues. Definitions of household vary widely. Working consistently from a standard definition is crucial for good measurement. For the purposes of this poverty assessment method, a household is defined as a group of individuals who live under the same roof and regularly share meals and expenses together. A family does not necessarily constitute a household, as it can include members who live away from home or who are closely related but do not cook together and pool resources to cover expenses. All household members should be screened to ensure that they fulfill all criteria for a household member. In some cases, the definition of a household may require inclusion of a spouse who works away from the home but contributes regularly to expenses and does not support any other household. Other family members living away from home are not counted unless they are children of the head of household attending a boarding school and the household supports them fully.

Define:
In the case where a husband lives away from the home most of the time, but contributes regularly to the household upkeep and supports no other household, the head of household would be the wife who remains at home. This household would be considered "female-headed" and the husband would be included as spouse.
All qualified household members must be included in either section B1 or B2 of the survey form, shown in tables 4.3 and 4.4, and names as well as identification codes must be assigned. (These codes will be used again later in the questionnaire.) All columns in these tables represent povertysensitive aspects of individuals and, as categories, should not be changed. However, determining the appropriate wording for categories or responses may require some changes.
Household adults. Section B1 of the questionnaire is shown in table 4.3.
Adult ID code. Each member of the household receives a separate identification number. This number will be used consistently throughout the questionnaire. Age. For older members, exact ages are sometimes not known. The respondent can be asked to estimate the approximate age of some adult members if determining the exact age is likely to be time consuming.
Maximum level of schooling. This indicator can use coded, sequenced responses to measure progressively higher levels of completion. The categories of response are listed after "(D)" in the table key.
Adapt: Identify the appropriate levels of educational advancement and list them progressively in terms of completion.
Can write. This refers to the ability to read and write, regardless of the language involved (all local languages apply).
Main occupation. This refers to the type of activity done most often by the household member on a daily basis. If individuals do more than one type of work, record the type that takes up the most time per day. If individuals spend the largest part of their day not working, it is critical to record this using one of the codes for not working. Categories of responses are listed after "(F)" below the table.
Amount of loans borrowed. This provides information on the extent to which the household has taken advantage of services from the MFI being assessed. Loans from all other MFIs are not included. This information will later help identify which households may have already benefited from MFI participation.
Clothing and footwear expenses. Household expenditures on footwear and clothing can reflect the relative poverty or wealth of a household in many cultures. Accurate measurements are critical in ensuring the reliability of the variable because this indicator will be treated as the benchmark poverty indicator. Expenditures are limited to those made by verified household members (not extended family members living and eating elsewhere) and do not include gifts to the household. Items given by one family member to another are also not counted as an expense. The amount of expenditure is the amount paid for the item at the time of purchase.
The time period covered is the past year; most respondents will need to be provided reference points (a sequence of notable holidays; time of year, such as Christmas; a family event; or start of school year).
with the spouse permanently present in the household, 3 -married with the spouse migrant, 4 -widow or widower, 5 -divorced or separated (B) 1 -spouse, 2 -son or daughter, 3 -father or mother, 4 -grandchild, 5 -grandparents, 6 -other relative, 7 -other nonrelative (C) 1 -male, 2 -female (D) 1 -less than primary 6, 2 -some primary, 3 -completed primary 6, 4 -attended technical school, 5 -attended secondary, 6 -completed secondary, 7 -attended college or university (E) 0 -no, 1 -yes (F) 1 -self-employed in agriculture, 2 -self-employed in nonfarm enterprise, 3 -student, 4 -casual worker, 5 -salaried worker, 6 -domestic worker, 7 -unemployed, looking for a job, 8 -unwilling to work or retired, 9 -not able to work (handicapped) (G) 0 -no, 1 -yes (H) In order to get an accurate recall, one should preferably ask about clothes and footwear expenses for each adult in the presence of the spouse of the head of household. If the clothes were sewn at home, provide costs of all materials (thread, fabric, buttons, needle) Tailoring charges should be included. If items are made in the home, the costs of all materials used (for example, buttons, thread, and cloth) should be estimated for each person. The respondent should be encouraged to ask other household members if he or she is uncertain of the items and amounts.
Children under age 15. Section B.2 of the questionnaire is shown in table 4.4. Characteristics related to children in the household are important indicators of relative poverty for many households. However, because many surveyed households have no children, the survey questions are limited to documenting the number, age, and clothing expenditures on each child. The amount of expenditure for clothing and footwear is calculated in the same way as described in the previous two paragraphs.
Adapting the Poverty Assessment Questionnaire to the Local Setting 55

Section C: Food-related indicators
Purpose. Table 4.5 shows section C of the questionnaire. Household eating patterns are strong indicators of relative poverty and vulnerability. Eating patterns can be affected by the relative poverty of a household in several ways.
First, poorer households tend to consume food on a less regular basis than wealthier households and may eat lesser quantities per person. In some cases, poorer households may skip meals or eat smaller quantities at meals, either during particular seasons or on a more regular basis. Second, poorer households tend to consume more of less costly foods and less of more costly foods. Third, poorer households are often less able to purchase staple foods in larger quantities at more favorable perunit prices, or less able to maintain stocks of either homegrown or purchased staples.
The specific objectives of measuring these aspects of food security are to: • Document the quantity and frequency or regularity of food served by the household on a routine basis in ways that distinguish differences in well-being.
• Identify and document consumption of specific foods that signal the spending power of the household.
• Identify the degree to which households are able to purchase in bulk and maintain stocks of staple foods.
Issues. The potential for bias in measuring food consumption is high and several steps are needed to limit the chance of error. First, the recall 56 Microfinance Poverty Assessment Tool period for recording food consumption must be kept short. Few individuals can remember what was eaten more than a week in the past. Second, food consumption patterns can be drastically altered during special events so that the occurrence of these events must be controlled for in the questionnaire. All surveyed households noting special events in the past few days are thus asked to recall the period before the event.
Third, the wording of food-related questions must be precisely stated and rigorously followed. Whether a meal is served or prepared can be interpreted differently. Some households cook only once per day but prepare enough food to serve at two meals. Table 4.5 Section C of survey questionnaire

Section C. Food-Related Indicators
(Both the head of household and his or her spouse should be present when answering this section.)

C1.
Did any special event occur in the last two days (for example, family event, guests invited)? (0) no (1) yes C2. If no, how many meals were served to the household members during the last 2 days?
(If yes, how many meals were served to the household members during the 2 days preceding the special event?)

C3.
Were there any special events in the last seven days (for example, family event, guests invited)? (0) no (1) yes (If "Yes," the "last seven days" in C4 and C5 should refer to the week preceding the special event.) C4.
During the last seven days, for how many days were the following foods served in a main meal eaten by the household? (1) daily (2) twice a week (3) weekly (4) fortnightly (5)  Item C1: Special event. The first question in this section screens for special events that may have disrupted normal eating habits during the past two days. If these occurred, respondents are asked to respond regarding eating practices before the special event. In the case of a special event, the respondent skips the next question and resumes the interview (C3). The same screening for special events is repeated for questions related to consumption of luxury and inferior foods (C4).

Luxury food Number of days served
Item C2: Number of meals served to household members. This is usually a reflection of local eating habits. If households usually serve three meals each day, the expected number would be three. If a morning meal is unusual, the expected number may be only two per day. It is crucial, however, that all interviewers use the same interpretation of what constitutes a "meal." Define: Standardize definition of a meal and add to the sheet of notes.
In one case study, the indicator was revised to measure the number of meals cooked in the past two days. Nearly all households reported cooking only one meal per day, which was traditional in the locality. The indicator as defined was thus unable to distinguish differences in well-being.
Item C4: Luxury foods. The questionnaire requires that, for each survey, three foods be identified that are locally considered of high quality and relatively expensive for the average household, and that the frequency in days that each of these foods was served be recorded. Luxury foods are very specific to local climates and customs. Usually, luxury foods cannot be consumed regularly by all households, but are more frequently consumed in wealthier households than in poorer households. Their consumption is also not restricted to special religious periods or cultural traditions.
Meat, eggs, or dairy products-and some processed foods or sweetscan act as luxury foods in many parts of the world. In some cases, rice in a nonrice-growing region or wheat products in a nonwheat-growing region can be treated as luxury foods. Sometimes what is considered a luxury food changes by season. For example, rice may be considered a luxury food during the maize-dominated agricultural season, but may lose its luxury status during the immediate post-harvest rice season when the rice price falls. Seasonality in price fluctuations should therefore be taken into account when determining what food groups are to be considered as luxury foods at the time of the interview.  Nicaragua beef, poultry, cheese Item C5: Inferior food. An inferior food is the opposite of a luxury food. A clear signal that a food is inferior is when many households in a given area tend to avoid its consumption if they can afford an alternative. An inferior food is usually a cheap substitute for a standard staple, or a cheap item or dish to be served with a standard staple. Cassava is considered an inferior food by many households in rice-growing areas, where rice is the preferred staple.
Adapt: Identify the food that can best be regarded as inferior in all survey areas. Item C6: Number of days when there was not enough to eat in past month. This question is intended to measure short-term or seasonal food shortages within the household. It can be explained to the household as any condition where meals were skipped either because of a shortage of food or because members did not eat as much as they needed to feel full (i.e., they went hungry for part or all of the day).
In one case study, the question was revised to ask only the number of times the household went to bed hungry. Locally, the evening meal was considered the most important, thus poorer households often went hungry during the day but went to bed full. The indicator did not adequately differentiate between levels of poverty.
Item C7: Number of months with at least one day not enough to eat in past year. This question is intended to measure longer-term food shortages within the household and is used to balance any seasonal bias that may have entered into the previous question. If the household experienced food shortages in a previous season, this question will tend to capture it. Because the recall period is long, the interviewer may need to probe the household to recall particular seasons when food prices tended to rise during the past year.
Item C8: Purchases of staple foods. This question measures how often households purchase staple foods. Lower-income households tend to purchase smaller quantities more frequently, despite the higher associated cost, because of limited cash availability.
Adapt: Identify three staple or storable foods that are regularly consumed in the local area. Order responses from highest to lowest frequency, as in table 4.8.

Section D: Dwelling-related indicators
Purpose. The quality of housing is partially determined by the relative poverty of a household. Indicators of dwelling quality include not only the size of the house, but also the durability of materials used in its construction and the extent to which it is kept in good condition. Finally, indicators of the facilities associated with the housing, such as toilet facil- ities and access to drinking water, can also measure aspects of its quality. Specific measurement objectives are to assess the quality and size of the household dwelling relative to others within the local area, as well as the quality of facilities available to and used by the household. Table 4.10 shows this section of the questionnaire.
Issues. Specific characteristics of household dwellings will vary by locality and culture. In some cultures, household dwellings may be numerous but located within one central compound. In urban areas, dwellings may consist of rented rooms within a single structure. It is therefore essential to adjust questions in this section to reflect the circumstances of survey households. The dwelling is defined as all enclosed living spaces used by the family on a routine basis. Building structures used primarily for storage or livestock are not considered part of the dwelling.
A second issue is the bias that occurs in areas where the level of local infrastructure precludes a household from accessing certain amenities. For example, even wealthy households will not use electricity in an area where electricity is not available. This potential bias will be balanced out in the study through area analysis and does not need to be accounted for in the questionnaire.
Knowing the ownership status of the dwelling can greatly assist in interpreting other information about the household and characteristics of its dwelling. Whether the house is built on squatter land can be a strong indicator of its general insecurity and vulnerability.
What type of toilet facility is available? (1) bush, field, or no facility (2) shared pit toilet (3) own pit toilet (4) shared, ventilated, improved pit latrine (5) own improved latrine (6)  In Kenya, the definition of a "room" included not only the rooms located within the household's main building, but all detached living quarters used by individual members.
Item D2: Type of roofing material. This question requires that the common types of materials used in roofing be identified, the categories for choices defined and sequenced in order from lowest to highest quality, durability, or cost. Where households have more than one dwelling within a compound, roofing material refers only to that on the primary structure.
Adapt: Determine categories for types of roofing and sequence them from lowest to highest quality using code numbers.
In India, roofs made of impermanent materials were most common among poorer households while roofs made of cement or tiles were most prevalent among less-poor households. In Kenya, metal sheets roofed nearly all houses regardless of the poverty level of the household.
Items D3-D4: Type of exterior walls and floors. The choices of materials used for exterior walls and floors will differ by locality. However, the sequence of choices should reflect an improvement in quality, determined either by durability or cost. The highest code number should list the highest-quality building materials. Again, type of exterior walls and floors refers only to those on the primary dwelling structure.
Adapt: Determine groupings for types of walls (and floors) and sequence categories (and code numbers) from lowest to highest quality.
Because the coastal areas of Madagascar are prone to tropical storms, most houses are built with light, cheap materials so that they can be easily rebuilt. The types of walls and roofing are thus not good indicators of differences in poverty levels among households.
The definition of what constitutes a room needs to be specified to fit local conditions. Item D5: Condition of dwelling structure. This question relies on the interviewer's subjective assessment and assumes that the interviewer is able to view the dwelling structure. To make the measurement consistent, interviewers should have a common understanding of what constitutes "dilapidated," "in need of repairs," and "in good condition." The condition should not depend on the dwelling size or quality of materials used, since these have already been measured.
Define: Standardize the interpretation of three levels of structural condition of the household dwelling and add to the sheet of notes.
In Kenya, "dilapidated" was interpreted to mean the dwelling was structurally unsound, "in need of repairs" meant that parts of the dwelling were obviously in need of repair, and "good condition" meant that no obvious repairs were needed.
Item D6: Electricity supply. The availability and means of delivery for electricity differ by location. The choices of response for this question may need to be altered to better reflect local circumstances. The ordering of choices should range from lower to higher access and cost. Responses should also be recorded where no electricity is available.
Adapt: Determine the category of choices for how households access electricity and sequence the response choices (and code numbers) from lowest to highest cost.
In India, only 20 percent of the poorest households had an electricity connection compared with 80 percent of the least poor.
Item D7: Cooking fuel. The type of primary cooking fuel will reflect location-specific conditions. Bias introduced by area differences in fuel use is addressed during the analysis stage. The choices of fuel types can be standard for all survey areas and should be ordered to reflect the lowest to highest cost of fuel.
Adapt: Determine types of cooking fuels and sequence categories (and code numbers) from lowest to highest cost.
In rural Kenya, a significant indicator of a household's relative poverty was whether the household purchased or collected fuel for cooking.
Item D8: Source of drinking water. The source of drinking water will be determined by local conditions. In general, drinking water sourced from open bodies of water, including open wells, are of lower quality Adapting the Poverty Assessment Questionnaire to the Local Setting 63 Interviewers should have a common understanding of what constitutes "dilapidated," "in need of repairs," and "in good condition." than drinking water accessed through closed systems, either public or private.
Adapt: Determine the main sources of drinking water and sequence categories (and code numbers) from lowest to highest quality.
In Madagascar and Kenya, poorer households were more likely to use open sources for drinking water.
Item D9: Type of latrine. The type of latrine is a component of housing and can be partially determined by the relative poverty of the household. The categories of responses will need to reflect local practices, but the choices of structure should be ordered from lowest to highest quality or cost.
Adapt: Determine the main types of latrines and sequence categories (and code numbers) from lowest to highest quality.
In India, even the majority of the least poor lack a latrine within the homestead; however, the likelihood of having one is much higher for these households (28 percent compared with less than 1 percent of the poorest households).

Section E: Other asset-based indicators
Purpose. Accumulation of assets is strongly influenced by household income levels. Poorer households use income to meet basic needs and have little extra to invest in durable assets. Measuring the value of certain types of consumer durable assets can signal differences in relative poverty, so that a complete valuation of all household assets is not necessary. Specific objectives related to this section are to record the number of selected consumer assets owned by the household by asset type, and to assess the current market value of these selected assets. Table 4.11 shows section E of the questionnaire.
Issues. Asking households to itemize and value their durable assets can be a sensitive topic. The list of assets is limited, as much as possible, to observable assets. The list also excludes items that are part of a business owned by a household member where the asset can be considered inventory. Inventory can be items that were either purchased with the intent to be resold or used to make products to be sold.
If a household owns a radio or refrigerator that is located at the business but is not for sale, this may be counted as a household asset. Finally, the value of an asset is the money the household could receive from selling it at the current time. Interviewers may need to probe to establish an accu-64

Microfinance Poverty Assessment Tool
The list of assets is limited, as much as possible, to observable assets. rate value by asking how old the item is and whether it is in good working order.
In the case where new clients may have already received a loan from the MFI, all questions related to assets need to be screened to eliminate any purchases that were made as a result of the loan. This is best done by asking what items were purchased from the MFI loan and excluding them.
Item E1: Size of landholdings. Land ownership is a good indicator of wealth in many developing countries. The amount of land owned by a household refers to the size (acres, hectares, or other measure) that is owned by all household members. The land may be broken into subcategories to represent differences in its use or relative value. If households are likely to know the value of their land at current market rates, and are likely to state the amount accurately, the value of landholdings can also be asked.

Adapt:
Determine the appropriate types of landholdings to measure-agricultural or nonagricultural, cultivated or uncultivated-with standardized definitions for each (see Land ownership is a good indicator of wealth in many developing countries. In many areas where land is owned communally, questions measuring ownership or access to land may not be a good means of differentiating levels of poverty. Item E2: Value of livestock assets (nos. 1-4). In many countries animals can be important assets for the household. Sometimes, however, households may be reluctant to number or value their animals. Also, areas that are primarily urban are less likely to have large animal assets.
Adapt: Determine which, if any, animals can indicate relative household wealth and include them in table E2. Include up to four categories. Item E2: Value of transportation-related assets (nos. 5-9). In many countries, ownership of means of transportation can delineate differences in relative poverty. Bikes, motorcycles, and other motorized vehicles vary in degree of ownership from country to country. People in mountainous regions may own fewer bicycles; people in urbanized areas, relatively more.
Adapt: Screen the list of transportation assets to include only those used in the areas surveyed.
Few households owned vehicles or bicycles in Kenya, but the latter were an indicator of relative wealth in India.
Item E2: Value of appliances and electronics (nos. 10-16). All major appliances and electronics are considered good indicators for differentiating relative poverty levels. Not all will be appropriate for every survey area, so the list should be amended to reflect only what is realistic.
Adapt: Screen the list of appliances to include only those found in the areas surveyed.
In India, ownership of electric fans signaled relative wealth. In the Kenyan highlands, few households owned fans, but many owned televisions. In Nicaragua, most surveyed households owned at least one television and its value was a significant determinant of the household's relative wealth.
In a country where many appliances can be purchased on credit, the value of assets can be discounted by the amount of credit owed on them to better differentiate the buying power of different households.

Customizing the questionnaire
The questionnaire for this study was tested in four different countries and found effective in measuring the relative poverty of households. However, the standardized form of the questionnaire will require certain adaptations to fit local circumstances. Customizing the standardized questionnaire will be required for two reasons. First, in adapting the response categories to standardized questions, evaluators will need to reword some sentences. Second, the researcher may want to include location-specific indicators that capture a very specific local aspect of poverty. A significant local indicator in Nicaragua, for example, is remittances from overseas. In either case, additions of local indicators must be made very sparingly.
This tool limits the researcher to no more than five additional indicators to capture location-specific poverty measures. One means of limiting the choice of indicators is to only consider adding those characteristics mentioned by local residents but not covered in the standard questionnaire.
The standardized questionnaire will require adaptations to fit local circumstances, however, additions of local indicators must be made very sparingly.
In addition, for indicators to be used in principal component analysis, responses must be either numerical (i.e., represent a quantity or value) or have coded response categories that assign a "1" to the lowest category of well-being and sequence all responses in ascending order to reflect progressive improvement in well-being. No poverty indicators at the household level should record "yes/no" or "have/don't have" responses and no indicators should include response codes for "don't know" or "not applicable."'

Guidelines for writing well-worded questions
A questionnaire is a standardized form for collecting data from respondents for the purpose of measurement and quantitative analysis. It contains questions that are to be asked in the same way to all respondents, with answers recorded using standardized sets of response categories.
Questionnaire design is more of an art form than a scientific process. The quality of the questionnaire depends on skill and judgment, a clear concept of the information needed, how this information will be used, and an awareness of possible sensitivities of respondents. Good questionnaires are often developed in stages and involve extensive pretesting. Characteristics of good questions are listed below.
• The wording of a question determines whether the researcher and the respondent interpret the meaning of the question in the same way. No single wording of a question is correct. Instead it is important to understand clearly what effect a particular wording can have on the response.
• The words used in a question should be familiar to the interviewer and the respondent. They should also correspond to local word usage and practice. Words used in questions should not be ambiguous, i.e., not have more than one possible meaning.
• Leading questions tend to suggest an expected answer to the respondent. "Did you eat three meals today?" is more leading than "How many meals did you eat today?" Likewise, a question that introduces bias includes words or phrases that indicate approval or disapproval. "Can you afford a telephone?" introduces more bias than "Do you have a telephone?" • To avoid implicit alternatives, questions should state clearly all relevant alternatives to a question unless for some reason this is not appropriate. Also, in expressing alternatives, the order of presentation can affect responses, since those listed last tend to be chosen. To avoid unnecessary estimates, phrase questions in a way that allows the researcher to make calculations later. Asking the household how much sugar it consumed in the past week is easier and more likely to be accurate than asking how much it consumed in the past year. • A double-barreled question is worded in such a way that respondents are required to give two answers, when a place is available for only one response. For instance, asking in the same question how much rice and sugar the family consumed in the past week would be difficult to answer and interpret.
• Finally, maintain a clear frame of reference. Keep questions specific to the household being interviewed. Instead of asking whether households in the area had enough food to eat in the past month, which elicits an opinion or subjective assessment, it is more accurate to ask only whether the household being interviewed had enough to eat.
The respondent must be able and willing to answer the question in the way it is posed. The respondent may have trouble answering some of the questions either because he or she is uninformed or because he or she is forgetful. In this case, it may be necessary to involve other household members in determining the correct answer.
Unwillingness to answer a question can manifest when either the respondent refuses to provide an answer, or the respondent purposely provides a wrong answer. The likelihood of this occurring can be reduced if respondents have a positive perception of the interviewer and consider the information needed for a legitimate purpose. Providing a good introduction and not rushing the interview can help to relax the respondent.

Pre-coding the questionnaire
The standardized questionnaire developed for this manual is formatted to provide pre-coded responses to qualitative questions. Coding refers to assigning numbers to each response or category of responses for a given question. Qualitative questions on the questionnaire form are closed, that is, categories of responses are identified and a number for each is given on the questionnaire. Codes for different responses or categories of responses have been structured so that all possible responses can be categorized into one of the predetermined choices. This methodology ensures that no overlap exists between response categories and codes. In order to better support data analysis, categories of responses are sequenced from lowest to highest quality or cost. These principles should be followed when survey questions are adapted to fit local conditions.
Pre-coding questionnaires greatly accelerates the process of entering data into a computer and analyzing it with statistical procedures. Computer-based statistical analysis requires codes in numeric form. The questionnaire provides a code box for each question. The list of response categories and their associated codes for text questions are located after or below the actual question.
If the coding of a question needs to be adapted for local conditions, the following rules must be adhered to at all times: • Coded responses are numbered and presented sequentially.
• Coded responses are located as close to the question as possible, and placement is consistent throughout the questionnaire (always at right, or always at left).
• Code boxes for each question are easily distinguishable so that interviewers are not confused as to which one to use for a given question.
• Coded responses are all-inclusive, so that the interviewer will not need to write in a response that does not fit the categories provided.
• Codes for "other," "don't know," or "not applicable" are not included.
• Quantitative responses are usually not coded; recording the actual age of an individual is preferable to assigning codes to age groups and recording these.
• "Yes" or "no" responses are coded with "no" as 0 and "yes" as 1.
• The number of code boxes provided for each question matches the highest number of places possible for an answer. For instance, a coded question with more than nine response codes requires two boxes.
• The choice of codes for table-formatted questions is located below the table, but always on the same page.
Coded responses need to be clearly defined for the interviewer. In addition to a brief description of each code choice on the questionnaire, a separate sheet of notes is usually needed to define exactly what each category includes. This separate sheet provides the interviewer with local translations for terminology and the meaning of categories of responses in local languages. The definitions established for what constitutes a household, or how a room is defined, can also be written out. The sheet of notes should order information in the way it is presented in the questionnaire and give the reference number of the relevant question. Pre-coding is the responsibility of those adapting the standard questionnaire to local conditions and is best done at the same time each question is adapted. Identifying meaningful, all-encompassing yet nonoverlapping response categories often involves considerable reflection. The task should not be left to junior members of the evaluation team.
Adapting the Poverty Assessment Questionnaire to the Local Setting 71 Interviewer training is best done by the same individuals who revise and finalize the questionnaire, and, ideally, also undertake the data analysis. This method avoids any confusion regarding the intent of questions or how they are worded and translated.
Interviewer training can follow a progressive format aimed at building skills step by step. Normally, interview training takes a minimum of two to three days, with an additional two days required for pretesting. The major determinant of training length is the amount of field experience already possessed by both the interviewers and field supervisors.

Format:
Brainstorming and roundtable discussion Materials: Written summary information, flip charts to jot comments, notebooks for participants Time: Full day Interviewers are critical to collecting quality data. They are the individuals who must convincingly present the study to respondents, guarantee that the wording of the questions is followed, and clarify issues for confused or reluctant respondents. To do their jobs well, interviewers need a solid overview of: (i) the purpose of the study, (ii) the sampling frame to be used for identifying households, (iii) the field operations plan, (iv) the role of the interviewer and principles of good interviewing, and (v) potential sources of error. An informal presentation with opportunities to ask questions is a good format for conducting this stage of training. Examples of how to build interviewer understanding of each subject in the survey are provided below.

Discuss the purpose of the study
This session can be started by summarizing the purpose of the study and describing the roles of various organizations involved in it. Once inter-

Training the Field Survey Team
Chapter 5 viewers have a general idea of the study, trainers can ensure that they gain a more thorough understanding of the purpose and importance of the field survey by asking probing questions; these questions will prompt them to consider why and how the study is being conducted. Chapter 1 of this manual describes the purpose of the study.
The following are possible questions for spurring discussion: Q: What do we mean by poverty? A: Most countries have very clear definitions of poverty. Calculating the poverty level of a household is very complicated. Poverty is usually measured in absolute terms, providing a measurement of income or expenditures in current terms to determine whether the household is poor. Collectively, the poverty line in a country is the cutoff annual income below which households are considered poor.

Q: What is the difference between absolute poverty and relative poverty?
A: This poverty assessment does not measure actual household poverty levels because it would exceed the scope of the study. Instead, the study compares the poverty level of a household with other households living in the same area. The interviewers in the study do this by asking questions that indicate the relative wealth or poverty of a household.
Q: What kinds of household characteristics provide clues to the relative poverty level of a household? A: Far too many household characteristics exist to ask about them all.
Many different questions were tested and it was concluded that the most informative questions were those related to household resources, either in the form of assets, type of occupation, or level of education. Other characteristics relate to how well a household can meet its daily needs, so questions are asked about the quality and adequacy of food, water, clothing, and housing.
Questions are focused on characteristics that can be observed, are not too sensitive for the respondent, do not involve too much calculation, and can be answered accurately. A: MFIs may want to have many of their clients perceived as poor, since this will result in a more favorable image of the MFI among donors and the local population. A few MFIs may advise their clients to underestimate their assets, the quality of their food, or other factors evaluated in a poverty assessment. The best way to guard against this possibility is to avoid showing MFI staff the survey questions in detail.

Discuss the sampling frame used for identifying households
In many surveys, interviewers tend to avoid extreme cases of poor and wealthy households. Discuss with the interviewers how and where the poorest of the poor live. If these individuals are homeless, discuss how they as households can be identified. Discuss how and where the wealthy live and how these households can be included in the sampling process.
Interviewers need to appreciate why interviewing the actual households sampled for the survey is so important. Client households are sampled from lists that are usually drawn up at the time of the actual survey. Although supervisors working with MFI field agents will be responsible for developing the actual lists, interviewers will be responsible for selecting client households as ordered on the list. Interviewers must also choose replacements only from the reserve lists. Discuss how these lists are created and how to decide when a replacement is needed. Describe the plan for how interviewers will communicate with the supervisor to identify a replacement client household, if needed.
Sampling nonclient households requires the interviewers to understand thoroughly the random-walk process, in case they need to find these households without a supervisor. Present the material on the random walk in chapter 3 (see page 42) as simply as possible. Test their understanding by asking how the sampling might be done in different settings and what problems they are likely to face. Talk about how these problems can be solved.
Choice of houses to count. When a direction has been chosen at random, a house will be selected out of a predetermined interval number. The interval number is fixed according to the size of the area and the number of households that will be surveyed so that there is broad coverage of the area. Generally, 1 house out of 5 can be chosen in a sparsely populated area, 1 out of 10 to 15 in larger or more densely populated areas. All houses must be carefully counted, even shanties or temporary structures; these are likely poorer households. Buildings that are not residential houses (e.g., churches, schools, mosques, city halls) are not counted. On the street, houses are counted alternatively on the left and right side; where an intersection occurs, the enumerator will go alternatively to the right and left side.
Multiple households in a house. When two or three households live in the same building, a number is given to each household and one of them is selected at random (for example, the numbers are written on small pieces of paper and one is selected at random). When a building has a large number of households, each household is given a number and several households can be selected. For example, if 1 house out of 10 is selected and there are 30 households are in the building, 3 households are selected.
Random walk in a city. For a small town (less than 4 to 5 kilometers from one side to the other), the enumerator must identify a central point in the town (such as a market place, the intersection of main roads, city hall, or a main building) where the random walk will begin.
If the city is large, only the locality or ward where MFI-sampled clients are concentrated is surveyed. Ideally, clients will live close by so that boundaries can be determined. If no clear boundary can be defined, a rough map can be drawn with the help of local MFI clients or staff. The map must show natural boundaries (such as rivers, main roads, or parks) so that different parts can be identified. A number is given to each area and some are selected at random for the survey. For example, 3 to 4 areas can be selected out of 10 areas in the city. In each area selected, a central point must be identified to begin the random walk.
Random walk in a rural area. Distances are generally far larger in rural areas and official maps are not usually available in developing countries. In this case, the administrative area that will be surveyed can be divided into several zones on the basis of natural boundaries. Several zones are selected at random. For the choice of houses, when no street is clearly marked, the interviewer must follow the direction taken at random.

Replacement households.
With the random-walk technique, the replacement nonclient household is the next household (n + 1). For example, when 1 house out of 10 is selected and the household chosen doesn't want to answer the questionnaire, the 11th house is chosen. When households are absent, the interviewer should avoid taking a replacement household immediately. Household members may only be at work or momentarily absent. In this case, the enumerator should try to come back when the household members are present (during the evening or weekend). If absent households are always replaced, the sample could underrepresent working households.
Finally, discuss with the interviewers and supervisors how to describe the area being surveyed in the terms listed at the end of chapter 3. Define the means of distinguishing between rural, semi-rural, and urban areas.

Present the field implementation plan
Interviewers will have many questions about the kind of support they will have in the field. They will want to know about vehicles, what kind of accommodation they can expect, when they will work, what kinds of materials will be provided, what expenses will be covered by per diem, and what measures will be taken if they become ill or injured. Review the schedule you have set for implementation of the field survey and the job responsibilities of each person involved in the field survey.

Define role of the interviewer and review principles of good interviewing
The following are important points to highlight about the interviewer's role and the art of good interviewing.
Be open-minded about the timing of interviews for the survey. Interviewers should be willing to schedule interviews at a time when respondents are able to meet with them. They should also be forthcoming about the amount of time required for the survey (about 20 minutes).
Be rehearsed to the point that questions can be asked using the precise wording written, while maintaining a relaxed and conversational tone of voice. The questions should also flow from one to the next without breaks while the interviewer finds his or her place or reads through the next question.

Know the questions well enough to add clarification and encouragement
if the respondent is confused or hesitant to answer. If an answer does not sound confident, interviewers should gently probe to verify that the answer is well thought through.
Maintain a pleasant and clean appearance and behave politely. This means not eating during interviews, showing respect to elders, and dressing modestly. Interviewers should not accept food or gifts from a household unless it would be considered extremely rude in the locality to refuse.
Maintain a neutral stance on the questions being asked, on the survey's purpose, and on the MFI being assessed.

Discuss major sources of error in the field and how to control for these errors
Inevitably, field staff will encounter difficulties in implementing the survey as planned. How they handle these difficulties, however, can greatly reduce the likelihood of error in the data set. Some common sources of errors in the field are discussed below.
Sample selection errors. Sample selection errors can come from several sources. First, there may be a temptation to exclude certain clients or locations for reasons that do not follow the sampling frame. MFI staff Training the Field Survey Team 77 How field staff handle survey difficulties will affect the level of error in the data set.
may have reasons for excluding groups whose participation would bias the survey results. Within an area, interviewers may purposely skip less accessible client households or may tend to select better-built dwellings while discounting the poorest accommodations of nonclient households. All of these practices can be major sources of error.
Nonresponse errors. Nonresponse errors usually occur when households are not at home or they refuse to participate. To avoid these errors, interviewers need to revisit such households at a later time or, if the household has refused, select the next household on a reserve list for clients or the next household directly following a nonclient household.
Interviewing errors. Interviewers who conduct interviews in an awkward, tiring, or offensive manner can jeopardize the quality and extent of cooperation of the respondent. To avoid these errors, interviewers should know all of the questions thoroughly, so that they ask the questions exactly as worded. They should also know the sequencing of the questions and maintain this order at all times.
Interviewers can improve responses by helping respondents to understand the questions. These probing techniques help to motivate the respondent and also focus him or her on the specific information being asked. Finally, if any changes need to be made to the questionnaire, interviewers need to keep track of these; all interviewers then need to agree to enact the changes uniformly and simultaneously. A cap on the maximum number of questionnaires completed per day may be considered to ensure that questionnaires are not completed hastily.

Stage 2: Understand content of the questionnaire Format:
Informal discussion with questions and answers Materials: Copy of revised questionnaire for all interviewers, flip charts to jot comments, extra paper for trainees and trainers to take notes Time: Half-day Before training interviewers, all local adaptations to the questionnaire should be drafted by project staff and the edited version of the questionnaire used for training. In training staff to use the questionnaire, make sure they understand its content and how responses should be recorded. It is recommended that project staff: • review all questions to identify their purpose • clarify any definitions that have been adapted for local conditions • differentiate between response choices 78

Microfinance Poverty Assessment Tool
Interviewers should know all of the questions thoroughly, so that they ask the questions exactly as worded.
• describe how to record answers using codes • explain the ordering of questions • develop a code sheet to complement the questionnaire In general, this stage of training is best kept informal with plenty of opportunity for questions. Any confusion or skepticism on the part of interviewers over the wording of questions, the nature of the response groups, or the flow of questions can signal problems with the questionnaire. Encourage evaluators to be forthcoming about how improvements can be made. As each section of the questionnaire is covered, ask participants to take notes and record on a flip chart special points regarding how to ask specific questions and how to interpret and distinguish between coded question categories. The sheet of notes prepared for interviewers should also have written text for them to introduce the study, together with a listing of the eligibility criteria for client and nonclient households.
Training the Field Survey Team 79

Box 5.1 Interviewer reference sheet
The reference sheet of notes provided to interviewers should contain explanations of the items listed below.

Selecting households:
• instructions on random sampling • instructions on the random walk

Screening households:
• how to introduce oneself and the study • how to screen clients for eligibility • how to screen nonclients for eligibility

Key definitions of:
• "a household" • how to calculate the value of clothing and footwear • a "meal" and "not enough to eat" • a "dwelling" • how to define three categories of "condition of dwelling" • how to estimate the size of landholdings and distinguish between types of landholdings • how to guide respondents to establish the "resale value" of assets The sheet of notes prepared for interviewers should list the eligibility criteria for client and nonclient households.

Format:
Small groups to translate, large group discussion to review translations Materials: Extra copies of the questionnaire, sheets to make notes Time: Half-day Many MFIs operate in areas where more than one language is spoken. In most cases, respondents will be better able to understand the questions if their local language is used in the survey interview. Successful implementation of a field survey in more than one language requires working through the wording for each question in each language, with interviewers making certain that all responses consistently measure the same thing. Translation may not necessarily require that the translated version be a written one. If interviewers are highly skilled and truly bilingual, and can read English proficiently enough to translate accurately at the time of the interview, a standardized verbal translation may be adequate. The format for this stage of training first requires breaking interviewers into small groups according to their language skills. If their locallanguage skills are inadequate, it may be necessary to use outside expertise for language translation. Each small group will review the appropriate wording for each question and response in each local language. Notes on words chosen can be recorded on extra sheets of paper. Once a draft translation is complete, a round-table review of each translated version by all survey personnel can check for consistency.
Once translations are agreed upon, key words for each question can be listed on a separate sheet of paper and their translation into other languages shown. This is one way to avoid having to translate the entire document. These sheets can be used as a reference during actual interviews.

Format:
Small groups of three interviewers to rotate roles of interviewer, respondent, and observer Materials: Copies of the questionnaire and translation notes Time: Half-day Good training programs provide extensive opportunities to practice interviewing. Practice helps interviewers to: (i) monitor for consistency across languages, (ii) build familiarity with the exact wording and flow 80

Microfinance Poverty Assessment Tool
Translation may not necessarily require that the translated version be a written one.
of questions, (iii) practice complete and accurate coding of questionnaires, and (iv) build confidence in their interviewing skills. The format for this stage of training can be varied. Role-playing by pairs of interviewers and a respondent is effective, both with and without an observer. Switching roles and partners can add variety to the needed repetitions of practice. All participants should be strongly encouraged to provide constructive feedback to their colleagues on how to improve their skills.

Format:
Travel to nonselected field site, conduct test interviews in groups with two interviewers and one observer, discuss any problem areas and needed changes Materials: At least three questionnaires for each enumerator Time: Full day Pretesting a questionnaire in the field usually involves all levels of the survey team. The pretest is standard practice for finding weak points in survey questions and errors in the logistical plan, as well as identifying the need for additional field staff training. Pretesting thus provides an opportunity to make corrections before doing the actual survey. For questionnaire designers, it is a chance to see if the questions are worded appropriately and interpreted consistently. Pretesting will uncover at least a few unforeseen responses that do not fit into existing response categories in the questionnaire. These can either be used to add to the list of existing codes or to revise the definition of one or more of the established codes. The pretest also cues project managers and field supervisors on the time commitment and resources required to locate and interview respondents.
The process of pretesting involves more than finding households to ask questions. A pretest survey site should be selected where nonsampled MFI clients are located. The pretest should include an opportunity for field teams to practice random sampling at the individual client and nonclient level. It should also include a visit to local leaders to experience their reaction to the proposed interviews.
The pretest is an important training tool for the interviewers, who practice sampling methods and gain confidence in conducting interviews. Each interviewer should have enough time to conduct at least two supervised interviews, preferably three, if time permits. Supervisors monitor the interviews and completed questionnaires from each interviewer and give feedback on how he or she can improve performance.

Training the Field Survey Team 81
The pretest is standard practice for finding weak points in survey questions and errors in the logistical plan.
In addition to individual feedback, interviewers will benefit from an opportunity to share their experiences in a group format, ask questions, and make suggestions for improvements. Interviewers will likely have questions related to the random-walk method of sampling nonclient households. These questions need to be addressed and the rules for implementing the technique reviewed.

82
Microfinance Poverty Assessment Tool

Analyzing Survey Data
Transforming written information from a questionnaire into a structured electronic database involves several meticulously executed stages of work. First, electronic variables must be defined and data types determined. Second, the data must be entered into spreadsheet files, with each file clearly linked to all others. Once data are entered, they must be cleaned of errors in order to assure accurate data analysis. Successful data management requires specialized skills in computerized spreadsheet software. The standard data-entry templates provided in appendix 5 of this manual can be adapted to reflect the customized questionnaire for any given country; the person carrying out the adjustments should, however, have an understanding of how the data eventually will be used.
It is recommended that an experienced data analyst be tasked with adapting the data-entry files and that this same person supervise the individuals who enter the data. Data entry does not require specialized computer skills, but the people responsible for this task should have good typing skills and some background in managing files on a personal computer. Poor typing usually translates into more time to complete the process and more data-entry errors. Data cleaning incorporates techniques best handled by an experienced data analyst with a background in statistics. Data-entry people can then make the actual corrections.
The following sections describe in detail how data for this survey are entered into well-defined files and then cleaned of errors. This manual assumes that data analysis will use SPSS software and therefore provides guidance in data entry and data cleaning for this software.

Structuring data files
Entering all questionnaire data into the same spreadsheet file would make eventual analysis of the data inefficient and disorderly. To avoid this problem, several different spreadsheet files are defined and specific variables entered into each. Four separate files are needed to manage the data contained in the questionnaire, as described below.

Managing the Survey Data
Chapter 6 F1: Household-level data file. This file records all data collected at the household level (all sentence-type questions with only one response per household, or all responses except those recorded in tables B1, B2, and E2 of the questionnaire).

F2: Adult data file.
This file records all data collected at the adultmember level (table B1 in the questionnaire).

F3: Child data file.
This file records all data collected at the childmember level (table B2 in the questionnaire).

F4: Asset value data file.
This file records all data collected at the individual asset level (table E2 in the questionnaire).
As noted previously, templates for these four files are provided in appendix 5; these templates can either be edited to reflect changes to the questionnaire or used as guides to create new data files from scratch.

Linking files within a relational database
Successful data management requires that all data pertaining to a particular household respondent be recorded under the identity code for that household. This is achieved by creating spreadsheet files where each row in a spreadsheet is treated as a "case" and contains information only for one household. The row always begins with the unique identity code assigned to the household. Household data can be recorded in separate files if more than one row in each spreadsheet contains information on the same household. This situation exists for questions pertaining to household members and assets. Each case within a spreadsheet file must be uniquely identified by the household identity code and an additional identity code to further identify all additional data specific to a single household. For example, if a household contains more than one adult, then the household code and a unique code for each adult are used to record information about that adult.
Linking files through overlapping case identity codes is essential to support data analysis. The researcher will need to pool information contained in individual files to conduct core analysis. Unique identity codes provide the means of linking these files together. (Additional codes, such as those for the survey area, will also help to categorize households; however, these will not be unique to each case.) The unique identity codes for the four files are as follows: • F1: Household identification code • F2: Household identification code + adult identification code • F3: Household identification code + child identification code • F4: Household identification code + asset identification code

General organization of SPSS
The SPSS software program, a commercially available off-the-shelf package, is a Windows-based program and can be operated largely from its menu system, complete with toolbars and icons.

Main menu bar
The SPSS main menu bar is displayed across the top of the opening window when the user first starts the program. The main menu bar (figure 6.2) 1 has the following menus: Linking files through overlapping case identity codes is essential to support data analysis.
It is strongly recommended that each user go through the complete online tutorial provided with the SPSS software before proceeding with this manual. The rest of the information pertaining to SPSS in this manual focuses on specific applications used to support the Microfinance Poverty Assessment Tool.
Data can be entered into the computer using a wide range of software packages. The most common means of entering smaller data sets, such as the one for this survey, is a spreadsheet program such as Microsoft Excel or a relational database program such as Microsoft Access. Many of the instructions and clarifications provided in the following section will apply to these or other spreadsheet software programs. Regardless of the dataentry program used, however, analysis will eventually require SPSS or SAS, two statistical packages that support principle component analysis.

SPSS views
In addition to its main menu, SPSS also has three windows for displaying information about data. The "Data View" window displays the actual data in spreadsheet form, as shown in figure 6.2. The "Variable View" window (figure 6.3) shows information on variable definitions. The variable view of a data file is used to add and delete variables or to change the characteristics of variables. In this view, each row summarizes information about a single variable and each column lists a characteristic of that variable. In both data and variable views, users can add, change, and delete information contained in the file.
The third type of window is the "Output View" window (figure 6.4), which actually displays the contents of a separate file. As procedures are run in SPSS, the results are automatically displayed in an output view file. In output view, the window is divided into two parts. The left section contains an outline of the output contained in the file and the right section contains the tables and charts created by the user; the latter section can be used to locate and move to different output contained in the file. In figure 6.4, the small arrow shown to the left of the word "title" corresponds to the title displayed in the right section and marked by a 88 Microfinance Poverty Assessment Tool

Figure 6.2 SPSS main menu ("Data View" window)
Regardless of the dataentry program used, analysis will eventually require SPSS or SAS.
large arrow. Users can use the scroll bar to browse the results or double click on the icon to move to a particular output.
As new procedures are run, the resulting tables and graphs are added sequentially to the output view file. The output file can be saved and reopened by assigning a name (the extension ".spo" is automatically added whenever files are saved). The tables contained within SPSS output files are transferable to most word processing programs and can easily be added directly into a summary report.
Managing the Survey Data 89

Preparation of data-entry forms and file documentation
To enter data, a variable name is needed for each type of data collected on the questionnaire. Each column in the spreadsheet is labeled with the name of one of these variables; these variable names, or labels, are usually sequenced according to the order in which they appear on the questionnaire.
Most spreadsheet and statistical software programs automate the variable creation process so that the definition of the variable can be entered at the same time that the variable label is created. The variable definition not only specifies the meaning of the variable label, it also designates the format of the variable data, lists code numbers, and provides a key for what each code represents for all pre-coded responses.
Variable type refers to whether data are numbers, dates, currencies, or strings. Most variable types in this questionnaire are numeric so that they can be analyzed using statistical procedures. All data containing letters or a mixture of letters and numbers are categorized as strings.
Variable and value labels are created in the "Variable View" window. A variable name must begin with a letter and cannot exceed eight characters. In addition to a variable name, a variable definition and value labels representing different response codes also need to be specified in the "Value Labels" dialogue box (figure 6.5). This dialogue box is accessed from the "Variable View" window by clicking on the right end of the values cell for that variable.
Variable measurement records the way in which data are measured for each variable. SPSS defines data as nominal, scale, or ordinal. The terms are shown in the far right column of figure 6.3. 90 Microfinance Poverty Assessment Tool

Figure 6.5 SPSS "Value Labels" dialogue box
Most variable types in the questionnaire are numeric so that they can be analyzed using statistical procedures.
Variables are analyzed according to the type of measurement used. Variables created from the questionnaire constitute three types of measurement: Nominal data, where the number codes represent labels for categories of responses. These codes tend to identify and classify information. Examples from the questionnaire include marital status, gender, and location codes. The code numbers do not represent systematic sequencing that adhere to an underlying scale.
Ordinal data, where the sequence of number codes for a variable reflects an ordered relationship. Code responses for ordinal data are assumed to measure points along an underlying continuous function that may specify graduations in quality or cost. Examples of ordinal data are education levels of adults (least education to most education), or the quality of drinking water and toilet facilities (lowest quality to highest quality).
Ratio-scaled data requires an absolute zero point. These data represent an actual unit of measurement, such as value, quantity, size, weight, or distance. Expenditures on clothing and footwear is an example of a scale variable. Ratio data does not require coding. The actual measurement can be recorded on the questionnaire. Whenever possible, quantifiable variables are recorded as ratio data, since this type of data permits more rigorous statistical testing. A list of all variable definitions and value labels can be summarized in SPSS by selecting the Utilities menu and the File Info option. Figure 6.4 shows an example of this list. The printout of this list shows all variable names, their definitions, type of measurement, and all value labels for nominal and ordinal data. The list should be carefully proofed and edited before any data is entered. Data entry cannot begin until the data-entry files exactly mirror the contents of the adapted questionnaire. This means that all variable names and definitions, and all code numbering and labels, correspond to adaptations made to the questionnaire.

Entering the data
Before actual data entry begins, all data-entry personnel should practice entering actual questionnaire data. During this trial stage, each dataentry person has the opportunity to practice how to edit, save, and reopen the data files. Data can be entered twice to check for consistency and accuracy.
Data entry is best achieved by following systematic procedures to minimize data-entry errors. In general, the data-entry person should enter all data from a questionnaire before moving to the next questionnaire. An exception to this rule can be made if the data entry is to be organized by file rather than by questionnaire. In both cases, the dataentry person should make notes on each questionnaire to indicate who entered the data and which part of the questionnaire was entered. Missing values on questionnaires require special procedures for accurate recording. In this assessment methodology all missing values will be recorded in the same way-by leaving the cell blank. This procedure is required by principal component analysis, which replaces missing values with indicator averages. Missing values can be of two types. First, in some cases, no answer is required from some households and the interviewer has purposely left the answer blank. This is not a true missing value and in the data-entry file, the cell for this variable and case should be left blank. In SPSS, a decimal point will appear in the middle of the cell to mark it as empty. The computer will treat this cell as a "SYSMIS," or system missing.
In other cases, a value is called for but none provided. This can be considered a missing value and efforts should be made to determine the correct value that belongs in the cell. If none can be found, the cell is also left blank and treated as a "SYSMIS." Under no circumstance should the number "0" be used to record a missing value, as "0" constitutes a reasonable response for many quantitative variables. Ideally, missing values should also be referred back to the interviewer to see if he or she can recall the correct responses.

Making electronic backups
Entering data is a time-consuming process. Once entered, data will be cleaned and further prepared for analysis, all of which takes time. Successful data management entails backing up all files regularly and storing copies in several different locations for safekeeping. Once data are entered and cleaned, master copies of these data files should also be saved and safeguarded. Subsequent alterations to these original versions should be saved and safeguarded under different file names.

Cleaning the data
An effective means of detecting errors with minimum effort is to enter all data twice, each time into a different file. Cases can then be compared between the two files to see where differences occur. Because data entry is estimated to take no more than three to four days, entering data twice can be more cost-effective than searching for errors at later stages.
Data-entry errors can occur in several ways. First, codes that do not exist in the original questionnaire may be entered for a variable. Second, the person entering the data may key in values for a variable incorrectly. Data errors can also occur in the questionnaire at the time of the inter-92 Microfinance Under no circumstance should the number "0" be used to record a missing value, as "0" constitutes a reasonable response for many quantitative variables.
view and remain undetected both in the field and by the person entering data. These errors can include values that are inappropriate for the question or that contradict information captured by other variables. Finally, data may be missing for variables in some questionnaires either because the respondent failed to answer the question or the interviewer failed to record the answer.

Data cleaning procedures
Data cleaning consists of a series of procedures that locate the various types of errors described above and guide the cleaners in how to make corrections where appropriate. These procedures are briefly outlined below.
Wild codes. The data set should be cleaned of all codes that do not exist for a particular variable. One method for identifying such "wild codes" is to test for frequencies on each indicator and compare the value codes for each to the answer in the original questionnaire. When wild codes are found and data assigned to them, the original questionnaire must be used to re-enter the correct code value. The SPSS procedure for frequency testing is described in the section below entitled "Applying SPSS procedures for cleaning data." Consistency checks. Checks on the logical patterns of answers can also be used to find data errors. Consistency checks can be done in several ways within SPSS. One method is first to filter the data set only for cases responding in a certain way, then to run a frequency test on a second variable to check for inconsistencies. For example, households that indicate they had no food shortages in the past year (item C7) would not indicate they did not have enough to eat in the past month (item C6). If the data are clean, all households answering "0" to C7 will also show a "0" response to C6. A frequency test of responses to C6 on all households answering zero to C7 would uncover any inconsistent responses.
Another method for checking inconsistencies across variables with only a few categories of responses is to run cross tabulations where responses for one variable are cross-checked in tabular form against responses for another variable. An example would be to check that households who cook with electricity (item D7) also report having access to an electricity supply (item D6). The SPSS routine for running a frequency test is discussed in the next section. Procedures for running a cross tabulation are described in chapter 8.

Extreme-case check.
In some cases, responses to a question can seem highly improbable either because they are extreme when compared with responses given by other households, or because they seem improbable given other responses from the same household. In one case-study survey, a household with few assets, limited food supplies, a poor diet, and low expenditures on food and clothing was found to hold land assets worth nearly $500,000. Not only was the value much higher than all other households in the survey, it also seemed inconsistent with the household's other responses. It was found that the data-entry person had typed too many zeros in the landholding variable cell. Extreme cases can be identified through several techniques in SPSS. Perhaps the easiest is creating a "box plot" of variable responses, as described later in this chapter.

Correcting data errors
Procedures for correcting data depend on the source of the error. In most cases, the source cannot be determined without checking the actual questionnaire. If the response written on the questionnaire is different from the number entered in the file, then the error is from data entry. The error is corrected by changing the number in the file to that shown on the questionnaire.
If the response shown on the questionnaire is the same as that entered in the file, two scenarios are possible. First, the number may not be an error, but simply an unusual response. To verify this, look through the household's responses to other questions to see if the response is plausible for that household. If the response seems to make sense, then leave the response unchanged. The second possibility is that the response seems unreasonable, in which case the cell is emptied of the false response and treated as a SYSMIS, or system missing response. A decimal point will automatically appear in the empty cell.

Using SPSS procedures to clean data
Three common SPSS procedures for cleaning data are frequencies, descriptives, and box plots. In order to locate the case or cases with data errors, one must first select subsets of cases.

Locating cases with data errors
Selecting subsets of data files is a useful technique for data cleaning. Generally it is used to restrict analysis to only a specific group of cases. In data cleaning, it is used to locate the case or cases containing errors.
Selecting subsets of cases is easily done using SPSS menus. Begin by going to the main Data menu and selecting the option Select Cases. This opens the "Select Cases" dialogue box (figure 6.6). The default for selecting cases is set to include all cases ("Select: All cases"). To change the default, click on "If condition is satisfied," as shown in figure 6.6, making the "If" button available for selection. Click on the "If" button to open the "Select Cases: If" dialogue box (figure 6.7). In this second dialogue box, scroll down to find and highlight variables from the list provided at the left. Click on the arrow button to move variable names into the box at the right; select from the operator and number keys the components needed to build an equation setting the rule for selecting cases. Common operators used in rules are >, <, =, and = (not equal).
For example, "mfi = 1" selects the cases in the first MFI survey area. Click on "Continue" to return to the previous menu (figure 6.6) and check that "Filtered" is selected (the circle next to it is filled in) under "Unselected Cases Are." Click on "OK" to run the selection command. Note that clicking on "Deleted" under "Unselected Cases Are" removes from the data file all cases that have been deselected. Do not select the deleted option unless you have an unusual reason for doing so.
In this example, once the number of cases is limited to MFI cluster 1, the frequency procedure is to check that no localities or group codes appear outside the range assigned to that area. SPSS returns to the "Select Cases: If" dialogue box, which displays at the far left of the data file a diagonal slash through the case number of deselected cases. To deselect all cases, return to the "Select Cases" dia-Managing the Survey Data 95

Figure 6.6 SPSS "Select Cases" dialogue box
Common operators used to set rules for selecting subsets of cases are >, <, =, and ~= (not equal).
logue box and click on "All Cases," then select "Continue" in the "Select Cases: If" dialogue box.

Frequencies
The frequency function in SPSS can be used to determine the frequencies of different responses for a variable. To compute frequencies in SPSS, click on Analyze in the main menu, then Descriptive Statistics, then Frequencies. This will open the "Frequencies" dialogue box (figure 6.8).
Click on the variables to be analyzed from the list at the left, then click on the arrow key to move them to the "Variable(s)" box at the right. The frequency procedure can be further specified to present results as a chart (see the example in figure 6.9) or to produce specific statistical 96 Microfinance Poverty Assessment Tool summaries. The bottom of the window contains the following buttons that can be used to refine the frequency procedure: "Statistics" accesses a dialogue box to select types of descriptive statistics "Charts" accesses a dialogue box to chart frequency distributions "Format" accesses a dialogue box to set the presentation format of results To find out which case contains the missing value, use the "Select Cases: If" dialogue box (accessed from the "Select Cases" dialogue box) and set the "If" condition to "missing [cookfuel]." Run the frequency test for the household identification code to locate case codes that have missing values. Then use the Go to Case option in the Data menu to locate each case. Once the case is reviewed, locate the original questionnaire to see if the correct response is provided.
Frequency charts are useful for visually inspecting the distribution of responses for a single variable, which can help identify outliers in the data. Figure 6.9 charts the distribution of per person expenditure on clothing and footwear; it also lists the mean and standard deviation for the variable. Because the variable has a large number of different values, the data were graphed in range segments.
The chart shows that distribution is slightly skewed to the right, with two responses much larger than all others. These may or may not repre-Managing the Survey Data 97

Descriptives
The descriptive function not only calculates almost all the statistics provided by the frequency function, it also provides a compact table of statistics. Descriptive tables are made by clicking on the Descriptive Statistics option in the Analyze menu, then Descriptives. Table 6.2 is an example of a descriptive table that shows an unusual outcome: a household spent nothing on clothing and footwear for an entire year. Inspection of the questionnaire and discussions with the field supervisor determined that the value was not erroneous, simply unusual.

98
Microfinance Poverty Assessment Tool

Box plots
In addition to frequency and descriptive tables, data can be explored through the use of box plots. Instead of plotting actual values, the box plot shows the median, 25th percentile, the 75th percentile, and values that are far removed from the rest. In figure 6.10, the thick line near the middle of each box represents the median for each group of cases. The box area represents the range in which 50 percent of all cases in each group fall. The box plot includes two categories of cases with outlying values. Cases with values more than 3 box-lengths from the upper or lower edge of the box are called "extreme values," and are marked with an asterisk (*). Cases of values between 1.5 and 3 box-lengths from either edge of the box are called "outliers" and are marked with the letter "O." Box plots can be used to clean data by identifying extreme outliers. Box plots are also useful for comparing the distribution of values among several groups. The box plot in figure 6.10 graphs the number of adults and children in each house by grouping data according to client status. As the graph shows, for both clients and nonclients the median family size is roughly five, with only three outliers and no extreme values detected. These results would not indicate a data-error problem.
Box plots and other graphs are created by selecting the Boxplot option from the Graphs menu. Choose "Simple" as the graph style and "Summary for groups of cases." This will open the "Define Simple Boxplot" dialogue box shown in figure 6.11. Choose the variable to be plotted and the groupings to distinguish cases.

Suggested data-cleaning routines
Once errors are located, they are corrected in the original data files. This section outlines specific tasks to follow when cleaning data for each of the four types of data files. The list is not exhaustive; the tasks are simply presented as examples of how data can be cleaned. Additional checks will certainly be needed that apply these concepts to test for errors in different variables.

Household data file (F1)
Conduct frequency tests on all variables with a limited number of response values. These include all variables with categories of defined responses. Only variables measuring an actual number need scrutiny (see items C2, C4, C5, C9, and D1 in the questionnaire).
Verify that no household has reported an unusually low or high number of meals in the past two days (item C2). Use a frequency table or box  Box plots can be used to clean data by identifying extreme outliers.

Adult data file (F2)
The adult file repeats the client status variable from A6. Verify that the households listed as MFI clients are actually clients of the MFI. This can be done by checking that all households listed as MFI clients also report at least one member as a client in table B1 in the questionnaire. The reverse procedure can be used to check that households listed as nonclients are actually nonclients. This can be done by selecting only cases of nonclients and creating a frequency table for adults who are members.
Create frequency tables on all variables with a limited number of response values. These include all variables except age and expenditures on clothing and footwear. Create descriptive tables or box plots to check for outliers for these two variables. Any cases where age is recorded as under 15 would indicate an error. Ages over 100 would also be questionable. Clothing expenditures well above the range of most households should also be checked to verify that these households also indicate a higher level of wealth in their responses on food consumption, housing, and asset ownership.
Verify that each household head has been correctly identified. No individual with an ID code of "1" (for head of household) should have a response under the variable for "relation to head," and no household should have more than one member with an ID code of "1." Choose the Select Cases option under the Data menu to filter only cases with an individual ID code of "1,"or head of household, then create a frequency table using the variable "relation to head."

Child data file (F3)
Create descriptive tables or box plots to check for outliers for the two variables of age and expenditures on clothing and footwear. Any cases where age is recorded as over 14 would indicate an error. If found, select cases where "age > 14," and run a frequency test for the household ID code. As before, check that any unusually high expenditure levels on children's clothing and footwear coincides with higher expenditures for other children and adults in the same household.

Asset data file (F4)
Ownership of assets reported by households is likely to vary considerably among households. Errors occur, in part, from households distorting information on the number and value of their assets. Errors also occur from inclusion of assets that may not be completely owned by the household (for example, either purchased through credit or rented).
Data-cleaning procedures for assets would thus screen for unusual combinations of information. For example, a household owning four Specific SPSS skills and techniques are required to prepare data for analysis once it has been cleaned. The data contained in the four separate files described in chapter 6 are first combined into a single file-an expanded version of the F1 household file. This is achieved in SPSS by using the procedure for aggregating data, followed by the procedure for merging files. Data recorded about adults, children, and assets are then used to create new, aggregated variables that record information at the household level. Once all data for the analysis is contained in a single file, several new variables are calculated from existing ones. The SPSS function for transforming data is used for this task.

Methods for aggregating data to generate new variables in SPSS
The data files for adult members of households (F2), child members of households (F3), and the summary of individual assets (F4) all contain data that needs to be aggregated at the household level. This process is required to create household-level variables that can be analyzed with other variables already existing at the household level. For example, if the object is to know the number of adults in each household who can write, this information can be created from the adult file by counting the number of adults who answer yes to "can write" in each household. However, the result is a number that is measured at the household level and therefore no longer fits in the adult file. The aggregation function calculates the new variable and the merge function transfers the new variable to the household file.
Aggregating data from the individual or asset level to the household level requires several steps. First, the type of variable to be created from each is defined and an SPSS function is set to calculate it. Second, the newly created variable is saved in a temporary file. Finally, the temporary file is merged with the household file by matching the household codes for each case.

Working with Data in SPSS
Chapter 7

SPSS aggregate data function
The aggregation process in SPSS requires that aggregate variables measured at the household level be temporarily saved in new files. The aggregate data function is an option in the Data menu. In the "Aggregate Data" dialogue box (figure 7.1), click on the variable name for household code in the list at the left and then click on the arrow key to move it to the box at the right labeled "Break Variable(s)." The "break variable" is the level at which data will be aggregated. All aggregations will use the household code as the break variable and all newly formed variables will be at the household level.
Move the cursor to the same variable list and highlight the variable to be aggregated. Use the lower arrow key to move this variable name to the box labeled "Aggregate Variable(s)." The "Name & Label" and "Function" buttons are now available. Click on "Function" to open the "Aggregate Data: Aggregate Function" dialogue box (figure 7.2).
In the "Aggregate Function" dialogue box (figure 7.2), specify how to calculate each aggregated variable. Select "Sum of values" if a total number is needed for each household, "Mean of values" for the average, or "Percentage above" or "Percentage below" for a specific percentage cutoff. (The latter two options require that a cutoff value for the percentage be entered. The alternatives "Percentage inside" or "Percentage outside" require that a range of percentage values be entered.) Select "Number of cases" for variables where the number of occurrences with-104 Microfinance Poverty Assessment Tool

Figure 7.1 SPSS "Aggregate Data" dialogue box
Aggregating data from the individual or asset level to the household level requires several steps.
in each household needs to be counted. In the example in table 7.1, the function "Mean of values" was selected to calculate the average age of adults in the household.

Aggregating old variables to generate new variables
Variables can be aggregated by a wide range of functions. The common methods used in this study record the value for each household case as one of the following: (i) mean, the average of all individual or asset cases, (ii) sum, the total of all individual or asset cases, or (iii) count, the number of occurrences of a particular response or condition. Individual-level indicators to be aggregated at the household level are listed in tables 7.1 and 7.2. Asset-level indicators to be aggregated at the household level are listed in table 7.3.
It is easy to make errors in the aggregation process if the steps involved are not well thought through. The far left column lists the original variable used to create an aggregation. These are placed in the "Aggregate Variables" box of figure 7.1. The middle column defines and names the output indicator that will be created for each household case. The new indicators will be saved in temporary output files. The far right column describes the procedures to follow in SPSS.
Working with Data in SPSS 105

Figure 7.2 SPSS "Aggregate Data: Aggregate Function"dialogue box
Common aggregation methods record the value of each household as the mean, sum, or count of all cases within the household.  Aggregating data on assets first requires computing new variables that represent the sum of the total value of assets. The following three variables need to be computed: • total value of livestock = sum of assets codes 1 through 4 • total value of transportation-related assets = sum of asset codes 5 through 9 • total value of appliances and electronics = sum of asset codes 10 through 15 No aggregation is needed to create output indicators such as the value of televisions; however, the transfer of data follows the same procedure. To be certain that the variable created can be recalled at a later time, click on "Name & Label" at the bottom of the "Aggregate Data" dialogue box. In the "Variable Name and Label" dialogue box (figure 7.3), fill in an identifying name and label for the variable being created.
Working with Data in SPSS 107  1, 2, 3, 4) (VALANIML) Select Cases) in which the asset code is < 5, then select "Sum of values" from the "Aggregate Function" dialogue box Value of individual transport Total value of transportation assets (VALTRANS) From the Data menu, select cases in which the asset assets (asset code = 5, 6, 7, 8, 9) code is > 4 and < 10, then select "Sum of values" from the "Aggregate Function" dialogue box Value of individual appliances Total value of appliances and electronics From the Data menu, select cases in which the asset and electronics (VALAPPLI) code is > 9, then select "Sum of values" from the "Aggregate Function" dialogue box Value of televisions owned Value of televisions owned (VALTVS) From the Data menu, select cases in which the asset (asset code = 10) code is = 10, then select "Sum of values" from the "Aggregate Function" dialogue box Value of radios owned Value of radios owned (VALRADIO) From the Data menu, select cases in which the asset (asset code = 15) code is = 15, then select "Sum of values" from the "Aggregate Function" dialogue box

Saving output as new files
In most cases, more than one individual variable can be aggregated at a time and the resulting output indicators saved in the same output file. However, if the aggregation procedure requires that only a subset of cases be selected to complete the aggregation, then separate output files are needed for each aggregation using the "Select If" procedure. Each output file requires a unique name. The entire aggregation process will result in the formation of nearly a dozen temporary output files. Use file names that will enable users to remember the contents of each. Saving the output for each group of aggregated variables requires making a new file. In the "Aggregate Data" dialogue box (figure 7.1), click on the small circle next to "Create new data file" and then click on the "File" button. This displays the "Output File Specification" dialogue box ( figure 7.4), where a name for the file containing the new variables can be entered. Use a name that will be easily recognizable at a later time.

Merging files
In the previous section, guidelines were given for creating many new temporary files containing variables of aggregated data. In SPSS the procedure for merging files is used to combine variables from two different files. Merging different variables for the same cases requires that both files share a common variable (the household code) with unique values for each case and be sorted so that the shared variable is listed in the same sequence in both files (for example, smallest to largest ID code).

108
Microfinance Poverty Assessment Tool The entire aggregation process will result in the formation of nearly a dozen temporary output files.
Files can be sorted by selecting Sort Cases from the Data menu (figure 7.5). Select the household ID code variable and move it to the "Sort by" box, then click on "Ascending," followed by "OK." Save the sorted file.
Once the household file and temporary aggregated data file are sorted by household ID code in ascending order, select Merge Files from the Data menu, then click on "Add Variables." The "Add Variables from: Read File" dialogue box opens. Select the household file to which you want to add the new variables. Click on "Open," and the "Add Variables from…" dialogue box opens (figure 7.6).
SPSS automatically identifies the common variables, which always include the household ID code. Check the box "Match cases on key vari-Working with Data in SPSS 109 ables in sorted files," then click on the ID code and move it to the "Key Variables" box by clicking on the arrow button next to it. Click on "OK." Check that the variables have been correctly merged and then save the new file under a different name.
To create a complete file at the household level, all variables listed in tables 7.1, 7.2, and 7.3 should be aggregated and merged with household-level data.

Transforming variables to recode data
A variable can be recoded to create a new variable or to replace the variable that already exists. Recoding is sometimes required to conduct computations. Recoding "SYMSIS" values into "0" will prevent a case from being excluded from a computation or analysis. For instance, to add the responses from C2 to the responses from C3, all "SYMSIS" values can first be changed to "0" and then added to create a new variable measuring the number of meals eaten by all households.
From the Transform menu, select Recode, then Into Same Variables. Specify the variables to be recoded in the dialogue box. A selection rule can be specified by clicking on "If" and following the menu prompts to specify the rule. The "Recode into Same Variables" dialogue box (figure 7.7) allows the operator to reassign values of existing variables or to collapse ranges of existing values into new values. Recoding into the same variable can also be used to transform missing codes into another value.
To recode into the same variable, click on "Old and New Values" and use the displayed dialogue box (figure 7.8) to indicate which old values are to be changed and what their new values will be. After selecting old and new values, click on "Continue," then "OK" to run the recoding procedure. In figure 7.8, the value "SYSMIS" is changed to "0." Only a few specific variables will need to be recoded, all of which result from the aggregation of data at the household (F1) level. For each aggregated indicator listed below, recode the old value of "SYSMIS" ("system missing") to the new value of "0," making the changes into the same variable, as follows: • from the aggregation of adult indicators listed in table 7.1, recode "SYSMIS" to "0" for EDUC1, EDCU2, etc., NUMWRITE, OCCUP1, OCCUP2, etc., NUMUNEMP, NUMCLIENT, ADUEX-PEN, and FHH • From the aggregation of child indicators, recode "SYSMIS" to "0" for NUMCHILD and KIDEXPEN • all asset value indicators are then aggregated to the household file (listed in the middle column of

Data procedures for computing new variables
Early in the analysis, the researcher will need to compute new variables from existing variables. Table 7.4 shows which computations are needed for creating new variables in the household file.
To compute new variables in SPSS, select Transform from the main menu, then Compute. The "Compute Variable" dialogue box (figure 7.9) opens. Click the "Target Variable" box and type the name of the new variable you are computing. Then click in the "Numeric Expression" box and type the variables to be used in computing the new variable. The formula can either be typed in or compiled by clicking on    variables from the variable list, followed by clicking the arrow button, then clicking the "Functions" button. Once the variable is created, open the "Variable Label" dialogue box by clicking on "Type&Label" and enter a variable definition and any value labels, if appropriate.
In some cases, computing new variables may require that a condition be applied to filter the values of the existing variable or variables to be used in forming the new variable. SPSS has a special dialogue box (figure 7.10) for this purpose, which can be accessed by clicking on the "If" button shown in figure 7.9. In the displayed dialogue box, a rule for selecting or excluding specific variable cases can be written. When this is completed, click on "Continue," then "OK" to run the computation.
Special precaution should be taken when computing variables from other variables containing missing values (SYSMIS). Any number added to a SYSMIS will result in a SYSMIS. To avoid this problem, exclude these cases from the calculation through an "If" statement using the expression NE MISSING[VARIABLE NAME] (figure 7.10).

Summary
This chapter guided users to aggregate and merge data from the adult (F2), child (F3) and asset (F4) files with data contained in the household file. The chapter also covered guidelines for transforming variables to create new household indicators. The end result of this process was the creation of an expanded household file containing all socioeconomic and poverty indicators required to complete the poverty assessment. This file can now be used to complete the initial stages of analysis to support the creation of a poverty index.

Testing for significant differences between client and nonclient households
Checking for differences in socioeconomic characteristics between clients and nonclients can improve our understanding of why the two groups differ in terms of poverty levels. These kinds of details provide a background that can contribute to the interpretation of quantitative povertyrelated findings. This chapter will provide guidance in how to test for differences between clients and nonclients to identify sampling differences between the two groups.
Differences between groups can be tested using both the t-test of differences between means and the chi-square test for cross tabulations. Determining which test to apply depends on the type of data scale used to measure the variable. For nominal and ordinal data, descriptive analysis of the relationship between two variables involves cross-tabulation tables to identify patterns of responses that differ by client status.
To test whether differences in responses between sample groups are significant-that is, to determine whether the variables are not independent of one another-the chi-square test is used. To determine whether the difference in means between two groups of independent samples for an interval variable is significant, the t-test of differences between two means is used. Significant differences found in the samples can be interpreted as representative of the population. In this case, the population refers to the entire group of MFI new clients and nonclients located in the same area.

How cross tabulation is applied
When one or more variables are measured on a nominal or ordinal scale, cross tabulation is a means of identifying a relationship between two variables. Cross tabulation categorizes into cells the number and percent of cases in which different combinations of responses occur. For example,

Conducting Descriptive Data Analysis
Chapter 8 115 if a researcher wanted to check for differences in the principal occupations for surveyed client and nonclient households, a cross tabulation would show the absolute numbers of the two household types working in each category of occupation, as well as the percent of total households falling into each category. In this way, patterns that differ between clients and nonclients can more easily be detected. Table 8.1 shows an example of a cross tabulation that segments responses by client status and occupation types for a case study used to test this assessment tool. As the table shows, on a percentage basis, nonclients are more likely to be self-employed in agriculture, working as casual labor, or unemployed and looking for a job. MFI clients are more likely to be self-employed in a nonfarm enterprise.
These results are noteworthy because they indicate that MFI selection criteria, such as a client being engaged in a microenterprise activity, translates into an overrepresentation of households with adults engaged in business than would be expected in the general population. If households engaged in business tend to have higher or lower poverty levels than nonbusiness households, this would likely influence the overall ranking of client households within the poverty index.
The chi-square test is then used to determine whether differences in the distribution of responses across categories are significant in a statistical sense. The chi-square test answers the question of whether the observed differences in responses between categories reflect a sampling error or indicate a relationship. In the example shown in table 8.1, a chisquare value that is significant at 0.05 (or less) suggests that a difference between client and nonclient households exists in terms of the distribution of occupation. The nature of this relationship, however, can only be discovered through inspection of the cross-tabulation table. 116 Microfinance Poverty Assessment Tool Cross tabulation is a means of identifying a relationship between two nominal-or ordinal-scale variables. Table 8.2 shows the chi-square results for the cross tabulation shown in table 8.1. The chi-square level of significance is less than 0.001, indicating that a very strong difference exists in the pattern of occupation responses between clients and nonclients.

Cross tabulation in SPSS
To run the cross-tabulation procedure in SPSS, click on Descriptive Statistics in the Analyze menu, then choose Crosstabs. This will open the "Crosstabs" dialogue box (figure 8.1). Move the variable for designating client status from the list at the left to the "Column(s)" box. Move the variables to be compared by client status into the "Row(s)" box. More than one variable can be selected at a time in the "Row(s)" box, but each combination will result in a separate cross-tabulation table. To run a Conducting Descriptive Data Analysis 117  The chi-square test is used to determine whether differences in the distribution of responses are significant in a statistical sense.
cross tabulation for each survey cluster, move the MFI cluster code to the "Layer 1 of 1" box. Now click on "Statistics" at the bottom of the page to open the "Crosstabs: Statistics" dialogue box ( figure 8.2). Check the box for "Chisquare" and click "Continue" to return to the "Crosstabs" dialogue box. Click on "Cells" to open the "Crosstabs: Cell Display" dialogue box (figure 8.3) and check the "Observed" box under "Counts" and the "Column" box under "Percentages." Click on "Continue" to return to the "Crosstabs" dialogue box, then on "OK" to run the cross tabulation. The results automatically appear in the SPSS "Output View" window. 118 Microfinance Poverty Assessment Tool The percentage of total cases falling into each cell of a cross-tabulation table is more significant than the absolute number in each cell.

Interpreting a cross-tabulation table
It is possible to enter multiple variables into a row at once when running cross tabulations, but be certain to identify only the variable for client status in the "Column(s)" box, as shown in figure 8.1. In the "Crosstabs: Cell Display" dialogue box ( figure 8.3), check only the "Column" option under "Percentages," as a percentage breakdown by row would not be useful. Once the test is run, check the output file that is shown on the screen for specification errors.
Interpreting a cross-tabulation table accurately takes practice. In most cases, the absolute number given in each cell of the table provides little insight. Instead, the percentage of the total cases falling into each cell is more significant for determining differences in the distribution of responses. Percentages can be given as either a share of the total number of cases in a column or the total number of cases in a row. In this study, percentage breakdowns of column totals will most often be used where the column variable indicates whether the respondent is an MFI client or nonclient household.
Cross tabulation can be done at different levels of data. Further clarification of the pattern of differences between clients and nonclients may be gained by dividing data into smaller categories, such as individual survey regions. In this way, the source of differences showing up in the aggregated data may be pinpointed to a more specific relationship. At the same time, cross tabulation that is more detailed may uncover a spurious relationship that appears only when subcategories are aggregated. The question to ask when deciding to delve deeper is: "Under what circumstances does this relationship exist?" In the previous example shown in table 8.2, a test of significance at the cluster level for the same data set uncovered the results shown in table 8.3. Based on chi-square levels of significance, it can be noted that occupational differences between clients and nonclients exist in four of the five regions. The region where no differences were found was highly urban, meaning fewer opportunities for agricultural enterprises existed.

Conducting specific analysis using cross tabulations
The following list of indicators from the adult file (F2) can be used to test for significant differences between clients and nonclients using cross tabulation: • main occupation of household adults • education levels of household adults • marital status of household head The analysis for each indicator can also be repeated at the cluster level. If significant differences are found between occupation and levels of education, the likely source of these differences should be interpreted.
Cross tabulation can be done at different levels of data. In this way, the source of differences in aggregated data may be pinpointed to a more specific relationship.

How the t-test is applied
For most socioeconomic indicators examined by this assessment tool, the number of possible values for a variable will be too large to make use of a cross-tabulation table. This is particularly the case for interval-and ratio-scaled variables. One way to test for significant differences between MFI clients and nonclients on interval and ratio data is to compare the means of a variable for the two different groups.
The mean differences between the two groups and the deviation from the mean within each group are used to derive a t-value. This value can then be compared with what is called the "critical t-value." If the t-value is higher than the critical t-value, the groups can be considered different.
On the other hand, if the calculated t is lower than the critical t-value, one can conclude that no difference exists between the two groups regarding the variable in question. If the actual t-value is above the critical t-value, the level of significance will be .05 or less. 120 Microfinance Poverty Assessment Tool One way to test for significant differences between MFI clients and nonclients is to compare the means of a variable for the two different groups.

SPSS procedure for running a t-test of means
To run a t-test, click on Compare Means in the Analyze menu, then on Independent-Samples T Test. This opens the "Independent-Samples T Test" dialogue box ( figure 8.4), where interval or ratio variables can be selected as test variables. The grouping variable-client status-identifies how to differentiate two groups of cases. Select only MFI client status as the single grouping variable, then click on "Define Groups" to specify two codes for the groups to be compared. Be certain that the codes used match those entered in the data file. Tables 8.4 and 8.5 shows SPSS output tables for comparing the mean of two independent samples. The first table shows the number of cases from each subcategory used for the calculation. The middle columns in the table show the calculated means for the two groups of MFI clients Conducting Descriptive Data Analysis 121 The grouping variableclient status-identifies how to differentiate two groups of cases. and nonclients, plus the standard deviation associated with each. The table shows the share of households adults who can write according to client status. Because the underlying code used "0" for "no" and "1" for "yes," the resulting means can be easily translated into a percentage. Ninety-five percent of adults in client households can write compared with ninety-two percent of adults in nonclient households. Determining whether the means are significantly different requires studying the second output shown in table 8.5. First, the table indicates whether the variances between the two groups can be considered equal (Levene's test). If the level of significance is less than 0.05, the calculated t-value is that shown in the row for equal variance.

Figure 8.4 SPSS "Independent-Samples T Test" dialogue box
The calculated t-value for this example is 2.2 and the significance of this value is 0.03, indicating that the calculated t-value is significantly greater than the critical t-value. On the basis of this result, one can conclude that MFI households have a significantly greater percentage of adults who can write than nonclient households.

Conducting specific analysis using the t-test of means
Results of interval-and ratio-scaled data that are tested using the t-test of means can be summarized in an SPSS output file. In addition, a summary narrative sheet can be prepared to describe significant differences between clients and nonclients found by analyzing the variables listed below. Using the cross-tabulation and t-test techniques explained in this chapter, the household data file (F1) can be expanded with the following variables: • family size • number of children • percent of female-headed households • average size of landholdings • average value of landholdings 122 Microfinance Poverty Assessment Tool The adult data file (F2) can be expanded with a variable for the percentage of adults who can write, and the child file (F3) with a variable for the average age of children.

Summary
In this chapter, analysis has focused on how to identify differences between clients and nonclients based on a number of socioeconomic indicators. When differences between the two groups are found to be significant, this information may suggest that the selection criteria of the MFI has resulted in the sampled groups being different in ways that are not directly related to their poverty status, but which could influence their status. These differences should be noted when interpreting the measurement of relative poverty levels of households, a measurement that will be explored in chapter 9.
A basic premise of this Microfinance Poverty Assessment Tool is that, within the range of poverty indicators collected through survey techniques, a subset of indicators exists which measures different aspects of relative poverty at the household level. Which combinations of indicators prove the most instrumental in measuring relative poverty in a given survey area will differ, often in ways that are somewhat predictable. In countries where poverty is extreme, indicators signaling chronic hunger tend to differentiate the relative poverty of households. In densely populated countries, ownership of land and dwellings may better signal differences in relative poverty. Cultural differences will also influence certain types of indicators.
Developing an objective measure of poverty requires first identifying the strongest individual indicators that distinguish relative levels of poverty and then pooling their explanatory power into a single index. This chapter guides users to conduct data analysis to: (i) determine which indicators are the strongest measures of relative poverty for the surveyed households, (ii) create a ranking list of these variables on the basis of their correlation with the poverty benchmark indicator-per capita expenditure on clothing and footwear, and (iii) apply these ranked indicators systematically to calculate a household poverty index.

Linear correlation coefficient
The linear correlation coefficient procedure is the primary means of filtering poverty indicators to determine which variables best appear to capture differences in relative household poverty. Testing the level and direction of correlation with the benchmark poverty indicator-per capita expenditure on clothing and footwear (PCEXPEND)-among a wide array of ordinal and ratio variables (see box 9.1) is the primary means of determining the strength of poverty indicators.
The linear correlation coefficient is a statistical procedure used to measure the degree to which two variables are associated. The correla-tion coefficient can determine the level and direction of a relationship between two variables. Linear correlation does not require that the units used in each variable be the same. The values of the correlation coefficient range from -1.00 to +1.00, and their sign and magnitude indicate how the two variables relate to one another. A coefficient value at or near 126 Microfinance Poverty Assessment Tool The linear correlation coefficient filters poverty indicators to determine which variables best capture differences in relative household poverty. -1 indicates that the variables are inversely related, that is, a higher value for one is associated with a lower value for the other. Higher education may, for example, be inversely related to consumption of inferior food, since higher education often brings higher incomewhich in turn pays for better-quality food. In contrast, a value at or near 1 suggests a strong positive relationship between the two variables. For example, the number of household members may be very closely related to the number of rooms in the household. Coefficient values at or near 0 suggest that no strong relationship exists between variables.
The interpretation of results is based on probability theory. This theory determines the level of significance of differences among sample groups that can be applied to the entire survey population. In the assessment tool, levels of significance are set at 0.05 or less, meaning that a minimum 95 percent confidence interval is used to either accept or reject the hypothesis that the association between two variables is random. If the level of significance is found to be less than 0.05, the association between the two variables is considered strong; if the significance level is found to be less than 0.01, the association is considered very strong.

Using SPSS to measure linear correlation
Correlation tables are created in SPSS by selecting Correlate under the Analyze menu, then Bivariate as the type of correlation. This will open the "Bivariate Correlations" dialogue box (figure 9.1). Highlight variables in ordinal, interval, or ratio form in the variable list at the left and move these to the "Variables" box at the right by clicking on the arrow button.
At the bottom of the dialogue box, check "Pearson" as the type of correlation coefficient, and select "Two-tailed" as the type of signifi-Developing a Poverty Index 127

Figure 9.1 SPSS "Bivariate Correlations" dialogue box
The correlation procedure is set up so that the benchmark indicator, per capita expenditure on clothing and footwear, appears as the first variable.
cance. Choose PCEXPEND, or per capita expenditure on clothing and footwear, as the first variable in the "Variables" box. Add additional variables from the indicators shown in box 9.1 in groups of not more than six to eight at a time.

Interpreting an SPSS correlation table
Correlation tables created by SPSS are in matrix form and appear in the "Output View" window. If the variable PCEXPEND tops the list in the "Bivariate Correlations" dialogue box, the first column in the output table will show the levels of correlation between PCEXPEND and all other variables run in the procedure. Table 9.1 is an example of a correlation output table. The results shown indicate that, of the three variables correlated with per person clothing and footwear expenditure (the shaded boxes), only the first two are found to be significantly associated. These are number of days meat and rice were served, respectively. Each was found significant at less than p = 0.01, indicating a 99-percent certainty that the correlation is not random.
Although the correlation coefficient for days rice was served is 0.179, note that the correlation is still considered highly significant. The variable for days inferior foods are served is negatively correlated with expenditures, as would be expected, but the level of significance (0.376) indicates that no association exists between per capita expenditures on clothing and footwear and consumption of inferior food.
Using the output shown in table 9.1, the two variables for number of days meat and rice were served can be added to a filtered list of indicators measuring aspects of poverty. To complete the filtering process, all other variables listed in box 9.1 would be correlated with PCEXPEND and those registering a significant level of correlation added to the filtered list of poverty indicators.
In large data sets, such as in this methodology, even small correlation coefficients may signal an association between two variables. To verify this, check that the association is found significant (level of significance is less than 0.01).

Selecting variables to test for correlation
The correlation procedure should always be set up so that the benchmark indicator, per capita expenditure on clothing and footwear, appears as the first variable listed in the "Bivariate Correlation" dialogue box (figure 9.1). To keep output tables a manageable size, run separate correlation tables for each group of indicators listed in box 9.1. By always including PCEXPEND as the first variable listed, the first column of the output table will always show the correlation coefficients between the benchmark poverty indicator and all other indicators.
Output from the analysis can be summarized in a table listing all indicators tested and ordered according to the strength of association measured, noting the number of cases found with missing values. An example of this format is shown in table 9.2. Indicators registering the highest levels of significance (p < 0.01) would top the list, while indicators registering insignificant levels of association (p > 0.05), would be excluded from the list. It is important to note the sign of the correlation coefficient, which indicates whether the relationship was found to be negative or positive. This table will be used again when estimating the poverty index.

Using principal component analysis to estimate a poverty index
This assessment tool develops a relative poverty index by applying principal component analysis (PCA). The PCA method is applied to determine how information from various indicators can be most effectively combined to measure a household's relative poverty status. The end result of PCA is a single index of relative poverty that assigns to each sample household a specific value, called a score, representing that household's poverty status in relation to all other households in the sample. An analyst creates the index from the combination of individual indicators that correlate significantly with one another on the basis of a shared underlying poverty component. PCA is used to identify (or extract) underlying components within a group of indicators that can at least partially explain why the indicator values differ between households in the way that they do. Each component is assumed to capture a unique attribute shared by survey households. One of the reasons why households answer differently to indicator questions is because of their relative poverty status.
If indicators are related in more than one way, then more than one underlying component will be created. However, only one component will measure a household's relative poverty. Indicators may also relate to one another due to the rural or urban setting of households or to specific regional conditions. Other possible underlying components may capture aspects related to similarities between households in education, occupation, or cultural practices.
In general, each component extracted will capture a unique attribute shared by survey households. The number of components that can be "extracted" increases with the number of indicators included in the analysis. Figure 9.2 shows how components relate to the indicator variables used to describe them.
The principal objective of using PCA in a poverty assessment is to extract the "poverty component" that can be used to compute a household-specific index of relative poverty. Hence, PCA will use first and foremost indicators that already show a strong correlation with the poverty benchmark indicator, per person expenditure on clothing and footwear.
Filtering the indicators in this way supports a stronger poverty component-one that associates most consistently and strongly with those indicators that an analyst expects to closely measure relative poverty. This component can then be treated as a "poverty index." The following sections guide users in how to apply the PCA method to most effectively measure the poverty index. 130 Microfinance Poverty Assessment Tool The end result is a single index of relative poverty that assigns a score to each sample household.

Statistical tools used in creating a poverty index
The steps for creating a poverty index using the PCA method are as follows: 1. Select a screened group of variables highly correlated with the poverty benchmark indicator.

Run a test model and interpret the results.
3. Revise the model on the basis of the results of prior runs until the results meet the performance requirements.
4. From a final model, save poverty component scores as poverty index variables.
Step 1: Select a screened group of indicators Before the PCA method is applied to the data, poverty indicators must go through a series of filters to ensure that the resulting index does not represent a distorted measure of poverty. A list of all indicators correlated with the poverty benchmark indicator was created in the first section of this chapter (see box 9.1). The reduced list of indicators in box 9.1 constitutes the first screening of indicators for PCA. These indicator variables are all in ordinal and ratio scale, which is required for the PCA method. Check the list for any variables with more than 25 missing values and use these as sparingly as possible. Add the variable PCEXPEND to the list. It is now treated as any other variable within the PCA method.
The following additional filters are used to further narrow selection of variables for the PCA model: Limit the number of indicators used in PCA. Having fewer variables reduces the complexity of the resulting calculated components. Closely related variables that effectively measure the same phenomenon can be screened, with only the strongest added to the PCA model.

Other indicators
For example, if all three luxury foods correlate strongly with per capita expenditure on clothing and footwear, choose only one or two of these. It is recommended that at least 10, but no more than 20, variables be used to create the poverty index.
Balance the range of indicators to reflect different dimensions of poverty. Several indicators measuring similar aspects of poverty can be included in a PCA model; however, a heavy concentration of similar indicator types can inappropriately skew the resulting poverty index to overemphasize one aspect of poverty. To avoid this, select several indicators from each section of the questionnaire.

Step 2: Run a test model and interpret the results
Components can be extracted from a series of indicators using several different techniques, however, only one-principal component analysis-is appropriate for the poverty assessment methodology. In PCA, each underlying component that is calculated represents a linear combination of the indicator variables used in the model.
The first component is the combination that accounts for the largest amount of variance in the sample. The second component accounts for the next-largest amount of variance and is uncorrelated with the first. Successive components explain progressively smaller portions of total sample variance. All components are uncorrelated with one another. Because of this trait, only one can be considered to measure relative poverty.
Using SPSS to generate a PCA model. You are now ready to run an initial PCA model. From the Analyze menu, select Data Reduction, then Factor Analysis. This opens the "Factor Analysis" dialogue box (figure 9.3). Select 6 to 10 indicators from the list of variables in box 9.1 that register the strongest levels of association with the benchmark indicator. Scroll down the list of variables at the left and select the highest-ranked indicators. Move them to the box on the right by clicking on the upper arrow button.
Once you have selected the variables, select the indicator from the list that distinguishes MFI clients from nonclients. Click on the lower arrow button to move this indicator to the "Selection Variable" box. Click on "Value" to choose the value representing nonclient households (designated as "0" on the questionnaire). This will restrict your initial model to include only the 300 nonclient households. The nonclient sample represents the general population and is therefore a more appropriate group to use for building the initial model.
Click on "Descriptives" to open the dialogue box shown in figure 9.4. In this dialogue box, check the box "Initial Solution" in the top part and "KMO and Bartlett's test of sphericity" in the bottom part. Click on 132 Microfinance Poverty Assessment Tool A heavy concentration of similar indicator types can skew the resulting poverty index to overemphasize one aspect of poverty.
"Continue" to return to the main "Factor Analysis" dialogue box. In this dialogue box, click on "Extraction" to open the dialogue box shown in figure 9.5. Set the extraction method to PCA by selecting "Principal components" from the "Method" drop-down list. Under "Analyze," check the box "Correlation Matrix." Under "Display," check the box "Unrotated factor solution." Note that, in the lower part of the form, it is possible to alter the minimum value of the Eigen value or to limit the number of factors to be extracted. Select the minimum Eigen value of "1" (the default value). This value will be used when saving the final results of the model. Click on "Continue" to return to the main "Factor Analysis" dialogue box.
Developing a Poverty Index 133 The nonclient sample represents the general population and is therefore a more appropriate group to use for building the initial model.
At the bottom center of the "Factor Analysis" dialogue box, click on "Rotation." In the dialogue box that opens, check the box for "None" under "Method." Click on "Continue" to return to the "Factor Analysis" dialogue box.
Finally, in the lower right corner of the dialogue box ( figure 9.3), click on "Options." In the dialogue box that opens, under "Missing Values," select "Replace with mean." In the bottom section of the screen, note the options to sort component coefficients by size and to choose a minimum value to be shown on the screen. These options can be used later when refining the model. Click on "Continue" to return to the main "Factor Analysis" dialogue box.
Step 3: Revising the model until results meet performance requirements PCA does not provide an easy way to generate a best-fit model for a poverty index. The approach requires trial and error and continual scrutiny of variables to determine which combination yields the most logical results. The primary strategy is to systematically screen the list of variables that could be used in the model without compromising the explanatory power of the poverty index.
The starting point for this screening is the component matrix, described in the following subsection. In addition to the component matrix, several other techniques can be used to determine how to improve the PCA poverty index model.
The component matrix. The initial output for the PCA model will include four tables: the component matrix, the common variance table, the communalities table, and the KMO-Bartlett test. Each output can be used to interpret results and refine the model. However, the most critical output for determining the composition of the poverty index is the com-134 Microfinance Poverty Assessment Tool

Figure 9.5 SPSS "Factor Analysis: Extraction" dialogue box
The PCA approach requires trial and error and continual scrutiny of variables. The most critical output for determining the composition of the poverty index is the component matrix. does the value of the component, which in this case is the relative wealth of the household. Negative coefficients indicate an inverse relationship between the indicator and the relative wealth of the household. Table 9.3 shows component loading coefficients for a sample PCA model used to calculate a poverty index. As the table shows, two components were calculated from the indicators.
The size of the absolute value of all component loadings on the first component in table 9.3 indicates that all can be considered significant explanatory indicators. The signs on the coefficients also align with expected characteristics of relative poverty. The second component in this model captures another common aspect of households and may suggest a relationship hinging more on rural households. It does not appear to consistently capture variance related to relative poverty, since for some variables the loadings carry an unexpected sign, their magnitude is insignificant, and the results do not appear consistent from one variable to the next. An analyst can improve the model's explanatory power by screening out variables that have low component loadings on the poverty component, since these do not improve the explanatory power of the index, and by adding new variables from box 9.1 to see if the addition improves or weakens the model results. Table 9.4 shows a second component matrix, one which features a few indicators that appear to contribute little to the model. The indicators "number of days inferior food served" and "number of bulls and cows" have much lower coefficients than others in the model, although the signs of the coefficients reflect the expected relationship of the indicators to household wealth. (Both of these indicators were found insignificantly correlated with the benchmark poverty indicator and are included here only for illustrative purposes.) The model in table 9.4 could be improved by removing these two indicators and re-estimating coefficients for those indicators remaining in the model. When weak variables are removed from the model, the coefficients on the remaining variables often increase in magnitude and the number of extracted components declines.
Even the most experienced analyst will run numerous combinations of variables to determine the combination of indicator variables that most appropriately explains the underlying poverty component. Analysis of results can be repeated with alterations until the resulting model appears to be the most appropriate for the survey data. Ideally, the final version will capture several dimensions of poverty (for example, food security, human resources, and asset accumulation), with no single group of measures constituting the entire measure. It is unlikely that the final model will include more than 20 indicators. If the model has been carefully screened to include only indicators of poverty, the first component is likely to explain the variance associated with poverty. As variables are systematically added or deleted, Eigen values and the associated level of variance explained by the poverty component can guide an analyst to refine the model. As variables are deleted, the Eigen value for the poverty index component will change, as will the percentage of common variance explained by the component. The change in the share of explained variance can signal whether the addition or elimination of a variable improved or reduced the explanatory power of the poverty index.
As a rule, a minimum Eigen value of 1 is needed if the component is to be considered representative of a common underlying dimension. In table 9.5, only the first two components indicate that a common variance is being measured. The first component (in this case, the poverty index) explains 37.5 percent of total variance; the second, 12.7 percent. In general, because the model has been refined to create a measure of relative Developing a Poverty Index 137 poverty, it is reasonable to expect the poverty indicator to explain the most variance.
Relative size of communalities. Another means of testing the appropriateness of the poverty model is to note the relative size of communalities in the model. Communalities represent the strength of the linear association among variables and components. Statistically, they represent the same measure as R-squared in a regression analysis. The values of communalities range between 0 and 1, with higher numbers indicating that a greater share of common variance is explained by the extracted components. Communalities indicate how well the indicators combine to identify different components. Since we are interested in only one of several shared components, communalities alone do not indicate the appropriateness of a variable for the poverty index model. Improving the measures for communalities will not improve the poverty index component if the added variables correlate strongly with components other than poverty.
Some variables may contribute to the explanatory power of a poverty factor, but not account for variances captured by other common factors. As a result, variables may have low communality coefficients but still be relevant indicators for building the poverty component. In general, however, communalities close to 0 (less than 0.1) signal that the variable in question may be a candidate for exclusion in subsequent runs. Table 9.6 is an example of a communalities table.
The table shows that communalities ranging in value from 0.198 to a high of 0.652 can be considered to fall within an acceptable range; all indicators prove highly explanatory of the poverty component shown in 138 Microfinance Poverty Assessment Tool   Table 9.7 is an example of how the table appears in "Output View" of SPSS. In the first cell of the right column of the table is the KMO measured for the model. In this case the number, 0.855, is within the acceptable range for a wellspecified model. The chi-square test is not used in this methodology because the test will almost always show less than 0.001 significance with sample sizes as large as 500.
Developing a Poverty Index 139  Step

4: Saving component scores as a poverty index variable
Once the final model for computing the poverty index is decided, the sample size used to calculate the poverty component can be increased from the 300 nonclients to the full sample of 500 client and nonclient households. This can be done within the "Factor Analysis" dialogue box (figure 9.3) by removing the MFI status variable from the "Selection Variable" box. Using the full 500 sample size, rerun the model to register the final component calculations, then verify that no unusual results occur. If the measures of good fit decline slightly, do not re-specify the model. Because the random sample of MFI clients cannot be considered an unbiased representation of the local population, MFI client cases are not used to set model specifications.
Using the final version of the PCA model, save the standardized values of the poverty component as a variable in the household data file. This is easily done from the "Factor Analysis" dialogue box in SPSS (see figure 9.3). First, click on "Scores," at the bottom of the screen to open the "Factor Analysis: Factor Scores" dialogue box (figure 9.6). In this dialogue box, check "Save as variables" and, under "Method," check "Regression." Hit "Continue." Second, open the "Factor Analysis: Extraction" dialogue box by clicking on "Extraction" in the main "Factor Analysis" dialogue box (see figure 9.5). Check "Number of factors." This will cause the box to the right to be highlighted. Enter "1" to indicate that only the first component is to be saved as a variable. Rerun the PCA model. Check to ensure that a new variable, "factor regression score," was created in the household file. Change the variable name to POVINDEX and add a variable definition such as "household poverty index."

Properties of the poverty index variable
The poverty index created through principal component extraction is estimated from standardized indicator values. This standardization is performed automatically by SPSS before running PCA. The poverty 140 Microfinance Poverty Assessment Tool

Standardizing a variable strips away the units in which a variable is measured.
index is also in standardized form. Standardizing a variable strips away the units in which a variable is measured. A standardized variable has a mean of zero and a standard deviation of 1. Figure 9.7 shows the distribution of a poverty index in standardized form. Poverty scores shown in the graph range from -2.51 to 3.72. Approximately two-thirds of households fall in the range between -1 and 1. Figure 9.8 shows the cumulative frequency of a different poverty index graphed for clients and nonclients. As the figure indicates, a fairly large margin of difference exists between the two groups except for the poorest of households, where differences between client and nonclient scores converge. For the poorest 10 percent of households, no difference is seen between client and nonclient poverty levels.
However, for all other levels of relative poverty, clients appear poorer than nonclients. This can be cross-checked with the average poverty index score for clients against nonclients. In the example shown in figure  9.8, the average nonclient score is 0.22 and the average client score is -0.13, suggesting that, on average, clients are assessed as poorer than nonclients in the same area.

Checking index results
Once the composition of the poverty index has been decided, the researcher can explore the findings, first to identify the level of relative poverty differences between clients and nonclients, and second, to verify

Number of Households
The poverty index is estimated from standardized indicator values and is itself in standardized form.
that the poverty index differentiates relative poverty among households consistently across survey areas and against individual indicators.
To check for significant differences between relative poverty levels of clients and nonclients, run a t-test of means using the poverty index as the dependent variable and the status of clients as the independent variable. Check if the level of significance is less than 0.05. If not, then there is no significant difference between the two samples.
Explore the index further by testing for significant differences between households ranked among the poorest (where the poverty index measures less than -1.0) as well as those ranked between -1.0 and 0, between 0 and +1.0, and, finally, for the least poor households with scores above +1. Note where significant differences are found, or where differences appear strongest. Check the means in each test to determine whether clients or nonclients are measured as less poor.
The poverty index can also be used to check for differences in the relative poverty of survey sites. Compare means between clients and nonclients at the MFI cluster level. Graph the results as shown in figure 9.9 to illustrate the average poverty score by MFI branch and client status.
Check that differences in average poverty among nonclients in the different survey areas correspond with the survey team's knowledge of which areas are considered poorer and wealthier. For example, the results from a case study displayed in figure 9.9 suggest that overall wealth levels may be lower in branches 2, 3, 4, and 5, and higher in branches 1 and 6 of the MFI.
In addition, the MFI seems to attract poorer households in areas 2, 3, and 4, and attract less poor households in areas 1 and 6 than are found in the nonclient population in these localities. Interestingly, in this particular example, clients in areas 1, 5, and 6 participate in a program without a specific targeting mechanism and those in areas 2, 3, and 4 are specifically screened to include only the poorest households.

142
Microfinance Poverty Assessment Tool

MFI clients Nonclients
A t-test of means is used to check for significant differences between the relative poverty levels of clients and nonclients.
When clients and nonclients are sorted by program type for this same case study, as shown in figure 9.10, a stark contrast in depth of poverty is seen between the targeted and non-targeted programs. The targeted clients are the poorest, on average, and the nonclients located in the targeted program areas are poorer, on average, than both clients and nonclients in the non-targeted program.
Developing a Poverty Index 143 The poverty index can be used to check for differences in the relative poverty of survey sites.

Using relative poverty terciles to interpret the poverty index
Defining the poor within the local population The creation of the poverty index assigns a poverty-ranking score to each household. The lower the score, the poorer the household relative to all others with higher scores. The scores of MFI client and nonclient households can now be compared to indicate the extent to which the MFI reaches the poor. First, however, the share of the local population likely to fall into the poorest group, as defined by the poverty assessment, must be decided. If a researcher is interested in measuring the extent to which the MFI succeeds in reaching the poorest of the local population, an appropriate definition may be the poorest 20 percent. A broader definition of the poor may include the lower half of the local population.
As explained in chapter 1, the microfinance poverty assessment methodology uses a cutoff of 33 percent to define the poorest group within the local population. This decision is based on the usefulness of categorizing local populations into terciles that can be broadly interpreted to represent the lowest-, middle-and higher-ranked groups of households based on their relative poverty. The methodology can be adapted to include additional categorizations, such as quartiles or quintiles. Although this manual uses terciles, researchers are advised to use the categorization that makes the most sense within the national context.
Each assessment study includes a random sample of 300 nonclient households and 200 client households. To use the poverty index for making comparisons, the nonclient sample is first sorted in ascending order according to poverty index score. Once sorted, nonclient households are divided in terciles based on their score: the top third of nonclient households are grouped in the "higher"-ranked group, followed by the "middle"-ranked group and finally, the "lowest"-ranked group. Since there are 300 nonclients, each group contains 100 households each. The cutoff scores for each tercile defines the limits of each poverty group.
Client households are then categorized into the three groups based on their household scores. Figure 9.11 illustrates the use of cutoff scores to create poverty terciles from nonclient households. The cutoff scores of -0.70 and +0.21 were calculated from an actual test case study example. As noted in chapter 1, each poverty assessment will use different cutoff scores to group households into terciles. The steps involved in determining and applying these scores are described in the next section.

SPSS procedures for creating poverty terciles
Step 1: Limit sample to nonclients. First, group only the 300 nonclient households into terciles. From the Data menu in SPSS, click on Select

144
Microfinance Poverty Assessment Tool Although this manual uses terciles to categorize poverty groups, researchers are advised to use a categorization that makes sense within the national context. Cases, then "Select If." Use the displayed dialogue box to filter cases only for nonclients.
Step 2: Rank nonclient households to create three relative poverty terciles. Terciles of the poverty index are created by selecting Rank Cases from the Transform menu. In the "Rank Cases" dialogue box (figure 9.12), select "poverty index score" from the cases listed in the box at the left, then use the arrow key to transfer it to the "Variable(s)" box at the right. Click "Smallest value" under "Assign Rank 1 to," then click "Rank Types" to open the "Rank Cases: Types" dialogue box (figure 9.13).
Click on "Ntiles" and type in the number 3. This will segment the sample of nonclients of 300 into three groups. If done correctly, approximately 33 percent of all nonclient households, or roughly 100 households, will be assigned to each of the three groups.
To verify that the ranking was done correctly, run a frequencies test on the ranking variable that is automatically created in SPSS. This variable will begin with the letter "N." Add the first seven characters of the variable name for the poverty index: NPOVINDE.
Step 3: Integrate MFI client households into relative poverty groupings. Each tercile created for nonclient households contains distinct value ranges of the poverty index. The maximum and minimum values for each range can now be used to assign the MFI client households. To create poverty terciles, first select only those cases that are currently assigned to the middle poverty tercile. Click on Select Cases under the Data menu, then filter cases to only those where the poverty tercile equals 2.
From the Analyze menu, select Descriptive Analysis, then Descriptives. In the dialogue box that opens, move the poverty index variable into the "Variable(s)" box and click on "OK." The resulting table will look somewhat like table 9.8. Note the minimum and maximum values; these values will be used to set boundaries for assigning the MFI client sample to the three terciles.
Once the range of values for each tercile of nonclients is recorded, assign MFI client households to each tercile according to their poverty index score. This is best done by computing a new variable, POV-GROUP, that will list group numbers for the 500 households of the full sample. Before computing this variable, however, verify that all 500 146 Microfinance Poverty Assessment Tool households have values for the poverty index. If not, limit the sample to exclude all cases of missing values before continuing with the following procedure. Assign values to the variable POVGROUP as follows: "1" for all cases where the poverty index score is below the minimum value appearing in the descriptive analysis table generated by SPSS (-0.70134 in the example shown in table 9.8) "2 " for all cases where the poverty index score is on or between the minimum and maximum values appearing in the descriptive analysis table (-0.70134 and 0.21338 in the example) "3" for all poverty index scores above the maximum value appearing in the descriptive analysis table (0.21338 in the example) Begin by selecting Compute from the Transform menu. In the dialogue box that opens, type in the new variable name, POVGROUP, in the top left box, and in the box at the right, enter "2." Click on "Continue," then "OK." This will assign values of 2 to the new variable. Now revise the newly computed variable by repeating the above process but this time, type "1" in the top right text box (top portion of figure 9.14) and click "If" to open the "Compute Variable: If Cases" dialogue box (lower portion of figure 9.14). Set the "if" condition for Developing a Poverty Index 147  Figure 9.14 SPSS "Compute Variable" and "Compute Variable: If Cases" dialogue boxes POVGROUP = 1 so that it applies only to those cases where the value of the poverty index variable is less than the minimum value of the poverty index shown in the descriptive table (table 9.8). The values of POVGROUP should now show values of 1 wherever the poverty index value is below -0.70134. Compute the POVGROUP values a final time to assign a value of 3 to poverty index values above the maximum level of the middle tercile. In our example, this value is 0.21338.
Step 4: Verify that all households have been categorized correctly. Once all cases have been assigned to a poverty group, run a frequencies table of POVGROUP to verify that the results are correct. Also verify that all cases with missing values for the poverty index also have missing values within POVGROUP. Choose Select Cases from the Data menu and set the "if" condition to "MISSING(povindex)." If these cases are found to have values of 1, recode the cases to SYSMIS using the Recode option in the Transform menu. Finally, add a variable label-household ranking of relative poverty-and value labels for the new variable where 1 = lowest, 2 = middle, and 3 = highest.

Assessing MFI poverty outreach by poverty groupings
Now that all cases for MFI clients and nonclients have been assigned to poverty groupings, it is possible to compare differences between the two distributions. If the pattern of poverty among client households matches that of nonclient households, client households will divide equally among the three poverty groupings in the same way as the nonclient households did, with 33 percent falling in each group. Any deviation from this equal proportion will signal a difference between the client and nonclient populations. For instance, if 60 percent of client households fall into the first tercile, or lowest poverty category, the MFI reaches a disproportionate number of very poor clients relative to the general population. Figure 9.15 shows the results of a test case study that highlights significant differences in the poverty distribution between clients and nonclients. The graph shows that clients are overrepresented within the lowest tercile and underrepresented in the highest tercile. This would indicate that the MFI is reaching a larger share of poorest households than is generally found in the population. In contrast, the results of a second case study, shown in figure 9.16, found the opposite pattern. The results in this figure indicate that MFI clients are underrepresented in the lowest tercile and overrepresented in the highest tercile. This implies that the MFI is attracting better-off clients.
To create graphs in SPSS similar to those shown in figures 9.15 and 9.16, choose Graphs, then Bar… This will open the "Bar Charts" dialogue box. Choose "Clustered" and select "Summaries for groups of 148 Microfinance Poverty Assessment Tool In addition to comparing differences in poverty, terciles can be used to assess how various indicators differ across poverty groups.
cases" under the "Data in Chart Are" box. Click "Define" to open the "Define Clustered Bar" dialogue box. In the top part of the box, select the option "% of cases," then select the variable POVGROUP as category axis and the client status variable to define clusters. Click on the "Titles…" option in the lower right corner to add titles for the graph.
In addition to comparing differences in poverty, the poverty terciles can also be used to assess how various indicators range in magnitude across poverty groups. This is an important means of verifying the degree to which the poverty index captures differences in poverty levels between households.
Developing a Poverty Index 149   Figure 9.17 shows data from a case study on the percentage of adults who completed at least grade 7 of education. Within the lowest-ranked group, roughly 40 percent of household adults had at least this level of education, compared to over 70 percent of those in the highest-ranked group. The graph shows a consistent rise in percent educated as household rankings increase; it also suggests that there are no unusual patterns within any of the branches.
A similar check on the poverty index can be made by creating a cross tabulation of ordinal indicators by poverty group. Table 9.9 uses data 150 Microfinance Poverty Assessment Tool The absence of piped water facilities or electricity in some areas may introduce bias in how households within these communities are ranked, if indicators are used that assume these facilities are available. from another case study, which shows that 97 percent of the highestranked households have their own latrine or flush toilet, compared to 58 percent of the poorest-ranked households. A review of the chi-square test in table 9.10 shows these differences to be highly significant. A final check on the appropriateness of indicators within the poverty index involves screening for inconsistencies in the infrastructure within survey locations. In particular, the absence of piped water facilities or electricity in some areas may introduce bias in how households within these communities are ranked, if indicators are used that assume these facilities are available.
Verify the contribution of each indicator used to create the poverty index through graphs of the means of each ratio variable and cross tabulations of each ordinal variable. If inconsistencies are found across branches or response codes, consider replacing the indicator in the poverty index with another that does not show inconsistencies. Once all variables have been reviewed and changes made, the composition of the poverty index can be finalized and the analyst can proceed to interpret the results, as discussed in chapter 10.

Interpreting the Results
A comprehensive assessment of an MFI must include an evaluation of its poverty outreach record and how this record reconciles with its mission and program objectives. MFIs differ in terms of geography, stated mission, type of market niche sought, preference for institutional culture, and a host of other factors. Ignoring these considerations or providing incomplete information on institutional details fails to tell a complete story. Interpreting the results of a poverty assessment within the context of a specific MFI adds depth of understanding to the quantitative measurement of the relative poverty differences between MFI clients and nonclients.
The CGAP Appraisal Format contains practical guidelines and indicators for measuring the financial and organizational performance of an MFI. This performance can be reviewed in light of external constraints, internal vision, and strategy to establish the context within which poverty outreach results should be interpreted.
This chapter guides researchers to use the poverty index to make comparisons across programs and countries by explaining how to develop summary ratios. These ratios can be used in conjunction with additional area-level and national-level information to interpret and compare the poverty outreach of different MFIs. The final section of the chapter relates the entire assessment process to the context of an individual MFI and counsels users how to prepare a summary report of findings.
Comparing results at the local, area, and national levels MFI outreach to the poor can be assessed at three levels: Local: the extent to which the MFI provides services to households at different poverty levels in the survey area.
Area: the extent to which the survey area represents relatively poor parts of the country.
National: the extent to which the country can be assessed as poor relative to all other countries.

Interpreting the Results of a Poverty Assessment
Chapter 10 The first level of assessment has formed the core of this manual, yet an overall conclusion regarding the poverty outreach of an MFI must explicitly account for area-and national-level considerations. An overall picture that takes all three levels into account should be the basis for making final comparisons. These additional levels of information can then be combined with a qualitative institutional analysis of an MFI to interpret its poverty outreach profile.

Comparing poverty at the local level
Household poverty scores and poverty groupings do not indicate the absolute poverty of the local area. It is feasible for an MFI to operate in areas where 90 percent of the local population lives in extreme poverty and in areas where no more than 10 percent live in poverty. Comparisons between clients and nonclients can only indicate differences in the relative poverty distribution between these two groups.
If the pattern of poverty of client households were similar to that of nonclient households, we would expect client households to be distributed among the three poverty groupings in the same fashion as the nonclient households: 33 percent falling into each group. Any deviation from this proportion would thus signal a difference between the client and nonclient populations. Two measures based on this deviation can be observed (see box 10.1): Measure 1: This measure reflects the extent to which the poorest households are represented in the client population. A measure of 33 indicates that the proportion of the poorest households among MFI clients is the same as in the general population. Measures greater than 33 imply that that proportion of the poorest households among MFI clients is greater than that in the general population. On the other hand, measures less than 33 imply that the proportion of the poorest households among MFI clients is less than that in the general population.

Measure 2:
This measure reflects the extent to which the highestranked households are represented in the client population. A measure of less than 33 indicates that, compared with the nonclient population, a lesser proportion of client households falls into the highest-ranked group.

Comparing poverty of the MFI operational area to national poverty levels
A local-level assessment of the relative poverty of MFI clients will not provide a complete picture if MFIs tend to be located in better-off or worse-off areas within a given country. In wealthier regions, relatively 156 Microfinance Poverty Assessment Tool A comprehensive assessment of an MFI must include an evaluation of its poverty-outreach record and how this record reconciles with its mission and program objectives.
poorer clients may still be better off, on average, than households living outside the operational area of the MFI. Conversely, in poorer regions, higher-ranked households may be worse off, on average, than households living outside the operational area of the MFI. Making assessments at the area or national level would require sampling households outside the operational area of the MFI, escalating the cost of the assessment exponentially and rendering it impractical.
There are two options available for comparing the level of poverty in the operational area of an MFI to other parts of the country. The first option is to collect area-level poverty measurements from various published sources. This option is feasible only in countries where secondary information is regionally disaggregated, reliable, and available. However, secondary data that is sufficiently disaggregated to allow comparisons between the operational area of an MFI and the rest of the country is available only in a handful of developing countries.
Moreover, when data are compiled from more than one source (which is likely), differences can exist regarding the division of geographic areas, units of measure, definitions of terms, and the year in which data were collected. A quantitative approach is feasible only if a standard methodology that is both countrywide and area-specific can be used to measure poverty. In all other cases, a second option involving an expert opinion can be applied.
Using secondary data. Quantitative measures of poverty can take many forms and the researcher may need to review several different measures to arrive at a fair assessment (see box 10.2). Some of the indicators currently used by governments and international organizations to measure poverty include: (i) official poverty statistics such as the percentage of the population living under the poverty line or the estimated poverty gap by locality, (ii) recent national surveys of annual household and per capita consumption and expenditure, disaggregated by locality, (iii) databases measuring food insecurity or vulnerability by locality, such as poverty Interpreting the Results 157

Box 10.1 Deviations from an even tercile distribution
Measure 1 • percentage of clients belonging to lowest-ranked poverty tercile • higher values show more extensive outreach to the poorest households in the local area

Measure 2
• percentage of clients belonging to the highest-ranked poverty tercile • higher values show more outreach to the better off in the local area An overall conclusion regarding the poverty outreach of an MFI must explicitly account for area-and nationallevel considerations.
maps for food aid distribution, and (iv) an aggregate indicator of quality of infrastructure by locality.
Using qualitative methods. The second option for assessing general poverty levels in an MFI operational area involves a qualitative assessment. This assessment, explained step by step in the text that follows, uses an expert panel to rate the poverty level of the MFI operational area against regional and national standards.
Step 1. Define areas to be assessed. Area-level assessment should include the entire operational area of the MFI, not just the branches selected for the household survey. First, make a list of all regions or branches, then indicate the names of localities where clients are located. When locations cannot be easily "recognized" by a potential panel of experts (see step 2 below), map each location to the closest commonly understood set of geographic coordinates, such as village names or local administrative units (about which information can be relatively easily solicited). The final list should be arranged as shown in table 10.1, with the first and second columns completed by the analyst.
Step 2. Identify the panel of experts. Key respondents for this assessment should be selected from a range of institutions, including major social science research institutes, governmental and/or nongovernmental organizations involved in poverty alleviation programs (including food aid distribution), and well-known but independent poverty experts. It is extremely important to ensure that the panel of experts has direct knowledge of the operational area of the MFI. This puts the panel in a position to rank the area against regional or national standards. 158 Microfinance Poverty Assessment Tool

Box 10.2 Using secondary data in a poverty assessment
In the South Africa case study, several forms of secondary data were used to establish that general poverty levels within the MFI operational area-the Northern Province-fell well below those in other parts of the country. Comparing a provincial Human Development Index (HDI) to the national South Africa HDI, the Northern Province was found to have an HDI of .470, which was 69 percent of the HDI of South Africa as a whole. The comparison indicates a strong regional disadvantage. In addition, the program targeted only the African population, which was shown to be the poorest of all ethnic groups within the region and the country as a whole. Secondary information also showed that income levels within the area were comparable to those found in some of the poorer countries of Africa. The data indicated further that the region was poorer than other parts of the country and that the MFI tended to operate within poorer districts of the region.
A qualitative assessment uses an expert panel to rate the poverty level of the MFI operational area against regional and national standards.
Officials in health and agriculture ministries and local governments often have extensive knowledge of very specific areas of the country and can make comparisons across regions and different administrative boundaries. Select eight to ten experts who have different institutional backgrounds (government, social science research institutes, and nongovernmental organizations involved in poverty alleviation programs).
It may be necessary to interview a set of regional experts to assess how specific localities compare to the overall region, and national experts to assess how the regions compare to the nation as a whole.
Step 3. Set criteria for assessing area-level poverty. The area-based poverty assessment uses the opinions of a panel of experts to rate the overall poverty level in specific MFI operational areas against national-average poverty levels. Panel members are asked to assign a score to each locality using the criteria below: 1 = operational area ranks considerably below national average 2 = operational area ranks somewhat below national average 3 = operational area ranks at or around national average 4 = operational area ranks somewhat above national average 5 = operational area ranks considerably above national average Interpreting the Results 159 Locality 3D * Score: 1 = Operational area ranks considerably below national average 4 = Operational area ranks somewhat above national average 2 = Operational area ranks somewhat below national average 5 = Operational area ranks considerably above national average 3 = Operational area is at or around national average To ensure that rankings are simple and unambiguous for the panel of experts, only five levels are used. It is important that adequate guidance be provided to the expert panel to assist them in ranking and scoring. It should be made clear that their assessment should be based on due consideration of factors such as • wage and employment (or unemployment) levels; type of employment • sufficient physical and social infrastructure to meet the needs of local residents in terms of clean drinking water; availability of health care clinics and hospitals • literacy levels; availability of elementary and secondary schooling • agricultural conditions (if rural), levels of commercialization, extent of food insecurity • unusual circumstances such as political unrest, natural disasters, or epidemics that may alter significantly the well-being of local residents Step 4. Elicit information from panel of experts. The worksheet shown in table 10.1 is distributed to all members of the expert panel, who are asked to fill in column 3 with a ranking for each locality within each region. If experts are unfamiliar with a particular locality, they should be asked to leave the particular cell blank rather than record a guess.
Step 5. Triangulation. Triangulation of information received from the panel of experts will be necessary, especially in cases where widely divergent views exist among the experts. This situation will arise, for example, when one expert assigns a score of five and another a score of one to the same locality. In such cases, the results of the expert panel should first be clearly tabulated to show the scores of each panel member for each locality. Further, whenever scores deviate by more than three points, those experts with divergent opinions should be asked to provide a brief written explanation supporting their conclusion. The tabulated results, along with the explanations, should then be recirculated to the panel to give them the opportunity to change their previous ranking in view of the overall results and explanations provided. If changes take place, the process should be repeated until no changes are made.
The best-case scenario is one where an overall consensus eventually emerges. If complete consensus does not emerge, but individual scores do not deviate by more than two points, then all expert opinions are given equal weight and average scores are computed as in step 6 below. If individual scores deviate by three or more points, it is highly likely that some of the members have incomplete information.
In such a situation, the analyst should independently evaluate all explanations for logical consistency and overall credibility, then decide which divergent opinion(s) to discard. All available secondary data 160 Microfinance Poverty Assessment Tool Triangulation of information received from the panel of experts will be necessary.
should be used in making this decision and the reasons for discarding an opinion should be clearly explained in the final evaluation report. Once a decision has been made, average scores may be computed from the individual scores retained.
Step 6. Calculate a weighted average rating for the MFI operational area. The worksheet in table 10.2 describes the process for computing the overall rating for the operational area of an MFI. For each locality, average ratings are computed by adding all scores for that locality and dividing this total by the number of expert responses. This average is entered in column 3 of table 10.2. The overall rating for the operational area of the MFI as a whole is then computed as the weighted average of all locality-specific ratings, using the locality's share of the total MFI client base as the weighting factor. In order to compute this rating, the number of active clients in each locality should be entered in column 4 and the client share of each locality in column 5. The client share is obtained by dividing the number of clients in the locality by the total client base of the MFI. The next step is to multiply columns 3 and 5 and place the result in column 6. The sum of this column is the weighted-average poverty rating for the entire MFI operational area. It is suggested that the actual tabulation of weighted averages be done using a spreadsheet program such as Microsoft Excel.
Interpreting the Results 161 There are three principal methods for assessing the poverty level of a household: (i) household expenditure analysis and computation of a poverty line, (ii) rapid appraisal or participatory appraisal methods, and (iii) indicator analysis, using an index of relative poverty. Background on these methods and their advantages and disadvantages as practical tools are briefly described below for the evaluator. A number of references are given for readers who wish to expand their knowledge of these methods.

Detailed household expenditure survey
The expenditure survey method is widely used in nationally representative households surveys, such as the Living Standard Measurement Survey conducted by the World Bank. The standard practice in poverty analysis has been to use household total expenditure as the primary measure to evaluate the standard of living of households. It is argued that total expenditure expresses a good measure of a household's command over the goods and services it chooses to consume.
The basic criteria used to assess whether or not a household is poor is its income, that is, whether its income is sufficient to meet the food and other basic needs of all household members needed for a healthy and active life. To make the assessment, a basket of goods and services satisfying a pre-set level of basic needs is constructed. This basket corresponds to local consumption patterns and is valued at local consumer prices to compute the minimum cost of its acquisition.
The value of the basket of minimum food, goods, and services is then called the "poverty line." This poverty line is most commonly expressed in per capita terms. If the per capita income of household members is below the poverty line, the household and its members are considered poor. If this condition does not hold, the household is categorized as nonpoor. For further references on household expenditure surveys and the poverty line, see, for example, Aho, Larivière, and Martin (1998); Chung et al. (1997); and Lipton and Ravaillon (1995).
The advantage of this method is that it is a widely accepted and fairly precise tool in measuring poverty, as far as the income dimension of pover-

Alternative Approaches to Assessing Poverty
Appendix 1 169 ty is concerned. The poverty line method allows for comparisons between clients and nonclients of MFIs within one area of a country and between countries. However, the data requirements of this method are very steep and comprehensive and standardized questionnaires are needed. The standard practice is to record food consumption data using a recall period of one week and a combination of monthly or yearly recall periods to collect information on various nonfood items. Even though poor households in developing countries consume a small number of goods, accuracy in reporting is always a concern, given the long recall periods.
A more accurate method is to require households to maintain a written diary of expenditures, but this is hardly feasible in countries or environments where illiteracy is endemic. Second, even if consumption items could be accurately recalled, there are several other problems: ways must be found to value home-produced foods when market information is lacking; irregular weights and measures make fixing quantities problematic; information on a number of high-value items, such as the rental value of housing, is likely to be seriously deficient.
Given these difficulties, it is likely that collected data on household expenditures will be quite inaccurate. Of course, the scale of these problems can be substantially minimized by extensive training of interviewers, multiple household visits, and cataloging informal weights and measures. However, the effect on survey cost and the time needed to control for potential errors is likely to be exponential.
Moreover, the analysis of expenditure data necessitates advanced skills in statistical data analysis. This requirement translates into high costs for both data collection and analysis. Another drawback of this method is that the definition of the minimum bundle of food and nonfood services required to achieve a minimum standard of living can be ambiguous in international comparisons if the minimum bundle of food and nonfood consumer items differs across countries.
The costs of an MFI client survey could potentially be reduced if the evaluator had access to benchmark data from a recently undertaken national household survey on poverty. If such data is accessible, the analyst may choose to undertake a similar household survey only for MFI clients, and to compare those results with the national benchmark (see, for example, Navajas et al. 2000). While this approach can reduce costs, it is only feasible in countries that have recently undertaken a national poverty study. However, in many developing countries, such data are either unavailable, outdated, or difficult and costly to obtain. In terms of cost, the evaluator would also need to spend considerable time becoming familiar with the national data.
In summary, while the household expenditure survey method can provide a reliable and valid assessment of poverty, it is far too costly, timeconsuming, cumbersome, and analytically demanding to be chosen as the most practical method for assessing the poverty level of microfinance clients.

Rapid appraisal and participatory appraisal
Two other methods used for poverty assessment are Rapid Appraisal (RA) and Participatory Appraisal (PA). These methods are often thought to be the same, since they seek input by the community and its members using similar techniques, such as wealth ranking and community mapping.
The ultimate goal of PA is empowerment of the target group. The method requires extensive participation by the community and assumes an open research and development agenda. This cannot be done quickly, that is, within one or two days. RA methods, on the other hand, are meant to provide evaluators with data on the community in a very short time. RA requires the participation of the community, but the time frame is short (usually a one-day visit to the community) and the agenda of the inquiry is predetermined.
RA and PA methods are widely used and accepted tools for identifying vulnerable groups in a community. They are extensively used by development programs and institutions, including MFIs, for targeting services to poorer clients (Hatch and Frederick 1998). The RA method in particular has relatively low time requirements for data collection.
While these methods can be well suited for both targeting and the participatory design of development projects and services, they have a number of disadvantages for poverty assessments seeking to make regional, national, or international comparisons. First, the results are difficult to verify, as they stem from community members' subjective rating of who is poor in the community and who is not. Second, the approach is likely to find poor people in every community, and the percentages of poor people may not vary much across villages.
In other words, the method may be consistent in finding the poorest third in one village, but it may not be consistent in finding the communities in which the poorest third of an entire region reside. Finally, the PA method requires skillful and experienced communicators. For national and international comparisons, there would be concern about a bias resulting from the way in which the method is implemented.

Indicator-based method
Another method to assess poverty at the household level is to identify a range of indicators that reflect powerfully on the different dimensions of poverty and for which credible information can be quickly and inexpensively obtained. Once information on a range of indicators has been collected, they may be aggregated into a single index of poverty. Desirable attributes of poverty indicators are reviewed in appendix 2.
One well-known application of this method is the Human Development Index (HDI) of the United Nations Development Programme (for the HDI values used in this manual, see UNDP 2000). It is based on three components: educational attainment, life expectancy at birth, and per capita income adjusted for purchasing-power parity dollars. The latter two indicators are costly to measure in surveys and therefore not operational.
Another example is the housing index, which is used by many MFIs (particularly in South and Southeast Asia) for targeting financial services to poorer clients. Its advantage is that the list of indicators feeding into the housing index, such as the quality of the roof or walls of a house, can be obtained very quickly by visual inspection of a house. A major disadvantage of this method is that it focuses only on one dimension of poverty (housing), while neglecting others such as food security and human resources. Further, the housing index may not be applicable when housing is homogeneous in the community or when it is not an important poverty dimension, such as in a region with a good climate.
In principle, the time and cost requirements of the indicator method in terms of data collection and analysis can be relatively low if the number of indicators in a poverty index are limited. The method can be considered valid if several dimensions of poverty are included. For these reasons, the indicator method was chosen to measure the poverty level of microfinance clients for the Microfinance Poverty Assessment Tool.
Overall, a good indicator is a measure that is easily observable, verifiable, and objectively describes poverty. A wide range of indicators is recommended in order to capture aspects of an underlying dimension of relative poverty within households.
Two main types of indicators can be used to assess the actual level of household poverty: indicators on income and indicators on consumption. Studies comparing different indicators based on income and consumption conclude that it is difficult to recommend one alternative over the other. (Skoufias, Davis, and Soto 2000). However, consumption over time (seasons or years) is more stable than income, and households provide information more easily on what they consume than on what they earn. For this reason, this tool relies on selected indicators of consumption, although selected indicators expressing means available to the household to increase its standard of living are also included.
The principal challenge in developing reliable indicators is to identify key components of consumption that are either unambiguous measures of poverty in themselves (such as incidences of hunger) or those that correlate well with-or are good proxies of-total household expenditure. Hence, it is not necessary to compile all food and nonfood expenditures of a household, since some types of expenses are closely related to the level of household poverty, while others are not.
Studies have shown that the proportion of clothing and footwear expenditure in the household budget remains stable at different income levels, around 5 to 10 percent of total expenses (Aho, Larivière, and Martin 1998;Minten and Zeller 2000). A recent study by Morris and others (1999) found clothing expenditure to be one expenditure component that increased proportionally with total household expenditures. Since clothing, unlike food commodities, usually requires a purchase of either a finished garment or materials to make a garment, it also avoids the valuation problems posed by food consumption and expenditure.

Recommended Questionnaire
Appendix 3