WPS4385 Policy ReseaRch WoRking PaPeR 4385 How Relevant is Targeting to the Success of an Antipoverty Program? Martin Ravallion The World Bank Office of the Director Development Research Group November 2007 Policy ReseaRch WoRking PaPeR 4385 Abstract Policy-oriented discussions often assume that "better poverty impacts of the country's main urban antipoverty targeting" implies larger impacts on poverty or more cost- program. Standard measures of targeting are found to effective interventions. The literature on the economics of be uninformative, or even deceptive, about impacts on targeting warns against that assumption, but evidence has poverty and cost-effectiveness in reducing poverty. In been scarce. The paper begins with a critical review of the program design and evaluation, it would be better to strengths and weaknesses of the targeting measures found focus directly on the program's outcomes for poor people in practice. It then exploits an unusually large micro than to rely on prevailing measures of targeting. data set for China to estimate aggregate and local-level This paper--a product of the Director's office, Development Research Group--is part of a larger effort in the department to assess the reliability of the methods used in practice for guiding policy making. Policy Research Working Papers are also posted on the Web at http://econ.worldbank.org. The author may be contacted at mravallion@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team How Relevant is Targeting to the Success of an Antipoverty Program? Martin Ravallion1 Development Research Group, World Bank, 181 H Street NW, Washington DC, USA Keywords: Poverty, cash transfers, errors of targeting, China JEL: I32, I38, O15 1 For helpful discussions on this topic and help with the data used here the author is grateful to Shaohua Chen, Jean-Yves Duclos, Emanuela Galasso, Garance Genicot, Margaret Grosh, Pilar Garcia Martinez, Philip O'Keefe, Adam Wagstaff, Dominique van de Walle, Youjuan Wang, Xiaoqing Yu, and seminar participants at Beijing University and at the Ministry of Finance-World Bank Roundtable on Public Finance, 2006. The findings, interpretations and conclusions of this paper are those of the author, and should not be attributed to the World Bank. Various measures of the "targeting performance" of antipoverty programs have been widely used to inform policy discussions. These measures are typically interpreted by both analysts and policy makers as indicators of a program's performance in "...directing benefits toward poorer members of the population" (Coady, Grosh and Hoddinott, 2004a, p.81). Comparisons of such measures across different programs have informed public choices on which programs should be scaled up and which should be dropped.2 As in any situation in which measurement is used to inform policy, "the indicators need to be related to the overall policy problem, with an explicit formulation of the objective and constraints" (Atkinson, 1995, p.31). It is widely agreed that the objective of this class of public programs is to reduce poverty, subject to the relevant constraints, including those related to the information available and the behavior of relevant agents, as well as resources. Better targeting is not seen as desirable in its own right, but rather as an instrument for reducing poverty. Do the measures of targeting used in policy discussions provide useful indicators for this policy problem? The most widely used measures quantify some aspect of how well a given program concentrates its benefits on the poor, which is essentially what "targeting" has come to mean. An example is the share of transfers going to the poor. Cornia and Stewart (1995) have been influential in arguing that measurement practices and policy discussions have put too high a weight on avoiding one type of error--the "Type 1 error" of having (ineligible) non-poor participants--relative to the "Type 2 error" of incomplete coverage of the poor.3 Cornia and Stewart did not present data linking these aspects of targeting performance to poverty outcomes (though they do point to this as an important direction for further research). However, the literature warns us against assuming that better targeting, as assessed by standard measures, will necessarily enhance a program's total impact on poverty.4 A number of factors cloud the relationship between targeting performance and total impact on poverty, including 2 Early empirical studies by Mateus (1983) and Grosh (1992) were influential in arguing the case for finer targeting. The meta studies of Grosh (1994) and Coady et al. (2004a,b) have provided the most comprehensive comparative data on program performance based on targeting measures. 3 The distinction between these two errors goes back to Weisbrod (1970) who called them "vertical-" and "horizontal target efficiency." Cornia and Stewart (1995) used the terms "E-mistakes" and "F-mistakes;" Smolensky et al., (1995) called them "errors of inclusion" and "errors of exclusion." The literature on social welfare policy in developed countries has also suggested that coverage of the poor is given little weight by standard measures of targeting. See, for example, the results of Duclos (1995) on the implications of incomplete take up of Britain's welfare benefits for measures of targeting. However, the relationship between these problems and overall impacts on poverty has received little attention. 4 For an overview of the arguments and evidence see van de Walle (1998). 2 aspects of program design, implementation and the context in which a program operates. Incentive issues have been a theme of one strand of the literature, pointing to the possibility that fine targeting will impose high marginal tax rates on recipients, possibly creating poverty traps.5 The literature has also warned that fine targeting can undermine political support for an antipoverty program; concentrating gains on the poor may induce a lower overall transfer to the poor, with benefits spread too thin, or covering too few people.6 It is also unclear how useful these measures are as indicators of cost-effectiveness, as an input to scaling-up decisions. Here it is not the total impact on poverty that one is focusing on, but rather the impact per unit of the resources devoted to a given program. (The total impact then depends on the allocation of resources across programs, weighted by their cost-effectiveness ratios.) Intuitively, the impact on poverty will depend on both the share of transfers going to the poor and the total transfer. Plainly a large uniform transfer (received by everyone, whether poor or not) can have more impact on poverty than a small well-targeted transfer. But will the latter type of program, with low leakage to the non-poor, necessarily be more cost-effective? The answer is far from obvious on a priori grounds. The factors noted above that cloud the relationship between targeting performance and a program's total impact on poverty will not, in general, vanish when total impact is normalized by total spending. For example, finer targeting typically entails administrative costs, which are debits against the total budget in determining the government's total transfer payment. Then the share of transfers going to the poor does not even identify the transfer to the poor per unit public spending. Less obviously, but no less importantly, targeting can generate "hidden" costs to participants, notably when there are conditionalities, such as work requirements, behavioral condionalities or sources of social stigma. Given the costs of targeting, it is not difficult to imagine cases in which the better targeted program (with the higher share of transfers going to the poor) is less cost effective in reducing poverty, and the literature already contains examples.7 In short, avoiding leakage to the non-poor can reduce the amount actually going to the poor, with theoretically ambiguous implications for poverty and cost-effectiveness in fighting poverty. 5 Besley and Kanbur (1993) pointed to this problem and other issues raised by targeting. Also see Smolensky et al. (1995). Kanbur et al. (1995) study the incentive issues in fine targeting, including characterizing an optimal scheme for poverty reduction, taking account of labor supply responses. 6 For theoretical analyses see De Donder and Hindriks (1998) and Gelbach and Pritchett (2000). 7 Ravallion and Datt (1995) and Murgai and Ravallion (2005) provide examples for workfare programs in India. 3 Whether better targeting, as measured in practice, implies a greater impact on poverty, or a more cost-effective intervention, is ultimately an empirical question. Yet, beyond a few suggestive examples, we really know rather little about how well these popular targeting measures perform in practice. This paper tries to help fill this gap in knowledge using a detailed case study of one program, namely China's "Minimum Livelihood Guarantee Scheme," popularly known in China as Di Bao (DB). This has been the government's main response to the new challenges of social protection in urban areas. A number of factors--the decentralized nature of the program, its scale and the availability of a large data set representative at local level--combine to make this an unusual opportunity to put targeting measures to the test. The program's targeting performance and impacts on poverty are estimated under standard assumptions across each of the 35 major municipalities of China. The most popular targeting measures found in policy-oriented discussions are thus tested as indicators of program performance in reducing poverty. Measures In principle, one can measure "targeting performance" by a program's impact on poverty relative to an explicit counterfactual, such as an un-targeted allocation of the same budget (as in Ravallion and Chao, 1989). Then the interpretation for poverty is unambiguous. That is not, however, the approach that has dominated the literature and practice. This discussion will focus on the main measures of targeting performance found in practice, and on which much of our current knowledge about "what works and what doesn't" is based. More precise definitions of the measures can be found in Table 1.8 Targeting measures I focus on four main measures, the first three of which are based on the concentration curve, C(p), giving the cumulative share of transfers going to the poorest p% of the population ranked by (say) household income per person (Figure 1). The first measure is the share of transfers going to the poorest H%, such as the poorest 40%. This is demoted S=C(H). In the empirical work discussed later, it will be natural to identify the poorest H% as the target group, i.e., the set of people deemed to be have incomes 8 Table 1 relates only to the measures used in this study. For a more comprehensive discussion of these and other measures, including their analytic properties, see the excellent volume by Lambert (2001). 4 below the municipal Di Bao poverty line; more precisely, we can set H = H0 , which is the pre- intervention headcount index of poverty--the proportion of the population living in households with pre-transfer income per person less than the poverty line. (The post-transfer headcount index is H1.) For much of the present discussion we can just take the poorest H% to be some reference group of poor or relatively poor people, without presuming that it is the precise target population for the program in question. The popularity of S is evident in the fact that the meta-studies by Grosh (1994, 1995) and Coady, Grosh and Hoddinott (2004a,b) found that this was the most readily available measure in their primary sources.9 The measure's popularity may well stem from its ease of interpretation. Against this advantage, the measure has some obvious drawbacks. For one thing, it tells us nothing about how transfers are distributed amongst the poor; two programs can have the same share of transfers going to the poor, but in one case the gains are heavily concentrated amongst the poorest, while in the other they case they only reach those just below the poverty line. Another concern is that this measure does not directly reflect the overall size of the transfer program, which will clearly matter to impacts on poverty, as discussed in the introduction.10 The second measure is the normalized share, NS, obtained by dividing S by H (Figure 1). Coady et al. (2004a, b) preferred to use NS as their measure of targeting performance, arguing that this was more comparable than S because it measures performance relative to a "... common reference outcome...that would result from neutral (as opposed to progressive or regressive) targeting" (p.69).11 By "neutral targeting" they mean that everyone gets the same transfer amount (whether poor or not), i.e., a "uniform transfer."12 If the transfer is uniform then clearly NS=1. However, finding a value of NS close to unity does not imply that the allocation is "close" to being uniform. There are many ways one could get a value for NS of unity (or nearly so), with rather different interpretations. Similarly to S, the NS measure is insensitive to how transfers are 9 Coady et al. provide the shares going to the poorest 10%, 20% and 40% for 85 of the antipoverty programs in their study (though with missing data in some cases). 10 The literature has pointed to the possibility that the share going to the poor can vary with the scale of a program, though the political economy of program capture; see Lanjouw and Ravallion (1999). 11 Coady et al. used H=40% when it was available, which was the case for about half the programs in their study, and the next lowest available number (20% or 10%) when the value for H=40% was not available. In the earlier comparative study of targeting performance by Grosh (1994), the value of H is set at 40% in all programs studied, in which case the first two measures will (of course) rank 12 This is sometimes called an "un-targeted transfer" in the literature, although it is not clear the absence of any effort at targeting would yield a uniform transfer. 5 distributed amongst the poor. The poor can receive H% of the transfers, but different people amongst the poor receive very different amounts; for example, the money could all go to either the poorest person or the least poor person; either way NS=1. NS also approaches unity as H approaches 100%, no matter how the money is distributed. When the reference outcome is this ambiguous, the usefulness of the measure becomes theoretically questionable. The third measure is the concentration index, CI, which is a widely used in studies of fiscal incidence. This can be thought of as a "generalized S" in that, instead of focusing on one point on the concentration curve, CI measures the area between the curve and the diagonal (along which the transfer is uniform); in Figure 1, CI is just twice the area marked A.13 The index is bounded above by 1 (at which point the poorest person receives all payments) and below by -1 (the richest person receives all). This measure has the attraction that it reflects distribution amongst the poor, and (indeed) over the whole range of incomes. A disadvantage is that it is not as easy to interpret as S or NS. And, as with the previous measures, it tells us nothing directly about the scale of transfers. Although these measures are all based on the concentration curve, they can give quite different results. Of course, S and NS will always be in the same ratio to each other when the same value of H is used for all programs. However, these two measures can rank programs differently when H varies, as in the case study presented later in this paper, and would presumably be the case in many applications. To illustrate, consider a transfer scheme operating in two cities and giving all participants the same amount. In city A all the transfers go to the poorest 20% and the overall poverty rate is 50% while in city B the transfers go to the poorest 40% and the poverty rate is 10%. A far higher share of the transfers goes to the poor in A (S=100% versus 25% in B). City A also has the higher concentration index (CI=0.8 in A versus 0.6 in B). By contrast, it is in city B where the scheme is deemed to be better targeted according to the normalized share (NS =2.5 for B versus 2 for A). More generally, the concentration curve for program A could lie everywhere above that for program B and yet NS is higher for B, given its lower H. The fourth measure is the "targeting differential," TD, which is the difference between the participation rate for the poor--which I will call the coverage rate (CR)--and that for the 13 To assure that all measures go in the same direction, I multiply the usual definition of CI by -1. 6 non-poor (Table 1).14 Alternatively, one can normalize the targeting differential by the mean transfer over all recipients; call this TD*. (When all recipients get the same transfer, TD=TD*.) However, it turns out later that the choice between TD and TD* makes little difference in the case study. Since TD is easier to interpret I shall focus on this measure. To interpret the targeting differential, note that when only the poor get help from the program and all of them are covered, TD = 1, which is the measure's upper bound; when only the non-poor get the program and all of then do, TD = -1, its lower bound. (In the "two cities" example above, TD=0.67 for city B and 0.4 for A.) This measure is easy to interpret, and it automatically reflects both leakage to the non-poor and coverage of the poor. How are these measures related to the incidence of Type 1 and Type 2 errors? A Type 1 error can be defined as incorrectly classifying a person as poor, while a Type 2 error is incorrectly classifying a person as not poor. A Type 1 error entails a leakage of transfers to the non-poor, while a Type 2 error implies lower coverage of the poor. Let the proportions of Type 1 and Type 2 errors in the populations of the non-poor and poor (respectively) be T1 and T2, as defined more precisely in Table 1.15 (Note that T2=1-CR.) Consider S. This can be written as a function of T1, namely S = 1-T1(1- H)/ P where P is the overall program participation rate. (Alternatively S =1-T1* where T1* is the proportion of participants who are Type 1 errors.) But one can equally well write S as a function of Type 2 errors, namely S = (1-T2)H / P . (Or S = (H / P) - T2*.) Nor is P likely to be independent of T1 and T2; for example, higher coverage of the poor (lower T2) may tend to come with larger programs. Thus S can be taken to depend on both T1 and T2. (The corresponding formulae are more complex for CI, and are omitted.) For the targeting differential, however, the relationship is very clear: TD automatically gives equal weight to both errors; more precisely:TD = 1- (T1+ T2). Thus standard targeting measures depend on the incidence of both types of errors. For the measures based on the concentration curve it should not be presumed that they will be largely unaffected by Type 2 errors. Is an empirical question what weights are attached to these two "errors of targeting." 14 This measure was proposed by Ravallion (2000). Also see Galasso and Ravallion (2005) on the properties of this measure and the discussion in Stifel and Alderman (2005). 15 One might prefer to normalize by population size; similar formulae for this case are easily derived, but the essential point remains. 7 Poverty impacts In testing the relevance of these targeting measures to a program's poverty impacts I use three poverty measures: the headcount index, the poverty gap index (PG), and the squared poverty gap index (SPG) (introduced by Foster et al., 1984). The measures are defined in Table 1. The pros and cons of each are well-documented; for a review see Ravallion (1994). Briefly, H is the easiest to interpret, and is the most popular measure, but is unaffected by income gains or losses to the poor unless they cross the poverty line. PG reflects mean income of the poor, but not inequality amongst the poor, which is the main advantage of SPG. Impacts are measured by pre-transfer less post-transfer poverty measures ( H0 - H1 and similarly for PG). Impacts on these measures are estimated on the same data, and under the same assumptions about how the scheme works (including behavioral responses), as used in measuring targeting. In particular, I shall assume that income in the absence of the program is observed income less payments received under DB. This assumes that there is no displacement of other income sources through behavioral responses, such as reduced work effort or lower private transfer receipts. This is the most common assumption in the literature on measuring targeting performance; indeed, it appears that virtually all of the primary studies used by Coady et al. (2004a,b) made this assumption. The assumption is questionable, however; I offer some tests that, while not conclusive, suggest that the data are at least consistent with the assumption. In assessing cost effectiveness I will normalize the poverty impacts by the cost of the program, though a more flexible econometric method of controlling for total spending will also be used. Given the costs of targeting, it is not difficult to imagine cases in which the better targeted program by any of the above measures is less cost-effective against poverty. Consider again the example of cities A and B above in which the program in city A is better targeted according to both S and CI (but not NS or TD). Suppose that the total cost to the government is the same, but that the finer targeting of city A's program (for which it will be recalled that all of the transfers go to the poorest 20%, versus 40% in city B) entails extra costs to both the government and participants such that only 25% of participants in city A escape poverty, while in B all poor participants are able to do so. The headcount index falls by 5% points in A, but 10% points in B. B's program has higher impact on poverty and is more cost-effective. There is a special case in which one of these measures, namely S, is a perfect indicator of cost-effectiveness for PG. That special case is when the program has no impact on H and there 8 are no fiscal costs besides the transfers. Then it can be readily shown that the impact on PG per unit public spending is simply S. Of course, this special case is unlikely to be of much practical interest, given that people in a neighborhood of the poverty line will presumably be transfer recipients and there will undoubtedly be other costs. Under the same assumptions, it can be readily shown that the normalized share, NS, is a perfect indicator of cost-effectiveness in reducing the income-gap ratio (I) (Table 1). This is (implicitly) the poverty measure relevant to comparisons of program performance based on the normalized share. However, as a poverty measure, the income-gap ratio is known to have a number of undesirable properties; for example, if a poor person living above the mean for the poor escapes poverty then this measure perversely suggests higher poverty. (PG does not have this property.) It should be noted that these measures of poverty impacts and cost-effectiveness can all be calculated from the same data required for the various measures of targeting performance described above. Of course, if one knows the impacts on poverty--which we agree to be the objective--then one does not need the targeting measures. However, since these targeting measures are widely used in assessing antipoverty programs and in comparative work, it is of interest to test their value as indicators for that policy problem. Program and data While economic reforms and structural changes in the Chinese economy have meant high rates of economic growth, it is believed that certain sub-groups have been adversely affected or have been unable to participate in the new economic opportunities due to their lack of skills, long-term illness or disability. The collapse of the old safety-net provided by guaranteed employment has clearly left some households vulnerable. Some of the "left behind" households started poor and some became poor, even though aggregate poverty rates have tended to fall over time. Urban areas have figured prominently in these concerns about the "new poor." On paper, the Di Bao program provides a transfer to all urban households with incomes below a DB line sufficient to bring them up to that line. The scheme became a national policy in 1999 and expanded rapidly; by 2003 participation had leveled off at 22 million people, representing 6% of urban residents. Municipal authorities have considerable power over the program, including setting the DB lines, funding (the center provides partial co-financing) and 9 implementation. China's cities vary in ways that could well be relevant to the outcomes of DB; for example, across the 35 largest urban areas studied in the paper, the highest mean household income per person (the city of Shenzhen) is over four times that of the lowest (Chongqing). The proportion living below the DB poverty line varies from 2% (in Fuzhou) to 19% (Haikou). The analysis uses China's Urban Household Short Survey (UHSS) for 2003/04, as discussed in Chen et al. (2006). The UHSS was done by the Urban Household Survey Division of the National Bureau of Statistics (NBS). I use the UHSS sample for the 35 largest cities, giving a total sample of 76,000, varying from 450 (in Shenzhen) to 12,000 (in Beijing). For these 35 cities, the definitions of geographic areas in the UHSS coincide with those for the DB lines and the entire data set has been cleaned by NBS staff and made available for this research. While the UHSS is a relatively short survey, it allows us to measure a fairly wide range of household characteristics. The survey also included a question on household income and questions were added on DB participation and income received from DB. As noted in the last section, in measuring targeting and poverty impacts I assume that income in the absence of the program is observed income less payments received under DB. While this is a common assumption, it is clearly questionable. Testing the assumption is difficult without panel data (and even then there can be severe identification problems). With only a single cross-sectional survey it is hard to be confident in the results, given the likelihood of omitted variables correlated with both program placement and the behaviors of interest. However, I can offer some observations that are at least consistent with this assumption. The design of DB intends that the benefits received will decrease as income rises, implying that participants face a positive marginal tax rate. Indeed, if the program works the way it is supposed to then it exactly fills the gap between current non-DB income and the DB line. Then participants will have no incentive to work (under the usual assumptions that leisure is a normal good and work yields no direct utility). Earned income net of DB will fall to zero. The program will have created a poverty trap, whereby participants do not face an incentive to raise their own incomes, because of the loss of benefits under DB. The extent to which this is a real problem in practice is unclear. Benefits are unlikely to be withdrawn quickly. There are reports that local authorities allow DB benefits to continue for some period after the participant finds a job (O'Keefe, 2004). Observations from field work also indicate that a notion of "imputed income" was used in a number of provinces. This was a 10 notional level of income that reflected the potential income given the household labor force; this was apparently done with the aim of minimizing work disincentives.16 Figure 2 plots DB payment (per capita) against the DB gap, given by the difference between the relevant DB line and income net of DB (both per capita). If the program exactly filled these gaps (when positive) then DB payments would rise with a slope of unity, but would be zero for those with income above the DB line. We see a marked tendency for mean DB payments (conditional on the DB gap) to rise with the DB gap, though the conditional expected value (as measured by a non-parametric regression) has a slope appreciably less than unity. The regression line starts to be noticeably positive at per capita incomes that are about 2,000 Yuan above the DB line and peaks at a mean of around 300 Yuan per capita, at a DB gap of around 4,000 Yuan. (The conditional mean is, of course, positive throughout, but very close to zero below 2,000 Yuan.) Thus Figure 2 suggests that the average benefit withdrawal rate (BWR) -- the amount by which mean DB payments change with an extra Yuan of income -- is around - 0.05; on average, a 100 Yuan increase in income entails a drop of only 5 Yuan in DB payments. An alternative method of estimating the average benefit withdrawal rate is to regress the per capita DB payment received on income per person less DB receipts, with a complete set of dummy variables for municipalities (to capture the differences in the generosity of the program). The implied BWR is very low, at -0.0012 (t-ratio=-17.51, n=76,808). This does not allow for the censoring that is evident in Figure 2. Using a Tobit regression, the estimate is -0.004 (t=-76.23). Estimating the Tobits separately for each municipality, I obtained statistically significant BWRs in all cases, but all were very low, with none higher (in absolute value) than -0.001. There is almost certainly attenuation bias in these estimates, due to income measurement errors. There is the usual source of measurement error in asking incomes using only one question, plus the fact that income net of DB payments will probably underestimate income in the absence of DB if there are behavioral responses. To address this concern, I tried an Instrumental Variables Estimator (IVE), in which a set of household-level characteristics (including demographics, education attainments, occupation, housing conditions) are used as instrumental variables for income in estimating the BWR; Chen et al. (2006) provide details on the variables used in the first-stage regressions. Note that this only works for the unconditional 16 This is based on a personal communication with Philip O'Keefe at the World Bank, drawing on his field-work discussions with local administrators. 11 regression coefficient of DB payments on pre-DB income, so the instrumental variables are automatically excluded from the main regression of interest; the conditional BWR is unidentified. The IV estimate of the unconditional BWR is -0.0021 (t=-28.33), again very low. I also repeated these calculations separately for each municipality, using the IVE for the full sample in each municipality. The estimates were significantly negative for all municipalities and ranged from -0.0102 to -0.0001. While each of these tests requires an assumption that can be questioned, they all suggest that the benefit withdrawal rate for Di Bao is very small. It would thus appear unlikely that the program would provide any serious disincentive for earning income, thus supporting our assumption that income in the absence of DB is simply observed income minus DB payments received. However, at the same time, such a low BWR raises concerns about how well the program reaches the poorest and how well it adapts to changes in household needs. The BWR for the program is almost certainly too low; Kanbur et al. (1995) find that an optimal BWR around one half is consistent with evidence on the relevant income elasticity of labor supply. Targeting performance and poverty impacts On calculating all these measures on the same data set and under the same assumptions, one can test the assumption commonly made in policy discussions that better targeting allows a greater impact on poverty and/or a more cost effective antipoverty program. One can also revisit some of the findings from past research on the factors relevant to targeting success. I begin with the aggregate results and then turn to the city-level analysis. Aggregate results I find that 7.7% of the total population of the 35 cities had a net income (observed income minus DB receipts) below the DB line (Table 2). The program's total participation is equivalent to about half of the eligible population by this definition. About 40% of DB recipients are ineligible according to these data (0.43=1.69/3.91). The proportion of these Type 1 errors amongst the non-poor is clearly very low at 0.018 (=1.69/92.29). But there is a high proportion 12 of Type 2 errors, with almost three-quarters of those who are eligible not being covered by the program (0.71=5.48/7.71, i.e., CR=0.29).17 Nonetheless, targeting performance appears to be excellent by international standards, with S=64%, NS=8.3 and CI=0.78.18 Coady et al. (2004a,b) provide estimates of NS for 85 programs. Argentina's Trabajar program has a NS = 4.0, making it the best performer by this measure amongst all programs surveyed by Coady et al.19 The median NS is 1.25. By this measure, Di Bao is a clear outlier in targeting performance internationally. Turning to the fourth measure of targeting performance I find that while 29% of the poor receive DB, this is only true of about 2% of the non-poor. Thus I find that TD=0.27. The mean DB payment across all those with Y