Are There Lasting Impacts of Aid to Poor Areas?

                                    Evidence for Rural China

                      Shaohua Chen, Ren Mu and Martin Ravallion1

                            Development Research Group, World Bank
                                 1818 H Street NW, Washington DC


Summary: The paper re-visits the site of a large, World Bank-financed, rural
development program in China, 10 years after it began and four years after disbursements
ended. The program emphasized community participation in multi-sectoral interventions
(including farming, animal husbandry, infrastructure and social services). Data were
collected on 2,000 households in project and non-project areas, spanning 10 years. A
double-difference estimator of the program's impact (on top of pre-existing governmental
programs) reveals sizeable short-term income gains that were mostly saved. Only small
and statistically insignificant gains to mean consumption emerged in the longer-term --
though in rough accord with the gain to permanent income. The use of community-based
beneficiary selection greatly reduced the overall impact, given that the educated poor
were under-covered. The main results are robust to corrections for various sources of
selection bias, including village targeting and interference due to spillover effects
generated by the response of local governments to the external aid.

Keywords: Poor-areas, aid, credit, rural development, impact evaluation, China

JEL: D91, H43, I32, O22

World Bank Policy Research Working Paper 4084, March 2008
The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the
exchange of ideas about development issues. An objective of the series is to get the findings out quickly,
even if the presentations are less than fully polished. The papers carry the names of the authors and should
be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely
those of the authors. They do not necessarily represent the view of the World Bank, its Executive Directors,
or the countries they represent. Policy Research Working Papers are available online at
http://econ.worldbank.org.



1        This study would not have been possible without the survey data collection effort over 10
years by the Rural Household Survey Team of China's National Bureau of Statistics (NBS). We
are particularly grateful to Wang Ping Ping at NBS, who ably supervised the surveys. The
authors have benefited from discussions with Guido Imbens and the comments of Kathleen
Beegle, Richard Blundell, Solveig Buhl, Shubham Chaudhuri, Richard Chiburis, Alan De Brauw,
Quy-Toan Do, Gershon Feder, Emanuela Galasso, Garance Genicot, Stuti Khemani, Alice
Mesnard, Alan Piazza, Dominique van de Walle and seminar participants at University College
London, the Overseas Development Institute, the University of Namur, the Indian Statistical
Institute, the Paris School of Economics and the World Bank. The support of the Bank's Research
Committee and the Knowledge for Change Trust Fund is gratefully acknowledged. The paper's
findings, interpretations and conclusions are those of the authors and should not be attributed to
the World Bank, its Executive Directors, or the countries they represent.

1.      Introduction

        Publicly-supported grants and loans to poor areas have long been an important

vehicle for development assistance. For example, China's anti-poverty policies have
emphasized such poor-area programs since the mid-1980s,2 motivated by the observation

that the country's success against poverty over the last 25 years has been geographically
uneven, with marked disparities in living standards emerging.3 Advocates of such

programs claim that credit constraints in poor areas perpetuate their poverty and that

targeted aid can relieve those constraints. By this view, capital-market failures in poor

areas entail that the investments made under such a program would be infeasible

otherwise, implying both efficiency and equity gains.

        It remains an open question how much impact can be expected. While not perfect,

capital markets may still work well enough to assure that marginal products of capital

come into rough parity between poor and non-poor areas in steady state. Then the

problem of lagging poor areas is not so much lack of capital as low productivity of capital,

such as due to poor natural conditions, lack of complementary knowledge or skills, or

poor policies.

        And even with credit constraints, some people are clearly more constrained than

others. If those selected are not credit constrained, their participation is voluntary, and

the interest rate is no different from other credit sources, then there will be no net gain

from the extra availability of credit. Heterogeneity in the impacts of such programs can

also arise from inequalities in the complementary skills or knowledge needed to derive

benefits from the extra investment. Beneficiary selection will then be crucial to the

outcomes. However, it is not obvious that the selection procedures found in practice

would "pick the winners." Beneficiary selection for local development programs has

come to rely heavily on local community groups. This practice may well achieve greater

equality in access to the aid within villages, but possibly at the expense of assuring that

the aid goes to those who would benefit the most.


2       See (inter alia) Leading Group (1988), World Bank (1992, 1997), Jalan and Ravallion
(1998) and Park et al. (2002).
3       See, for example, Knight and Song (1993), Jian et al., (1996), Khan and Riskin (1998),
Ravallion and Jalan (1996, 1999), World Bank (1992,1997), Kanbur, and Zhang (1999) and
Ravallion and Chen (2006).


                                               2

        This paper provides the first rigorous assessment of the longer-term impacts of a

poor-area program. The program is the World Bank's Southwest China Poverty

Reduction Project -- the Southwest Program (SWP) for short. This comprised a package

of multi-sectoral interventions targeted to poor villages using community-based

participant and activity selection. The aim was to achieve a large and sustainable

reduction in poverty. The paper reports results from an intensive survey data collection

effort over 10 years, initiated by two of the authors and done in close collaboration with

the Rural Survey Organization of China's National Bureau of Statistics.

        Assessing development aid effectiveness at the project level raises a number of

challenges. A long-term commitment to collecting high-quality survey data is crucial,

but it is not sufficient. Impact can only be meaningfully assessed relative to a

counterfactual; our counterfactual is the absence of the SWP, which means that we assess

the incremental impacts, on top of pre-existing governmental spending. As in any

observational study, there are concerns about selection bias, i.e., differences in

counterfactual outcomes between SWP participants and non-participants. Our data

collection effort allows us to "difference out" the time-invariant component of the

selection bias (arising from non-random placement). However, it is not obvious on a

priori grounds that the bias would be constant over time, given that the initial village

characteristics that attract the program (such as poor infrastructure) may also influence

the growth rate under the counterfactual. We use both propensity-score weighted

regression and kernel-matching methods to balance the observable covariates between

sampled SWP and non-SWP villages.

        A further problem is that aid-financed poor-area development projects are likely

to violate the common assumption in impact evaluations (both experimental and non-
experimental) of no interference with the comparison units.4 A plausible source of

interference in this setting is through local public-spending spillover effects to non-SWP

villages. The local government cuts its own development spending in the villages

targeted for external aid, and the spending is diverted in part at least to the non-



4       This assumption is often implicit in impact evaluations but it was made explicit by Rubin
(1980), who dubbed it the stable unit treatment value assumption (SUTVA). SUTVA is known to
be implausible in certain bio-medical evaluations.


                                                3

participants used to form the comparison group. We propose and implement a test for

spillover effects and we construct a bound for the bias.

        The paper's principle finding is that there were sizeable income gains from the

SWP during its disbursement period, but these gains did not survive four years later. The

longer-term impact on mean income is neither large nor statistically significant.

However, we do find significant gains for some sub-groups, notably those among the

poor with better schooling. Our results point to substantial losses from the community-

based beneficiary selection process.

        The following section describes the SWP while sections 3 and 4 describe our data

and methods. Section 5 presents the main results while Section 6 draws some lessons for

future evaluations.


2.      Background on the program

        In 1986, the Government of China designated that about 15% of the country's

2,200 counties were "poor counties," which would receive extra assistance, mainly in the

form of credit for development projects. Past research has suggested that the designated

poor counties are in fact poor (by a range of defensible criteria) and that they have seen

higher growth rates than one would have otherwise expected (Jalan and Ravallion, 1998;

Park et al., 2002). The gains have not been sufficient to reverse the underlying tendency

for growth divergence (whereby poorer counties tend to have lower growth rates) and

there is evidence that the impacts on economic growth may have declined in the 1990s

(Park et al, 2002). Within these designated poor counties, geographic pockets of extreme

poverty have persisted to the present day, mainly in upland areas.

        The SWP was introduced in 1995 with the aim of reversing the fortunes of

selected poor villages in the designated poor counties of Guangxi, Guizhou and Yunnan.

About one-quarter of the villages were selected for the SWP (1,800 out of 7,600

villages). The aim was to choose relatively poor villages within these counties, with

selection based on objective criteria, although not formulaic. The selection was done by

the county government's project office in consultation with provincial and central

authorities and the World Bank.




                                              4

        The total outlay on the SWP was US$464 million, which was financed by World

Bank loans and counterpart funding from China's central and provincial governments.

The total investment per capita under the SWP was only slightly lower than mean annual

income per capita of the project villages.

        As in other World Bank projects, there were numerous appraisal and supervision

missions by Bank staff and consultants, and these missions often probed quite deeply into

the project's local operations, including numerous visits to participating counties and

villages. Two of the authors (Chen and Ravallion) participated in some of these missions

and also revisited a number of the sampled villages over two weeks in May 2005

(including some that they had visited 10 years earlier) and had informal discussions about

the SWP with numerous ex-participants.

        Within the selected villages, virtually all households were expected to benefit

from the infrastructure investments, such as improved rural roads, power lines and piped

water supply. Widespread benefits were also expected from the improved social services,

including upgrading village schools and health clinics, and training of teachers and

village health-care workers. Those with school-aged children also received tuition

subsidies (conditional cash transfers). Over half of the households in SWP villages also

received individual loans (accounting for about 60% of disbursements). The interest rate

was set at the same level as for loans from the government's poor-area programs and the

Agricultural Development Bank of China, although this is a lower rate than for

commercial sources of credit. The loans financed various activities including initiatives

for raising farm yields, animal husbandry and tree planting. There was also a component

for off-farm employment, including voluntary labor mobility to urban areas and support

for village enterprises. The selection of project activities aimed to take account of local

conditions and the expressed preferences of participants, although it is unclear how well

this worked in practice; there have been reports that farmers' preferences were sometimes

over-ruled by local cadres (World Bank, 2003).

        Household selection into the SWP was a less transparent process than village

selection, which could be based on data and field observations. The household selection

was typically done by the pre-existing "farmers' committee" in each village and was not

subject to rigorous monitoring. From our discussions in field work, it appears that credit-



                                              5

worthiness criteria and successful past experience with similar project activities played an

important role. No doubt local level connections also played a role.

        In common with other development projects, the SWP provided the capital and

technical assistance, but it did not provide insurance, and many of the activities are likely

to entail non-negligible risk; the income gains will depend on a number of contingencies,

including the vagaries of the weather, uncertain demand for the new products and risks

associated with out-migration.

        The ex ante expectation was that the SWP would virtually eliminate poverty in

the selected villages over the longer term. The World Bank's Implementation

Completion Report (ICR) -- the final document giving the ex post "self-assessment" of a

lending operation by the relevant operational unit -- claimed that the SWP had a

substantial impact on poverty, citing survey data indicating that the poverty rate had been
more than halved in the project areas over 1995-2001 (World Bank, 2003).5 However,

the attribution of these gains to the SWP is questionable. The evaluative claims in the

ICR are reflexive comparisons, which only reveal the true impact under the assumption

that there would have been no progress against poverty in the absence of the project.

That assumption must be deemed highly implausible in this setting.

        Ravallion and Chen (2005) studied the impacts of the SWP over the disbursement

period, 1995-2000, using survey data for 2,000 randomly sampled households in both

SWP and observationally similar non-SWP villages that had first been surveyed in 1995

(at the beginning of the project) and then annually until project completion. On

comparing income changes in SWP villages with those in the matched non-SWP villages,

they found an average income gain over five years of around 10% of baseline mean

income, representing an average rate of return of 9%. The gains are not as dramatic as

suggested by the reflexive comparisons in the ICR, but they are still sizeable.

        However, Ravallion and Chen found that a large share of the income gain was

saved. On comparing the final year of disbursement with the first, Ravallion and Chen

found only a modest impact on mean consumption or consumption poverty. The savings

rate from the project's income gains was well above the pre-intervention savings rate.

5       This was confirmed by researchers at the Chinese Academy of Social Sciences, who also
pointed to a substantial increase in primary school completion rates and a decline in the infant
mortality rate which they attributed to the SWP (Guobao et al., 2004).


                                                 6

        Why was there such a high savings rate from the initial income gains? A number

of explanations can be suggested, carrying rather different implications for the long-term

impact of the SWP. Possibly households saved more to assure they could repay the

loans. That depends on the extent to which repayment was enforced. While the World

Bank's loan is made to the (central) Government of China, and repayment is virtually

certain, that is not the case for the loans made at local level, where enforcement problems

are common. Indeed, local repayment rates on loans for poverty reduction under the

government's own program were less than 25% in the three provinces covered under
SWP.6 However, it may be that the necessity of the center repaying the World Bank

"trickled down" in the form of greater local enforcement of SWP repayments than for the

loans made under the government's own poor-area programs.

        Another possibility is that the high initial savings rate reflected a perception on

the part of participants that the longer-term income gains from SWP would be modest or
uncertain at best -- raising concerns about the sustainability of the program's impacts.7

When interpreted in terms of the Permanent Income Hypothesis, the Ravallion-Chen

findings imply that participants felt that a large share of the income gain was transient,

and (hence) it was saved. While this would happen even without uncertainty about the

future income gains, such uncertainty is likely, and would probably lead to precautionary
saving in response to the project.8 In this regard, it is instructive that Ravallion and Chen

found large year-to-year differences in impact, which were primarily due to variability in

the annual returns to the program's investments rather than the level of investment. This

variability in the returns suggests that participants would have had a hard time assessing

the program's impact on permanent income.

        The transient-income explanation suggests that the income impacts of SWP would

diminish appreciably after disbursement. Precautionary saving would also start to fall as



6       The repayment rates on loans for poverty reduction in 1997 ranged from 8% in Yunnan
to 23% in Guizhou. Repayment rates were somewhat higher for other types of loans but the
overall average was still only 30% (Government of China, 1998).
7       The ICR rated "sustainability" as "highly likely." The Bank's internal evaluation of SWP
by its Operations Evaluation Department pointed to the need for further evidence on the longer-
term sustainability and impact of SWP.
8       There is evidence of precautionary savings in response to uninsured risk in the same
region of rural China; see Jalan and Ravallion (2001).


                                                7

participants learn more about the impacts. Consumption gains should become evident in

due course, consistent with the project's underlying impact on permanent income.

        There is another explanation for the high savings rate from the short-term income

gains. This postulates that the SWP systematically alters the returns-to-saving in the

participating villages. By this view, the project provided local public goods that

increased the marginal product of private capital, and so stimulated higher savings to

support the desired private investment, which would yield longer-term income gains
beyond the life of the project.9 This assumes that there are capital-market imperfections,

which entail that investment depends on own-savings and that the marginal products of

private capital are not equalized across locations. With the poor facing severe constraints

on access to credit and yet having higher marginal products of capital in their own (farm

and non-farm) enterprises (given low capital stocks and concave production functions)

one might expect to see a sizeable (and pro-poor) investment response. Clearly, this

explanation offers a more positive view of the prospects of a sustained impact on poverty

from the SWP, in that it suggests that income gains will persist well beyond the

disbursement period (as the returns to investment start to be realized) and that sustainable

consumption gains would emerge.

        By re-surveying in 2004/05 the same sample studied by Ravallion and Chen

(2005) we hope to throw light on which of these explanations is most plausible.


3.      Data

        The original plan for the impact evaluation of SWP was to do a baseline survey in

1995 and to only do follow-up surveys during the Bank's disbursement period, up to

2000. However, we decided to re-survey the original sampled households in 2004/05, to

try to resolve the issues about longer-term impact raised by Ravallion and Chen (2005)

and discussed in the previous section.

        All surveys were implemented by the Rural Household Survey (RHS) team of the

government's National Bureau of Statistics (NBS). The surveys covered 2,000

randomly-sampled households in 200 villages, with roughly half not participating in the

9       Jalan and Ravallion (2002) provide a micro model of growth with imperfect capital
markets that is consistent with this property, and find supportive evidence in the same region of
rural China.


                                                 8

SWP. All villages were in counties covered under the government's poor-area program,

to assure that we will identify the impact of the SWP, on top of the government's
program. There are 112 SWP villages and 86 non-SWP villages.10 The SWP villages

were a random sample from all project villages, while the non-SWP villages were a

random sample from all other villages in the designated poor counties. Ten randomly

sampled households were interviewed in each village.

         The 1996-2000 and 2004/05 surveys included community, household and

individual questionnaires. The community schedule collected data on natural conditions,

infrastructure and access to services. The household survey collected data on (inter alia)

incomes, consumptions and assets. The individual questionnaires covered gender, age,

education and occupation. A data set was collected from 1997 to 2001 on development

project activities (both SWP and under other existing government programs). There are

34 project activities identified in these data, in seven categories (farming, animal
husbandry, forestry, infrastructure, education, health and labor migration).11

         We follow Ravallion and Chen (2005) in using 1996 as the baseline. There are
serious comparability problems between the 1995 survey and later surveys.12 As a

baseline, the 1996 data are not free of contamination; 17% of the program's total

disbursement on household projects had been made by the end of 1996. We check

robustness to using 1995 as the baseline.

         Relative to other household surveys, unusual effort went into obtaining accurate

estimates of consumption and income from the 1996-2000 and 2004/05 surveys. While

the community, individual and project activity surveys used conventional one-time

interviews, the household survey was quite different. The household surveys from 1996

onwards were closely modeled on NBS's Rural Household Survey (RHS) (which is



10       In the 2004/05 survey, two villages (one SWP and one not) were inadvertently replaced
by two different villages in the same township.
11       A project activity survey in 1998 also gathered information about the scale and the
starting year of each SWP sub-project at village and household level, as well as data on other
funding these villages and households received from the government and other sources.
12       Because of delays in NBS being told the locations of SWP villages, the first survey in
December 1995 had to use a one-time interview method, asking recall over the full year. The use
of this long recall period is likely to lead to underestimation of income and consumption (though
this is of less concern for village-level characteristics). The subsequent surveys used the daily-
diary method over the full year, allowing more accurate income and consumption data.


                                                   9

described in detail in Chen and Ravallion, 1996). This is a good quality budget and

income survey, notable in the care that goes into reducing both sampling and non-

sampling errors. Similarly to the RHS, sampled households maintain a daily record on

all transactions plus log books on production. Local interviewing assistants visited each

household at two-three weekly intervals, at which time inconsistencies found at the local

(county-level) NBS office are checked. Other trained interviewers also visited at regular

intervals to collect additional data. This intensive interviewing method is a marked

contrast to most surveys in which the respondent is visited only once or twice.

        The consumption aggregate is built up from very detailed data on cash spending

on all commodities and imputed values of in-kind spending, which is mainly

consumption from household production, valued at local selling prices. Living

expenditures exclude spending on production inputs (which are accounted for in net

income from own-production activities). They also exclude transfer payments, though

these only account for a small share of total spending (3.7% over the whole sample in

1996). The income aggregate includes cash income from all sources and imputed values

for in-kind income. Income is measured net of all production costs, including interest on

debt (including loans from the SWP). The migrant workers were not tracked, although

the income aggregate includes remittances received from family members who migrated,

including those supported by the SWP. Remittances are expected to be the main means

by which the out-migration component reduced poverty in the short run.

        Given the unusual effort that went into data collecting and checking the

consumption and income data, we expect that subtracting consumption from income will

give reasonably accurate estimates of savings. We also look into what forms the savings

took. There are many forms of saving in this setting, including money balances and

investment in own-production activities. The survey was not designed to allow a

complete independent accounting of all forms of saving. Some data were collected on

assets and liabilities, although the reliability of the reported values is questionable. We

also study impacts on holdings of specific assets.

        For the 2004/05 follow-up survey we used exactly the same survey instrument as

for the 1996-2000 surveys, augmented with a module to elicit perceptions of both welfare

and the project's impacts. The module asked respondents to assess whether various



                                               10

aspects of their lives had improved over the preceding 10 years. (The questions in this

module were asked in 2005.) These involved a long list of aspects of well-being and in

each case the respondent was asked whether this item had improved or not over the last
10 years, on a 10 point scale (from "extremely worse off" to "extremely better off").13

(The sample was restricted to adults who were at least 28 years of age at the time of the

interview.) Our idea here is to see whether such a rapid appraisal tool -- which does not

require any prior surveys, including a baseline -- gives similar results to our more costly

longitudinal survey-based method.

        Over 1996-2005, the attrition rate was 12% (6% over 2000-05). Using a probit

model for attrition over 1996-2005 we found a number of significant predictors,
including age of head, share of children, landholding and some geographic variables.14

(Being an SWP village was not a significant predictor of attrition.) NBS survey teams

were instructed to find replacement households as similar as possible to those that

dropped out. We also tested how well this replacement worked, using a regression for

the probability of being a replacement household estimated on the pooled sample of
replacement and "drop-out" households.15 Among the same set of covariates for attrition,

no regressor was a significant predictor for replacement and the regression has very low

overall explanatory power. It appears that the sample with replacements can be

considered representative of the population.

        We checked the robustness of our results to several potential data problems. One

problem concerns the aggregation of total living expenditures. It appears that in

processing the 2004/05 survey data, living expenditure in one county may have failed to
include in-kind consumption.16 The data for three other households whose in-kind

income was more than six times larger than their total living expenditure seem to have a

similar problem. We re-estimated the impacts on consumption and income, dropping this


13      The Chinese and local language versions of the module were refined over time on the
basis of field tests in poor villages in a number of locations.
14      A statistical addendum is available from the authors giving full details.
15      Note that we have baseline data for the "drop-outs" and the current year's data for the
replacements. To deal with the time difference we did a pro-rata adjustment of the data on drop-
outs to 2004 values according to the ratios of the means over time for each variable, based on the
balanced panel. In caculating the ratios, we also weighted by the attrition probability.
16      We suspect there is a problem because the total living expenditure of 68% the sample in
that county is equal to cash expenditure, whereas net in-kind income is about half of overall total.


                                                  11

one county and the three households. The results reported below were robust to this

change (details available from the authors).

        Another potential data problem is related to the coding of SWP projects. We find

in the village-level project data base that all ten villages in one county claim to have a

project funded by the SWP, even though six of them were officially designated as non-
SWP villages.17 It may well be that there was significant SWP participation by villages

that had not been selected for the project in this particular county (although we cannot

rule out coding errors). On deleting this county we found that our main results were

robust (details are available from the authors).


4.      Estimation methods and sources of bias

        Our aim is to estimate average treatment effects on the treated. The double-

difference ("difference-in-difference") method identifies a project's impact under the

assumption that the selection bias (the counterfactual difference in outcomes) is constant

over time and additive in its effect on outcomes. In the present context, we point to two

sources of time-varying selection bias: (i) outcome changes are correlated with initial

differences between the participating and non-participating areas, and (ii) spillover

effects, whereby the project itself alters the subsequent path of outcomes for the non-

participants.

4.1     Biases due to targeting

        Let us begin with the classic evaluation problem. We have data on an outcome

measure Yit for the i'th unit observed at dates, t=0,1. Each unit is observed to be either a

participant (Tit =1) or non-participant (Tit = 0 ). We can write the outcome measure as:

                Yit = Yit + TitGit (t=0,1; i=1,...,N)
                        C                                                              (1)

where Git = Yit -Yit is the gain ("impact"), Yit is the outcome under treatment and Yit
                T    C                            T                                         C


is the counterfactual outcome. Git is not directly observable for any i (or in expectation)

since we do not know Yit for Tit = 0 and Yit for Tit =1. The selection bias is the mean
                          T                    C


difference in counterfactual outcomes (dropping the i subscripts):


17      There are scattered minor reports of SWP activity in non-SWP villages elsewhere, but
these appear to be random, and are probably coding errors.


                                               12

                 Bt = E(Yt | T1 =1) - E(Yt | T1 = 0)
                          C                 C                                           (2)

We call this the unconditional bias, given that we have not yet allowed for control

variables. Given the purposive targeting of the SWP it must be presumed that Bt  0.

        The standard double-difference estimator assumes that B1 = B0 , implying that the

change in mean gains for period 1 participants is consistently estimated by:

         DD = E[(Y1 -Y0 ) |T1 =1]- E[(Y1 -Y0 ) |T1 = 0] = E[G1 -G0 |T1 =1]
                    T    T                  C   C                                       (3)

If period 0 is a true baseline, with T0 = 0 for all i (by definition), then Y0i = Y0i for all i,
                                                                                    C
                                       i

and so DD = E(G1 | T1 =1), i.e., mean impact on the treated units.

        However, time-invariant unconditional bias ( B1 = B0 ) is implausible for poor-area

development programs. The targeted poor areas typically lack infrastructure and other

initial endowments, which could (in turn) affect the subsequent growth rates. DD will

then be a biased estimator, since the subsequent outcome changes are a function of initial

conditions that also influenced the assignment of the sample between the two groups. In
other words, the selection bias will not be constant over time.18

        The direction of bias in DD depends on whether the underlying growth process is

convergent or divergent. For the government's poor-area programs in southwest China in

the 1980s, Jalan and Ravallion (1998) found that failure to control for the initial

heterogeneity between the targeted counties and non-participating counties yields a
downward bias in a DD estimator, consistent with growth divergence.19 However, it is

unclear whether this also holds across villages within the same (poor) counties; indeed,

the results of Jalan and Ravallion (2002) (also for southwest China) suggest that inter-

county divergence can occur side-by-side with intra-county convergence.

        We address this issue by balancing treatment and comparison units in terms of the

initial conditions that may have influenced program placement. These variables are

represented by the vector X. Our key identifying assumption is that the selection bias is

time-invariant conditional on X, i.e., that:

     E(Y1 | T1 =1, X ) - E(Y1 | T1 = 0, X ) = E(Y0 | T1 =1, X ) - E(Y0 | T1 = 0, X )
         C                   C                   C                    C                 (4)


18      This echoes more general concerns about the importance of correcting for selection bias
based on observables (Rosenbaum and Rubin, 1983; Heckman et al., 1998).
19      Also see Jalan and Ravallion (2002) who find evidence of divergence at the county level.


                                              13

On applying a result due to Rosenbaum and Rubin (1983), if outcome changes are

independent of participation given X, then they are also independent of participation

given the propensity score: P(Xi) = Pr(Ti =1Xi) , (0 < P(Xi) <1). This justifies

balancing on P(X) to remove selection bias based on X. Note that this only addresses

time-varying selection bias based on observables; a bias will remain if there are any latent

(time-varying) factors correlated with the changes in counterfactual outcomes. As

discussed later, a remaining bias due to unobservables appears to be more likely for

household selection than village selection.

        We use various methods for assuring balance on P(X). One method is to limit

comparisons to a trimmed sub-sample with sufficient overlap in propensity scores. For

our data, the region of common support (minimum score for treated, maximum score for

untreated) is (0.11, 0.95). For our "trimmed sample" we chose a slightly tighter interval

(0.1, 0.9), which are also the efficiency bounds recommended by Crump et al. (2006) for
estimating average treatment effects with minimum variance.20

        We also use the weighted-regression method proposed by Hirano, Imbens and

Ridder (2003). Thus we estimate the DD from the following regression:

                 Yit = + DD.Ti1t + Ti1 +t +it (t=0,1; i=1,...,N)                            (5)

where E(i Ti ) = 0. This is estimated with weights of unity for treated units and
                1

P^(X ) /(1- P^(X )) for controls, where P^(X ) is a consistent estimate of P(X) and

0 < P^(X ) <1. Hirano et al. show that weighting the controls this way yields an efficient

estimator.21 We estimate (5) on both the pooled sample (for t=0,1 and including

replacement households) and for both the total sample and trimmed sample.




20      Using the formula in Crump et al. (2006), the exact bounds are 0.0997 and 0.9003. For
estimating the average treatment effect on the treated Crump et al. also recommend dropping
treatment units with scores less than about 0.8 (for our data), but keeping all un-treated units. We
did not follow this recommendation. For one thing, we felt that this entailed the loss of too many
treatment villages, raising concerns about inference for the population of treated villages.
Secondly, our balancing tests performed better when we also deleted the low-score untreated
villages, which are clearly poor comparison units.
21      If we wish to estimate the average treatment effect for the population, the weights are
1/ P^(X ) for the treated units and 1/(1- P^(X )) for the controls (Hirano and Imbens, 2002).


                                                 14

         To interpret (5) note that, in a balanced panel, we could instead estimate the

equivalent regression in the more familiar "fixed-effects" form:

                 Yit =  * + DD.Ti1t +t +i +it                                           (6)

Here the fixed effect is i =i Ti +i (1-Ti ) = Ti + + i ( E(i Ti )  0 )) where
                                 T        C                      C
                                    1           1         1                   1

 = - , i = (i - )Ti +(i - )(1-Ti ), E(i) = 0, it =it + i and
       T     C           T     T          C    C
                                   1                     1

 =* + . Thus,Ti in (5) picks up differences in the mean of the latent individual
             C
                           1

effects, such as would arise from initial selection into the program. The advantage of (5)

is that it does not require a balanced panel, and hence it gives estimates that are robust to

selective attrition (recalling that the replacements appear to have preserved the sample's

ability to represent the population).

         As a robustness check, we compare these estimates with matching on the

propensity score. Note first that the sample estimate of mean impact can be written as:

                  NT              NC
                 ((Y    T                                                               (7)
                       i1-Yi0 )- Wij(Yj1 -Yj0))/NT
                             T             C    C

                  i=1             j=1

where NT is the number of SWP participants, NC is the number of control observations,

and Wij is the propensity score-based weight given to the j'th non-participants in making

a comparison with the i 'th participant. How many non-participants to include in the

control group and how to assign weights to each non-participants are practical questions

in implementing PSM. One option is to use the popular method of nearest-neighbor

matching. However, because of the non-smoothness of nearest neighbor matching, the

conventional bootstrapping method is inappropriate for estimating the standard errors

(Abadie and Imbens, 2006). In order to assure valid bootstrapped standard errors, we

choose to apply nonparametric kernel matching in which all the non-participants are used

as controls and weights are assigned according to a kernel function of the predicted

propensity score (following Heckman et al., 1997, 1998). The weights can be written as

Wij = Kij /     Kik , where Kij = K((P^j (X ) - P^i (X ))/ an ), in which K() is a kernel
              k

function and an a bandwidth parameter. We use the normal density function as the kernel

and the odds ratio (rather than propensity score) because SWP villages are over-sampled

relative to their frequency in the population eligible for the project.



                                               15

        The conditional independence assumption motivates a specification test of

whether there are differences in observables between the project and non-project villages

after conditioning on P(X ) through matching or re-weighting. Following Rosenbaum

and Rubin (1985) and Abadie and Imbens (2006) we test for covariate balancing using

differences in standardized means between the SWP villages and matched or re-weighted

non-SWP villages. To achieve a better balance of covariates and to allow for a more

flexible estimate of propensity scores, we also include polynomial terms for the initial

income levels (see, for example, Smith and Todd, 2005). We will show that the matching

and re-weighting procedures produce a satisfactory balancing of the observables between

SWP and comparison villages.

4.2     Biases due to spillover effects

        All the methods described above assume that an observationally similar

comparison group pre-intervention reveals the counterfactual of what would have

happened over time to mean outcomes for the treatment group in the absence of the

intervention. This will clearly not be the case if there are any spillover effects, whereby

the intervention changes outcomes for non-participants.

        Spillover effects due to residential mobility between villages are unlikely in this

setting given the village-level administrative land allocation. Under China's rural land

laws, a migrating household would have little prospect of getting a share of the land

available (and almost certainly cultivated) at the destination and would also risk losing

their land at the origin.

        Another source of spillover effects is inter-village trade (possibly via urban hubs).

To the extent that the project has an impact on local incomes and prices, trade-induced

general equilibrium effects will entail spillover effects to the non-SWP villages used to

infer the counterfactual. We will test for impacts on prices as well as incomes,

distinguishing cash-incomes (as derived from inter-village trade) and incomes-in-kind.

        Local public spending responses to project aid can also be confounding. Recall

that there were other development activities supported by the local (county and

provincial) governments, side-by-side with the aid-financed SWP. Non-SWP villages

could then be affected by a SWP-induced re-allocation of public spending by local

authorities. If the SWP does not have a lasting impact then the bias will probably be



                                             16

confined to the disbursement period. However, if there are lasting impacts (observable to

local authorities) then one would expect the local spending response to the SWP to

continue beyond the disbursement period.

        A theoretical model of the local public spending response to external aid can help

inform an assessment of the likely bias. Let GOVj denote the local government's

spending on its own poor-area development programs in village j=SW, NSW, which index

the SWP (project) and non-SWP (comparison) villages, and total spending is

GOV = GOVSW + GOVNSW . (We treat the two groups as having equal size but this does

not change the main result as long as the group sizes are fixed.) The external aid

provides extra spending in the amount AID in the project villages, so that total spending

on poor-area programs in the SWP villages is GOVSW + AID . The local government has

a preference ordering over its spending allocation across the two sets of villages and its

spending on all other activities, denoted Z. The preference ordering is represented by:

W(GOVSW + AID,GOVNSW , Z) and this function is strictly increasing in all three

elements, and strictly concave in all three; it simplifies the analytics if we also assume

that the function is additively separable, though this can be weakened. The local

government maximizes W subject to its local revenue constraint, which creates an upper

bound on GOV+Z. Under these assumptions we have the following result (that is proved

in Appendix 1):

        Proposition: The external aid will displace local government spending in the

        project villages, increase spending in the comparison villages, but decrease total

        local government spending across both sets of villages.

The implication for our evaluation is plain: Comparing outcome changes over time

between SWP and (matched) non-SWP villages in the same counties will under-estimate

the project's true impact.

        We will test for spillover effects. The presence of non-SWP development projects

in the SWP villages provides the clue. We use the same evaluation methods described

above, but the "outcome variable" becomes the extent of non-SWP project activity in the

SWP villages. The theoretical result in the above proposition will be exploited in

determining an upper bound to the bias induced by spillover effects,



                                              17

5.      Estimated impacts

        Table 1 gives mean income and consumption and the poverty rates for 1996, 2000

and 2004/05. The poverty line of 808 yuan per person per year in 1995 prices

(corresponding to the $1 a day line used by Ravallion and Chen, 2005, at 1993

purchasing power parity) as well as poverty lines above and below this figure. We see

that the income gains in SWP villages between 1996 and 2000 were larger than among

non-SWP villages, but that this reverses between 2000 and 2004/05. Ten years after its

commencement, the SWP does not appear to have allowed the selected poor villages to

catch up with the rest of these (poor) counties.

        Table 1 suggests that SWP had little or no impact on income and consumption.

However, before accepting that conclusion we need to probe more deeply into the

potential sources of bias described in the previous section. We begin with selection bias

due to non-random placement of the SWP. At the end of this section we test for bias due

to contaminating spillover effects.

5.1     Probits for selection into the SWP

        Table 2 gives probits for whether a village was selected for SWP, as used to

estimate the propensity scores. The variables were chosen to reflect the selection criteria

used by the project staff (based on our interviews at the time).

        We find that project villages tend to be in more hilly/mountainous areas, are less

likely to have electricity, less likely to have a school in the village or nearby, though
more likely to have a health clinic within the village relative to nearby.22 The SWP

villages also tend to have larger populations, with lower mean income in 1995 (from the

village-level data), lower mean consumption in 1995 (from the household survey) and

more land per capita. The latter characteristic probably reflects lower population density

and lower land quality in the project villages. In most respects, the results of Table 2

suggest that the SWP villages tend to be poorer than other villages within the project

counties, consistently with Table 1.

        Using the propensity scores based on Table 2 to re-weight the data we were able

to obtain a close balancing of the characteristics of the two samples (including in the


22      Remote villages are more likely to have a very basic health clinic, to compensate for the
inaccessibility to more comprehensive township facilities.


                                               18

means of the initial outcome variables), particularly after trimming the samples, as

discussed in the previous section. Appendix 2 provides details on the balancing tests,

which pass comfortably; this was also the case for a full set of covariates in Table 2, for

which the balancing tests are reported in the Addendum available from the authors.

5.2      Double-difference estimates of average impacts

         In assessing impacts on mean consumption and income, we begin with the simple

DD estimates of the mean impacts for income, consumption and saving, as given in Table

3. We give estimates for both 2000 (at the end of disbursements) and 2004/05 and for

both the levels and the logs; the latter gives higher weight to the gains to poorer

households. The baseline is 1996 in both cases.

         Focusing first on the disbursement period, we see a sizeable and statistically

significant impact on income but not consumption; the bulk of the income gain was

saved. (The same pattern was found using 1995 as the baseline.) On decomposing

income (as wage income, farming, animal husbandry, fishery, forestry, non-farm

enterprises, transfers and asset income), the only component that showed a statistically

significant impact was animal husbandry, for which the simple DD impact on net income

was 90.85 yuan (t=2.92), which rose to 117.26 (t=3.37) and 136.15 (t=3.55) using
weighting and matching (respectively) to correct for selection bias (Table 4).23

         Another way of disaggregating income is into cash or kind (which will be relevant

when we consider trade spillovers in section 5.4). We found that the bulk of the short-

term income impact was income in-kind from animal husbandry, as is evident from Table

4. This is puzzling, as a sizeable share of income-in-kind from husbandry in a rural

economy is also consumed directly, and should then show up in consumption. However,

the income in-kind that is being affected by the project appears to be small non-

productive animals and new litters of productive animals, which are counted as income in
kind but are held over for consumption or sale at a later date rather than consumed.24 We

will return to this point when we discuss the longer-term impacts.




23       We only report the results for husbandry, and summarize those for other components; a
statistical addendum is available with full details.
24       We do not have data on this, but the practice of counting such animals as income in-kind
is discussed in the manual for enumerators provided by NBS.


                                                 19

        We can also disaggregate consumption expenditure. On separating food staples

(rice, wheat etc) from non-staples and other foods we found significant impacts in 2000

for non-staple foods (meat, vegetables etc); the simple DD for this category was 26.26

yuan (t=1.68) though rising to 40.64 yuan (t=2.69) and 42.58 yuan (t=2.70) for the PS

weighted and kernel matched estimators respectively. This is likely to entail nutritional

gains through higher protein and more micro-nutrients.

        The results change dramatically when we track the impacts through to 2004/05, as

is evident when we return to Table 3. We find no significant impacts on mean income or
consumption over the longer observation period.25 (This also was also true for staples

and non-staples separately.) Table 3 also gives the DD estimates for mean income using

the propensity scores to balance project and non-project villages; we give results using

both weighting and matching, for both end dates, and for both the trimmed sample and

total sample. The basic pattern in the simple DD estimates is still evident. The results
are robust to using kernel matching instead of the re-weighted regression method.26

        While there is clearly some sensitivity to the choice of estimation method, the

pattern is still reasonably robust, indicating significant and sizeable income gains during

the disbursement period but much less in the longer term. The estimated income gains in

2000 tend to be larger when we correct for purposive selection of SWP villages; this is

consistent with a divergent growth process between villages. However, no such pattern is

evident for the 2004/05 impacts.

        We did find significant longer-term impacts on income in-kind. On breaking up

income in-kind by source, we found that both farming and husbandry accounted for

almost all these long-run impacts, though only husbandry was significant (Table 4). The

simple DD estimate of impact in 2004/05 on income in-kind was 130.30 yuan (t=2.11),

though this fell somewhat when we corrected for selection; with weighting we obtained

DD=111.90 (t=1.89) while with matching, DD=96.98 (t=1.78). We found no other

significant impacts in the long run amongst cash income components.




25      The same pattern was evident using 1995 as the baseline, although impacts were
somewhat lower.
26      The results were also robust to deleting the troublesome county and the observations with
problematic data (section 3).


                                                20

        In contrast to the period up to 2000, we find consumption gains in the post-

disbursement period. The impact on total consumption in 2004/05 is not statistically

significant (Table 3). However, when we break this up according to cash or kind, we do

find signs of larger impacts on consumption in kind. The simple DD estimate for

consumption in kind in 2004/05 is 118.40 yuan (t=2.54), although this drops appreciably

when we correct for selection bias; using PS weighting the impact is 74.46 (t=1.50). The

longer-term impacts on consumption in kind probably include consumption of the income

in-kind from animal husbandry that we observed in the SWP disbursement period.

        For either the simple DD or the score-weighted DD, the consumption gains

exceed what one could reasonably expect under the permanent income hypothesis (PIH)

if the income gains from SWP were purely transient. For then the consumption gains in

the four-year period following SWP would simply be the rate of interest times the

permanent-income equivalent of the transient income gain. For the simple and score-

weighted DD, plausible rates of interest would imply lower consumption gains than we

see in Table 3, although this is not true for the kernel-matched DD for which the post-

disbursement consumption gains equal the increment to permanent income at a rate of

interest of about 10%. Statistically, however, we cannot reject the null hypothesis that the

post-disbursement consumption gain equals the increment to permanent income (at

reasonable interest rates) treating the SWP income gain as transient.

        The PIH interpretation begs the question as to why we saw no consumption gains

in the disbursement period. If SWP participants knew at the outset that the project would

entail only a transient income gain then consumption would have immediately reflected

the implied gain to permanent income. However, from what we know about the SWP, it

is unlikely that participants could have formed a reliable estimate of the gain to

permanent income due to SWP until at least project completion. As noted in section 2,

there was considerable uncertainty about the income gains, and high initial savings may

have been a short-term precautionary response.

        We found no evidence of impacts on interest payments on loans or the proportion
of households paying interest or paying back loans, for either 2000 or 2004/05. 27 So we

find no support for the idea that either the high savings from the short-term gains or the


27      Again we only summarize the results here; the addendum gives full details.


                                               21

lower longer-term impacts on incomes stem from greater enforcement of interest or

repayment requirements under the SWP, compared to other credit sources.

         With weak enforcement of the SWP loan repayments, it might be conjectured that

taxes on SWP areas would increase, to help local authorities pay back the SWP loans to

higher levels of government. However, we did not find any evidence of impacts on taxes

or fees paid per capita, in either 2000 or 2004. It appears that higher levels of

government treated the SWP as, in large part, a transfer payment to lower levels.

         In testing for impacts on agricultural productivity, we used total farm income per
unit area.28 We found no evidence of impacts. Nor did we find much evidence of

impacts on holdings of productive assets and wealth (including housing). This was true

for both the disbursement period and the longer-term. An exception is that the village
data base revealed a significant impact on livestock holdings, notably cows and goats.29

         There is some sign of a demographic impact. Household size fell in both SWP

and non-SWP villages over 1996-2000, but more so in the former. The simple DD for

household size is -0.13 persons (t=-1.75) and it is slightly larger with the corrections for

selection bias (the PS-weighted estimate is -0.16, t=-1.64, and it was similar for kernel

matching). The demographic effect was associated with slightly fewer children.

However, the demographic impact was not evident in 2004.

         Nor did we find any evidence of impacts on remittances received from family
members migrating out, or on the probability of a family member migrating.30

         We did find significant impacts on school enrolment rates during the

disbursement period; our PS-weighted DD estimate was 0.074 (with a t-ratio of 2.20),

i.e., a 7.4% point increase in the school enrollment rate of children aged 6-14 by the year


28       Ideally we would use physical output for a given crop per unit area under its cultivation.
However, only total land area under cultivation was collected. Instead we used an overall farm
productivity measure, obtained by dividing total net income from farming by total cultivated area;
this can be interpreted as a mean crop-specific yields weighted by both prices and shares of land.
29       The simple DD for cows per person in 2000 was 0.05 (t-ratio=2.47); with score-
weighting it rose to 0.07 (t=3.54) and it was the same with kernel matching (t=4.33). By 2004 the
impacts were slightly higher and equally significant statistically; the simple DD estimate was 0.07
(t=3.69) while the score-weighting the impact was 0.09 (t=4.05) and with kernel matching it was
0.10 (t=3.92). Significant impacts were also evident for sheep, although with lower t-ratios.
30       Out migration in the previous year is only measured for those present in the village at the
time of the interview, although NBS made an effort to ask the individual questions at times of the
year when migrants are more likely to be present. Remittances may well be the better indicator.


                                                 22

2000 is attributed to SWP.31 However, this impact had dropped substantially by 2004/05;

the corresponding DD estimate fell to 0.032 (t=1.00). The transient schooling impact

probably reflects the fact that the tuition subsidies ended with other SWP disbursements.

Of course, even though the non-SWP village caught up substantially with the SWP

villages in schooling by 2004/05. Thus there were children in SWP villages who entered

school earlier than without the SWP and this will probably yield future income gains.

        There was almost no sign of impacts on the prices of agricultural outputs and
purchase prices for inputs for 13 items.32 We found positive impacts during the

disbursement period for a number of types of infrastructure, although they are generally

not statistically significant. We found little sign of impacts in the 2004/05 data. The

exception was TV reception, which showed significant impacts in the longer-term as well

as during the disbursement period.

        Table 5(a) gives the estimated impacts on the incidence of income poverty for

various poverty lines; Table 5(b) gives the corresponding results for consumption

poverty. Again we give estimates using the poverty line of 808 yuan per person per year
as well as selected poverty lines above and below this figure.33 The poverty impacts in

the SWP disbursement period are broadly consistent with our findings for the impacts on
the mean income and consumption in Table 3.34 In Figure 1 we also give the results

graphically, by plotting the DD estimate of the impact on the headcount index of poverty

(for income and consumption poverty in panels (a) and (b) respectively) against the

poverty line, which we vary over virtually the whole distribution. Impacts on the income

poverty rate are largest just below the 808 poverty line, for both end dates. The impacts

on consumption poverty echo our results for mean consumption around the middle of the

range of poverty lines, where 2004/05 consumption-poverty impacts exceed those for



31      The uncorrected DD was 0.046 (t=1.41) and the kernel matched DD was 0.072 (t=2.40).
32      The only exceptions were that diesel oil had a significantly higher price in the SWP
villages by 2004/05 and edible oil crop had a slightly lower price.
33      The table only gives results for the trimmed sample, which is better balanced. However,
although the precise estimates differ between the two samples, the basic pattern was the same,
and our main conclusions do not depend on this choice.
34      The results were also robust to deleting the county in which some SWP activity was
recorded in non-SWP villages. We found an impact on extreme consumption poverty in 2004
after deleting the consumption outliers ( The weighted DD at 500 consumption poverty line is -
8.06 with t-ratio of -1.72; the weighted DD at 600 is -9.20 with t-ratio of -1.67.)


                                                 23

2000; the results imply a sizeable nine percentage point drop in the consumption poverty

rate at poverty lines around 600 yuan. However, this is not true at lower and higher lines,

where impacts over the two time periods agree fairly closely.

        For all of the above impact estimates, the counterfactual is the absence of the

SWP. There is an alternative counterfactual of interest, namely the absence of direct

participation in any anti-poverty program, including the government's programs. For

identifying this counterfactual we can use those households in non-SWP villages who did

not participate in any other program; this applied to 69% of the households in non-SWP

villages. So we repeated the above calculations dropping those who recorded any direct

participation in other programs. (The balancing tests passed comfortably.) The impacts

for 2000 were similar to those above. However, the long-run impacts on mean income

and consumption were larger. For example, the simple DD estimate of the impact on

mean income in 2004 rose to 125 yuan per person (as compared to 45 yuan in Table 5)

although this fell to 99 yuan when we corrected for selection bias using PS weighting.

Nonetheless, the impacts relative to this alternative counterfactual were still not

significantly different from zero; for example, the t-ratio on the simple DD for mean

income was 1.47, which dropped to 1.13 with PS weighting.

5.3     Heterogeneity in impacts

        We tested for differences in impacts according to the initial values of income,
education and ethnicity.35 The score-weighted DD's were not significantly different for

any of our outcome variables when we stratified by education or ethnicity. However, we

found a notable difference when stratified by initial income (above or below the median),

with significant longer term gains for the low-income group. When we interacted income

with education we found that the longer-term gains were strongest for the relatively well

educated (at least junior high school) amongst the low-income households, as can be seen

in Table 6.

        The heterogeneity in returns suggests that a different assignment of the loans

would have increased overall impact. The household participation rate was slightly higher

for the group of relatively poor but well educated households; 61.1% of this group in


35      We distinguish Han Chinese from all other ethnic minorities. The ICR points to concerns
about how well ethnic minorities were reached by the SWP (World Bank, 2003).


                                              24

SWP villages participated, as compared to 58.8% of those with above median income and

higher education, 50.0% of those with high income but low schooling, and 47.8% of

those with both low income and schooling. (The program slightly favored better educated

households both above and below median income.) Suppose that beneficiary selection

had focused solely on the relatively well-educated poor, and saturated this group, with no

change to conditional mean impacts by subgroup, which were zero for other groups

(consistently with Table 6). Then the impact of the program as a whole would have risen
substantially, from a mean impact of about 40 yuan per person to about 150 yuan.36 To

achieve this outcome, the program would have had to over-ride the community-based

selection process, which evidently put too little weight on reaching the educated poor,

even though this group was already favored in the selection process.

        While we found no impacts on average remittances and out-migration, significant

positive impacts were evident when we stratified by initial income and education; the

impacts were significant for those who were initially above median income and (among

those with above-median income) were larger for those with more schooling.

5.4     Are we underestimating the impacts due to spillover effects?

        Biases in long-term impact estimates can arise from interference due to spillover

effects, as discussed in section 4.2. Our results do not offer much support to the idea of

trade-induced spillover effects. We have seen that there were no significant impacts on

prices, although it might be argued that arbitrage eliminated any price differentials. More

damaging to the notion that there were significant trade-spillovers across villages is the

fact that we did not find significant impacts on cash income, even during the

disbursement period; the short-term income gains were in kind, and mainly from animal

husbandry. Since inter-village trade is likely to involve cash, there must be a

presumption that such trade was affected rather little by SWP.

        What about bias due to the responses of the local political economy? From the

data on project activities, we counted the number of new non-SWP projects of each type

that started between 1996 and 2001 (inclusive). (So this is the change in the number of

non-SWP projects during the period.) For the loans made to households, the project data

36      This is based on an impact of about 200 Yuan for this group (Table 5), scaled down by
25% to reflect the number of households in this group, which would then represent 75% of the
total number of SWP participants.


                                               25

also give counts of the total number of beneficiary households. However, we cannot tell

what happened in the post-disbursement period since it was only possible to collect the

project data we use for these calculations during the SWP disbursement period.
         Table 7 gives the results for various project activities.37 Large displacement
effects are evident for virtually all non-SWP activities.38 For most categories, the mean

in SWP villages is half or less that in non-SWP villages, implying that 40% or more of

the non-SWP spending allocation to SWP villages was cut, and re-allocated to non-SWP
villages.39 Such large displacement effects would imply that the benefits of the SWP are

likely to have spilled over to our comparison villages, leading us to under-estimate the

impacts of SWP.

         How large is the bias in our estimates of the impact on income due to these

spillover effects? We shall assume that the displacement is entirely within the same

county; that is plausible given that the county government is the key decision maker in

the sub-county allocation. Invoking the theoretical result in section 4.2, we expect that

total government spending (in both project and comparison villages) will also fall. In

other words spending is expected to rise in the comparison villages by less than the

amount that had been displaced in the project villages. To determine an upper bound to

the bias we can assume that the increase in spending in the comparison villages exactly

equaled the displaced spending in the project villages. In this case we will be over-

estimating the bias due to spillover effects.

         To help throw light on the likely magnitude of bias due to spillover effects, let

GOV denote the spending done under the government's own program, expressed as

spending per capita of the total population. Some of this spending is done in SWP

villages and some is in the non-SWP villages; GOV = wGOVSW + (1- w)GOVNSW where

w is the population share of the SWP villages while GOVSW and GOVNSW denote the


37       The main activities excluded are minor infrastructure projects none of which showed any
significant displacement. When there is no response from a village for a specific activity we treat
it as a zero; this is plausible, although we test robustness to treating it as a missing value.
38       We repeated these tests using the total samples and treating all cases in which no entry
was made as missing values. The results in Table 9 were reasonably robust. (The effects tended
to be stronger under the alternative treatment of "no response" entries.)
39       Recall that about one quarter of villages in SWP counties received the aid project, so that
a non-SWP village will receive, on average, one third of the displaced spending.


                                                   26

observed (post-SWP) levels of government spending in SWP and non-SWP villages

respectively (per capita of the relevant population). We assume that in the absence of the

SWP there would be no difference in the level of the government's spending between

these two types of villages. The amount of displacement of non-SWP spending in SWP

villages that is attributed to the SWP is then (GOVNSW - GOVSW )(1- w) .40 The bias in

the double-difference estimate is RNSW (GOVNSW - GOVSW ) where RNSW is the income

rate of return to the government's projects.41 The true impact is thus:

                 DD* = DD + RNSW (GOVNSW - GOVSW )                                           (8)

On noting that DD* = RSW AIDSW where RSW is the true rate of return to the SWP and the
                            *                    *



external aid-financed investment is AIDSW per capita in the SWP villages, we can then

derive the following formula for the proportionate bias:

                  DD    =1-       RNSWGOV                    w(1- k)
                                                                                             (9)
                 DD*               RSW AID
                                    *          where  = 1   -w(1-k)

and where k  GOVSW /GOVNSW and AID = wAIDSW . There will be no bias if there is no

displacement (k=1), or the SWP is negligible in size (w=0) or the rate of return to the

displaced government investment is zero ( RNSW = 0 ).

        However, this is still not a usable formula for determining an upper bound for the

bias since the measured rate of return to SWP spending will also be contaminated by the

spillover effect. (We assume that the bias due to the local-spending spillover effects

induced by the external aid only contaminates estimates of the rate of return to that aid.)

The true rate of return is RSW = RSW DD* / DD . Substituting into (9) and solving we have:
                               *



                 DD*    =1+       RNSWGOV                                                    (10)
                  DD               RSW AID



40      Note that if S is the per capita government spending displaced from SWP villages then
Sw /(1- w) is the corresponding gain (per capita) in the non-SWP villages. GOVSW = GOV - S and
GOVNSW = GOV + Sw /(1- w) .
41      Note that YSW = YSW - RNSW S and YNSW = YNSW + RNSW Sw /(1 - w) are the
                                *                               *

measured income gains where the * denotes the values without the spillovers. Also note that
DD = YSW - YNSW and DD* = YSW - YNSW . The following result is then easily derived.
                                           *       *




                                                  27

        What are seemingly plausible values for the parameters of (10)? Jalan and

Ravallion (1998) estimated an average rate of return of 12% for the Government's poor

area development program in the same region of China over 1985-90. Using different

methods, Park et al., (2002) also estimate a rate of return to the Government's national

poor-area program of 12% in the period 1992-95. Using the same data, and similar

methods to the present study, Ravallion and Chen (2005) estimated that the rate of return

to the SWP spending during the disbursement period was RSW = 9%. So we set

RNSW / RSW = 1.33. One-quarter of villages in the poor countries participated in SWP, so

w=0.25. Based on Table 7 we can take k=1/3 to be a reasonable lower-bound (noting that

DD / DD* is strictly increasing in k).42 So  = 0.2 . The level of investment per capita

under the non-SWP projects is about half of than under SWP (GOV / AIDSW = 0.5)

implying that GOV / AID = 2 .43 Inserting these numbers into equation (9) we obtain

DD* / DD =1.53 (implying RSW = 14%).
                                  *


         So allowing for spillover effects could yield as much as a 50% larger income

gain attributed to the SWP during the disbursement period. The bias-corrected simple

DD estimate of the income gain during the disbursement period could rise to about 200

yuan per person, from 130 yuan. In principle, the consumption gains could also be biased,

although, given that we find virtually zero (indeed negative) consumption impacts in the

disbursement period, our conclusion that the income gains were fully saved remains

unaffected.

        The more interesting question concerns the post-disbursement period. Recall that

the tests for displacement in Table 7 do not cover the post-disbursement period. It might

be expected that the local spending balance between the treatment and comparison

villages would be restored once the external aid ceased. Although the data used in Table

7 are not available for 2004/05, we can at least test for long-term impacts on new loan

activity from non-SWP sources, as an indication of whether the SWP displaced other



42      Using the project data base to comparing average loan amounts for non-SWP in SWP
villages with those in non-SWP villages gives k=0.58.
43      According to the project data, mean lending per capita under non-SWP projects (whether
in SWP or non-SWP villages) represents 53% of the corresponding mean loan under the SWP
(per capita of the population in SWP villages).


                                               28

sources of finance in the post-disbursement period. (In 1995 we know who had received

SWP loans so we can net this out of total loans received. Of course, in 2004/05 there

were no new SWP loans.) By these calculations, we found no significant impacts on

non-SWP loans in 2004/05. This does not suggest there was long-term displacement of

other sources of finance.

        While the displacement effect is presumably greater in the disbursement period, it

cannot be ruled out post-disbursement. If there are in fact longer-term gains from the

SWP and this is known locally then continuing positive displacement will be expected,

making it harder to identify those gains. However, even the upper bound to the bias

derived above of DD* / DD =1.5 is well short of being sufficient to imply a significant

long-term impact on mean income; assuming that the standard error is not biased by the

spillover effect, one would need to quadruple the income gain in 2004 before it could be

deemed statistically significant.


6.      Conclusions

        The longer-term impacts of aid to poor areas depend crucially on why these areas

are poor in the first place. If persistently poor areas arise from generalized capital-market

failures then external aid can relieve the credit constraints and so enhance long-run

growth. If instead the credit market failures are specific to certain (liquidity-constrained)

subgroups of the population then the aid will need to be targeted to those groups.

However, persistently poor areas can arise from other causes, such as governance failures

or (possibly policy-induced) distortions in other markets (including labor, such as due to

restrictions on migration). Heterogeneity in impacts can also interact with the beneficiary

selection process in a way that attenuates the aggregate impact.

        So the benefits from extra aid to poor areas may well be modest. Unfortunately,

the absence of rigorous studies of the long-term impacts of aid to poor areas has left a gap

in our knowledge about both the causes of geographically concentrated poverty and aid

effectiveness.

        To help fill this gap in knowledge, we have used a specially designed set of high-

quality surveys collected over a 10 year period to study the impacts of a World Bank-

financed poor-area development program in southwest China. We find a sizeable and



                                              29

statistically significant impact on mean household income in the participating villages

during the disbursement period. However, there was a much smaller impact on

consumption during that period; the short-term income gains were largely saved

(although with some improvements in diet quality). Four years after disbursements had

ended, both project and non-project villages had seen sizeable economic gains, with only

modest net gain to mean income attributed to the project. Indeed, we cannot reject the

null hypothesis that the longer-term average impact was in fact zero, although we do find

evidence of longer-term impacts on income in-kind from animal husbandry.

        The most plausible interpretation of our findings appears to be as follows. The

high savings rate from the initial income gains reflected uncertainty about the future

impacts -- no doubt compounded by the uncertainty about the project's loan repayment

and interest obligations, given uncertain contract enforcement at local level. Farm

animals were clearly an important form of saving as well as being the main source of the

short-term income gains. No doubt the relevant uncertainties were resolved in the longer

term. Productivity gains turned out to be small. The initial income gains proved to be

transient for most households, although there was some persistence in the income gains

from animal husbandry. The mean consumption gains over the longer period are in

rough accord with what one would expect from the (modest) increment to permanent

income attributable to the project.

        We highlight three findings that raise broader issues for development programs.

First, heterogeneity in impacts can play an important role in explaining poor overall

outcomes. We find that there were significant and lasting income gains among the subset

of households who were initially poor and relatively well educated. Presumably these

households had more productive investment options, which could not be financed

otherwise given the liquidity constraints facing the poorest. The program's community-

based selection process favored the better educated, but expanded coverage of those who

were also poor could have greatly enhanced the program's overall impact. Given the

heterogeneity in returns, the implied (ex-post) deficiencies of the community-based

selection process help explain the program's disappointing overall impact. While the

program performed well in selecting poor villages, overall impacts were greatly

attenuated by inadequate coverage of the (educated) poor within poor villages.



                                             30

        This finding points to a potentially serious trade-off facing such programs. The

desirability of more participatory processes of local beneficiary selection may well come

at a large cost to overall impacts, including on poverty. To assure larger impacts one

would need to over-ride this process by dictating the types of households that should be

targeted, based on the likely benefits to them. (In the program studied here, it appears that

the presence of complementary skills and knowledge, as proxied by education, was

crucial to the impact.) Whether that is feasible or not in practice is a moot point.

        Second, our results point to the importance of taking account of the participants'

inter-temporal behavior, such as in response to the uninsured risks often associated with a

development project. Those responses can cloud impacts in both experimental and non-

experimental evaluations. An evaluation that focused solely on the income or

consumption gains during the disbursement period (as is commonly the case) can give a

deceptive picture of the true impacts.

        Third, our findings illustrate how the responses of local development agents can

cloud identification of the long-term impacts of geographically-placed projects (whether

randomly placed or targeted). We found evidence of positive spillover effects on the

comparison villages through the displacement of other development spending during the

program's disbursement period. Such interference suggests that the classic impact

evaluation methods will systematically underestimate the impact. In our case, the biases

could well be substantial, although it is unlikely that these effects are imparting a

sufficiently large bias on our impact estimates (under seemingly plausible assumptions)

to overturn our main qualitative results. But this may well be a bigger problem in other

settings.




                                              31

Appendix 1: Proof of the proposition in Section 4.2

       The problem is to maximize W(GOVSW + AID,GOVNSW , Z) s.t.

GOVSW + GOVNSW + Z  R , where R is the local government's revenue. The first-order

conditions for an optimum require that:

               WSW (GOVSW + AID) = WZ (Z)                                          (A1.1)

               WNSW (GOVNSW ) = WZ (Z)                                             (A1.2)

(in obvious notation). By the implicit function theorem, to optimal levels of GOVj and Z

are functions of AID. Differentiating (A1.1) and (A1.2) totally with respect to AID we

have:

               (WSS +WZZ ) GOVSW            GOVNSW                                 (A2.1)
                             AID      +WZZ    AID     = -WSS


               WZZ GOVSW                     GOVNSW   = 0                          (A2.2)
                      AID    + (WNN +WZZ)     AID

where WSS is the second derivative of W w.r.t. GOVSW , WNN is the second derivative of

W w.r.t. GOVNSW and WZZ is the second derivative of W w.r.t. Z. Solving (A2.1) and

(A2.2) we have:

               GOVSW                                                               (A3.1)
                 AID     = -WSS(WNN +WZZ)/ J < 0

               GOVNSW                                                              (A3.2)
                  AID     =WSSWZZ / J > 0

Summing (A3.1) and (A3.2) we also have:

               GOV
                AID    = -WSSWNN / J < 0                                           (A4)

Where J WSSWNN +WSSWZZ +WZZWNN > 0 . Proposition 1 follows immediately.




                                            32

References

Alberto, Abadie and Guido Imbens, 2006, "Large Sample Properties of Matching Estimators for

      Average Treatment Effects," Econometrica, 74(1): 235-267.

______________________________, 2006, "On the Failure of the Bootstrap for Matching

      Estimators", mimeo, University of California, Berkeley.

Chen, Shaohua and Martin Ravallion, 1996, "Data in Transition: Assessing Rural Living

      Standards in Southern China," China Economic Review, 7: 23-56.

Crump, R., J. Hotz, G. Imbens, and O. Mitnik, 2006, "Moving the Goalposts: Addressing

      Limited Overlap in Estimation of Average Treatment Effects by Changing the Estimand,"

      National Bureau of Economic Research, Technical Paper 330, Cambridge, Mass.

Government of China, 1998, Yearbook of China Agricultural Development Bank. Beijing: China

      Statistics Press.

Guobao, Wu, Qiulin Yang and Chengwei Huang, 2004, "The China Southwest Poverty

      Reduction Project," Paper presented at the conference, Scaling Up Poverty Reduction,

      Shanghai, China. Shortened version published in Reducing Poverty on a Global Scale,

      Edited by Blanca Moreno-Dodson, World Bank, 2005, CD-ROM, pp. 255-258.

Heckman, James and Petra Todd, 1995, "Adapting Propensity Score Matching and Selection

      Model to Choice-Based Samples," Working Paper. Department of Economics, University

      of Chicago.

Heckman, J., H. Ichimura, and P. Todd, 1997, "Matching as an Econometric Evaluation

      Estimator: Evidence from Evaluating a Job Training Program," Review of Economic

      Studies 64(4): 605-654.

Heckman, J., H. Ichimura, J. Smith, and P. Todd, 1998, "Characterizing Selection Bias using

      Experimental Data," Econometrica, 66: 1017-1099.

Hirano, Keisuke and Guido Imbens, 2002, "Estimation of Causal Effects using Propensity

      Score Weighting: An Application to Data on Right Heart Catheterization," Health

      Services and Outcomes Research Methodology, 2: 259-278.

Hirano, Keisuke, Guido Imbens, and Geert Ridder, 2003, "Efficient Estimation of Average

      Treatment Effects Using the Estimated Propensity Score," Econometrica Vol. 71(4):

      1161-1189.

Imbens, Guido, 2004, "Nonparametric Estimation of Average Treatment Effects under

        Exogeneity: A Review," Review of Economics and Statistics, 86(1): 4-29.

Jian, Tianlun, Jeffrey Sachs and Andrew Warner, 1996, "Trends in Regional Inequality in

        China," China Economic Review, 7(1), 1-21.

Jalan, Jyotsna and Martin Ravallion, 1998, "Are There Dynamic Gains from a Poor-Area

        Development Program?" Journal of Public Economics, 67: 65-85.

___________ and ______________, 2001, "Behavioral Responses to Risk in Rural China,"

        Journal of Development Economics, 66: 23-49.

___________ and ______________, 2002, "Geographic Poverty Traps? A Micro Model of

        Consumption Growth in Rural China," Journal of Applied Econometrics, 7(4):

        329-346.

Kanbur, Ravi and Xiaobo Zhang, 1999. "Which Regional Inequality? The Evolution of Rural-

        Urban and Inland-Coastal Inequality in China from 1983 to 1995," Journal of

        Comparative Economics 27: 686-701.

Khan, Azizur Rahman and Carl Riskin, 1998, "Income Inequality in China: Composition,

        Distribution and Growth of Household Income, 1988 to 1995," China Quarterly,

        154: 221-253.

Knight, John and Lina Song, 1993, "The Spatial Contribution to Income Inequality in Rural

        China," Cambridge Journal of Economics 17: 195-213.

Leading Group, 1988, Outlines of Economic Development in China's Poor Areas, Office of the

        Leading Group of Economic Development in Poor Areas Under the State Council,

        Agricultural Publishing House, Beijing.

National Bureau of Statistics (NBS), 2000, The Poverty Monitoring Report of Rural China 2000,

        Beijing: China Statistics Press.

Pack, Howard and Janet Pack, 1990, "Is Foreign Aid Fungible? The Case of Indonesia,"

        Economic Journal 100: 188-194.

Park, Albert, Sangui Wang and Guobao Wu, 2002, "Regional Poverty Targeting in China,"

        Journal of Public Economics, 86(1): 123-153.

Ravallion, Martin and Shaohua Chen, 2005, "Hidden Impact: Household Saving in Response to

        a Poor-Area Development Project," Journal of Public Economics, 89: 2183-2204.

Ravallion, Martin and Jyotsna Jalan, 1999, "China's Lagging Poor Areas," American Economic

        Review, Papers and Proceedings 89(2): 301-305.



                                               34

Roemer. John and Joaquim Silvestre, 2002, "The Flypaper Effect is Not an Anomaly," Journal

       of Public Economic Theory 4(1): 1-17.

Rosenbaum, Paul R., and Donald B. Rubin, 1983, "The Central Role of the Propensity

       Score in Observational Studies for Causal Effects," Biometrika, 70: 41-55.

_________________and ___________, 1985, "Constructing a Control Group Using Multivariate

       Matched Sampling Methods that Incorporate the Propensity Score," American

       Statistician, 39(1): 33-38.

Rubin, Donald B., 1980, "Discussion of the Paper by D. Basu," Journal of the American

       Statistical Association 75: 591-593.

Smith, A. Jeffrey and Petra E. Todd, 2005, "Does Matching Overcome LaLonde's Critique of

       Nonexperimental Estimators?" Journal of Econometrics, 125: 305-353.

van de walle, Dominique and Ren Mu, 2007 , "Fungibility and the Flypaper Effect of Project Aid:

       Micro-Evidence for Vietnam," Journal of Development Economics, 84: 667-685.

World Bank, 1992, China: Strategies for Reducing Poverty, Washington DC: World

       Bank.

__________, 1997, China 2020: Sharing Rising Incomes, Washington DC: World Bank.

__________, 2003, Implementation Completion Report on the Southwest Poverty Reduction

       Project, Report No. 26132, Washington DC: World Bank.




                                             35

            Figure 1: Impacts on poverty (trimmed sample)

                      (a) Income poverty


DD poverty impact (% points)

  2

  0

 -2

 -4

 -6

 -8

-10

-12

-14

   350     450   550    650     750 808       950   1050  1150
              Poverty lines (Yuan per person per year)
                        Year 2000         Year 2004




                   (b) Consumption poverty



  DD poverty impact (% points)
  2

  0

 -2

 -4

 -6

 -8

-10

-12

-14

   350     450   550    650     750 808       950   1050  1150
             Poverty lines (Yuan per person per year)
                        Year 2000         Year 2004




                                          36

Table 1: Summary statistics on outcome indicators

                                        1996                          2000                      2004/05
                                 SWP         Non-SWP          SWP          Non-SWP         SWP        Non-SWP
                                villages       villages     villages        villages      villages     villages
Mean income                     996.061       1158.319     1263.412        1223.698      1390.766     1518.963
                               (715.402)     (604.914)     (910.036)       (669.843)     (902.030)    (930.867)
Mean consumption                843.559        945.201      943.550        1023.352      1130.588     1211.973
                               (469.555)     (445.787)     (566.183)       (698.428)     (794.167)    (795.499)
Income poverty rate
Poverty line=600 yuan            0.222          0.127        0.138           0.112         0.123        0.095
                                (0.416)        (0.332)      (0.345)         (0.316)       (0.329)      (0.294)
                808 yuan         0.453          0.306        0.290           0.262         0.242        0.182
                                (0.498)        (0.461)      (0.454)         (0.440)       (0.429)      (0.386)
                1000 yuan        0.614          0.456        0.449           0.415         0.369        0.290
                                (0.487)        (0.498)      (0.497)         (0.493)       (0.483)      (0.454)
Consumption poverty rate
Poverty line=600 yuan            0.290          0.183        0.276           0.219         0.179        0.135
                                (0.454)        (0.387)      (0.447)         (0.414)       (0.384)      (0.342)
                808 yuan         0.576          0.454        0.509           0.441         0.385        0.317
                                (0.494)        (0.498)      (0.500)         (0.497)       (0.487)      (0.465)
                1000 yuan        0.757          0.648        0.675           0.627         0.537        0.468
                                (0.429)        (0.478)      (0.468)         (0.484)       (0.499)      (0.499)
Notes: Standard deviations are in parenthesis. Income, consumption and poverty measures are weighted by
household size. There are 112 project villages and 86 comparison villages. The mean of income/expenditure is Yuan
per capita per year at 1995 prices.




                                                        37

Table 2: Probit regression of village participation in the SWP using baseline covariates

                                                                    Coeff.     z-value
     Village on the plains                                          Reference category
     Hills                                                            4.876 (4.02)
     Mountainous                                                      2.771 (3.05)
     Whether village has electricity                                 -0.672 (-1.82)
     ...telephones                                                   -0.070 (-0.2)
     ...road passing through it                                       0.215 (0.59)
     ...radio transmitters                                            0.352 (1.09)
     Whether village can receive TV transmission                      0.237 (0.82)
     Located <5km from the nearest market                             0.028 (0.05)
     ...5-10 km from the nearest market                              -0.494 (-0.94)
     ...10-20 km from the nearest market                              0.740 (0.95)
     ...>20km                                                       Reference category
     # of days in a cycle during which the market assembles          -0.115 (-0.76)
     County town within 5 km                                        Reference category
     Distance from village to county town is 5-10km                   1.373 (1.95)
     ...10-20km                                                      -0.530 (-0.85)
     >20km                                                           -0.448 (-0.83)
     Township=village                                               Reference category
     Distance from village to township is within 5km                  0.137 (0.19)
     ...5-10km                                                        0.229 (0.34)
     ...10-20km                                                      -1.628 (-2.55)
     Main mode of transportation used by the villager: bicycle       -0.296 (-0.4)
     ...bus                                                          -0.305 (-0.9)
     ...other automobile                                              0.913 (1.71)
     ...walking                                                     Reference category
     Nearest train station is within 5 km                            -0.586 (-0.62)
     ...5-10km                                                        0.999 (1.39)
     ...10-20km                                                       1.111 (1.52)
     >20km                                                          Reference category
     Nearest bus station is within 5 km                               0.021 (0.07)
     ...5-10km                                                        0.265 (0.64)
     ...10-20km                                                       0.469 (1)
     ...>20km                                                       Reference category
     Whether village has a day-care center                            0.724 (1.38)
     Elementary school is in village                                Reference category
     Nearest elementary school is within 5km                          0.055 (0.16)
     ...5-10km                                                        0.737 (1.6)
     Middle school is in village                                    Reference category
     Nearest middle school is within 5 km                             1.026 (2.09)
     ...5-10km                                                        0.142 (0.21)
     ...10-20km                                                       1.551 (1.63)
     ...>20km                                                         0.882 (1.13)
     Medical clinic in village                                      Reference category
     Nearest medical clinic is within 5 km                           -1.026 (-2.79)
     ...5-10km                                                       -0.420 (-1.11)
     ...10-20km                                                      -0.820 (-1.24)
     ...>20km                                                        -0.997 (-1.46)



                                                 38

Total population of the village                                                 0.000 (1.99)
Irrigated land (mu)                                                            -0.001 (-2.8)
Forest land (mu)                                                                0.000 (-0.87)
# of people work in TVE over # of labor                                         0.139 (1.99)
Whether village has TVE                                                        -0.798 (-1.35)
Output of grain per capita (kg/person)                                          0.001 (1.52)
Net income per capita                                                           0.020 (2)
Net income per capita squared                                                   0.000 (-1.97)
Net income per capita cube                                                      0.000 (1.66)
(End of year) # of pigs per person                                              0.972 (1.75)
(End of year) # of cows per person                                              0.840 (0.7)
(End of year) # of sheep, goat per person                                       0.531 (1.12)
(End of year) # of poultry per person                                           0.419 (2.54)
(End of year) # of hone been per person                                        -5.412 (-2.27)
Workforce per capita                                                            0.036 (1.4)
Average household size                                                         -0.042 (-1)
Share of workforce female                                                      -0.082 (-1.68)
Cultivated land per capita (mu)                                                 1.438 (3.19)
Grassland per capita (mu)                                                       1.887 (1.43)
Village mean of consumption (log)                                              -0.493 (0.198)
Village mean of school enrollment (age 6-14)                                   -2.029 (-2.84)
Guangxi                                                                         1.394 (2.73)
Guizhou                                                                         0.659 (0.92)
Yunnan                                                                        Reference category
Intercept                                                                      -2.522 (-0.88)
Pseudo-R2                                                                              0.360
Note: The village is the unit of observation (n=200) and all explanatory variables are pre-
intervention (1995). Standard errors are adjusted for cluster at county level.




                                                 39

Table 3: Impact of SWP on household income and consumption using propensity-score weighting or matching
                                          1996 mean     Gain in     Gain in non-                              PS                      Kernel
                                           in SWP        SWP           SWP           Simple               weighted                   matched
                                           villages     project       villages        DD        t-ratio      DD          t-ratio         DD         t-ratio
Trimmed sample
     2000    income                        981.906      196.322        66.012        130.31     1.826      182.655       2.541        169.150       2.392
             consumption (C)               841.729      67.092         70.480        -3.388     -0.067     -17.662       -0.313       -45.762       -0.751
             saving (S)                    140.223      129.185        -4.525       133.711     2.107      200.333       2.723        214.93        2.685
  2004/05    income                        981.906      432.325       387.399        44.926     0.500      42.975        0.455        42.234        0.549
             consumption                   841.729      345.947       287.687         58.26     0.870      58.535        0.786        18.312        0.223
             saving                        140.223      86.333         99.655        -13.322    -0.159     -15.544        -0.18       23.941        0.289

     2000    log income                     6.747        0.18          0.051          0.128     2.046       0.161        2.395         0.133        2.251
             log consumption                6.629        0.058         0.019          0.040     0.755       0.031        0.537         0.001        0.003
             log(1+S/C)                     0.117        0.120         0.031          0.089      1.79       0.131        2.467         0.133        2.617
  2004/05    log income                     6.747        0.345         0.264          0.081     1.171       0.062        0.823         0.038        0.522
             log consumption                6.629        0.299         0.210          0.090     1.707       0.067        1.130         0.025        0.474
             log(1+S/C)                     0.117        0.046         0.055         -0.009     -0.148     -0.005        -0.078        0.014        0.263
Total sample
     2000    income                         989.45      273.962        65.379       208.583     3.346      213.605       3.287        192.731       2.985
             consumption (C)               843.559      99.991         78.151         21.84     0.510     -151.054       -1.180      -189.569       -1.427
             saving (S)                    145.934      173.928       -12.828       186.755     3.141      364.696       3.371        382.342       3.612
  2004/05    income                         989.45      401.316       360.644        40.673     0.537      -47.159       -0.423       -45.246       -0.344
             consumption                   843.559      287.029       266.772        20.258     0.371      36.752        0.633        25.893        0.439
             saving                        145.934      114.244        93.816        20.427     0.303      -83.874       -0.705       -71.097       -0.52

     2000    log income                     6.752        0.230         0.050          0.180     3.448       0.180        3.337         0.160        3.221
             log consumption                6.631        0.087         0.028          0.059     1.374      -0.046        -0.577       -0.081        -0.945
             log(1+S/C)                     0.121        0.143         0.021          0.122     2.727       0.227        3.568         0.241        3.615
  2004/05    log income                     6.752        0.310         0.231          0.078     1.314      -0.005        -0.064        -0.01        -0.112
             log consumption                6.631        0.223         0.188          0.035     0.682       0.021        0.388         0.007        0.115
             log(1+S/C)                     0.121        0.087         0.043          0.044     0.915      -0.026        -0.307       -0.017        -0.185
Notes: All the calculations are weighted by household size. T-ratio of kernel matching is obtained from bootstrapping (100 repetitions). Standard errors of
weighted D-D estimations are robust to heteroskedasticity and serial correlation of households within each village. In the total sample, there are 112 project
villages and 86 comparison villages. In the trimmed sample, there are 71 project villages and 66 comparison villages.

Table 4: Impacts on income from animal husbandry

                                                       Gain in       Gain in                                PS                    Kernel
        Revenue or costs from           1996 mean      SWPR           non-                               weighted                matched
        animal husbandry (AH)            in SWPR       project       SWPR          DD          t-ratio      D-D        t-ratio      D-D        t-ratio
2000 total revenue                        326.983       107.6        33.894       73.706       2.302      100.03       2.703     118.883       2.498
        total cost of production          190.901      -15.516        1.623      -17.139       -0.801    -17.229       -0.779     -17.271      -0.717
        net income from AH                136.082      123.117       32.271       90.845       2.924      117.26       3.373     136.154       3.551
        cash income (net)                 142.204      12.411         1.587       10.824       0.663      14.684       0.858       -2.356      -0.685
        in-kind income (net)              -6.123       110.705       30.684       80.021        2.79     102.575       3.099     125.853       2.895
2004 total revenue                        326.983      196.889      225.753      -28.864       -0.507     -1.357       -0.023     12.356       0.246
        total cost of production          190.901      80.772       121.847      -41.075       -1.196    -36.175       -1.031     -42.896      -1.285
        net income                        136.082      116.118      103.906       12.212       0.282      34.818       0.785      55.252       1.344
        cash income (net)                 142.204      103.745      150.839      -47.093       -1.025    -30.646       -0.578      3.219       0.641
        in-kind income (net)              -6.123       12.372        -46.932      59.305       2.033      65.464       1.805      74.179       1.705
Notes: All the calculations are weighted by household size. T-ratio of kernel matching is obtained from bootstrapping (100 repetitions). Standard
errors of weighted D-D estimations are robust to heteroskedasticity and serial correlation of households within each village. The trimmed sample is
used, for which there are 71 project villages and 66 comparison villages.




                                                                              41

Table 5: Propensity score weighted estimates of impacts on poverty



                  Poverty         (1)            (2)
                 incidence    Change in       Change in
                 (1996) in       H in            H in         (1)-(2)
    Poverty        project      project      comparison       Double
      line        villages      villages       villages      difference   t-ratio
                              (a) Income poverty
                                       2000
      500            14.584        -6.747           0.957        -7.704    -2.138
      600            22.762        -7.331          -1.672        -5.659    -1.247
      700            35.116      -13.093            1.490       -14.582    -2.824
      808            46.697      -15.713           -4.599       -11.114    -1.515
      900            55.047      -15.193           -4.771       -10.422    -1.581
     1000            62.025      -12.906           -3.606        -9.300    -1.395
     1100            68.973      -10.802            1.642       -12.444    -2.195
     1150            72.405        -9.981           2.484       -12.465    -2.256
                                      2004/05
      500            14.584        -8.053          -5.021        -3.032    -0.809
      600            22.762      -12.250           -6.779        -5.470    -0.857
      700            35.116      -19.410         -11.533         -7.877    -1.046
      808            46.697      -24.907         -19.276         -5.630    -0.693
      900            55.047      -26.344         -22.915         -3.429    -0.444
     1000            62.025      -28.097         -23.816         -4.281    -0.530
     1100            68.973      -27.623         -19.537         -8.086    -1.352
     1150            72.405      -28.378         -20.347         -8.031    -1.424
                           (b) Consumption poverty
                                       2000
     500          18.673        -2.695         6.111         -8.806     -1.691
     600          29.053        0.078          5.298         -5.221     -0.841
     700          40.749        1.140          1.088         0.052       0.006
     808          57.392        -5.266        -1.902         -3.364     -0.386
     900          67.000        -5.761        -0.715         -5.046     -0.734
    1000          75.665        -6.102        -4.570         -1.532     -0.248
    1100          80.898        -4.987        -5.782         0.796       0.164
    1150          83.586        -5.184        -3.569         -1.615     -0.347
                                      2004/05
     500          18.673       -11.537        -4.081         -7.456     -1.500
     600          29.053       -16.661        -7.918         -8.743     -1.536
     700          40.749       -18.226        -13.352        -4.874     -0.747
     808          57.392       -23.241        -19.095        -4.146     -0.584
     900          67.000       -24.439        -22.567        -1.872     -0.267
    1000          75.665       -25.936        -23.121        -2.815     -0.520
    1100          80.898       -24.192        -21.455        -2.737     -0.511
    1150          83.586       -22.006        -17.962        -4.044     -0.789
  Notes: All the calculations are weighted by household size. Standard errors
  are robust to heteroskedasticity and serial correlation of households within
  each village. The trimmed sample is used with 71 project villages and 66
  comparison villages.



                                        42

Table 6: Estimated impacts stratified by initial income and education

                                            Lower education group                          Higher education group
                                                    Weighted DD                                      Weighted DD                   Weighted
                                   1996 mean            for lower                  1996 mean           for higher                    triple
                                      in SW            education                      in SW            education                   difference
                                     villages          group (1)         t-ratio     villages          group (2)         t-ratio     (1)-(2)       t-ratio
                                                               Initial income below median
      2000 income                       643.538               81.686      1.015          645.831          207.958         2.525      -126.271       -1.491
               consumption              664.573              -43.809     -0.593          674.167            55.069        0.604       -98.878       -1.246
               saving                   -20.989             125.518       2.167          -28.290          152.875         1.460       -27.357       -0.291
               productive assets        413.096              -58.508     -0.753          311.452            86.098        1.424      -144.606       -1.737
               housing value            501.121              -39.476     -0.189          611.993          173.552         0.959      -213.028       -0.947
 2004/05 income                         643.538               43.687      0.319          645.831          197.933         2.026      -154.246       -1.079
               consumption              664.573               97.623      1.188          674.167          219.517         2.370      -121.894       -1.105
               saving                   -20.989              -53.914     -0.521          -28.290           -21.598       -0.247       -32.316       -0.277
               productive assets        413.096               80.478      0.752          311.452          134.206         1.985       -53.728       -0.446
               housing value            501.121             216.285       0.866          611.993          815.739         2.481      -599.454       -2.022
     Number of households                                      312 (173+139)                                   299 (169+130)
                                                               Initial income above median
      2000 income                      1465.163             305.638       1.535        1476.474           174.261         1.194       131.376        0.587
               consumption             1061.494            -237.040      -1.268        1170.625              -8.934      -0.071      -228.105       -0.979
               saving                   403.747             542.693       1.720          305.881          183.215         1.375       359.479        1.086
               productive assets        600.292            -160.040      -1.775          609.010           -34.391       -0.374      -125.649       -1.054
               housing value            842.872             343.787       1.768        1109.570             60.008        0.303       283.780        1.118
 2004/05 income                        1465.163              -27.644     -0.133        1476.474            -54.414       -0.348        26.770        0.107
               consumption             1061.494              -24.752     -0.179        1170.625          -136.847        -0.913       112.095        0.637
               saving                   403.747               -2.876     -0.015          305.881            82.452        0.493       -85.328       -0.389
               productive assets        600.292             120.572       1.089          609.010         -201.500        -1.258       322.072        1.816
               housing value            842.872             432.315       0.874        1109.570          -697.603        -0.910     1129.918         1.331
     Number of households                                        204 (97+107)                                  363 (170+193)
Notes: The numbers parentheses are the number of observations in SWP villages and non-SWP villages respectively. Estimation is made on a balanced
panel of 1178 households on the trimmed sample. Lower education is defined as household head education level being lower than junior high school
(illiterate or primary school). Higher education is defined as household head education level being at lest junior high school. Standard errors of weighted D-D
estimations are robust to heteroskedasticity and serial correlation of households within each village. Balanced panel in trimmed sample is used with 67 project
villages and 62 comparison villag




                                                                                43

Table 7: Testing for displacement of new non-SWP development projects in SWP villages
                                        Mean in      Mean in non-                                       PS weighted                   Kernel
                                    SWP villages     SWP villages         Difference        t-ratio         diff.        t-ratio  matched diff.       t-ratio
Farming
Number of projects                        0.79            2.11               -1.32          -2.45           -1.68        -2.09         -2.04           -2.03
Number of households                    147.63          399.44             -251.80          -2.48         -182.99        -2.43        -205.03          -2.05
Animal husbandry
Number of projects                        1.51            3.03               -1.52          -2.08           -2.21        -1.98         -2.38           -1.78
Number of households                    135.09          324.87             -189.78          -1.17          -94.99        -1.18        -62.14           -1.00
Forestry
Number of projects                        0.54            1.34               -0.79          -2.50           -1.50        -1.84         -2.28           -1.66
Number of households                    131.63          296.63             -165.00          -1.41         -120.06        -1.97        -117.65          -3.15
Infrastructure
Terracing                                 0.12            0.65               -0.53          -2.08           -0.94        -1.58         -1.35           -1.46
Drinking water                            0.31            0.90               -0.59          -3.04           -0.86        -2.58         -1.04           -2.54
Irrigation                                0.24            0.60               -0.36          -1.80           -0.30        -1.42         -0.27           -1.31
Electricity                               0.28            0.58               -0.30          -2.21           -0.49        -2.01         -0.61           -1.52
Roads                                     0.19            0.39               -0.20          -1.89           -0.24        -1.39         -0.25           -1.53
Student subsidies: No.                    0.82            2.35               -1.53          -3.03           -1.74        -2.79         -1.75           -2.83
New schools: No.                          0.35            0.79               -0.44          -2.10           -0.55        -1.96         -0.84           -2.06
Teacher training: No.                     0.07            0.37               -0.30          -1.87           -0.39        -2.25         -0.37           -2.26
Health insurance                          0.16            0.31               -0.14          -2.05           -0.12        -1.26         -0.05           -0.69
New clinic                                0.06            0.24               -0.18          -3.84           -0.12        -1.85         -0.09           -1.46
Doctor training                           0.07            0.26               -0.18          -1.62           -0.18        -1.56         -0.12           -1.23
Total no. projects                        6.07           14.81               -8.73          -3.25          -11.72        -2.45        -13.68           -2.14
Total no. households                    415.38          1026.10            -610.71          -1.74         -399.02        -2.22        -386.20          -2.98
Notes: Trimmed sample, treating "no response" as "no project". T-ratio of kernel matching is obtained from bootstrapping (100 repetitions). Standard errors of
D-D and weighted D-D estimations are robust to heteroskedasticity and serial correlation of villages within each county. Trimmed sample is used with 71
project villages and 65 comparison villages.




                                                                              44

Appendix 2: Balancing tests for village characteristics and household outcomes with and without weighting and trimming


                                                                                             Difference in standardized means
                                                                                                      PS kernel-
                                                                                PS weighted for    matched for total    PS-weighted for       PS kernel-matched
                                   Standardized means*         Un-weighted        total sample          sample          trimmed sample       for trimmed sample

                                    SWP         Non-SWP
                                  villages       villages     mean     s.e.      mean      s.e.     mean        s.e.    mean         s.e.      mean        s.e.
 Village characteristics (1995)
 Total population                   0.009        -0.012      0.021    0.143      0.013    0.137     0.180     0.134     0.076      0.186       0.115     0.187
 Electricity                       -0.151         0.196      -0.347   0.141     -0.229    0.138     -0.028    0.157     0.104      0.164      -0.268     0.147
 Phone                              0.053        -0.069      0.122    0.143      0.109    0.141     0.373     0.132     0.155      0.168       0.072     0.171
 Road                              -0.061        0.079       -0.139   0.143     -0.094    0.141     0.090     0.155     0.211      0.164      -0.130     0.134
 Radio                              0.044        -0.058      0.102    0.143      0.075    0.135     0.241     0.126     0.271      0.155       0.193     0.170
 TV                                -0.084        0.109       -0.193   0.142     -0.131    0.143     0.056     0.152     0.117      0.175      -0.136     0.163
 Nearest market <5km               -0.036         0.047      -0.083   0.143     -0.068    0.148     0.100     0.152     0.078       0.187      0.417      0.206
 Elementary school in village      -0.009         0.011       -0.02   0.143     -0.031    0.143     0.102     0.129     -0.075     0.182      -0.005      0.18
 Clinic in village                  0.021        -0.028      0.049    0.143      0.051    0.141     0.258     0.129     0.043      0.170       0.073     0.172
 Net income per capita             -0.162         0.211      -0.373   0.141     -0.241    0.142     0.133     0.124     0.073       0.164      0.094     0.171
 Cultivated land per capita         0.134        -0.173      0.307    0.141      0.238    0.135     0.251     0.122     0.299      0.151      -0.159     0.144
 Household outcomes (1996)
 Consumption per capita            -0.156        0.203        -0.36   0.141     -0.217    0.190     -0.195    0.206     -0.069     0.181      -0.007     0.181
 Income per capita                 -0.168        0.219        -0.39   0.141      -0.23    0.139     -0.238    0.137     -0.182     0.181      -0.153     0.185
 Headcount poverty index
 ___600 yuan (income)               0.175        -0.227      0.402    0.141      0.384    0.169     0.442      0.18     0.248      0.201       0.244     0.234
 ___808 yuan (income)               0.196        -0.256      0.452     0.14      0.345    0.172     0.375     0.196     0.194       0.192      0.108     0.194
 ___1000 yuan (income)              0.212        -0.277      0.489    0.139      0.41     0.202     0.457     0.256     0.165       0.201      0.062     0.216
 ___600 yuan (consumption)          0.161         -0.21      0.371    0.141      0.455    0.186     0.534      0.18     0.259      0.198       0.325     0.214
 ___808 yuan (consumption)          0.155        -0.202      0.357    0.141      0.253    0.194     0.242     0.213      0.07      0.194       0.008     0.218
 ___1000 yuan (consumption)         0.171        -0.222      0.393    0.141      0.319    0.268     0.291     0.308     0.006      0.182      -0.130     0.203
Notes: * (sub-group mean minus mean for full sample)/standard deviation for full sample. In total sample, there are 112 project villages' 86 comparison villages.
In the trimmed sample, there are 71 project villages and 66 comparison villages. Household income, consumption and poverty measures are weighted by
household size. The Addendum provides further balancing tests.




                                                                              45