WPS4155



                                           How Good a Map?
                           Putting Small Area Estimation to the Test

         Gabriel Demombynes, Chris Elbers, Jean O. Lanjouw, and Peter Lanjouw1

                                                   Abstract

         This paper examines the performance of small area welfare estimation. The
method combines census and survey data to produce spatially disaggregated poverty and
inequality estimates. To test the method, predicted welfare indicators for a set of target
populations are compared with their true values. The target populations are constructed
using actual data from a census of households in a set of rural Mexican communities.
Estimates are examined along three criteria: accuracy of confidence intervals, bias and
correlation with true values.           We find that while point estimates are very stable, the
precision of the estimates varies with alternative simulation methods. While the original
Elbers et al (2002, 2003) approach of numerical gradient estimation yields standard errors
that seem appropriate, some computationally less-intensive simulation procedures yield
confidence intervals that are slightly too narrow. Precision of estimates is shown to
diminish markedly if unobserved location effects at the village level are not well captured
in underlying consumption models. With well specified models there is only slight
evidence of bias, but we show that bias increases if underlying models fail to capture
latent location effects. Correlations between estimated and true welfare at the local level
are highest for mean expenditure and poverty measures and lower for inequality
measures.


Keywords: Poverty, Inequality, Small Area Estimation
JEL Classification: C13, C88, D31, I32, O15, R13


World Bank Policy Research Working Paper 4155, March 2007

The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the
exchange of ideas about development issues. An objective of the series is to get the findings out quickly,
even if the presentations are less than fully polished. The papers carry the names of the authors and should
be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely
those of the authors. They do not necessarily represent the view of the World Bank, its Executive Directors,
or the countries they represent. Policy Research Working Papers are available online at
http://econ.worldbank.org.




1World Bank, Free University of Amsterdam, UC Berkeley and World Bank. We are grateful to Martin
Ravallion and Danny Pfeffermann for comments and suggestions. The views in this paper are the authors'
and should not be interpreted to reflect those of the World Bank or affiliated institutions.

1 Introduction


        This paper examines the performance of a method for producing small area

estimates of the spatial description of economic welfare. The methodology is described in

Elbers, Lanjouw and Lanjouw (2002, 2003), henceforth referred to as ELL (2002). These

"poverty maps" offer the promise of generating useful data about poverty and inequality

at the local level, information which has potential applications in both the policy and

research spheres. In this paper, an unusual data set is used to compare community-level

welfare measures estimated using the small area estimation method against measures

created from direct observations of household expenditure collected over the entire

population within those communities.

        Poverty maps have two sets of uses. They can be used as tools for geographical

targeting of social spending. In a number of countries they have been used by

governments and non-governmental organizations to identify those areas where the poor

are concentrated as a first step towards directing resources to the poor.           While

policymakers in wealthy nations are accustomed to having information about local level

conditions and welfare readily at hand, in the typical less developed country, information

compiled at the local level is scarce and only available through specialized surveys. In

such environments poverty maps are a potentially valuable resource.

        On the research front, poverty maps have a variety of applications. With the

resurgent interest in economic growth theory, and in particular the focus on inequality's

role, spatial profiles of welfare within a country can be useful. Poverty maps can also be

used to investigate the spatial relationship between poverty and a variety of outcomes,

including health and crime. The research applications for poverty maps are particularly

strong when poverty maps can be produced for multiple years in a single country. In such

cases poverty maps can be employed for policy evaluation.




                                             2

        The method examined here has been employed for a number of countries, and the
resulting poverty maps have been utilized by both policymakers and researchers.2 The

growing popularity of the methodology adds to the need for a validation exercise.

        The analysis in this paper compares the predicted poverty and inequality rates

produced by the methodology for groups of rural Mexican communities to the actual

poverty and inequality rates in those communities. One strength of the small area

estimation approach is that it produces confidence intervals for its estimated welfare

measures. An important objective in this paper is to assess to what degree the confidence

intervals produced by the ELL method capture the distribution of error in the point

estimates. Bias in the point estimates is also examined. The paper is organized as follows.

Section 2 details the poverty mapping methodology. Section 3 describes the data

employed, Section 4 sketches the validation exercise, and Section 5 presents the results.

Section 6 concludes with a discussion of results and their implications.


2. Methodology


        This section reviews the poverty mapping methodology, which is explained in
more detail in ELL (2002).3 The basic approach is straightforward and typically involves

a household survey and a population census as data sources. First, the survey data are

used to estimate a prediction model for either consumption or incomes. The selection of

explanatory variables is restricted to those variables that can also be found in the census

(or some other large dataset) or in a tertiary dataset that can be linked to both the census

and survey. The parameter estimates are then applied to the census data, expenditures are

predicted, and poverty (and other welfare) statistics are derived. The key assumption is

that the models estimated from the survey data apply to census observations. The first

stage begins with an association model of per capita household expenditure for a

household h in location c, where the explanatory variables are a set of observable

characteristics:


2 Poverty Maps based on this method are now underway or completed in more than 30 developing
countries. Early examples include Alderman et al. (2002), and Mistiaen , Ozler, Razafimanantena and
Razafindravonona (2002). See also Demombynes et al (2002).




                                                    3

(1)     ln ych = E[ln ych xch] + uch .



        The locations correspond to the survey clusters as they are defined in a typical

two-stage sampling scheme. The observable characteristics must be found as variables in

both the survey and the census or in a tertiary data source that can be linked to both data
sets.4

        Using a linear approximation to the conditional expectation, the household's

logarithmic per capita expenditure is modeled as


(2)     ln ych = xch + uch .
                     



        The vector of disturbances, u, is distributed F (0,).                   The model in (2) is

estimated by Generalized Least Squares using the household survey data. In order to

estimate the GLS model, , the associated error variance-covariance matrix, is estimated.

Individual disturbances are modeled as


(3)     uch = c + ch ,


where c is a location component andch is a household component. This error structure

allows for both spatial autocorrelation, i.e. a "location effect" for households in the same

area to the extent that it is not already covered by location-level explanatory variables,

and heteroskedasticity in the household component of the disturbance.                          The two

components are uncorrelated and (by construction) uncorrelated with observable

characteristics in the regression equation.

        The model in (2) is first estimated by simple OLS. The residuals from this

regression serve as estimates of overall disturbances, given by u^ch . These residuals are

decomposed into uncorrelated household and location components:


3Early variants of the methodology were presented in Hentschel et al (2000) and Elbers, Lanjouw and
Lanjouw (2000). These earlier versions differ in important ways with the approach outlined in ELL(2002).



                                                   4

(4)     u^ch = ^c + ech .


The estimated location components, given by ^c , are the within-cluster means of the

overall residuals. The household component estimates, ech , are the overall residuals net

of location components. Additional parameters are estimated: ^2 , the variance of c

and V^ 2 , the variance of 2 .5
      ( )
        To allow for heteroskedasticity in the household component, a logistic model of

the variance of ch conditional on a set of variables, zch, is estimated, bounding the

prediction between zero and a maximum, A, set equal to (1.05)*max{ech}:  2




(5)     ln[ A - ech ] = zch^ + sch .
              ech
                2
                          T
                  2




Letting exp{zch^} = B and using the delta method, the model implies a household
                 T


specific variance estimator for ch of



(6)     ^,ch [1
          2        AB
                    + B] 2+ Var(s)[AB(
                             1              1- B)]  .
                                         (1+ B)3


        This heteroskedasticity model generates a vector of coefficient estimates, ^ , and

the variance-covariance matrix, V^(^) . The coefficient estimates are used to predict

^,ch , the household-specific term for the variance of ch.
  2



        These error calculations are used to produce two square matrices of dimension n,

where n is the number of survey households. The first is a block matrix, where each

block corresponds to a cluster, and the cell entries within each block are ^ . The second
                                                                            2




4 Note that these variables need not be exogenous.
5See Appendix 1 of Elbers et al (2002) for details.


                                                   5

is a diagonal matrix, with household-specific entries given by ^,ch . The sum of these
                                                                                  2



two matrices is ^ , the estimated variance-covariance matrix for the original model given

by equation (2). Once this matrix has been calculated, the original model is estimated by

GLS.

         In the second stage predicted log expenditures and subsequently local-level

estimates of poverty and their accompanying standard errors can be generated via several

routes. Elbers et al (2002) describe a method based on numerical gradient estimation.

An alternative approach known as parametric bootstrapping (Pfeffermann and Tiller,
2005) has been found to yield closely similar results and proceeds as follows.6 A series

of simulations are conducted, where for each simulation r a set of first stage parameters

are drawn from their corresponding distributions estimated in the first stage. A set of beta
and alpha coefficients, ~r and ~r , are drawn from the multivariate normal distributions

described by the first stage point estimates and their associated variance-covariance
matrices. Additionally, (~ ) , a simulated value of the variance of the location error
                                    2 r


component is drawn.7 Combining the alpha coefficients with census data, for each
census household (~,ch)r , the household-specific variance of the household error
                              2


component, is estimated. Then, for each household simulated disturbance terms, ~c and                   r


~ch , are drawn from their corresponding distributions.8 A value of expenditure for each
  r


household, y^ch , is simulated based on both predicted log expenditure, xch ~r , and the
                   r                                                                          

disturbance terms:




6We will see below that while the methods yield very similar point estimates, the approach employed in
ELL (2002) produces slightly wider (and possibly more plausible) confidence intervals. In Appendix 1 we
outline yet a third approach that yields confidence intervals that also more closely track those obtained with
the method outlined in ELL (2002).
7The ( )r value is drawn from a gamma distribution defined so as to have mean ^ 2 and variance
         ~2
                                                                                         
V^ 2 .
  ( )
8Non-normality is allowed for in the distribution of both c and ch . For example, for each distribution,
a Student's t-distribution can be chosen with degrees of freedom such that its kurtosis most closely matches
that of the first stage residual components, ^c or ech . An alternative, semi-parametric, approach can also
be adopted in which stardardized residuals are drawn from the first-stage survey residuals.


                                                      6

(7)       y^ch = exp xch +~c + ~ch .
            r         (    ~r     r       r )


Finally, the full set of simulated per capita expenditures, y^ch , are used to calculate
                                                                            r


estimates of the welfare measures for each target population.9

         This procedure is repeated R times drawing a new                       ~r, ~r, (~ )    2 r  and

disturbance terms for each simulation.                For each subgroup, the mean and standard

deviation of each welfare measure are calculated over all r=1,...,R simulations. For any

given location, these means constitute our point estimates of the welfare measure, while

the standard deviations are the standard errors of these estimates.

         There are two principal sources of error in the welfare measure estimates
produced by this method.10 The first component, referred to as model error in ELL

(2002), is due to the fact that the parameters from the first-stage model in equation (2) are

estimated. The second component, termed idiosyncratic error, is associated with the

disturbance term in the same model, which implies that households' actual expenditures

deviate from their expected values. While population size in a location does not affect

the model error, the idiosyncratic error increases as the number of households in a target

subgroup decreases.


3. Data


         The analysis in this paper uses data collected as part of the targeting and

evaluation program of PROGRESA, a health, education, and nutrition program of the

Mexican government. Assignment to PROGRESA for households in these communities

was randomized by community; a census of all households in 506 communities was

conducted in November 1997, 320 were integrated into PROGRESA in late spring of

1998, and three follow up surveys (complete censuses) of households in all 506

communities were conducted in 1998 and 1999. Additionally, a survey was conducted in


9These calculations are performed using household size as weights, implicitly assuming that expenditure is
distributed uniformly within households. The same methodology could be applied using equivalence scales
to capture alternative intrahousehold distributional assumptions.




                                                      7

March 1998, before PROGRESA was introduced to treatment communities. The March

survey included a fairly detailed expenditure survey.

         This paper employs household characteristic data from the November 1997
survey and an expenditure aggregate constructed using the March 1998 survey.11 While

it would be possible to undertake the analysis using income data from the November

survey, the expenditure data is preferred for two reasons. First, the income data is very

noisy. A substantial fraction of households report no income at all, and the income data

shows no correlation with the March expenditure aggregate. The March expenditure

aggregate, in contrast, is highly correlated with an expenditure aggregate from the June

1999 survey (for control group households), suggesting that it is a fairly consistent

measure of household welfare. Second, the applications of the ELL methodology thus far

have most commonly used household expenditure or consumption as the basis for welfare

analysis, following the consensus that given the potential for consumption smoothing,

consumption is likely to be a better indicator of long-term welfare than income. While it

would be preferable to have expenditure data collected at the same time as household

characteristics data, the household variables used here are unlikely to change

substantially over time. Consequently the time gap between the November and March

surveys should not distort the analysis.

         While detailed, the expenditure aggregate is less comprehensive than typical

consumption aggregates developed from some surveys carried out in developing

countries. It covers only cash expenditures and does not include figures for rent. The

expenditure survey was not carried out in 14% of households interviewed in November

1997. These households, which are concentrated in a small number of communities, are

not included in the analysis. The ten communities with fewer than 10 households with

expenditure information are also not included, leaving 20544 households in 496

communities.




10A third potential source of error is associated with computation methods. Elbers et al (2002) show that
this can be set arbitrarily small by selecting a sufficiently large number of simulations.
11Most questions in the November 1997 survey were similar to those in the 2000 national Mexican census.
They concerned household characteristics and recent income of the household.


                                                        8

4. Analysis


        The approach used for the validation exercise is to estimate a first-stage model

using a "pseudo-survey" drawn from the PROGRESA households, using a two-stage

sampling procedure. Welfare measures are then predicted with target populations

composed of groups of PROGRESA households. The PROGRESA communities

themselves have too few households to produce meaningful confidence intervals for the

estimates using the methodology. Previous experience, e.g. ELL (2002), has shown that

standard errors are very large for target populations with less than a few hundred

households. In order to generate a group of more suitably sized target populations, the

communities were grouped at random into 20 target populations. Both the pseudo-survey

and the target populations were drawn repeatedly, in order to generate estimates for a

large number of target populations.

        Specifically, the steps in the analysis were as follows:


1) A random sample of 50 localities was drawn from the 496 localities, with probability

   of selection proportional to the size of the locality. From each of the 50 localities, 10

   households were selected at random. The data from these households (a total of 500)

   serve as a pseudo-survey.


2) The first-stage methodology described above was applied using the pseudo-survey. A

   set of explanatory variables for log per capita expenditure was selected from a

   candidate list.      An additional set of explanatory variables which best explained
   estimated location effects were selected from a set of community-level averages.12


3) The 496 localities were grouped into 4 groups of 24 communities and 16 groups of 25

   communities. These serve as the 20 target populations for the poverty mapping



12From equation (2) and (3) it is clear that the variance of the location effect c must be small if acceptable
standard errors on welfare predictions are to be obtained. We have found that the inclusion of means of
explanatory variables, calculated from the census for the relevant enumerationa areas, reduces      2


considerably. See ELL(2002) for details and see also below.


                                                       9

    analysis, and the location effect is modeled at the level of the localities. The target

    populations each cover an average of 1042 households.


4) True poverty and inequality rates were calculated for the 20 target populations based
    on actual per capita expenditure.13


5) The poverty mapping methodology was applied to predict poverty and inequality rates

    for the 20 target populations, using first-stage models estimated with the pseudo-

    survey.


6) The entire procedure was repeated 10 times, drawing a new pseudo-survey for each

    round of analysis.


          The output of this procedure is a set of poverty and inequality estimates and

associated standard errors for 200 target populations. To examine the sensitivity of the

estimates to the error specification, two different specifications are used for the second-

stage analysis. In the first, both the location component and the household component of

the error are modeled as Student's t-distributions. For the second specification, a semi-

parametric approach is used for both the location and the household components. In this

semi-parametric approach, instead of drawing from a t-distribution, the standardized

residuals are drawn from the first-stage survey residuals. For both specifications, the

household component of the error is modeled as heteroskedastic, with the predicted log
per capita expenditure as the sole explanatory variables.14




13The poverty line was set to 159 pesos, the per capita expenditure of the median household in the full set
of households. This corresponds roughly to PROGRESA's poverty-classification scheme; using
discriminant analysis techniques based on household income, approximately 50% of households were
initially classified as "poor" and thus qualified for PROGRESA.
14Note that for the semi-parametric approach, it is the standardized residuals that are drawn from the
observed distributions in the survey. These standardized residuals, with mean zero and variance equal one,
are drawn and multiplied by the square root of the relevant simulated variance (of the location or household
effect) to produce simulated residual values.


                                                       10

5. Results


First-Stage Results

         OLS Regression results from the first-stage models are given in Appendix 2
Tables A1-A10. Across the ten pseudo-surveys used here, the R2 ranges from 0.415-0.53

(see Table 1). The explanatory power of the models in this analysis is in the general range
of models from past applications. The R2 for models for particular strata ranged from

0.45 to 0.77 in Ecuador (Hentschel et al, 2000), 0.29 to 0.63 in Madagascar (Mistiaen et

al, 2002), and 0.47 to 0.72 in South Africa (Alderman et al, 2002). The explanatory

power achieved with the PROGRESA models is rather good given that the households in

the PROGRESA communities are more homogenous than those within a stratum in a

typical application. All the communities in the PROGRESA sample were selected for the

program because they were poor and rural, based on indicators in the 1990 and 1995

censuses. Consequently, the households are more similar to one than another than the

households in an entire stratum of a country.

         Household size was used in all models, and some variables were selected in

models for several pseudo-surveys, but there was generally little consistency in models

chosen across pseudo-surveys. The estimated location effects were generally small, with

variances ranging from 0.9% to 3.1% of the overall variance of the disturbance term after

the addition of cluster-level means. This can also be seen in that the models achieved

levels of explanatory power very close to what would be achievable with models that

employed, instead, a cluster-level fixed-effects specification (see Table 1).


Second-Stage Results

5.1 Point Estimates and Precision

         Tables 2 and 3 present illustrative results for the headcount rate based on two
pseudo surveys: 2 and 3.15 These tables present for each of the 20 target populations a

measure of the true headcount rate as well as the estimated headcount rate based on a

variety of procedures. Column 1 presents estimates and standard errors based on the



15These two pseudo-surveys have been chosen arbitrarily in order to avoid unnecessary repetition.
Qualitative conclusions are unchanged if other, or all, pseudo-surveys are examined.


                                                     11

numerical gradient simulation procedure sketched out in Elbers et al (2002). Columns 2-

4 present estimates based on the "parametric bootstrapping" (Pfeffermann and Tiller,

2005) procedure outlined in section 3 and are computed using the POVMAP2 software

that has been purpose-written by Qinghua Zhao in the Research Department of the World
Bank.16 The parametric bootstrapping results vary depending on whether disturbances

are drawn from the empirical distribution (Column 2) or from parametric distributions

(Column 3). The estimates in column 4 are based on a program written in SAS, based

also on application of the procedure outlined in section 3 (with disturbances drawn from a

parametric distribution), and are presented to illustrate that simulation based results do

vary depending on different random number generating algorithms as well as seeds.

Finally the results presented in Column 5 are based on an alternative, non-parametric,
scheme outlined in Appendix 1.17

         Point estimates differ only slightly across different simulation approaches. In

Table 2, while the true headcount rate for target population 1 is 60.5% the estimated rate

for this target population varies between 60.9% and 61.6% across the different estimation

approaches. The approaches are more clearly at odds in terms of the estimated standard

errors.    In particular, standard errors deriving from the "parametric bootstrapping"

procedure described in Section 3 and summarized in Columns 2-4, tend to be somewhat

smaller than those based on the numerical gradient method described in ELL(2002) �

Column 1 - and the non-parametric approach of Appendix 1 (Column 5). In the case of

pseudosurvey 2 the distinction is not of great significance: irrespective of methodology,

the 95% confidence interval around each target population's estimated headcount rate

encompasses the true poverty rate in 19 out of 20 cases. However, with other pseudo

surveys the distinction does matter. In Table 3, results are presented based on a model of

consumption estimated from pseudosurvey 3. With this survey, the "classical" approach

(Elbers et al, 2002) and the alternative approach outlined in the appendix yield three


16POVMAP2 can be freely downloaded at http://iresearch.worldbank.org.

17 Note that these estimates do not show significant differences in poverty between target populations. This
reflects both the relative homogeneity of the group of PROGRESA households, the random composition of
target populations, and the small sizes of the target populations, about 1000 households. On the other hand,
discriminating between poverty of the target populations is not the subject of the current paper and all
standard errors are about the same size as one would get from survey-based estimates at the aggregate
level..


                                                      12

cases where a target population's true poverty rate falls outside the 95% confidence

interval around the estimated poverty rate. But with the parametric bootstrap approach

underpinning estimates in Columns 2-4 the failure rate is higher (7 cases). For this

pseudosurvey the parametric bootstrapping approach appears to produce standard errors

that are too "optimistic" - suggesting greater precision of estimates than is warranted.

         Given this evidence of a tendency for the parametric bootstrapping procedure to

produce confidence intervals that are somewhat too narrow, we employ from now on,

unless noted explicitly otherwise, the non-parametric approach outlined in Appendix 1.

Additional comparisons, not reported here, confirm that conclusions derived with this

simulation procedure hold also for estimates based on the considerably more

computationally-intensive numerical gradient approach outlined in ELL(2002).                    The

important point to take away here is that simulation methods do seem to matter (with

respect to standard errors, if not point estimates). Further research is underway to
understand better why the different simulation methods do not always agree.18

         Table 4 looks more closely at the confidence intervals estimated around welfare

estimates produced with our non-parametric simulation scheme. If the confidence

intervals accurately reflect the true uncertainty in the estimates, the fraction of cases of

the "truth" falling within a confidence interval around an estimate should be

approximately equal to the corresponding confidence level. Note however that twenty

`target populations' are drawn for each of the ten `surveys' and so the experiments are

not entirely independent.

         For each welfare measure and each of the ten pseudosurveys the number of

instances is counted when true welfare in each of the 20 target populations falls within

two standard deviations around the target population's estimated welfare level. For

example, in the case of pseudo survey 1, the true welfare estimate (mean, headcount,

squared poverty gap, and General Entropy Class inequality measure with parameter 0)

always fall within the confidence interval around the estimated welfare measure. In

Table 2 we saw that for pseudosurvey 2 this occurred 95% of the time (19 out of 20

cases) for the headcount, and Table 4 shows the same was observed for the mean, while



18The most recent version of POVMAP2 now offers the user the choice of the "classical" numerical
gradient or the parametric bootstrapping procedures outlined in Section 3. .


                                                   13

for the squared poverty gap and inequality calculated on the basis of the GE0 the truth

always falls within the confidence intervals calculated around the estimates. On average,

across all pseudo surveys the success rate is just under 95% for the mean consumption,

headcount, and squared poverty gap measures, and just below 90% for the GE0 measure.

        In Table 5 we consider how sensitive are our estimated standard errors to the

presence of unobserved location effects.               We saw in Table 1 that our preferred

specifications for the different pseudosurveys were quite successful in proxying

unobserved location effects (      ^  2

                                        ^u2 ranges between 0.9% and 3%). How much larger

would standard errors be if our underlying models had not been so successful in this

respect? Table 5 compares estimates and standard errors on small area estimates of the

headcount rate from pseudosurvey 2 based on two models: one with our preferred

specification; and the other with a specification in which no census-mean variables were
included.19 In the latter model the share of the variance of overall disturbance term that is

attributable to the variance of the cluster component is now 11.9%, a four-fold increase

over the 2.7% in the preferred model (Table 5). At the all-census level, the two models

predict   headcount        rates   of    61.9%     and    61.5%,       respectively,     both  virtually

indistinguishable from the 61.1% actual headcount rate in the population. However, the

standard error on the model with no location variables is now 0.024, up by more than two

fifths from the standard error of 0.017 obtained with the preferred model. Part of the

increase in the standard error is due to the fact that the explanatory power of the model

with no location variables is lower than that of the preferred model.                       As a result,

idiosyncratic error would be expected to be higher � see Section 2 and ELL(2002).

However, at the level of the total population most of the idiosyncratic error will have

cancelled out (poverty is being estimated over a population of more than 20,000

households). Thus the increase in the standard error from 0.017 to 0.024 is likely due

mainly to the consequence of our failure to adequately capture unobserved location

effects. At the target population level, standard errors are higher than at the level of the

total population, irrespective of underlying models.                    Moving from the preferred

specification to the model with no location variables, standard errors rise considerably,


19 Our calculations here are based on the numerical gradient "classical" simulation procedure.


                                                    14

and in some cases the percentage change is even greater than at the level of the total

population. For example, standard errors across the two models rise by as much as 43%

for target population 2 (0.030*1.43=0.043). However, here, the changes in standard

errors are reflecting both the influence of idiosyncratic error and our failure to capture

location effects.


5.2     The Level of Location Effects

        Note that the location effect c may include group effects at levels higher than

the survey cluster. To see this consider the following model with group random effects at

a `district' level (v), as well a the cluster level (c).:


         ln yvch = xvch + v + vc + vch

As before, the error components are uncorrelated. If clusters are the primary sampling

unit, a district is sampled only indirectly, viz. if one of the sampled clusters happens to be

located in that district. In a typical living standards survey there will only rarely be

districts that have been sampled more than once in this way, making it impossible to

separate the location effect in the sample into a `district effect'  and a `cluster effect' .

Assume accordingly that a district is sampled at most once, and write v(c) for the unique

district sampled along with the cluster. The model now becomes


         ln yv(c)ch = xv(c)ch + v(c) + v(c)c + v(c)ch .


Or, with obvious relabelling:


         ln ych = xch + *c + ch,


where *c = v(c) + v(c)c. Consequently, the estimated variance of the location effect in a

model with only cluster-level random effects is in fact an estimate of  + , the    2    2


combined group effects operating at the sample's cluster level.




                                                15

       In the simulation phase the analyst has to choose whether the location effect

estimated from the pseudosurvey should be applied at the cluster or the `district' level.

When there is no way of separating the location effect into a cluster and `district' effect

the best that one can do is to assume either that the effect is entirely a cluster-level effect,

or that it occurs entirely at the district-level. The latter will be quite a conservative

assumption as it will rule out that any part of the estimated location effect applies only at

the cluster level. This approach might be considered as yielding an "upper-bound" on the

standard error. The former will be "optimistic" in the sense that it will yield standard

errors that could be under-estimates of the true-standard error � particularly if the

location effect is big. In our setting, it does not make sense to apply the location effect at

a level higher than the cluster, as the latter correspond to villages and these have been

assembled randomly into 20 target populations.          ELL (2002) illustrate in the more

plausible setting of rural Ecuador, however, that when it is assumed that the location

effect estimated at the cluster level applies entirely at a higher level (in Ecuador, at the

parroquia level), then the idiosyncratic component of the standard error does rise

appreciably. However, they also show that the impact on overall standard errors is

negligible because in their setting � as in the present study � the size of the estimated

location effect is small.     If the introduction of cluster-means or other cluster-level

variables is not successful in capturing group effects then the choice of level of

aggregation at which to apply the location effect in the simulations can affect final results

more substantially. In such a case there would be a larger range between the "optimistic"

standard errors and the upper-bound estimates obtained by assuming that the location

effect occurs entirely at the `district' level.


5.3 Bias

       Another way in which to gauge the reliability of small-area estimates of welfare is

to consider whether there is evidence of bias - a systematic tendency for estimates to

deviate from the truth in any way. Figures 1-4 show, for each target population and for

four different welfare measures, the relationship between true welfare and the difference

between true and estimated welfare. In Figure 1 we can see that there is some tendency

for the estimation procedure to overestimate mean per-capita consumption for those



                                               16

target populations with a true mean consumption level that is low, and to underestimate

the mean consumption level of rich target populations. To see this note that when true

consumption is low, the bias - defined here as "truth" minus estimated consumption, is

negative � while it is positive when true average consumption is high. However this

relationship is not strong. Overall, the average difference between the estimated mean

consumption and true consumption is about 1.5 pesos: about 1% of the mean

consumption level of the poorest target population. The bias is similarly modest for the

headcount (Figure 2), squared poverty gap (Figure 3) and mean log deviation (General

Entropy class measure with parameter 0) inequality measure (Figure 4).

        The extent of bias in these estimates is related to the degree to which the model

specification fails to capture location effects on the basis of census-mean variables or

other variables intended to capture locality-level characteristics. As we saw in the

preceding section and in Table 1, our model specifications are quite successful in

removing the effect of latent community level characteristics, and as a result the bias in

our estimates is quite modest. If we produce estimates that omit village-level census

means, then the bias is accentuated. Figure 5 illustrates how the slope of the line

capturing the extent to which headcount is overestimated in truly non-poor communities

and the headcount is underestimated in truly poor communities becomes steeper when

estimates are based on a consumption model that fails to capture unobserved location

effects. The intuition behind this bias is quite straightforward: if there is a sizeable

location effect, and our model fails to capture it, then there will be a tendency for poverty

to be over-estimated in communities that are relatively well-off, given the explanatory

variables in the model, i.e. that have large positive location effects. Part of the reason

that the communities are well-off is likely attributable to community-wide characteristics

of the community, and this will not be reflected in estimates based on a model that fails to

capture the effect of those characteristics. As a result estimates will tend to overstate

poverty of such communities. Conversely, in truly poor communities, part of the reason

they are so poor will be due to the broader characteristics of the community. Again, if

the consumption model does not capture the impact of those broader characteristics, there

will be a tendency for estimated poverty to be an understatement of true poverty in the

community. We see, therefore, that not only is there a strong incentive to proxy location



                                            17

characteristics in order to improve the precision of estimates (Section 5.1), but also in

order to minimize a systematic tendency to overstate poverty in truly non-poor

communities and understate poverty in truly poor communities.


5.4 Correlation

        A further way to consider the reliability of the small area estimates is to examine

the correlation between the predictions and the true values. Table 6 shows simple pearson

and spearman rank correlations between true and predicted values. Each cell shows the

correlation between predicted welfare and true welfare across the 20 target populations.

Rows represent alternative pseudosurveys and columns indicate alternative welfare

measures. Correlations (both pearson and rank) are positive and reasonably high for

mean consumption and the two poverty measures (headcount rate and squared poverty

gap). In the case of inequality the correlations are much lower � presumably because the

target populations vary very little in terms of true inequality. Indeed, households in the

PROGRESA communities are more homogeneous than those within a stratum in a typical

poverty mapping application. All the communities in the PROGRESA sample were

selected for the program because they were poor and rural, based on indicators in the

1990 and 1995 censuses. Consequently, the households are more similar to one another

than the households in an entire stratum of a country. This high level of homogeneity

across households (and target populations) is a somewhat unusual feature of this

empirical application. However, it might be expected to present a particularly difficult

setting in which to implement the small-area estimation methodology and therefore does

provide a useful (conservative) setting in which to gauge the methodology's performance.


6. Discussion


        The results presented here offer a rough test of the ELL methodology and point to

some tentative conclusions that may inform future applications of the ELL welfare

mapping method. In terms of the predictive power of the method, the results provide

strong evidence that ELL estimates have important information content. Bias is low, the

correlations between actual and predicted values of poverty indices and the mean are



                                             18

generally positive and not insubstantial. For inequality figures, the results are generally

weaker. Because the signal-to-noise ratio is lower in these inequality estimates, it is

particularly important to take into account error in the estimates when applying them to

research or policy applications.

       The ability to provide confidence intervals is a crucial advantage to the ELL

method as compared with alternative approaches to welfare mapping. In the analysis

presented here, it was found that alternative simulation methods do influence the size of

the estimated standard errors on welfare estimates. The numerical gradient approach,

originally proposed in ELL(2002) was found to produce satisfactory standard errors, and

similarly for the non-parametric simulation procedure outlined in Appendix 1. However,

the parametric bootstrapping procedure described in Section 3 was found to yield

standard errors that are somewhat understated. It is not entirely clear why this latter

procedure should suffer from this propensity, and further research is needed to resolve

this concern.

       An important objective of this analysis has been to document how important it is,

when applying small-area estimation methods, to think hard about possible unobserved,

community-level, factors that may influence welfare outcomes.             Experience with

"poverty mapping" in a large number of countries indicates that inclusion of census-

means as regressors in the underlying consumption model (and/or the inclusion of

household variables that capture "network" effects, or of additional community-level

variables from tertiary datasets such as administrative and GIS data) can go a long way

towards helping to secure specifications in which unobserved location effects are kept

small. The analysis here has shown that failure to capture such location effects in this

way can lead to markedly higher standard errors and also an increase in bias.

       It is important to recognize the limitations of the analysis in this paper. The data

used here are less well-suited to poverty mapping than those usually employed. First, the

expenditure aggregate used is less comprehensive than that found in a typical developing

country survey, and the general quality of the data may be worse than, for example, data

collected in a World Bank LSMS survey. This reduces the potential for variation in

expenditure to be explained by observed variables. Second, the data all come from poor

households in rural Mexico. Consequently, there is relatively little variation in



                                            19

expenditure across households, and a relatively large fraction of the variation is due to

measurement error or short-term fluctuations and cannot be explained by observable

characteristics.

       The problem associated with the small range of expenditures is compounded in

this exercise by the fact that it was necessary to construct target populations by randomly

assembling groups of communities. This resulted in a narrow spread of welfare measure

values across the target populations. The ELL method is likely to produce estimates with

a higher signal-to-noise ratio when the underlying population has greater variation in

consumption.

       All in all, the analysis presented here suggests that the details of poverty mapping

matter. But the evidence does also suggest that the small area estimation procedure can

provide useful, and reliable, estimates of welfare at fine levels of aggregation that survey

data themselves would not be able to accommodate.




                                             20

References

Alderman, Harold, Miriam Babita, Gabriel Demombynes, Nthabiseng Makhatha, and
       Berk �zler. "How Small Can You Go? Combining Census and Survey Data for
       Mapping Poverty in South Africa, 2002. Journal of African Economies, 11: 3.

Demombynes, Gabriel, Chris Elbers, Jenny Lanjouw, Peter Lanjouw, Johan Mistiaen and
       Berk �zler. 2002. "Producing a Better Geographic Profile of Poverty:
       Methodology and Evidence from Three Developing Countries." WIDER
       Discussion Paper no. 2002/39, The United Nations.

Elbers, Chris, Jean O. Lanjouw, and Peter Lanjouw (2000) "Welfare in Villages and
       Towns: Micro-Measurement of Poverty and Inequality", Tinbergen Institute
       Working Paper No. 2000-029/2, Amsterdam, Netherlands.

Elbers, Chris, Jean O. Lanjouw, and Peter Lanjouw (2002) "Micro-Level Estimation of
       Welfare", Policy Research Working Paper No. 2911, The World Bank, October
       2002.

Elbers, Chris, Jean O. Lanjouw, and Peter Lanjouw. 2003. "Micro-level Estimation of
       Poverty and Inequality." Econometrica 71:1, pp. 355-364.

Hentschel, J., Lanjouw, J.O., Lanjouw, P. and Poggi, J. (2000) "Combining Census and
       Survey Data to Study Spatial Dimensions of Poverty: A Case Study of Ecuador",
       World Bank Economic Review 14(1): 147-166.

Mistiaen, Johan, Berk �zler, Tiaray Razafimanantena, and Jean Razafindravonona. 2002.
       "Putting Welfare on the Map in Madagascar" World Bank Africa Region
       Working Paper Series No. 34, The World Bank.

Pfeffermann, D. and Tiller, R. (2005) `Bootstrap Approximation to Prediction MSE for
       State-Space Models with Estimated Parameters', Journal of Time Series Analysis,
       25(6), November, 893-916.




                                           21

                                 Table 1
          Diagnostics for 10 Pseudosurvey Consumption Models

Pseudosurvey   Sample        No. of         R 2       ^ 2     R2
                 Size       Clusters                     ^u
                                                          2      R2f.e.

     1           500           50         0.4678      0.0291   0.927
     2           500           50         0.4593      0.0270   0.912
     3           500           50         0.5274      0.0247   0.927
     4           500           50         0.4151      0.0019   0.901
     5           500           50         0.5176      0.0195   0.961
     6           500           50         0.4766      0.0259   0.920
     7           500           50         0.4549      0.0263   0.971
     8           500           50         0.4205      0.0241   0.910
     9           500           50         0.4910      0.0088   0.945
    10           500           50         0.4193      0.0310   0.874




                                   22

                                                         TABLE 2: Pseudosurvey 2

                 Truth         (1)`                 (2)                   (3)                 (4)                (5)
                            Classical'           PovMap2               PovMap2         SAS-based Program     Alternative
 Targetpop                 Procedure          (non-parametric)        (parametric)                            Procedure
                        (Elbers et al 2002)                                                                (see Appendix)
                                      s.e.                 s.e.                 s.e.                s.e.              s.e.
1                0.605  0.614       0.030    0.616        0.027    0.609       0.029   0.611       0.025   0.612     0.037
2                0.568  0.616       0.030    0.622        0.028    0.621       0.028   0.613       0.027   0.616     0.039
3                0.572  0.621       0.032    0.624        0.032    0.619       0.029   0.614       0.029   0.613     0.040
4                0.636  0.636       0.031    0.635        0.024    0.630       0.024   0.627       0.027   0.640     0.036
5                0.612  0.586       0.034    0.585        0.034    0.592       0.033   0.591       0.034   0.584     0.041
6                0.640  0.641       0.031    0.638        0.033    0.641       0.032   0.638       0.029   0.639     0.038
7                0.621  0.568       0.034    0.565        0.035    0.573       0.035   0.572       0.036   0.569     0.038
8                0.647  0.643       0.036    0.644        0.035    0.645       0.033   0.640       0.032   0.626     0.048
9                0.610  0.592       0.029    0.595        0.030    0.599       0.032   0.597       0.033   0.589     0.039
10               0.675  0.609       0.033    0.609        0.034    0.615       0.030   0.612       0.031   0.596     0.038
11               0.603  0.609       0.038    0.605        0.034    0.607       0.030   0.607       0.029   0.606     0.038
12               0.568  0.681       0.037    0.690        0.031    0.685       0.030   0.677       0.033   0.680     0.046
13               0.647  0.623       0.033    0.629        0.029    0.631       0.030   0.623       0.032   0.630     0.038
14               0.604  0.591       0.035    0.599        0.029    0.594       0.030   0.592       0.030   0.583     0.043
15               0.576  0.618       0.036    0.619        0.029    0.625       0.030   0.614       0.030   0.625     0.039
16               0.595  0.613       0.030    0.614        0.029    0.616       0.027   0.608       0.024   0.611     0.038
17               0.553  0.564       0.038    0.565        0.030    0.569       0.029   0.561       0.031   0.553     0.043
18               0.589  0.634       0.039    0.633        0.029    0.636       0.033   0.638       0.033   0.629     0.043
19               0.676  0.638       0.037    0.639        0.029    0.642       0.023   0.637       0.025   0.656     0.039
20               0.613  0.654       0.030    0.653        0.029    0.656       0.027   0.657       0.029   0.651     0.036
Cases of truth falling
outside the 2 s.e.              1                    1                     1                   1                  1
interval

                                                         TABLE 3: Pseudosurvey 3

                 Truth          (1)                 (2)                   (3)                 (4)                  (5)
                           `Classical'           PovMap2               PovMap2         SAS-based Program  Alternative Procedure
 Targetpop                 Procedure          (non-parametric)        (parametric)        (parametric)       (See Appendix)




                                                                  23

                        (Elbers et al 2002)
                                      s.e.           s.e.            s.e.            s.e.            s.e.
1                0.605  0.555       0.030    0.554   0.023   0.555   0.030   0.554   0.034   0.554   0.022
2                0.568  0.570       0.037    0.569   0.024   0.570   0.037   0.560   0.040   0.568   0.026
3                0.572  0.544       0.033    0.544   0.030   0.544   0.033   0.531   0.043  0.0541   0.030
4                0.636  0.554       0.034    0.551   0.029   0.554   0.034   0.548   0.043   0.554   0.024
5                0.612  0.576       0.032    0.582   0.028   0.576   0.032   0.562   0.040   0.580   0.028
6                0.640  0.591       0.033    0.587   0.027   0.591   0.033   0.581   0.040   0.591   0.026
7                0.621  0.571       0.033    0.575   0.028   0.571   0.033   0.566   0.039   0.573   0.026
8                0.647  0.629       0.036    0.629   0.032   0.629   0.036   0.619   0.040   0.632   0.029
9                0.610  0.554       0.034    0.556   0.024   0.554   0.034   0.554   0.038   0.558   0.023
10               0.675  0.595       0.033    0.600   0.026   0.595   0.033   0.574   0.043   0.594   0.025
11               0.603  0.584       0.038    0.586   0.025   0.584   0.038   0.587   0.037   0.586   0.027
12               0.568  0.562       0.034    0.561   0.027   0.562   0.034   0.556   0.043   0.563   0.028
13               0.647  0.567       0.040    0.568   0.027   0.567   0.040   0.568   0.040   0.567   0.025
14               0.604  0.527       0.030    0.525   0.025   0.527   0.030   0.531   0.039   0.523   0.022
15               0.576  0.548       0.030    0.545   0.022   0.548   0.030   0.549   0.037   0.545   0.025
16               0.595  0.589       0.026    0.589   0.026   0.589   0.026   0.593   0.040   0.588   0.025
17               0.553  0.492       0.033    0.495   0.022   0.492   0.033   0.487   0.030   0.497   0.025
18               0.589  0.548       0.040    0.549   0.024   0.548   0.040   0.546   0.042   0.547   0.025
19               0.676  0.649       0.031    0.651   0.025   0.649   0.031   0.641   0.033   0.651   0.024
20               0.613  0.652       0.040    0.653   0.025   0.652   0.040   0.632   0.039   0.652   0.027
Cases of truth falling
outside the 2 s.e.              3                  7               7               7               3
interval




                                                            24

     Table 4: Relative Frequency of True Target Population Welfare Falling
         Within 95% Confidence Interval Around Estimated Welfare

Survey         Mean            Headcount        FGT2            GE0
1              1.00            1.00             1.00            1.00
2              0.95            0.95             1.00            1.00
3              0.90            0.85             0.80            0.95
4              0.95            1.00             1.00            0.90
5              1.00            1.00             1.00            0.85
6              0.80            0.90             0.80            0.60
7              0.95            0.95             0.95            1.00
8              0.95            0.95             0.90            0.70
9              0.85            0.85             0.80            0.90
10             0.95            0.90             0.90            0.95
Overall        0.93            0.94             0.92            0.89




                                      25

                                       Table 5
        Precision of Headcount Estimates with and without Location Variables
                       Numerical Gradient "Classical" Simulation
                       Pseudosurvey 2, POVMAP2 calculations


                                    I. Model with Location    II. Model with no
                                          Variables           Location Variables   % change
  Village    Population  True FGT0     Sample size=500         Sample size=500     in standard
   Code                                  R 2 = 0.459 ,           R2 = 0.413,        error in
(sorted by                                                                        moving from
                                         2   2                  2    2
true FGT0)                             /u = 0.027             /u = 0.119           Model I. to
                                    Estimated      s.e.    Estimated       s.e.     Model II.
                                      FGT0                  FGT0
     1          946        0.605       0.614      0.030     0.600         0.040       33%
     2          1046       0.568       0.616      0.030     0.622         0.043       43%
     3          1162       0.572       0.621      0.032     0.604         0.042       31%
     4          991        0.636       0.636      0.031     0.598         0.041       32%
     5          1061       0.612       0.586      0.034     0.609         0.042       24%
     6          935        0.640       0.641      0.031     0.606         0.040       29%
     7          932        0.621       0.568      0.034     0.602         0.046       35%
     8          861        0.647       0.643      0.036     0.653         0.042       14%
     9          871        0.610       0.592      0.029     0.615         0.038       31%
    10          1219       0.675       0.609      0.033     0.622         0.040       21%
    11          845        0.603       0.609      0.038     0.615         0.038        0%
    12          992        0.568       0.681      0.037     0.624         0.044        9%
    13          1289       0.647       0.623      0.033     0.623         0.039       18%
    14          1271       0.604       0.591      0.035     0.624         0.045       29%
    15          854        0.576       0.618      0.036     0.612         0.039        8%
    16          1141       0.595       0.613      0.030     0.614         0.038       27%
    17          1181       0.553       0.564      0.038     0.582         0.044       16%
    18          820        0.589       0.634      0.039     0.616         0.045       15%
    19          1060       0.676       0.638      0.037     0.623         0.038        3%
    20          1008       0.613       0.654      0.030     0.637         0.040       33%
   Total       20485       0.611       0.619      0.017     0.615         0.024      41%%




                                          26

                          Figure 1: Checking for Bias



           40




           30




           20




           10




            0




          -10




          -20




          -30




          -40

             150      155    160     165      170     175    180    185    190




Average difference: -1.49




                                     27

                          Figure 2: Checking for Bias



    0. 12




    0. 10




    0. 08




    0. 06




    0. 04




    0. 02




    0. 00




   -0. 02




   -0. 04




   -0. 06




   -0. 08




   -0. 10




   -0. 12

        0. 55     0. 57     0. 59     0. 61      0. 63    0. 65    0. 67    0. 69



Average difference: 0.012




                                     28

                                 Figure 3: Checking for Bias


          0. 05




          0. 04




          0. 03




          0. 02




          0. 01




          0. 00




         -0. 01




         -0. 02




         -0. 03




         -0. 04




         -0. 05




         -0. 06




         -0. 07

             0. 100 0. 105 0. 110  0. 115 0. 120 0. 125 0. 130 0. 135 0. 140 0. 145 0. 150 0. 155 0. 160




Average difference: -0.0015




                                                29

                              Figure 4: Checking for Bias



       0. 08

       0. 07


       0. 06


       0. 05


       0. 04


       0. 03


       0. 02


       0. 01


       0. 00


      -0. 01


      -0. 02

      -0. 03


      -0. 04


      -0. 05


      -0. 06


      -0. 07


      -0. 08


      -0. 09


      -0. 10


      -0. 11


      -0. 12

      -0. 13


      -0. 14


      -0. 15

           0. 21   0. 22   0. 23   0. 24   0. 25   0. 26   0. 27   0. 28   0. 29   0. 30   0. 31




Average difference: -0.0024




                                         30

                    Figure 5: Model Specification and Bias



 0. 08




 0. 04




 0. 00




-0. 04




-0. 08

     0. 55    0. 57     0. 59       0. 61        0. 63      0. 65    0. 67    0. 69




                                    31

   Table 6: Correlations Between Estimated and True Welfare Across Target
                                Populations

Survey       Mean              Headcount            FGT2                 GE0
       Pearson Spearman Pearson Spearman Pearson Spearman Pearson Spearman
  1      0.58      0.53      0.64     0.58      0.73      0.75      0.14    -0.05
  2      0.27      0.32      0.20     0.22      0.47      0.55      0.02    -0.01
  3      0.68      0.69      0.62     0.61      0.54      0.45      0.03    0.14
  4      0.50      0.54      0.59     0.57      0.33      0.29     -0.11    -0.06
  5      0.67      0.67      0.75     0.69      0.71      0.67     -0.02    0.12
  6      0.45      0.50      0.67     0.73      0.80      0.78      0.06    0.15
  7      0.37      0.36      0.35     0.30      0.21      0.20      0.24    0.07
  8      0.66      0.67      0.59     0.50      0.53      0.51      0.18    0.15
  9      0.22      0.11      0.23     0.12      0.15      0.04      0.11    0.18
 10      0.28      0.21      0.38     0.28      0.18      0.08     -0.17    -0.18
Average  0.47      0.46      0.50     0.46      0.46      0.43      0.05    0.05




                                     32

                                                Appendix 1
                              A Non-parametric Simulation Procedure

In this appendix we describe the procedure used for generating the welfare predictions
reported in the paper. The procedure was developed to diminish the role of distributional
assumptions and increase the role of bootstrapping.

A key aspect of the prediction is the way in which 'model error' is handled, or the
inevitable deviation between estimated and true parameters.20 So far we have accounted
for model error using the estimated covariance matrices for the model parameters.
Alternatively, sampling error of the parameter estimates can be simulated directly, by re-
sampling the survey and re-estimation of the parameters, which is what we do in the
current paper. The survey is resampled by parametric bootstrapping of the error term,
based on an initial set of point estimates and residuals. This procedure also allows us to
detect bias in the estimators for the parameters of the error model.

Starting from any given 'fake survey' the steps are as follows21:

    1. For the current application, model selection must necessarily be a semi-automatic
         procedure. Thus we carry out an OLS regression of log per capita consumption on
         an extensive set of candidate variables.
    2. Next we limit the number of covariates using a procedure for step-wise selection
         of regressors.
    3. With the resulting set of regressors, we specify and estimate a linear mixed effect
         model accounting for both cluster random effects and household-level
         heteroskedasticity.22         We     have     used     the     following     specification        for
         heteroskedasticity:

                                              h =0e       1 h
                                                           y^




         where y^h denotes the point estimate of household h's log per capita consumption
         (pcx).
    4. The estimation yields
      - point estimates for the regression coefficients,^ .
      - point estimates for log per capita expenditure, y^ .
      - point estimates for the heteroskedasticity model, ^ .
      - the ^ allows us to derive point estimates for the standard deviation of household-
         level errors, ^s .

20'True' is interpreted here as the parameter estimates that would result from a sample consisting of the full
population.
21The computations have been carried out using R version 2.2.1 and the nlme package, version number
3.1.66. Script files of the procedure can be obtained upon request from the authors.
22 See Venables and Ripley (1997) and Bates and Pinheiro (1998). The procedures for estimating linear
mixed effect models in R's nlme package can handle cluster random effects and household-level
heteroskedasticity of a simple type.


                                                      33

      - residuals, which we split into mean residuals per cluster, ^ , the standard deviation
         of these,  , and deviations from the cluster mean, ^ .

      - the standardized household residuals, ^ =       ^
                                                            .
                                                       ^s

These estimates are used to check for bias in the estimation procedure. There is reason to
expect such a bias, especially for the heteroskedasticity model and the variance of the
cluster effects  .  23



    5. The general idea to generate 100 samples by parametric bootstrapping using the
         above parameters as the 'true' model. We resample  's from ^ , standardized
         household residuals from ^ , multiplying the latter with each households specific
         standard deviation from ^s . The total residual is added to y^ to yield a new value
         for log per capita expenditure for each household. The new value is compatible
         with the model estimated under 3 above, and with the value of household
         regressors.
    6. Each bootstrapped sample is used to re-estimate the model and the mean of the
         estimates is used to check for estimation bias. It turns out that the bias (if any) is
         small and inconsequential. Nevertheless, we have compensated for bias in the
         estimators for  and  using the average bias found in this first round of
         simulations.

With the adjusted values for the variance estimators we again generate 100 samples by
parametric bootstrapping.

    7. For each sample we restimate the model, resulting in point estimates for ,  ,  ,
         and  . These are used to impute log per capita consumption values for
         households in the 'census'. For census 'EAs' an  is drawn from the estimation
         result, for households a  is drawn and multiplied with the household-specific
         variance, using the current value of  . The sum of cluster and household 'error' is
         added to the systematic part of log per capita expenditure, based on the household
         regressors and the current value of  .

Thus we generate values of log per capita expenditure for all households in the census.
Using these we compute welfare statistics (poverty and inequality measures). The tables
and figures in the text represent means and standard deviations of the simulated welfare
statistics thus generated.




23See Pfefferman and Glickman(2004), and Rao (2003). The estimators for the regression coefficients are
unbiased regardless of the error structure imposed.


                                                   34

Appendix References

Bates, D.M. and Pinheiro, J.C. (1998) "Computational methods for multilevel models"
       available in PostScript or PDF formats at http://franz.stat.wisc.edu/pub/NLME/)
Pfeffermann, D., and Tiller, R. (2005). Bootstrap Approximation to Prediction MSE for
       State-Space Models with Estimated Parameters. Journal of Time Series Analysis,
       26, 893-216.
Pfeffermann, D., and Glickman, H. (2004). "Mean Square Error Approximation in Small
       Area Estimation By Use of Parametric and Nonparametric Bootstrap". Invited
       lecture at the Joint Statistical Meeting, Toronto.
Rao, J.N.K. (2003) Small Area Estimation. Wiley: New York.
Venables, W.N. and Ripley, B.D. (1997) Modern Applied Statistics with S-plus. 3rd
       Edition, Springer-Verlag.




                                             35

          Appendix 2: OLS Regression Results of Consumption Models

Table 1: Pseudo Survey 1
Dependent Variable: Log Per Capita Expenditure
Coefficients:
                 Estimate Std. Error t value Pr(>|t|)
(Intercept)      5.998746   0.376066 15.951 < 2e-16 ***
hsize           -0.088087   0.013425 -6.562 1.37e-10 ***
onlyindhead     -0.357614   0.187783 -1.904 0.057450 .
refrig           0.164402   0.076970   2.136 0.033187 *
toilet          -0.096050   0.052603 -1.826 0.068475 .
vehicle          0.203101   0.088630   2.292 0.022359 *
bilinghead      -0.341641   0.080568 -4.240 2.67e-05 ***
rechead          0.092900   0.059246   1.568 0.117526
av_femhead      -0.898957   0.371149 -2.422 0.015798 *
av_onlyindhead 2.250072     0.566152   3.974 8.13e-05 ***
av_primedhead    0.774239   0.260069   2.977 0.003056 **
av_rechead       0.786780   0.223840   3.515 0.000481 ***
av_runwater     -0.098425   0.066368 -1.483 0.138717
rhsize2          0.796609   0.167641   4.752 2.66e-06 ***
rroompp         -0.174065   0.039715 -4.383 1.44e-05 ***
rroompp2         0.011750   0.003473   3.384 0.000773 ***

Multiple R-Squared: 0.4838,    Adjusted R-squared: 0.4678

Table 2: Pseudo Survey 2
Dependent Variable: Log Per Capita Expenditure
Coefficients:
                   Estimate Std. Error t value Pr(>|t|)
(Intercept)        9.129474  0.804244 11.352 < 2e-16 ***
hsize             -0.096499  0.014865 -6.492 2.12e-10 ***
gasstove           0.172803  0.070264     2.459 0.014270 *
refrig             0.133641  0.081375     1.642 0.101186
toilet             0.087655  0.059192     1.481 0.139298
adultfracf         0.327968  0.159454     2.057 0.040243 *
av_adultfracm      0.747587  0.468641     1.595 0.111320
av_agehead        -0.033981  0.007541 -4.506 8.29e-06 ***
av_concreteroof -0.382385    0.207337 -1.844 0.065759 .
av_femhead        -2.605026  0.637204 -4.088 5.09e-05 ***
av_primedhead     -0.659667  0.308155 -2.141 0.032800 *
av_radio          -0.874114  0.318263 -2.747 0.006249 **
av_rechead         0.451829  0.286645     1.576 0.115622
av_runwater       -0.179179  0.086103 -2.081 0.037964 *
av_television      0.776940  0.212897     3.649 0.000292 ***
av_waterheater     1.502314  0.854344     1.758 0.079308 .
rhsize2            0.953988  0.147147     6.483 2.23e-10 ***
rroompp           -0.027115  0.017004 -1.595 0.111454
ragehead2        123.342227 50.256095     2.454 0.014470 *

Multiple R-Squared: 0.4788,    Adjusted R-squared: 0.4593




                                    36

Table 3: Pseudo Survey 3
Dependent Variable: Log Per Capita Expenditure
Coefficients:
                  Estimate Std. Error t value Pr(>|t|)
(Intercept)       7.740004  0.553045 13.995 < 2e-16 ***
hsize            -0.111892  0.012125 -9.228 < 2e-16 ***
blender           0.142074  0.069276    2.051 0.040833 *
brickwall        -0.123116  0.065063 -1.892 0.059067 .
gasstove          0.231063  0.072605    3.182 0.001556 **
naturalroof      -0.169465  0.071431 -2.372 0.018070 *
onlyindhead       0.242028  0.166921    1.450 0.147733
radio             0.140417  0.055806    2.516 0.012193 *
stereo            0.247070  0.116874    2.114 0.035038 *
adultfracf        0.302865  0.165445    1.831 0.067787 .
bilinghead        0.163705  0.073534    2.226 0.026468 *
agehead          -0.002257  0.001585 -1.424 0.155226
secedhead         0.227859  0.118303    1.926 0.054693 .
av_agehead       -0.015256  0.006668 -2.288 0.022575 *
av_blender       -1.091239  0.259010 -4.213 3.02e-05 ***
av_concreteroof 1.030535    0.205624    5.012 7.63e-07 ***
av_femhead       -0.657499  0.421795 -1.559 0.119708
av_hsize         -0.096361  0.038338 -2.513 0.012285 *
av_onlyindhead -0.539298    0.359583 -1.500 0.134336
av_primedhead    -0.386760  0.255997 -1.511 0.131505
av_radio         -0.745915  0.219001 -3.406 0.000715 ***
av_refrig         0.870410  0.258107    3.372 0.000807 ***
av_television     0.807982  0.192275    4.202 3.16e-05 ***
av_toilet        -0.258594  0.096860 -2.670 0.007851 **
av_waterheater -1.194664    0.657062 -1.818 0.069666 .
rhsize2           0.949978  0.146055    6.504 1.99e-10 ***

Multiple R-Squared: 0.5511,   Adjusted R-squared: 0.5274




                                    37

Table 4: Pseudo Survey 4
Dependent Variable: Log Per Capita Expenditure
Coefficients:
              Estimate Std. Error t value Pr(>|t|)
(Intercept)   5.061854   0.351890 14.385 < 2e-16 ***
hsize        -0.109834   0.016429 -6.685 6.35e-11 ***
refrig        0.174286   0.076323   2.284 0.022831 *
toilet        0.161254   0.054947   2.935 0.003497 **
adultfracm    0.320246   0.139893   2.289 0.022495 *
adultfracf    0.293536   0.138096   2.126 0.034042 *
bilinghead    0.143261   0.062064   2.308 0.021403 *
secedhead     0.205535   0.105298   1.952 0.051520 .
av_agehead    0.014903   0.007363   2.024 0.043521 *
av_blender    0.423415   0.159784   2.650 0.008314 **
av_brickwall 0.382044    0.128597   2.971 0.003117 **
av_radio     -0.727830   0.218147 -3.336 0.000914 ***
rhsize2       0.476885   0.148333   3.215 0.001392 **
rroompp      -0.140513   0.045565 -3.084 0.002161 **
rroompp2      0.012268   0.004478   2.740 0.006379 **

Multiple R-Squared: 0.4315,   Adjusted R-squared: 0.4151

Table 5: Pseudo Survey 5
Dependent Variable: Log Per Capita Expenditure
Coefficients:
                Estimate Std. Error t value Pr(>|t|)
(Intercept)      6.20858   0.33501 18.533 < 2e-16 ***
hsize           -0.10914   0.01331 -8.198 2.22e-15 ***
blender          0.17330   0.06220    2.786 0.00554 **
brickwall        0.19870   0.06127    3.243 0.00126 **
onlyindhead     -0.31920   0.16104 -1.982 0.04804 *
toilet           0.09907   0.05699    1.738 0.08279 .
adultfracm       0.26519   0.13636    1.945 0.05239 .
av_adultfracm    1.05350   0.38360    2.746 0.00625 **
av_blender      -0.36338   0.16296 -2.230 0.02621 *
av_femhead      -0.88381   0.36526 -2.420 0.01590 *
av_refrig        1.56893   0.30584    5.130 4.21e-07 ***
av_runwater      0.19768   0.07834    2.524 0.01194 *
av_secedhead    -0.88101   0.49439 -1.782 0.07538 .
av_toilet       -0.38558   0.11117 -3.468 0.00057 ***
av_washmachine -1.43055    0.49677 -2.880 0.00416 **
rhsize2          0.72648   0.15162    4.791 2.21e-06 ***
rroompp         -0.04117   0.01615 -2.550 0.01109 *

Multiple R-Squared: 0.5331,   Adjusted R-squared: 0.5176




                                    38

Table 6: Pseudo Survey 6
Dependent Variable: Log Per Capita Expenditure
Coefficients:
                   Estimate Std. Error t value Pr(>|t|)
(Intercept)       4.830e+00 4.540e-01 10.639 < 2e-16 ***
hsize            -1.031e-01 1.365e-02 -7.555 2.13e-13 ***
blender           1.693e-01 6.489e-02    2.608 0.009384 **
onlyindhead      -3.751e-01 1.941e-01 -1.933 0.053881 .
refrig            1.485e-01 7.901e-02    1.879 0.060809 .
bilinghead       -3.069e-01 7.068e-02 -4.342 1.73e-05 ***
agehead          -6.775e-03 2.913e-03 -2.325 0.020469 *
av_adultfracm     2.464e+00 6.953e-01    3.545 0.000432 ***
av_agehead       -1.184e-02 5.958e-03 -1.987 0.047493 *
av_blender       -5.047e-01 1.965e-01 -2.569 0.010503 *
av_brickwall      1.187e+00 2.092e-01    5.671 2.45e-08 ***
av_concreteroof -7.636e-01 2.516e-01 -3.035 0.002537 **
av_onlyindhead    3.661e+00 5.887e-01    6.219 1.09e-09 ***
av_rechead        1.371e+00 2.384e-01    5.752 1.57e-08 ***
av_refrig         4.606e-01 3.069e-01    1.501 0.134090
av_washmachine -7.053e-01 3.694e-01 -1.909 0.056798 .
av_waterheater    2.058e+00 7.781e-01    2.645 0.008436 **
rhsize2           6.923e-01 1.459e-01    4.746 2.74e-06 ***
rroompp          -4.672e-02 1.594e-02 -2.931 0.003541 **
ragehead2        -1.285e+02 8.692e+01 -1.479 0.139890

Multiple R-Squared: 0.4965,   Adjusted R-squared: 0.4766

Table 7: Pseudo Survey 7
Dependent Variable: Log Per Capita Expenditure
Coefficients:
                  Estimate Std. Error t value Pr(>|t|)
(Intercept)        7.05900   0.43468 16.240 < 2e-16 ***
hsize             -0.12896   0.01467 -8.793 < 2e-16 ***
brickwall          0.12956   0.06725    1.927 0.054605 .
refrig             0.27110   0.08101    3.347 0.000882 ***
toilet             0.10500   0.06705    1.566 0.117989
rechead            0.11263   0.05835    1.930 0.054186 .
av_brickwall       0.44800   0.19190    2.335 0.019975 *
av_concreteroof -0.65035     0.24228 -2.684 0.007518 **
av_femhead        -2.13496   0.43132 -4.950 1.03e-06 ***
av_hsize           0.16780   0.04718    3.556 0.000414 ***
av_primedhead      0.73362   0.31380    2.338 0.019801 *
av_radio          -0.41700   0.19357 -2.154 0.031714 *
av_secedhead       1.06547   0.75789    1.406 0.160414
av_secplusedhead -2.31016    1.26275 -1.829 0.067947 .
av_toilet         -0.33099   0.12981 -2.550 0.011084 *
av_waterheater    -1.91772   0.77601 -2.471 0.013809 *
rhsize2            0.51461   0.14845    3.467 0.000574 ***
rroompp           -0.04888   0.01765 -2.769 0.005839 **

Multiple R-Squared: 0.4735,   Adjusted R-squared: 0.4549




                                    39

Table 8: Pseudo Survey 8
Dependent Variable: Log Per Capita Expenditure
Coefficients:
                   Estimate Std. Error t value Pr(>|t|)
(Intercept)       6.734e+00 5.829e-01 11.552 < 2e-16 ***
hsize            -1.208e-01 1.689e-02 -7.151 3.22e-12 ***
radio             1.080e-01 6.370e-02    1.695 0.09076 .
refrig            2.748e-01 8.626e-02    3.186 0.00154 **
toilet            1.568e-01 7.117e-02    2.203 0.02806 *
vehicle           2.872e-01 1.095e-01    2.623 0.00898 **
agehead          -7.636e-03 3.424e-03 -2.230 0.02619 *
av_adultfracm     1.856e+00 9.437e-01    1.967 0.04976 *
av_concreteroof 8.002e-01 1.790e-01      4.472 9.70e-06 ***
av_femhead       -1.495e+00 5.201e-01 -2.875 0.00422 **
av_primedhead    -1.095e+00 4.020e-01 -2.724 0.00668 **
av_rechead        5.684e-01 2.779e-01    2.045 0.04139 *
av_runwater      -1.586e-01 8.212e-02 -1.931 0.05410 .
av_secedhead      2.328e+00 7.829e-01    2.974 0.00309 **
av_toilet        -2.154e-01 1.340e-01 -1.608 0.10844
rhsize2           8.414e-01 1.902e-01    4.424 1.20e-05 ***
rroompp          -7.209e-02 3.950e-02 -1.825 0.06864 .
rroompp2          6.130e-03 3.114e-03    1.968 0.04962 *
ragehead2        -1.873e+02 1.038e+02 -1.805 0.07173 .

Multiple R-Squared: 0.4414,   Adjusted R-squared: 0.4205

Table 9: Pseudo Survey 9
Dependent Variable: Log Per Capita Expenditure
Coefficients:
                    Value Std.Error DF    t-value p-value
(Intercept)      5.086357 0.1885547 441 26.975497 0.0000
hsize           -0.141745 0.0124185 441 -11.414072 0.0000
brickwall        0.104505 0.0600241 441  1.741055 0.0824
gasstove         0.135917 0.0672063 441  2.022382 0.0437
onlyindhead     -0.895540 0.1898896 441 -4.716112 0.0000
radio            0.137231 0.0543460 441  2.525141 0.0119
adultfracf       0.402884 0.1555154 441  2.590636 0.0099
bilinghead      -0.111148 0.0660533 441 -1.682702 0.0931
secedhead        0.260845 0.1083821 441  2.406719 0.0165
av_hsize         0.098639 0.0312940 44   3.152021 0.0029
av_runwater     -0.149705 0.0722344 44 -2.072487 0.0441
av_secedhead     1.286449 0.3965916 44   3.243761 0.0023
av_television -0.318822 0.1260081 44 -2.530169 0.0151
av_washmachine 1.140216 0.2628267 44     4.338280 0.0001
rhsize2          0.653687 0.1320114 441  4.951745 0.0000

Multiple R-Squared: 0.506,    Adjusted R-squared: 0.491




                                    40

Table 10: Pseudo Survey 10
Dependent Variable: Log Per Capita Expenditure
Coefficients:
              Estimate Std. Error t value Pr(>|t|)
(Intercept)    6.31149    0.23697 26.634 < 2e-16 ***
hsize         -0.11552    0.01597 -7.232 1.87e-12 ***
naturalroof   -0.18068    0.07998 -2.259 0.024322 *
television     0.14613    0.05767   2.534 0.011593 *
vehicle        0.26146    0.09715   2.691 0.007363 **
bilinghead    -0.15631    0.07185 -2.175 0.030083 *
av_adultfracm -1.67378    0.69667 -2.403 0.016655 *
av_blender    -0.65744    0.19602 -3.354 0.000859 ***
av_brickwall   0.22799    0.12851   1.774 0.076677 .
av_radio      -0.59248    0.21000 -2.821 0.004978 **
av_roompp      0.64006    0.23260   2.752 0.006150 **
av_secedhead   1.37118    0.53967   2.541 0.011371 *
rhsize2        0.72202    0.17679   4.084 5.17e-05 ***
rroompp       -0.03031    0.01717 -1.765 0.078153 .

Multiple R-Squared: 0.4344,  Adjusted R-squared: 0.4193




                                   41