Policy Research Working Paper                          11110




               When Aggregation Misleads
      Bias in Unit-Level Small Area Estimates of Poverty
                     with Aggregate Data

                            Paul Andres Corral Rodas




Poverty and Equity Global Department
May 2025
Policy Research Working Paper 11110


  Abstract
 This paper explores why small area poverty estimates from                          geographic levels. Through model-based simulations, the
 models at the household level that only use aggregate                              paper shows that the bias in these models is minimized
 data as covariates, exhibit systematic bias. The analysis                          when the empirical variability of simulated welfare based
 demonstrates that this bias stems from the model’s inabil-                         on the model is closest to the true empirical variance of
 ity to capture the complete between-household variation                            welfare at the area level. This finding also has implications
 in welfare, as they rely solely on covariates aggregated at                        for bias in unit-level models.




 This paper is a product of the Poverty and Equity Global Department. It is part of a larger effort by the World Bank to
 provide open access to its research and make a contribution to development policy discussions around the world. Policy
 Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The author may be contacted
 at pcorralrodas@worldbank.org.




         The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development
         issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the
         names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those
         of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and
         its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.


                                                       Produced by the Research Support Team
       When Aggregation Misleads: Bias in Unit-Level Small Area
                     Estimates of Poverty with Aggregate Data

                                         Paul Andres Corral Rodas∗




Key words: Small area estimation; poverty mapping; satellite imagery; census; official statistics

JEL classification: C13; C55; C87; C15
   ∗
    The World Bank Group - Poverty and Equity Global Practice (pcorralrodas@worldbank.org). The author ac-
knowledges financial support from the World Bank. Special thanks to Carlos Rodriguez-Castelan, Alexandru Cojo-
caru, Tara Vishwanath, and Isabel Molina for comments on an earlier draft. Full replication package for the results
presented in this paper may be found in: https://github.com/pcorralrodas/UC_source_of_bias
1    Introduction

Household surveys aimed at gauging a population’s living standards often lack representativeness
beyond broad regions or specific population demographics. Additionally, there is a risk that many
pertinent locations or groups may be omitted from these surveys. However, detailed information
on poverty is crucial for effectively targeting resources to alleviate it.

The demand for disaggregated statistics has increased the reliance on indirect techniques that
integrate supplementary data from censuses, registries, or larger-scale surveys. These methods
are used to produce sufficiently precise statistics for granular populations. Small area estimation
encompasses a broad range of statistical techniques designed to enhance the precision of estimates
when household surveys lack the sample size required for the desired level of accuracy. Among these
techniques, model-based approaches stand out by leveraging the concept of "borrowing strength"
from larger datasets or auxiliary information. These methods use models that establish relationships
across areas (e.g., regression techniques), enabling the creation of indirect estimators (Molina and
Rao 2010).

Most model-based techniques fall into two main categories: unit-level models and area-level mod-
els. Unit-level models are generally applied when data on individual units (e.g., households) are
available, while area-level models are used when only aggregate data for specific geographic areas
(e.g., area means) are accessible, as described by Fay and Herriot (1979). In poverty estimation,
unit-level models predict the welfare distribution first and then apply a threshold to determine
the proportion of people below that threshold. In contrast, area-level models directly estimate the
poverty rate for an area.

Unit-level models face limitations when survey and census data are from different years — a common
issue in developing countries where censuses and surveys are conducted infrequently. Area-level
models, however, provide a feasible alternative. These models rely on linear functional forms and
perform estimation and prediction using only aggregate data for geographic entities of interest (Fay
and Herriot (1979); Torabi and Rao (2014)). Another approach, unit-context models, employs an
estimation stage where household-level measures are modeled as a linear function of area-level
characteristics (Nguyen (2012); Lange et al. (2018); Masaki et al. (2020)). Unit-context models,
like unit-level models predict the welfare distribution first and then apply a threshold to determine
the proportion of people below that threshold, but do so only using area-level characteristics.

Although alluring, unit-context models, have been discouraged due to the method resulting in
biased estimates. Corral et al. (2022) show that the method is unable to fully replicate the
welfare distribution which results in biased estimates. For a practical example of the resulting
bias, see Edochie et al. (2024). In that work the authors present the aggregated estimates at
the geographic level of representativeness and it is evident that across many areas the differences
compared to direct estimates are considerable (see Figures 7 to 9 in Edochie et al. (2024)). Moreover,
it can be seen that the majority of the estimates fall above the 45 degree line, suggesting an


                                                    2
upward bias in the estimates. Using real world data, where samples are taken following methods
implemented in developing countries, Corral Rodas et al. (2023) present evidence on how unit-
context models produce biased estimates of poverty and that the noise and bias of unit-context
models is considerably larger than that produced by area-level models.

A key question is what is the source of the bias in unit-context models. In this note, the simulations
implemented by Corral et al. (2022) are used to study the source of the method’s bias. The
bias studied here goes beyond the potential bias of the method noted in Corral et al. (2021)
which was related to a sampling issue. Instead, the bias noted in this paper is related to the
transformation bias noted by Würz et al. (2022). These authors attempt to address the same issue
as Nguyen (2012), Lange et al. (2018) and Masaki et al. (2020) of not having auxiliary microdata
and relying on aggregate population-level auxiliary information. Würz et al. (2022) note that
relying on aggregate-level data to model household-level welfare leads to first-order bias from back-
transformation and second-order bias from using aggregate data. They note that using aggregate
means as covariates instead of individual values introduces additional bias due to the convexity of
the back-transformation function.

The bias studied here is related to the poor model explanatory power of unit-context models and
is related to the bias noted by Würz et al. (2022). Because unit-context methods model trans-
formed household-level welfare as a linear function of area-level covariates only, these are unable to
adequately account for between-household variations in welfare. The simulations undertaken here
show that areas where the variance of predicted welfare is aligned to the area’s true variance of
welfare are those that exhibit the lowest bias across the entire welfare distribution. This illustrates
that under unit-level and unit-context models, bias at the area level is a function of the dependent
variable’s mean and the empirical variance of the linear fit. The dependent variable benefits from
the use of EB methods, but the empirical variance is fully dependent on the auxiliary data at hand
and how aligned it is to the population in each area.

The note proceeds to present the assumed model for unit-level small area estimation and provides a
discussion on why not accounting for variance leads to biased estimates of poverty. The method for
creating simulated data is illustrated, followed by the results. Finally, conclusions are presented.



2    Small area estimation

The model based small area estimation methods described in this paper are dependent on an
assumed model. The nested error model used for small area estimation was originally proposed by
Battese et al. (1988) to produce county-level corn and soybean crop area estimates for the American
state of Iowa. For the estimation of poverty and welfare, Molina and Rao (2010) and Elbers et al.
(2003) assume that the transformed welfare yah for each household h within each location a in
the population is linearly related to a 1 × K vector of characteristics (or correlates) xah for that



                                                  3
household, according to the nested error model:1

                           yah = xah β + ηa + eah , h = 1, . . . , Na , a = 1, . . . , A,                      (1)


where ηa and eah are respectively location and household-specific idiosyncratic errors, assumed to
be independent from each other, following:

                                         iid     2             iid
                                                                 2
                                      ηa ∼ N 0, ση , eah ∼ N 0, σe

                     2 and σ 2 are unknown. Here, A is the number of locations in which the
where the variances ση      e
population is divided and Na is the number of households in location a, for a = 1, . . . , A, and na is
the sample size from area a. Finally, β is the K × 1 vector of regression coefficients.
One of the main assumptions is that errors are normally distributed. The assumption implies
that, conditional on the observed characteristics, the model’s errors are normally distributed. To
obtain estimates, the first step is to fit the model from Eq. 1 to the observed sample data via any
method providing consistent estimators. Usual fitting methods under this approach are maximum
likelihood (ML) or restricted maximum likelihood (REML), both based on the normal likelihood,
and H3 method, which does not specify a distribution. This yields the vector of parameter estimates:

                                                ˆ = β,
                                                θ   ˆ σ 2
                                                       ˆη ,σ
                                                           ˆe2
                                                               .


The empirical best (EB) area effects for the model are estimated from:

                                                                     σ
                                                                     ˆη2
                                     γ  ¯a − x
                                     ˆa y       ˆ ,
                                             ¯a β         γ
                                                          ˆa =
                                                                 σ
                                                                 ˆη2+σˆe2 /n
                                                                             a


      ¯a and y
where x      ¯a are the sample means in area a for x and y , respectively.
The variance for the area effects are given by (Molina and Rao, 2010):


                                                            2
                                           var [ηa |y
                                                    ¯a ] = ση (1 − γa )

Making use of the parameters estimated from the model of Eq. 1, it is possible to produce a value
of yah for every household in the census data:


                                      ∗        ˆ+γ          ˆ + ε∗
                                     yah = xah β    ¯a − x
                                                 ˆa y    ¯a β    ah


where ε is drawn from ε∗           2              2
                       ah ∼ N (0, ση (1 − γa ) + σe ). This Monte Carlo procedure is often repeated
100 times in order to derive indicators under each simulated population and then average across to
   1
    For simplicity yah is considered the transformed welfare. The most common transformation applied is the natural
logarithm.


                                                         4
                                                                      ˆ+γ
derive the final EB estimate. Notice how in the simulated vectors xah β    ¯a − x
                                                                        ˆa y       ˆ does not
                                                                                ¯a β
vary across simulations, only ε∗
                               ah vary across the simulated vectors.

Alternatively, because normally distributed errors are assumed, the probability of being poor for
                                                                            ˆ its idiosyncratic error,
any household h in area a is entirely dependent on its expected welfare, xh β,
e                                     ¯a − x
                                   ˆa y
ˆh , and the predicted area effect γ          ˆ which are assumed to follow eh ∼ N (0, σ 2 ) and
                                           ¯a β                                         e

ηa ∼ N γa (¯
           ya − x         2 (1 − γ ) , respectively. The empirical best estimated probability of a
                ¯a β ) , ση       a
household being poor is given by:

                                                                                          
                                                                        z−y
                                                                          ˆah
                                      P rob poorha = Φ                                                                    (2)
                                                                     σ
                                                                     ˆη2 (1 − γ      ˆ2
                                                                              ˆa ) + σ e


where z is the transformed poverty threshold and y         ˆ+γ
                                                 ˆah = xah β    ¯a − x
                                                             ˆa y       ˆ0 . The poverty rate
                                                                     ¯a β
for the area is given by the average probability of being poor across households.2 Within a given
area, the only thing that varies across households is the covariates.
                                                                                2∗ ) in a given area
Under the assumed model, variation in simulated welfare across the population (Sy a
                                                                            2∗
is the sum of two components: 1) the variability of the explained portion, Sy                               1
                                                                                                                   yah   −y ∗ )2
                                                                            ˆa =                           Na   h (ˆ      ¯a
, and 2) variability in the simulated errors            1
                                                       Na    h       ε∗ 2
                                                                      ah :


                        2∗ =
                       Sy                                1           ∗    −y ∗ )2
                          a                             Na       h (yah    ¯a
                        2∗ =                          1
                                                              yah + ε∗       ∗ )2
                       Sy a                          Na    h (ˆ       ah − y¯a
                        2∗ =      1           2 +   ε∗2     2∗ + 2ˆ yah ε∗            ∗            ∗ ε∗
                       Sy a      Na     h   y
                                            ˆah      ah + y¯a            ah − 2ˆyah y
                                                                                    ¯a         − 2¯
                                                                                                  ya  ah
                        2∗ =
                       Sy                      1
                                                         yah − y ∗ )2 + 1         ∗2
                          a                   Na      h (ˆ      ¯a       Na    h εah


The bias of poverty estimates at the area level will be determined by differences between the true
 2 and the simulated S 2∗ . This is simplified if we assume that the dependent variable for a given
Sy a                  ya
area follows a log-normal distribution. Then the simulated poverty rate for a given area under a
given threshold (ln z ) under the assumption is:

                                                                              
                                                 ∗       ln z − y¯a 
                                             F GT0a =Φ
                                                                                                                           (3)
                                                               2
                                                             Sya∗


                                                                         2 .                      ∗
                                                                 ¯a and Sy
Consequently, the poverty rate for a given area is a function of y         a


Unit-context models are an approximation to the assumed underlying data generating process from
Eq. 1. Originally introduced by Nguyen (2012), and then re-introduced by Lange et al. (2018) and
modified by Masaki et al. (2020). Unit-context models are defined as models where household-level
welfare is modeled using only area and sub-area level characteristics. Masaki et al. (2020) suggest
  2
      Traditionally, this process is approximated via Monte Carlo simulations as noted above.




                                                             5
that the model should include characteristics that explain variability at a geographic level below
the one for which we aim to estimate poverty. A possible unit-context model follows:


                                 ysach = zsac α + tsa ω + gs λ + ηsa + εsach

where s is used for an aggregation level that is over the target areas (a super-area) and c is used for
subareas, e.g., clusters that are nested in area a. Hence, zsac contains subarea-level characteristics,
tsa includes area-level characteristics and gs is composed of super-area-level characteristics (which
may include super-area fixed effects). The regression coefficients across these levels are respectively
denoted α, ω and λ. The random effects, ηsa , are specified in this model at the area level, the
same as in Eq. 1. Note that, among the set of covariates in this model, none is at the unit-level;
covariates only vary at the subarea-level and above.

A key feature of unit-context models is that the linear fit only explains a relatively small amount
of the total variance of the dependent variable (y ), with a coefficient of determination (R2 ) that is
often quite lower than models that include household level covariates, and ranges between 0.15 and
0.25 in most instances. Define Sy as the standard deviation of the sample’s dependent variable, the
coefficient of determination, R2 , is given by:

                                                  σ
                                                  ˆη2+σ ˆ2
                                              2          e
                                            R =1−     2
                                                     Sy

Consequently, because unit-context models have lower explanatory power since they only rely on
area-level covariates, if σ
                          ˆη2                                                  ˆ2
                                                                       2 , and σ
                                is the unit-context model estimate of ση
                             uc                                                  euc is the unit-context
model estimate of σ 2
                    e , then:


                                           σ
                                           ˆη2 +σ   ˆ2               σ
                                                                     ˆη2+σ ˆ2
                                 2            uc     euc     2              e
                                Ruc =1−           2
                                                         < R   = 1 −     2
                                                 Sy                     Sy


                                            2
                                          σ
                                          ˆη uc
                                                 ˆ2
                                                +σ euc > σ
                                                         ˆη2
                                                              ˆ2
                                                             +σ e.                                  (4)

The lower R2 of unit-context models implies that the explained portion of welfare is lower than the
true portion of explained welfare.

When simulating vectors of the dependent variable at the national level, the empirical standard
deviation of the dependent variable is approximated under unit-level and unit-context models. How-
ever, under unit-context models the empirical standard deviation at the area level is not properly
approximated due to the model misspecification. Consequently, when creating simulated vectors
of welfare under unit context models the variation in simulated welfare across the population in an
       2∗ ) will not match that of the true population. From Eq. 4, we know that:
area (Sya




                                                     6
                                                  ε∗ 2
                                                   ahuc >                   ε∗2
                                                                             ah .
                                              h                     h

Additionally, given the poor model fit we also know that:


                                                      ∗     2                        ∗ 2
                                            ˆahuc − y
                                            y       ¯a uc
                                                                <            yah − y
                                                                            (ˆ     ¯a )
                                      h                                 h

                                                                  2∗ that is larger than that of
Hence, under unit-context models some areas will have a value of Sy a
                                                             2∗ . This result, coupled with Eq. 3,
the true DGP and some will have a smaller or equal value of Sy a
                                                                            2∗ is most different from
implies that the bias of unit-context models will be larger in areas where Sy a
                  2 .
the true model’s Sy a


The focus here is the model’s performance on estimating poverty. However, the poor model fit
also affects predictions of welfare. Corral et al. (2022) present evidence that unit-context models
will do a good job at predicting the model’s dependent variable (y ), thanks to the use of empirical
best predictors. A similar result is presented by Chen et al. (2024) who indicate that under model
misspecification, unit-context models will perform just as well as area level models in predicting the
model’s dependent variable.3 Nevertheless, this seems to only hold true when estimating the mean
of the dependent variable of the model is the goal. It does not hold true when the goal is estimating
the original untransformed equivalized income or expenditure. Under unit-context models, bias is
introduced when the goal is a measure that may be distributionally sensitive – e.g., poverty or
equivalized income or expenditure. Corral et al. (2022) present evidence from simulations on how
unit-context models will produce biased predictions of the back transformed dependent variable,
i.e. equivalized income or expenditure (see figure 5.3 in Corral et al. (2022)). Würz et al. (2022)
also present evidence on how unit-context models will lead to biased estimates of welfare that arise
from the back transformation of the dependent variable.4
                                                           ∗ = x β
Assuming log linearity, the simulated nested error model (yah     ˆ
                                                                ah 0 + γ  ¯a − x
                                                                       ˆa y       ˆ0 + ε∗ )
                                                                               ¯a β     ah
                              yah ) exp (ε∗
                   ∗ ) = exp (ˆ                        ∗           2              2                  ∗
implies that exp (yah                     ah ). Since εah ∼ N (0, ση (1 − γa ) + σe ), then E [exp (εah )] =
         2 (1 − γ ) + σ 2
exp 0.5 ση                       . Therefore, at the area level welfare will be equal to:
                 a     e


                                     Na
                                1                              2              2
                                                yah ) exp 0.5 ση
                                           exp (ˆ                (1 − γa ) + σe
                                Na   h=1

Consequently, just like for poverty, the predicted welfare under unit-context models will be more
                                           2∗ and the true model’s S 2 .
biased the greater the difference between Sya                       ya
   3
     Under unit-level and unit-context models welfare is usually transformed to ensure that errors are normally
distributed to conform to the model’s assumptions.
   4
     Under unit-level models welfare is usually transformed to ensure that errors are normally distributed to conform
to the model’s assumptions.




                                                            7
                                                                                2∗ is related to biased
In the following section, I present a model-based simulation to illustrate how Sy a

estimates at the area level.



3       Simulation data

Data is generated for the simulations following the assumed model from Eq. 1.5 The population
size for the simulated data is N = 500, 000, and the observations are allocated among A = 100
areas (a = 1, . . . , A). Within each area a, observations are uniformly allocated over c = 20 clusters
(ca = 1, . . . , Ca ). Each cluster c consists of Nac = 250 observations. In this simulation experiment,
a simple random sample of nac = 10 households per cluster is taken, and this sample is kept fixed
across simulations. Using a sample, it is possible to compare with estimators based on the FH
model (see Corral et al. (2022); Molina et al. (2022); Rao and Molina (2015); Fay and Herriot
(1979)). The model that generates the population data contains both cluster and area effects.
                                           iid                                   iid
Cluster effects are simulated as ηac ∼ N (0, 0.1), area effects as ηa ∼ N 0, 0.152 and household
                            iid
specific residuals as each ∼ N 0, 0.52 , where h = 1, . . . , Nac , c = 1, . . . , Ca , a = 1, . . . , A.

    1. x1 is a binary variable, taking value 1 when a random uniform number between 0 and 1, at
                                                                a         c
        the household-level, is less than or equal to 0.3 + 0.5 40 + 0.2 10 .

    2. x2 is a binary variable, taking value 1 when a random uniform number between 0 and 1, at
        the household-level, is less than or equal to 0.2.

    3. x3 is a binary variable, taking value 1 when a random uniform number between 0 and 1, at
                                                                a
        the household-level, is less than or equal to 0.1 + 0.2 40 .

    4. x4 is a binary variable, taking value 1 when a random uniform number between 0 and 1, at
                                                                a         c
        the household-level, is less than or equal to 0.5 + 0.3 40 + 0.1 10

    5. x5 is a discrete variable, simulated as the rounded integer value of the maximum between 1
                                                              a
        and a random Poisson variable with mean λ = 3 1 − 0.1 40 .

    6. x6 is a binary variable, taking value 1 when a random uniform value between 0 and 1 is less
        than or equal to 0.4. Note that the values of x6 are not related to the area’s label.
                                                                                        c        a
    7. x7 is generated from a random Poisson variable with mean λ = 3                  20   −   100   + u , where u is
        a random uniform value between 0 and 1.

For the unit-context model variations implemented here the PSU level mean of each covariate is
used as the eligible covariates to fit the model.
    5
   Data is simulated following the same approach as Corral et al. (2022). The write-up here is also borrowed from
Corral et al. (2022).




                                                          8
In this experiment, I take a grid of 99 poverty thresholds, corresponding to the 99 percentiles of
the very first population generated. In total, 1,000 populations are generated. In each of the 1,000
populations, the following quantities are computed in every area for each of the 99 poverty lines:

    1. True poverty indicators τa , using the “census”.

    2. CensusEB estimators τ
                           ˆaCEBa presented in Corral et al. (2021), based on a nested-error model

       with only area random effects and including the unit-level values of the covariates. The R2
       for this model is roughly 0.60.

    3. Unit-context CensusEB estimators τ
                                        ˆaU C −CEBa based on a nested-error model with random

       effects at the area-level. This estimator follows the approach of Masaki et al. (2020). The
       R2 of the resulting model hovers around 0.17.

The average difference between the true poverty indicator and the estimate across the 1,000 simu-
lations represents the empirical bias for each area.


4     Results

    Figure 1: Empirical bias of poverty for CensusEB and Unit-Context small area estimation




Note: Simulation based on 1,000 populations generated as described in section 3. Each line corresponds to one of
the 100 areas. The x-axis represents the percentile on which the poverty line falls on, and the y-axis is the empirical
bias.

Because poverty is predicted across the 99 thresholds noted in the previous section, it is possible to
plot how bias across all lines and areas is present. The 99 percentiles are considered since the goal

                                                          9
of unit-level small area estimation of poverty is to replicate the full welfare distribution and from
it, estimate poverty. The simulations presented here, train the model and predict using the same
data, thus sampling does not play a role. This is done in order to remove other potential sources of
bias from unit-context models, such as the potential omitted variable bias in unit-context models
noted by Corral et al. (2021).

As can be seen in Figure 1, the bias of unit-context models is present across all lines, but as noted
by Corral et al. (2022), the bias is lower for some areas and for some percentiles.

                                                                       2
          Table 1: Statistics for the area level empirical variation (Sˆ
                                                                       y ) of the model’s linear fit
                                                              2             2         2
                                                        True Sˆ
                                                              y         UC Sˆ
                                                                            y     EB Sˆ
                                                                                      y
                                             Min          0.374          0.075    0.374
                                             Max          0.861          0.106    0.861
                                             Mean         0.568          0.088    0.568
                                             p25          0.448          0.084    0.449
                                             p50          0.549          0.088    0.549
                                             p75          0.682          0.091    0.682
Note: Simulations based on 1,000 populations generated as described in section 3 and averaged to the area level. The
true level variation in the explained portion of the model ranges from 0.374 to 0.861, with the average across the 100
areas being 0.568.



Table 2: Statistics for the area level total empirical variance of the model simulated dependent
variable
                                         True          Unit-context model                 EB
                                              2          2∗            2∗   2       2∗      2∗   2
                                             Sya
                                                        Sya
                                                                      Sya
                                                                          /Sy a
                                                                                   Sy a
                                                                                           Sya
                                                                                               /Sya

                               Min       0.632         0.813           0.744      0.632     0.998
                               Max       1.119         0.844           1.295      1.119     1.003
                               Mean      0.826         0.826           1.026      0.826     1.000
                               p25       0.707         0.822           0.879      0.707     0.999
                               p50       0.808         0.826           1.024      0.807     1.000
                               p75       0.940         0.830           1.166      0.940     1.001
Note: Simulations based on 1,000 populations generated as described in section 3 and averaged to the area level.


As argued in section 2, the bias arises because at the area level, the unit-context model does a poor
                                                                                2 6
job at capturing the empirical variation of the explained portion of the data, Sy
                                                                                ˆ . Table 1 presents
               2
the value for Sy                                                                       2
               ˆ across areas. The true empirical variation of the explained portion, Sy
                                                                                       ˆ ,ranges
from 0.37 for the area with the lowest empirical variation to 0.86 for the area with the largest, a
similar range to the EB model. However, the range for the unit-context model is much smaller,
from 0.07 to 0.1. Nevertheless, on average, the total empirical variation of the dependent variable,
 2 , is matched by unit-context models (Table 2; mean). This is because the model is fit at the
Sy
                    2 and σ 2 are estimated to be much larger since the covariates capture so little
national level and ση      η
                                                            2 . Consequently, the range for the total
of the total empirical variance of the dependent variable, Sy
   6                                          1                 ∗ 2
       At the area level this is equal to:   Na    h
                                                        yah − y
                                                       (ˆ     ¯a )



                                                                  10
empirical variance of the areas is minimal, from 0.813 to 0.844, compared to the truth of 0.632
                                                                     2 and σ 2 only works at the
to 1.119. Thus, in the unit-context models the larger estimates for ση      η
                                                             2∗
national level, across areas this leads to some areas where Sya
                                                                is considerably larger than the true
 2                  2∗                                        2
Sya
    and some where Sya
                       is considerably smaller than the true Sy a
                                                                  .

Unit-context models are biased because these deviate from the truth by assuming welfare is not
dependent on unit level characteristics, which leads to poor fitting models. Figure 2 illustrates
how this is manifested across selected poverty lines. The absolute bias for a given area decrease as
the ratio of model explained variance to true variance for a given area gets close to 1. Therefore,
even if the use of empirical best methods guarantees that the mean of the dependent variable of
the model at the area level is unbiased, because poverty is distribution dependent, the inability of
unit-context models to approximate the full distribution at the area level leads to biased poverty
estimates.




                                                11
                  Figure 2: Empirical bias of Unit-Context models across select lines




Note: Simulation based on 1,000 populations generated as described in section 3. Each dot corresponds to one of the
                                         2∗   2
100 areas. The x-axis represents the Sy   a
                                            /Sya
                                                 for the UC model, and the y-axis is the absolute empirical bias. The
gray line represents the quadratic fit plot.


Figure 3 averages across all 99 poverty lines the absolute bias from the model for each area.7 This
figure makes it more salient how bias is at its minimal point around the point where the unit-context
         2∗    2
model’s Sy a
             /Sy a
                   is closer to 1 in that area. Hence, estimates are biased for some areas more than
for others.
   7                                                                      1    P =99
    For area a, the average absolute bias across all lines is given by:   99   p
                                                                                        ˆpa − τpa |,where τ is the head count
                                                                                       |τ
poverty rate.


                                                           12
                Figure 3: Empirical absolute bias of Unit-Context models across areas




Note: Simulation based on 1,000 populations generated as described in section 3. Each dot corresponds to one of the
                                    2∗    2
100 areas. The x-axis represents Sy   a
                                        /Sya
                                             for the unit-context model, and the y-axis is the absolute empirical bias
across all 99 percentiles. The gray line represents the quadratic fit plot.


                                               ∗ )) under unit-context models. The absolute bias
A similar issue is observed for welfare (exp (yah
              2∗ /S 2 is closer to 1 (Fig. 4, left). Considering that the true area values of un-
decreases as Sy a  ya
transformed welfare range from roughly 18 to 110, the bias for some areas is quite considerable.
In accordance to what was noted in section 2, the method yields unbiased values of the model’s
dependent variable, y
                    ¯a , i.e. the transformed welfare (Fig. 4, right). This is a similar finding to
that of Chen et al. (2024) who through simulations illustrate that, in the presence of substantial
model misspecifications, the unit context model shows similar performance to that of area-level
models with known variances. Nevertheless, Chen et al. (2024) do not provide results for the
back-transformed variable. Considering that welfare usually requires transformation so that model
assumptions are met, the author’s results hold little value for international welfare and poverty
monitoring.




                                                         13
Figure 4: Empirical absolute bias of Unit-Context model’s predicted welfare, exp (ya ) and y
                                                                                           ¯a across
areas




Note: Simulation based on 1,000 populations generated as described in section 3. Each dot corresponds to one of the
                                  2∗   2
100 areas. The x-axis represents Sya
                                     /Sya
                                          for the unit-context model, and the y-axis is the absolute empirical bias of
the mean welfare for the area. The gray line represents the quadratic fit plot.




5     Conclusions

This paper provides evidence that the bias in unit-context models for small area poverty estimation
stems primarily from their inability to adequately capture the full variance of welfare at the area
level. While these models may achieve unbiased estimates of mean transformed welfare (i.e. the
model’s dependent variable) through empirical best prediction methods, they systematically fail to
replicate the true welfare distribution within areas, leading to biased poverty estimates and welfare
estimates.

The simulation results reveal several key insights:

First, unit-context models typically explain only a small portion of the total variance in welfare (R2
ranging from 0.13 to 0.25) compared to traditional unit-level models (R2 around 0.50). This limited
explanatory power arises because unit-context models rely solely on area-level covariates, omitting
household-level variation. Adding more covariates to the unit-context model risks overfitting and
could lead to further bias, thus the solution is not aligned to adding more data. An approach
similar to the one presented by Würz et al. (2022) could work, but may require having access to
unit-level data and thus is not feasible when using satellite derived data.

Second, while unit-context models may be able to match at the national level the total empirical
                                                                                     2 and σ 2 ),
variation of welfare through the estimation of area and household error components (ση      e



                                                         14
this compensation mechanism breaks down at the area level. Some areas end up with significantly
over- or under-estimated variation of welfare, leading to systematic bias in poverty estimates.

Third, our analysis reveals that the magnitude of bias in poverty estimates is directly related to how
well the model simulated total empirical variation of welfare matches the true welfare’s variability
in each area. Areas where this ratio approaches 1 show minimal bias, while areas with substantial
mismatches exhibit larger biases.

These findings have important implications for practitioners. While unit-context models offer
practical advantages in situations where household-level census data is unavailable or outdated,
their inherent limitations in capturing welfare distributions should be carefully considered. Users
should be particularly cautious when interpreting poverty estimates for areas where the model’s
simulated variability of welfare differs substantially from the observed welfare’s empirical variance
in survey data. As noted by Corral Rodas et al. (2023) and Corral et al. (2022), using simulated
and real world data, area-level models such as the well-known Fay Herriot model will outperform
unit-context models.

Future research might explore methods to improve the empirical variance approximation in unit-
context models or develop diagnostic tools to identify areas where these models are most likely to
produce reliable estimates. Additionally, investigating alternative approaches that better capture
within-area welfare distributions while maintaining the practical advantages of unit-context models
could prove valuable.




                                                 15
References
Battese, G. E., Harter, R. M., and Fuller, W. A. (1988). An error-components model for predic-
  tion of county crop areas using survey and satellite data. Journal of the American Statistical
  Association, 83(401):28–36.

Chen, Y., Lahiri, P., and Salvati, N. (2024). Effects of model misspecification on small area
  estimators. arXiv preprint arXiv:2403.11276.

Corral, P., Molina, I., Cojocaru, A., and Segovia, S. (2022). Guidelines to small area estimation
  for poverty mapping. The World Bank, Washington, DC.

Corral, P., Molina, I., and Nguyen, M. (2021). Pull your small area estimates up by the bootstraps.
  Journal of Statistical Computation and Simulation, 91(16):3304–3357.

Corral Rodas, P. A., Henderson, H. L., and Segovia Juarez, S. C. (2023). Poverty mapping in the
  age of machine learning. World Bank Policy Research Working Paper, (10429).

Edochie, I., Newhouse, D., Tzavidis, N., Schmid, T., Foster, E., Hernandez, A. L., Ouedraogo,
  A., Sanoh, A., and Savadogo, A. (2024).         Small area estimation of poverty in four west
  african countries by integrating survey and geospatial data. Journal of Official Statistics, page
  0282423X241284890.

Elbers, C., Lanjouw, J. O., and Lanjouw, P. (2003). Micro-level estimation of poverty and inequality.
  Econometrica, 71(1):355–364.

Fay, R. E. and Herriot, R. A. (1979). Estimates of income for small places: An application of James-
  Stein procedures to census data. Journal of the American Statistical Association, 74(366a):269–
  277.

Lange, S., Pape, U. J., and Pütz, P. (2018). Small area estimation of poverty under structural
  change. World Bank Policy Research Working Paper No. 8472.

Masaki, T., Newhouse, D., Silwal, A. R., Bedada, A., and Engstrom, R. (2020). Small area
  estimation of non-monetary poverty with geospatial data. World Bank Policy Research Working
  Paper No. 9383.

Molina, I., Corral, P., and Nguyen, M. (2022). Estimation of poverty and inequality in small areas:
  Review and discussion. TEST, pages 1–24.

Molina, I. and Rao, J. (2010). Small area estimation of poverty indicators. Canadian Journal of
  Statistics, 38(3):369–385.

Nguyen, V. C. (2012). A method to update poverty maps. The Journal of Development Studies,
  48(12):1844–1863.


                                                 16
Rao, J. and Molina, I. (2015). Small area estimation. John Wiley & Sons, Hoboken, NJ, 2nd
  edition.

Torabi, M. and Rao, J. (2014). On small area estimation under a sub-area level model. Journal of
  Multivariate Analysis, 127:36–55.

Würz, N., Schmid, T., and Tzavidis, N. (2022). Estimating regional income indicators under
  transformations and access to limited population auxiliary information. Journal of the Royal
  Statistical Society: Series A (Statistics in Society), 185(4):1679–1706.




                                                 17