Policy Research Working Paper                        11034




            Predicting Income Distributions
                 from Almost Nothing
                             Daniel Gerszon Mahler
                                Marta Schoch
                               Christoph Lakner
                                Minh Nguyen
                                  Jose Montes




Development Data Group &
Poverty and Equity Global Department
January 2025
Policy Research Working Paper 11034


  Abstract
 This paper develops a method to predict comparable income                          which the method can be applied. The paper finds that a
 and consumption distributions for all countries in the world                       simple model relying on gross domestic product per capita,
 from a simple regression with a handful of country-level                           under-5 mortality rate, life expectancy, and rural popu-
 variables. To fit the model, the analysis uses more than                           lation share gives almost the same accuracy as a complex
 2,000 distributions from household surveys covering 168                            machine learning model using 1,000 indicators jointly.
 countries from the World Bankâ€™s Poverty and Inequality                             The method allows for easy distributional analysis in coun-
 Platform. More than 1,000 economic, demographic, and                               tries with extreme data deprivation where survey data are
 remote sensing predictors from multiple databases are used                         unavailable or severely outdated, several of which are likely
 to test the models. A model is selected that balances out-of-                      among the poorest countries in the world.
 sample accuracy, simplicity, and the share of countries for




 This paper is a product of the Development Data Group and the Poverty and Equity Global Department. It is part of a
 larger effort by the World Bank to provide open access to its research and make a contribution to development policy
 discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/
 prwp. The authors may be contacted at dmahler@worldbank.org.




         The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development
         issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the
         names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those
         of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and
         its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.


                                                       Produced by the Research Support Team
   Predicting Income Distributions from
              Almost Nothing
     Daniel Gerszon Mahler, Marta Schoch, Christoph Lakner, Minh Nguyen, Jose Montes1




JEL classification: C53, D31, I32, O10
Keywords: Income, consumption, data deprivation, machine learning, poverty, measurement




1 All authors are with the World Bank. We are grateful for comments received from Roy van der Weide,

Johannes Hoogeveen, Federico Haslop, Andres Fernando Chamorro Elizondo, Zander Prinsloo, Benjamin
Stewart, Emi Suzuki, and the Global Poverty Monitoring Working Group of the World Bank. The authors
gracefully acknowledge financial support from the UK government through the Data and Evidence for
Tackling Extreme Poverty (DEEP) Research Programme.
1 Introduction
Household surveys are needed to measure poverty and design distributional policies, yet in
several countries they are not conducted due to low statistical capacity, conflict, or lack of
resources. In other cases, household surveys are collected but not shared with researchers and
policy makers (Dang et al. 2019, Ekhator-Mobayode and Hoogeveen 2022). Lack of data
disproportionally affects poorer countries, meaning that ignoring or not properly accounting for
such countries in global analyses biases results.

We address these gaps by developing a method that through a simple regression can predict
annual income and consumption (henceforth welfare) distributions with a credible distributional
shape for each country. The method uses widely available social and economic indicators at the
country level and does not require any household survey data for the country of interest. We
explicitly search for a simple model that can be easily taken up in applied work and can work for
the most data deprived countries. To estimate such a model, we leverage rich information on
welfare distributions from more than 2,000 household surveys available in the World Bankâ€™s
Poverty and Inequality Platform (PIP), which covers 168 countries, using data between 1991 and
2020. For each of these surveys, we have distributions of daily per-capita household income or
consumption expressed in purchasing-power-parity-adjusted dollars.

We sequentially remove one of the 168 countries from the sample and predict the excluded
distribution for this country using the remaining 167 countries and various predictors. As potential
predictors, we search among more than 1,000 candidate variables spanning multiple databases
and including remote sensing indicators. After repeating this leave-one-out cross-validation for
each of the 168 countries with available data, we compare the predicted distributions to the
survey-based distributions, favoring models that minimize the prediction error.

We find that a model that uses GDP per capita, under-5 mortality, life expectancy, rural population
shares, and regional dummies predicts welfare well, and that adding more information does not
lead to any relevant gains. National accounts data stand out as the single best predictor of welfare,
even, but to a lesser extent, in data-deprived countries. This suggests that notwithstanding the
gaps between welfare from surveys and national accounts (Deaton 2005; Pinkovskiy and Sala-i-
Martin 2016, Deaton and Schreyer 2022, Prydz et al. 2022) and measurement issues with GDP
in autocratic countries (Martinez 2022), GDP provides valuable information on welfare in contexts
where household surveys are unavailable. However, about half of the countries are so data-
deprived that they not only lack an income or consumption distribution, but also GDP. For these
countries, we use World Bank income groups as a proxy, resulting in two tiers of models
depending on whether GDP data is available for the country of interest.

These two tiers outperform models based on remote sensing data (e.g., nighttime lights,
vegetation), which do not meaningfully lower the out-of-sample error if added to the models. This
suggests that, on average, remote sensing data are worse predictors of welfare at the national
level than GDP, which is consistent with evidence suggesting that remote sensing data might
produce household welfare estimates very different from survey-based ones (Van Der Weide et
al. 2024). This suggests that a relatively simple model using readily available data outperforms
more complicated models using costly data.

                                                 2
We implement our preferred methods for all countries to estimate global poverty, benchmarking
the results against poverty estimates published by the World Bank. We find that the models track
poverty rates relatively well in general but with notable exceptions. We show that these errors are
in part due to poverty estimates not being comparable across countries and within countries over
time but surely also due to modeling errors.

On average, our preferred models predict income or consumption off by around 30%. While this
is a large error, it is important to assess it against an appropriate benchmark. A random forest on
all 1,000+ possible predictors does not lead to any gains in the out-of-sample error, suggesting
that much of the remaining error is likely irreducible. Furthermore, at a global scale, a 30% error
is small in comparison with the large observed income differences: The median welfare of the
richest country in our sample is more than 100 times that of the poorest country, and the 75th
percentile of medians is 5 times greater than the 25th percentile.

A long-standing literature has tried to overcome data gaps and predict distributions when limited
information was available. Survey-to-survey imputations can be used when survey data on
welfare is unavailable at a desired point in time, but data on correlates of householdsâ€™ wellbeing
is available together with welfare data from a prior household survey (Stifel and Christiansen
2007, Roy and Van Der Weide 2024). Alternatively, national accounts data can be used to
extrapolate older welfare vectors forward in time (Mahler et al. 2022; Angrist et al. 2021). Others
have estimated full distributions when grouped data or summary statistics are available (Chen
2018; Chotikapanich et al. 2012, Eckernkemper and Gribisch 2021, Jorda and NiÃ±o-ZarazÃºa
2019; Hajargasht et al. 2012). Wealth indices have been used to predict full distributions for
individual countries that lack consumption or income data but have a Demographic and Health
Survey (Filmer and Prichett 2001, Dang et al. 2019). However, all these methods require at least
one survey-based welfare vector, thus making them inapplicable for countries that do not have
any survey data.

Remote sensing data and mobile phone data have been used to predict mean welfare, the poverty
rate, or another distributional statistic in a country in the absence of survey data (Pinkovskiy and
Sala-i-Martin 2016; Blumenstock et al. 2015, Steele et al. 2017; Pokhriyala and Jacques 2017;
Lee & Braithwaite 2022; Engstrom et al. 2022). However, these approaches do not predict full
distributions. Given the multiple poverty lines and welfare measures used in practice (see for
example Jorda et al. 2023, Decerf and Ferrando 2022, Kanbur et al. 2022, Jolliffe & Prydz 2021),
using such a model for each relevant welfare metric would jeopardize the simplicity of our
approach. Moreover, the remote sensing data needed for these approaches do not stretch back
far in time for long-term time trends and are not always publicly available, making these methods
difficult to implement for practitioners.

The remainder of the paper is structured as follows. The data and method are described in
sections 2 and 3. Sections 4 and 5 present results and robustness checks. Section 6 applies the
models to global poverty measurement, and section 7 concludes.




                                                 3
2 Data
Our primary data source is household survey data on disposable income or consumption available
in PIP. We use data from 1,989 surveys covering 168 countries for the period 1991 and 2020. We
exclude data before the 1990s as the quality was generally worse then, particularly for low- and
middle-income countries. The data are standardized as far as possible but differences exist with
regards to the method of data collection, and whether the welfare aggregate is based on income
or consumption. We use information on per capita household welfare expressed in 2017 USD
PPPs. We use PIPâ€™s public percentile database (version 20230919), utilizing 99 percentiles on
the distribution from each income or consumption vector. Concretely, we use the values of income
or consumption such that the cumulative density function í µí°¹í µí°¹ (í µí±¦í µí±¦) takes the following values
{0.01, 0.02, â€¦ ,0.99}. That is, we retain the 99 poverty lines that result in poverty rates of
{1%, 2%, â€¦ ,99%}. The final dataset consists of 196,903 quantile-country-year observations on
pairs of daily per-capita welfare and the associated quantile in the distribution.

We combine this survey data with various possible predictors of welfare at the country level. We
use data from the World Development Indicators (WDI) of the World Bank, which is one of the
largest databases of country-year development indicators spanning a wide range of topics. The
WDI contains information on around 1,400 indicators covering topics such as health, agriculture,
education, climate change, infrastructure and more. We also use all data from the World
Economic Outlook of the International Monetary Fund, which contains dozens of variables on
macroeconomic indicators, and all data from the UNâ€™s World Population Prospects, which
contains dozens of variables on population, health, and demographics. We compliment GDP data
from the sources above with estimates from the Madison Database (Bolt and Van Zanden 2024).
We also use country and region classifications by the World Bank and data on political rights, civil
liberties, and freedom status from the Freedom House.

In addition, we use remote sensing data available from the Google Earth Engine. We use data on
nighttime lights, precipitation, temperature, impervious surface, cropland, the normalized
difference water/snow/vegetation index, and the enhanced vegetation index. While spatial
coverage of remote sensing variables is not a concern as they cover the earthâ€™s entire surface,
temporal coverage is at times limited. Nighttime light data, for example, dates back to 1992. To fit
into this exercise, the remote sensing data need to be aggregated to the country-year level. We
first aggregate them to annual data by calculating the mean, max, min, and standard deviation of
each location (e.g., a pixel) over a year. Afterwards, we aggregate them spatially by taking the
mean, max, min, and standard deviation of the annual data for a country. This gives 16 features
for each type of variable. Nighttime lights are also converted to a per capita level by dividing the
sum of lights by the population size. We weigh each grid equally, and hence the indices reflect,
for example, the mean temperature in the territory of a country, not the mean temperature
experienced by a person in a country. Given that many of the variables impact welfare through
agriculture, it is not clear that population weighting (which would give a dominance to urban areas)
would make them more related to welfare distributions. Yet we also add population-grid weighted
estimates of temperature and rainfall from Gortan et al. (2024).




                                                 4
Where sensible, we use all variables in levels and in logs. From the total set of covariates, we
remove all that have more than 50% missing values, as these are unlikely to be relevant in the
application where we will apply our models, which is for the most data-deprived countries. This
leaves us with a total of 1,444 candidate variables for predicting welfare distributions.

3 Method

3.1 Distributional assumptions
To ensure that the predicted cumulative density functions (CDFs) are well-behaved, we need to
impose a distributional assumption. Though the log-normal distribution is the typical two-
parameter distribution used in applied work (see for example Bergstrom 2022, Kraay & Van der
Weide 2022, and Soergel 2021), we find that the log-logistic distribution, also known as the Fisk
distribution (Fisk 1961), provides a marginally better fit (see section 5.1). This is consistent with
Bresson (2009), who finds the log-logistic distribution to be the best performing two-parameter
distribution.

The Fisk distribution is given by
                                                                                                           1
                                                                       í µí°¹í µí°¹í µí±“í µí±“í µí±“í µí±“í µí±“í µí±“í µí±“í µí±“ (í µí±¦í µí±¦) =       í µí±¦í µí±¦ âˆ’í µí»¿í µí»¿
                                                                                                                        (1)
                                                                                                       1+ï¿½ ï¿½
                                                                                                           í µí»¼í µí»¼


where í µí»¼í µí»¼ is the scale parameter, which equals the median of the distribution, and í µí»¿í µí»¿ is the shape
                                                                                                                                                1
parameter, which equals the inverse of the Gini coefficient (Gini = ). We are interested in
                                                                                                                                                í µí»¿í µí»¿
predicting welfare levels, which we can isolate on the left-hand by using the quantile function of
the log-logistic distribution.

                      âˆ’1                            í µí±í µí± í µí°ºí µí°ºí µí°ºí µí°ºí µí°ºí µí°ºí µí°ºí µí°º                                                             í µí±í µí±
                í µí°¹í µí°¹í µí±“í µí±“í µí±“í µí±“í µí±“í µí±“í µí±“í µí±“ = í µí±¦í µí±¦ = í µí»¼í µí»¼ ï¿½      ï¿½                 â†” ln(í µí±¦í µí±¦) = ln(í µí»¼í µí»¼) + í µí°ºí µí°ºí µí°ºí µí°ºí µí°ºí µí°ºí µí°ºí µí°º âˆ— í µí±™í µí±™í µí±™í µí±™ ï¿½           ï¿½          (2)
                                                   1âˆ’í µí±í µí±                                                                            1âˆ’í µí±í µí±

where í µí±í µí± is the quantile of the distribution (i.e., percentile in our application). Equation 2 thus
expresses log welfare as a function of the quantile. This quantile function of the Fisk distribution
is convenient because it can be estimated through a simple OLS regression where the covariates
                                                                                                                                                              í µí±í µí±
included directly in the regression will predict ln(í µí»¼í µí»¼), while the covariates interacted with í µí±™í µí±™í µí±™í µí±™ ï¿½                                                         ï¿½
                                                                                                                                                             1âˆ’í µí±í µí±
will predict the Gini. With one covariate, such a regression can be written as
                                                                                                                              í µí±í µí±
                lnï¿½í µí±¦í µí±¦í µí±í µí±,í µí±í µí±,í µí±¡í µí±¡ ï¿½ = í µí»½í µí»½0 + í µí»½í µí»½1 âˆ— í µí±¥í µí±¥í µí±í µí±,í µí±¡í µí±¡ + ï¿½í µí»¾í µí»¾0 + í µí»¾í µí»¾1 âˆ— í µí±¥í µí±¥í µí±í µí±,í µí±¡í µí±¡ ï¿½ âˆ— í µí±™í µí±™í µí±™í µí±™ ï¿½ ï¿½                              (3)
                                                                                                                       1âˆ’í µí±í µí±

where í µí±¦í µí±¦í µí±í µí±,í µí±í µí±,í µí±¡í µí±¡ is daily per-capita income or consumption at percentile í µí±í µí±, in country í µí±í µí±, in year í µí±¡í µí±¡, and
í µí±¥í µí±¥í µí±í µí±,í µí±¡í µí±¡ is a covariate of interest. While í µí±¥í µí±¥í µí±í µí±,í µí±¡í µí±¡ is only available at the country-year level, it can be used
together with the distributional assumption to generate predictions at the country-year-percentile-
level. We could instead predict the median and Gini index separately and then recover a
distribution under the log-logistic parameterization. A robustness check shows that this variant
performs marginally worse (see section 5.1). The advantage of our preferred approach is that it
leverages the microdata and accomplishes the predictions in one step.


                                                                                                       5
To illustrate our approach, we use only log GDP per-capita as a covariate without interacting it
with the percentile term, i.e. setting í µí»¾í µí»¾1 = 0 in equation 3, and implicitly assuming that all countries
have the same Gini. Estimating this regression on our 196,903 quantile-country-year
observations, yields the following result:
                                                                                                                                               í µí±í µí±
           lní µí±¦í µí±¦í µí±í µí±,í µí±í µí±,í µí±¡í µí±¡ ï¿½í µí±¥í µí±¥í µí±í µí±,í µí±¡í µí±¡ ï¿½ = âˆ’5.794 + 0.869 âˆ— í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™/í µí±í µí±í µí±í µí±í µí±í µí±í µí±í µí±í µí±í µí±í µí±í µí±í µí±í µí±,í µí±¡í µí±¡ + 0.385 âˆ— í µí±™í µí±™í µí±™í µí±™ ï¿½      ï¿½        (4)
                                                                                                                                              1âˆ’í µí±í µí±

Hence, the median (in 2017 USD PPP per person per day) is estimated to be í µí»¼í µí»¼ =
í µí±’í µí±’ âˆ’5.794+0.869âˆ—í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™/í µí±í µí±í µí±í µí±í µí±í µí±í µí±í µí±í µí±í µí±í µí±í µí±í µí±í µí±,í µí±¡í µí±¡ , which for an annual GDP per capita of $20,000 equals $16.6. The í µí»½í µí»½1
estimate implicitly derives a â€˜passthrough rateâ€™ indicating how much of GDP growth passes
through to growth in (median) welfare. According to equation 4, whenever GDP per capita grows
by 1%, the median grows by 0.87%, which is similar to other estimates in the literature (Prydz et
al. 2022, Lakner et al. 2022). The Gini is predicted to be 38.5.

For any GDP per capita, we can turn the predictions of welfare at a particular percentile to a full
distribution. Figure 1 provides a graphical example of the predicted distribution for two specific
levels of GDP per capita using equation (4).

        Figure 1: Illustration of predicted distributions for two levels of GDP per capita




                      Note: Predicted distributions from the output of equation 4.


PIP includes a mix of consumption and income distributions, with income more commonly used
in richer countries. We can predict either an income or consumption distribution for any model we
                                                                                                                                                          í µí±í µí±
run by adding an income/consumption dummy and interacting this dummy with í µí±™í µí±™í µí±™í µí±™ ï¿½                                                                            ï¿½.   That
                                                                                                                                                         1âˆ’í µí±í µí±
said, given that the countries that use income aggregates tend to be wealthier and vice versa,
predicting income aggregates for poor countries and consumption aggregates for wealthy
countries would mean predicting beyond where there is common support and hence involve a
greater level of uncertainty.


                                                                                               6
3.2 Model performance

We use spatial-block leave-one-out cross-validation to compare out-of-sample errors of different
models (Roberts et al. 2017). That is, we sequentially remove all surveys for one of the 168
countries with available survey data and estimate welfare at 99 percentiles at each distribution of
the omitted country by running variants of equation (3) on the remaining 167 countries. We
remove all surveys available for a country and not a survey at the time to better simulate the
scenario where no data is available for a country. At each of the 99 predicted values of í µí±™í µí±™í µí±™í µí±™(í µí±¦í µí±¦), we
calculate the absolute difference between true log welfare and predicted log welfare, and
summarize the error of the survey, í µí°¿í µí°¿í µí°¿í µí°¿í µí°¿í µí°¿í µí°¿í µí°¿í µí±í µí±,í µí±¡í µí±¡ , as the mean absolute deviation:

                                                              1
                                í µí°¿í µí°¿í µí°¿í µí°¿í µí°¿í µí°¿í µí°¿í µí°¿í µí±í µí±,í µí±¡í µí±¡ =                  ï¿½
                                                                 ï¿½ï¿½í µí±™í µí±™í µí±™í µí±™(í µí±¦í µí±¦í µí±í µí±,í µí±í µí±,í µí±¡í µí±¡ ) âˆ’ í µí±™í µí±™í µí±™í µí±™(í µí±¦í µí±¦í µí±í µí±,í µí±í µí±,í µí±¡í µí±¡ )ï¿½
                                                              99
                                                                    í µí±í µí±âˆˆí µí±ƒí µí±ƒ

Figure 2 shows a graphical example of this. The blue curve represents the true distribution of
Lesotho in 2017 (í µí±™í µí±™í µí±™í µí±™(í µí±¦í µí±¦)), the black curve is the result of our GDP-based prediction from equation
(4) (í µí±™í µí±™ï¿½
         í µí±™í µí±™(í µí±¦í µí±¦)), and the loss is equal to the average width of the yellow lines.

                                        Figure 2: Illustration of loss function




              Note: Graphical example of the distance between the survey-based welfare
              distribution and the predicted welfare distribution on a log scale. The yellow
              bars indicate the absolute deviation between the two distributions at 99
              percentiles.

In the main specification, we use the mean absolute deviation rather than the mean squared error
since we are interested in minimizing the deviations between the true and predicted log welfare
while giving equal weight to all deviations. Using the mean squared error would give a larger
weight to large deviations, which often occur at very low or high percentiles. The welfare values
in the tails are most susceptible to measurement error.



                                                                                  7
We calculate the mean absolute deviation for every survey distribution available for a country,
average these losses, and then aggregate over all countries as follows:

                                                             1            1
                                      í µí°¿í µí°¿í µí°¿í µí°¿í µí°¿í µí°¿í µí°¿í µí°¿ =            ï¿½ï¿½               ï¿½ í µí°¿í µí°¿í µí°¿í µí°¿í µí°¿í µí°¿í µí°¿í µí°¿í µí±í µí±,í µí±¡í µí±¡ ï¿½,
                                                           í µí±í µí±í µí°¶í µí°¶    í µí±í µí±í µí±í µí±,í µí±‡í µí±‡
                                                                  í µí±í µí±âˆˆí µí°¶í µí°¶            í µí±¡í µí±¡âˆˆí µí±‡í µí±‡

where í µí±í µí±í µí±í µí±,í µí±‡í µí±‡ is the number of surveys for a country with data and í µí±í µí±í µí°¶í µí°¶ is the number of countries with
surveys. Across countries, the frequency of surveys varies drastically and systematically with
income. For example, the average low-income country has 4 surveys between 1991 and 2020,
compared with 20 surveys in the average high-income country. To ensure that we do not select
models that only work well for countries with many surveys, our aggregation formula weighs each
country by the inverse of its number of surveys, such that the total weight for each country equals
one. We multiply the final loss by 100 such that the loss approximately equals the average error
in predicting welfare in percent.

Our loss function and use of a linear model to predict í µí±™í µí±™í µí±™í µí±™(í µí±¦í µí±¦) leads us to estimate equation (3) using
a quantile regression. Quantile regressions estimate the parameters while minimizing the
absolute size of the error, which is consistent with our loss function. As a robustness check, we
also use OLS, which minimizes the sum of squared errors, and hence is consistent with a mean
squared error loss function (see section 5.1).

3.3 Model selection

The selection of our preferred model is not only guided by the loss function, but also by three
additional principles, which we define below: simplicity, useability, and coherence.

With respect to simplicity, we prefer models that can be applied easily, which is in part why we
use a linear model. One challenge with a linear model is that missing values in the covariates
cannot be handled easily. A potential solution is to impute the missing values, but this would make
the model intractable. To keep the framework simple, we will only use one covariate with missing
values, and otherwise restrict the model to covariates with at most 1% missing values. This results
in two tiers of models, one where the covariate with more than 1% missing values is available and
one where it is not. We allow for 1% missing values in other covariates because some of the
databases we use exclude certain economies for political reasons (notably Taiwan, China; and
Kosovo), yet data are often available for these excluded economies from country-specific sources.
So in practice, the second tier model can be applied everywhere. To select the primary variable
of interest, we will compare the error from regressions of the type shown in equation (4) but
replacing GDP per capita sequentially with all covariates in our dataset.

By useability, we mean that the model can be applied and performs relatively well even in the
most data deprived contexts. The tier 1 covariate is selected based on a balance of predictive
accuracy and non-missingness. By construction, our tier 2 model will only use covariates available
for virtually all countries, and can hence be applied everywhere. Even if the model can be applied
everywhere, it is possible that the relationship between welfare and covariates is fundamentally
different for data deprived countries. Martinez (2022) for example finds that GDP growth is less
reliable for authoritarian countries, which also tend to be the ones that do not produce or publish


                                                                              8
survey data. To ensure that our model works well for data deprived countries, we check if our
model selection and performance are robust to using only the countries with at most three poverty
estimates, which reflects the 25% most data deprived countries for which we have some data.
These countries tend to be more authoritarian and conflict-ridden but also include a couple of
wealthy countries which simply do not share much data (Table A.1).

With respect to coherence, we mean that we are willing to tolerate marginal increases in errors if
doing so results in a model that is easier to rationalize. This is in part why we are restricting
ourselves to predictions consistent with the log-logistic distribution, but it also matters for the
covariates we will select. For example, if we find that a model using GDP per capita in 2011 PPPs
performs slightly better than a model using GDP per capita in 2017 PPPs, we will favor the latter
as the welfare distributions are expressed in 2017 PPPs, and hence makes for a more consistent
model.

The use of these three principles in addition to the error of the model means that, at times, value
judgments are needed to assess when an increase in the error may be merited by a greater
adherence to the three principles. This complicates the model selection somewhat in contrast to
a sole reliance on the model error. As a robustness check, we use machine learning to select a
model guided exclusively on model performance as a test of how much accuracy we give up by
including these three additional principles (section 5.2).

4 Results
Our first objective is to select the main variable to use for the tier 1 model. To that end, we run
separate quantile regressions following equation (4) using one candidate covariate at a time,
replacing í µí±™í µí±™í µí±™í µí±™(í µí°ºí µí°ºí µí°ºí µí°ºí µí°ºí µí°º) in the equation. We evaluate the fit in-sample for reasons of efficiency, since
evaluating all models out-of-sample is too computationally intensive. Given that all models are
identical and simple, overfitting is unlikely to be an issue. The left panel of Figure 3 plots the error
from these regressions against the availability of each indicator from 1991-2020 for the 218
economies considered by the World Bank. We exclude variables with less than 50% availability
given that they have too much missingness to be useful for our application.

We are primarily interested in variables for which no other variable has a lower error and lower
missingness. When using the full sample (panel a), this concerns national accounts variables and
under-five mortality. The national accounts variables on the frontier are GDP, Household Final
Consumption Expenditure (HFCE), and Gross National Income (GNI) (the national accounts
variables are expressed in logs and per capita terms throughout). We use versions of these
variables expressed in 2017 PPP-adjusted USD even though at times non-PPP adjusted versions
perform slightly better (see the unlabeled orange dots in Figure 3). These variables have the
advantage of being highly correlated with welfare but the drawback of not always being available.
For example, GNI per capita is available only for 58% of the country-year observations. Hence,
national accounts data can only offer a partial solution to predict the distribution of welfare. To the
contrary, under-five mortality is universally available due to modeling done by the UN Population
Division.



                                                       9
                       Figure 3: Covariatesâ€™ predictability of welfare and availability
                  (a) Full sample                                          (b) Data deprived countries




    Note: Each dot in the figure presents output from a separate regression using a particular covariate in the
    regression. The vertical axis shows the error of the regression, and the horizontal axis shows the availability of
    the covariate across countries. The labelled national accounts variables are in per capita 2017-PPP terms.
    Panel (a) uses the full sample while panel (b) uses countries with at most three surveys. Panel (b) evaluates
    missingness by looking at the country-years without a poverty estimate but with data on the covariate. Variables
    in the bottom-right have high accuracy and high availability. An error of, say, 30 means that the model predicts
    log welfare off by 0.3 on average, which is approximately equal to 30% (though for large errors, the
    approximation is less accurate).

In panel (b) we perform the same analysis on the data-deprived sample. Concretely we look at
the performance among countries with at most three welfare distributions and evaluate
missingness based on the country-years without data. We do this to get a sense of how the
variables could perform in situations where the modeling efforts are most likely to be applied,
namely in settings where actual welfare data are unavailable. National accounts data perform
relatively worse in this subsample: HFCE and GNI are not even shown in the plot as they are not
available in more than half of the cases, and GDP per capita is now outperformed by sanitation
and electricity access. This could happen because the country-years for which the latter two
variables are available are easier to predict that the country-years for which GDP per capita is
available. For example, electricity access is essentially only available post-2000, which could give
rise to its relatively good performance if there is more noise in early welfare aggregates. Even in
the data deprived case, remote sensing variables including nighttime lights do not offer the most
accuracy. 2



2 From this evidence it is not possible to draw conclusions about the accuracy of remote sensing data in
other applications such as for small area or subnational estimation, where many of the variables we
consider here are unavailable.

                                                           10
The greater accuracy of variables without complete availability lends itself to using the two tiers
of models discussed previously: Tier 1, the model with a variable without complete availability,
and Tier 2, the model relying exclusively on variables with complete availability. We test the
performance of possible Tier 1 variables in Figure 4. To evaluate possible Tier 1 variables, we
delete all observations with missing values in any candidate variable, as this gives a fair
evaluation. As candidate variables we use the ones on the frontier of either panel of Figure 3 â€“
GNI, HFCE, GDP, electricity access, access to sanitation, and under-five mortality. We add
nighttime lights for reference. Including HFCE and GNI reduces the common sample that can be
used on all models, since these variables have many missing observations. Hence, we consider
a version that includes these variables (panel a) and a version without them (panel b).

                                      Figure 4: Candidate tier 1 variables
                  (a) With HFCE and GNI                                        (b) Without HFCE and GNI




    Note: The figure plots the prediction error (vertical axis) of various predictors (horizontal axis). Observations:
    in panel (a) N=132,163; in panel (b) N=159,385.

GNI performs best in the full sample, marginally outperforming GDP, while HFCE performs best
in the data deprived sample. HFCE and GNI are however only available in less than half of
country-years without poverty data, so a model with GDP has a broader use. For that reason, we
select GDP over GNI or HFCE. If we remove those two variables (panel b), GDP clearly
outperforms other contenders in the full sample, but only marginally in the data deprived sample.
Notably using sanitation access or electricity access gives essentially the same performance.
However, these two variables are not well suited for predicting poverty in non-poor countries,
since access is universal. Therefore, once again, there are reasons to prefer GDP as the tier 1
variable also in panel (b).

Next, we test if adding more variables to the tier 1 model would reduce the error, and which
variables we should use for the tier 2 model. To that end, we run a quantile regression lasso
predicting í µí±™í µí±™í µí±™í µí±™(í µí±¦í µí±¦) using equation (3) (Sherwood & Maidman 2017). As possible covariates for tier
2, we use all the variables we gathered with at least 99% availability, all of these variables


                                                           11
                          í µí±í µí±
interacted with í µí±™í µí±™í µí±™í µí±™ ï¿½      ï¿½   (such that they also predict the distribution, not just the mean), and all of
                         1âˆ’í µí±í µí±
these two sets of variables interacted with whether income or consumption is used to measure
                                                                                                  í µí±í µí±
welfare. For tier 1 we also add the interaction of log GDP per capita with í µí±™í µí±™í µí±™í µí±™ ï¿½                   ï¿½   and the welfare
                                                                                                 1âˆ’í µí±í µí±
type. For both tiers, only three types are selected by the lasso: Mortality rates (infant, under-5,
under-40, and under-60), life expectancy (overall and at 80), and the rural population share.

In principle, we could use the lasso alone to select the final models, but upon further inspection,
we find that the lasso does not select group variables (income groups and regions) in an ideal
way, results in a relatively unstable model selection, and does not deal with income/consumption
dummies efficiently. 3 For those reasons, we use a stepwise model selection using as inputs the
subset of variables that the lasso identified, to which we add income groups to tier 2 as a proxy
of GDP per capita. Figure 5 shows the order in which variables enter and the out-of-sample error
with each additional variable included. For tier 1, the variables chosen in order are GDP/capita,
under-60 mortality, under-5 mortality, and life expectancy. For tier 2, the variables chosen in order
are under-5 mortality, income groups, rural population share, and life expectancy. After these
variables are included, no additional variable lowers the out-of-sample error notably.

                                       Figure 5: Error as variables enter the model
                              (a) Tier 1                                                 (b) Tier 2




      Note: The figure plots the order in which variables enter the two tiers and the errors as each variable is
      added. For example, GDP per capita is the first variable to enter in tier 1, and when it is the only variable in
      the model, the error is 33.7. The next variable to enter is under-60 mortality, which, when added to the model
      that includes GDP per capita, lowers the error to 30.3.




3
 We find very small variations in the penalty parameter results in either all income groups, regions, and their
interaction with welfare type being included or none.

                                                            12
The lasso and stepwise regression did not include interactions. We next test if adding interactions
                                                                                              í µí±í µí±
with the income/consumption dummy, or between the variables and í µí±™í µí±™í µí±™í µí±™ ï¿½                          ï¿½   (i.e. letting the
                                                                                             1âˆ’í µí±í µí±
variables influence the Gini) could lower the error. As we show in the next section, regional
dummies are most predictive of the Gini index among all 1,000+ variables. Therefore, we also
test whether accuracy improves when we allow for variation in the Gini index by region. We try
both Gini dummies for all regions and for the three regions that show notable differences in their
Gini indices from the rest of the world. These regions are Latin America & the Caribbean and Sub-
Saharan Africa which are regions with high inequality (Haddad et al. 2024), and Europe & Central
Asia, which tends to have lower inequality. Finally, we test if the Gini should be a second-order
polynomial of log GDP per capita, as the Kuznets curve would suggest (Kuznets 1955).

The error of both tiers would be reduced by letting the Gini differ by the three regions mentioned
above (by 1.4 for tier 1 and 1.2 for tier 2). For tier 1, there is additional evidence in favor of adding
a dummy for welfare type (income or consumption) and interacting this with GDP per capita. This
means that as GDP grows, it has differential impacts on income and consumption vectors, which
is in line with existing evidence (Lakner et al 2022, Mahler et al. 2022, Prydz et al 2022). This
reduced the error of tier 1 further to 26.7. On the other hand, for tier 2, there is no similar case for
adding a welfare type dummy or interacting the income group classifications with welfare type.
Finally, we substitute under-60 mortality for the rural population share in the tier 1 model, which
increases the error very marginally (from 26.67 to 26.69). This improves the consistency between
the two models, and hence reduces revisions to the predictions if a country moves from one tier
to another. In sum, we end up with two almost parallel models, where GDP per capita is proxied
by income group in tier 2. The final out-of-sample error of the two tiers are 26.7 and 30.1. The
regression outputs from the final models are shown in Table 1 (for the full sample).

                                              Table 1: Regression output
      Outcome variable: Log welfare (í µí±™í µí±™í µí±™í µí±™(í µí±¦í µí±¦))                         Tier 1             Tier 2
      Intercept                                                  -1.975***     (0.052)   1.837***   (0.047)
      Log GDP per capita (2017 PPP)                               0.393***     (0.004)
      Log under-5 mortality (per 1,000 live births)              -0.185***     (0.003)   -0.307***      (0.004)
      Life expectancy (years)                                     0.016***     (0.000)    0.017***      (0.000)
      Rural population share (0-100)                             -0.003***     (0.000)   -0.008***      (0.000)
      Income group
              Low income                                                                 Base
              Lower-middle income                                                        0.229***       (0.006)
              Upper-middle income                                                        0.450***       (0.008)
              High income                                                                1.102***       (0.011)
      Welfare type (income =1, consumption = 0)                  -3.671***    (0.030)
      Welfare type * log GDP per capita                           0.392***    (0.003)
      í µí±™í µí±™í µí±™í µí±™(í µí±í µí±/(1 âˆ’ í µí±í µí±))                                   0.354***    (0.002)     0.350***      (0.002)
      í µí±™í µí±™í µí±™í µí±™(í µí±í µí±/(1 âˆ’ í µí±í µí±)) * I[Europe & Central Asia]       -0.045***    (0.002)    -0.034***      (0.003)
      í µí±™í µí±™í µí±™í µí±™(í µí±í µí±/(1 âˆ’ í µí±í µí±)) * I[Latin America & Caribbean]    0.159***    (0.002)     0.166***      (0.003)
      í µí±™í µí±™í µí±™í µí±™(í µí±í µí±/(1 âˆ’ í µí±í µí±)) * I[Sub-Saharan Africa]           0.060***    (0.002)     0.069***      (0.003)
      Observations                                               194,626                 194,824
      Pseudo R2                                                  0.7512                  0.719
      Note: *=0.05, **=0.01, ***=0.001. Robust standard errors in parentheses.


                                                                 13
We use two examples to illustrate how these regressions can predict welfare and poverty rates.
The first example is for predicting a consumption distribution in a Tier 1 country in Sub-Saharan
Africa (i.e., GDP is available). In this case, the predicted log consumption equals
                                                                                                                                                                                  í µí±í µí±
 ln(í µí±¦í µí±¦) = âˆ’1.975 + 0.393 ln(í µí°ºí µí°ºí µí°ºí µí°ºí µí°ºí µí°ºí µí°ºí µí°ºí µí°ºí µí°º) âˆ’ 0.185 ln(í µí±¢í µí±¢5í µí±ší µí±š) âˆ’ 0.003í µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿâ„Ží µí±Ží µí±Ží µí±Ží µí±Ží µí±Ží µí±Ž + 0.016í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™ + (0.354 + 0.060)ln ï¿½         ï¿½
                                                                                                                                                                                1 âˆ’ í µí±í µí±

Isolating í µí±í µí±, which can be interpreted as the poverty rate associated with the poverty line í µí±¦í µí±¦ yields
                                                                                                                                                                                       1   âˆ’1
                       exp(âˆ’1.975 + 0.393 ln(í µí°ºí µí°ºí µí°ºí µí°ºí µí°ºí µí°ºí µí°ºí µí°ºí µí°ºí µí°º) âˆ’ 0.185 ln(í µí±¢í µí±¢5í µí±ší µí±š) âˆ’ 0.003í µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿâ„Ží µí±Ží µí±Ží µí±Ží µí±Ží µí±Ží µí±Ž + 0.016í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™) 0.354+0.060
   í µí±í µí±(í µí±¦í µí±¦) = ï¿½1 + ï¿½                                                                                                                                                    ï¿½            ï¿½
                                                                             í µí±¦í µí±¦


Using data for Ethiopia in 2021, GDP per capita is $2.319, under-5 mortality is 47, the rural share
is 78 percent, and life expectancy is 65 years. At the international poverty line of $2.15, this yields
í µí±í µí±(í µí±¦í µí±¦) = 0.276, so a poverty rate of 27.6%.

The second example predicts the income distribution of an upper-middle-income country in East
Asia & Pacific, which lacks GDP data (i.e., using the Tier 2 model):
                                                                                                                                                                            í µí±í µí±
                   í µí±™í µí±™í µí±™í µí±™(í µí±¦í µí±¦) = 1.837 + 0.450 âˆ’ 0.307 ln(í µí±¢í µí±¢5í µí±ší µí±š) + 0.017í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™ âˆ’ 0.008í µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿâ„Ží µí±Ží µí±Ží µí±Ží µí±Ží µí±Ží µí±Ž + 0.350í µí±™í µí±™í µí±™í µí±™ ï¿½         ï¿½
                                                                                                                                                                          1 âˆ’ í µí±í µí±

Again, isolating í µí±í µí± gives
                                                                                                                                                                          1      âˆ’1
                                     exp(1.837 + 0.450 âˆ’ 0.307 ln(í µí±¢í µí±¢5í µí±ší µí±š) + 0.017í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™í µí±™ âˆ’ 0.008í µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿí µí±Ÿâ„Ží µí±Ží µí±Ží µí±Ží µí±Ží µí±Ží µí±Ž) 0.350
                 í µí±í µí±(í µí±¦í µí±¦) = ï¿½1 + ï¿½                                                                                                                          ï¿½      ï¿½
                                                                            í µí±¦í µí±¦


Using data for Thailand (an upper-middle-income country in East Asia) in 2021, under-5 mortality
is 8, life expectancy is 79 years, and the rural population share is 48 percent. Setting í µí±¦í µí±¦ = 6.85,
which is the typical poverty line for upper-middle-income countries, results in í µí±í µí±(í µí±¦í µí±¦) = 0.124, so a
poverty rate of 12.4%.

5 Robustness checks
This section conducts three robustness checks of our model selection. First, we predict welfare
distributions using other distributional approaches. Second, we predict distributions flexibly using
machine learning on all the indicators we gathered. This helps assess how much we are giving
up by insisting on a simple model. Third, we look at how well the preferred model performs in
relevant subsamples to explore if there are cases where it may be less appropriate to use.

5.1      Accuracy using other distributional approaches
We try three other ways of predicting full distributions to assess the robustness of our main
approach. In all cases, we use the same covariates as in the final two-tier model but change how
they translate into full distributions. First, instead of predicting the full distributions directly using
the inverse CDF of the log-logistic distribution, we first predict the log of the median and the Gini,
and then infer the full distribution from the inverse CDF using the predicted median and Gini.
Second, we run OLS regressions rather than quantile regressions, and hence estimate the
parameters using the MSE rather than the MAD. Third, rather than assuming the distributions are

                                                                                                 14
log-logistic, we assume that they are log-normal. Given that the quantile function for the log-
normal distribution results in í µí±™í µí±™n(í µí±¦í µí±¦) = ln(í µí¼‡í µí¼‡) + âˆš2í µí¼Ží µí¼Ž 2 âˆ— í µí±’í µí±’í µí±’í µí±’í µí±’í µí±’ âˆ’1 (2í µí±í µí± âˆ’ 1), we can essentially run the
                                                                                                                í µí±í µí±
same regressions as we have run so far, but instead of interacting covariates with í µí±™í µí±™í µí±™í µí±™ ï¿½                        ï¿½,   we
                                                                                                               1âˆ’í µí±í µí±
interact them with í µí±’í µí±’í µí±’í µí±’í µí±’í µí±’ âˆ’1 (2í µí±í µí± âˆ’ 1). We find that in all three cases, the error increases (modestly)
compared to our preferred approach (Figure A.1).

5.2    Accuracy using machine learning on all gathered indicators
The errors of our two models -- 26.7 or 30.1 -- are substantial on an absolute scale, as they imply
that welfare is predicted off by around 30% on average. We test whether the error is high because
our model ultimately is too simple to perform well, because of the distributional assumption we
impose, or because of irreducible error related to the uncertainty involved with constructing
welfare aggregates and converting them to a common currency.

First, to estimate the limitations imposed by the simple model, we predict the median and Gini
using a conditional inference random forest (Hothorn et al. 2006) using all the 1,000+ covariates
we gathered. This helps shed light on whether the limitations of the functional form imposed, or
the set of covariates considered causes our relatively large error. When doing so, we get an out-
of-sample error of 27.9 on the full sample and 36.5 on the data-deprived sample. This is higher
than the errors from the tier 1 model (26.7 and 31.1). Though it certainly is possible to improve
upon this error using even more covariates or other machine learning methods, this suggests that
the main reason for the error is not the simplicity of the model we used. The intuition for this result
is clear from the variable importance plots of the random forest: The main predictor of medians is
GDP per capita, while the most predictive variable of the Gini is the region (Figure A.2). Both
variables are already included in our preferred models.

Second, to assess the potential loss from the distributional assumption, we estimate a
hypothetical error if we predicted the median and Gini perfectly and then impose a log-logistic
distribution. This results in errors of 5.6 on the full sample and 6.6 on the data-deprived sample.
Hence of the final error of our model, around 20% can be ascribed to our distributional
assumption, while the remaining 80% are most likely due to irreducible error.

Some additional statistics can also help bring clarity on the size of the error of the two models.
Firstly, the R2 from the regressions of our final models are 0.71 and 0.75, whereas the R2 of a
model without any predictors is 0.15 (i.e. a model which would assign approximately the
distribution of the median country to any country). Hence, our models explain most of the variance
across and within countries. In addition, it is important to note that countries differ greatly in their
welfare levels. The median welfare of the richest country in our sample is more than 100 times
the poorest country, and even the 75th percentile of medians is 5 times greater than the 25th
percentile. Hence, an error of around 30% would not lead to a substantial mis-ranking of a
particular country, given the large differences in living standards in the world at large.

5.3    Accuracy in subsamples
We next explore how our preferred models perform in various settings (Figure 6). If the model
performs relatively poorly in a setting it may either be because the chosen covariates are less
related to welfare distributions in this setting or because welfare distributions in general are noisy

                                                           15
and hard to predict in those settings. We try to distinguish between these two explanations by
also reporting the error of the corresponding machine learning model. If the error in a particular
subsample is higher both for our models and the machine learning model, then welfare in general
is hard to predict in this setting. By contrast, if the machine learning model gives low errors, while
the errors of our models are high, then the covariates we have selected are not as relevant for
this subsample.
                                  Figure 6: Performance by country groups




Note: The out-of-sample error for different groups of countries. FCV refers to whether the country-year is classified as
in fragility, conflict, or violence according to the World Bankâ€™s classification.

The errors are lower in data rich settings, for rich countries, and using more recent welfare
aggregates. In general, whenever the error is low, the random forest outperforms our models,
while in poor and data deprived settings, the random forest does worse. This suggests that when
there is high-quality information available, a simple model does not pick up all the relevant
information, and a more complex model could do better. By contrast, when the data are sparse
and of worse quality, there is less to be gained from complex models.

6 Application to global poverty measurement
We apply our model to all countries in the world from 1991-2020 to measure global poverty. We
use the international poverty line ($2.15/day, which is the median poverty line of low-income
countries), the median poverty line of upper-middle-income countries ($6.85), and the median
poverty line of high-income countries ($24.35) (Jolliffe â“¡ al. 2024). In a handful of cases where
data are still missing in the variables we use in the two tiers, we find alternative values. 4 We

4
 Under-5 mortality is measured by the UN Inter-agency Group for Child Mortality Estimation. It uses country-level
estimates from vital registration, surveys and censuses and fits a Bayesian B-spline bias-reduction model to create a


                                                          16
benchmark the results against the official global poverty estimates by the World Bank, which rely
on survey data, but also interpolations and extrapolations using growth in national accounts. We
can in theory recover predictions for both income and consumption distributions for each country
using tier 1. To approximate what countries actually rely upon, we use income distributions for
countries in Latin America & the Caribbean and high-income countries. Results are shown in
Figure 7.
               Figure 7: Predicted and survey-based global and regional poverty trends




         Note: Predicted poverty rates using the two tiers and actual poverty rates from PIP.

The predictions at the extreme poverty line globally are a bit below the actual trend and the
reduction is a bit less strong, with the predicted $2.15 poverty rate falling from 26% in 1991 to 7%
in 2019, while PIPâ€™s poverty rate falls from 37% to 9% over the same period. At the $6.85 line,
the predicted rates are also below PIP, but the rate of progress is a bit faster, with the predicted
poverty rate going from 67% to 38%, while PIPâ€™s falls from 69% to 47%. At the high-income line,
the two are more aligned. Despite these differences, the two-tiered model is able to predict the
decline in global poverty rather well using a handful of readily-available country-level variables.


smooth trend through these estimates (Alkema & New 2014). All 218 economies in the World Bankâ€™s list of economies
have data on this indicator. Life expectancy is based on the World Population Prospects 2022, relying on similar sources
as under-5 mortality but uses different methods to smooth and fill gaps (United Nations 2022). Life expectancy has no
missing values. Rural population shares are based on data from the 2018 World Urbanization Prospects (United
Nations 2018), which relies on national criteria of urban and rural populations, using extrapolations from past trends
when timely data are missing. The rural share is missing for Kosovo; Taiwan, China; and St. Martin (French Part). For
St. Martin (French part), we assume the same rural population share as the Dutch part, while the data for Kosovo is
taken from Serbia, and for Taiwan, China, we use the share of China. Income groups are defined based on a countryâ€™s
Gross National Income (GNI) per capita. Countries may have an income group classification even though they do not
publish GDP (or GNI) data. This happens when GNI estimates of sufficient quality exist to place a country in an income
category but not with sufficient precision to release the numerical estimate. The only economy among the World Bankâ€™s
218 economies without an income group classification (since 2015) is the RepÃºblica Bolivariana de Venezuela, which
is assumed to carry its last classification (upper-middle income) forward to 2021.

                                                          17
The predicted poverty rates track rather well in all regions except for East Asia & Pacific and
South Asia, where the predicted rates are notably lower. This is in large part due to China and
India, which influence the regional and global results due to their size, not because the model
necessarily performs worse there. We will briefly discuss both countries, which illustrate issues
related to consumption surveys that might also be present elsewhere (Nicoletti et al. 2011).

Chinaâ€™s extreme poverty rate is â€œonlyâ€ predicted to have fallen from 24% to 0.6% from 1993-2020,
while PIPâ€™s survey-based estimates suggest it fell from 63% to 0.1% (Figure 8). This is in part
explained by Chinaâ€™s poverty rates not being comparable before and after 2012. The non-
comparable (survey-based) poverty rate fell from 8.5% to 2.9% from 2012 to 2013, while the
predicted (comparable) poverty rate only fell from 1.5% to 1.3%. This suggests that some of the
discrepancies between the predicted and actual poverty rates might be due to the actual poverty
rates not being fully comparable within countries over time.
                    Figure 8: Predicted and survey-based country poverty trends




  Note: Predicted poverty rates using the two tiers and actual poverty rates from PIP. When estimates from PIP
  are not connected with a line, this indicates that the two estimates are not comparable to each other. Uses only
  the survey-year observations. The regional aggregates from PIP in Figure 7 use these survey-year observations
  but also interpolations and extrapolations at the country-level.

In India, many different extreme poverty rates exist depending on the measure of consumption
used and how one extrapolates from the most recent survey (Tarozzi 2007). At the time of this
analysis, PIP used a consumption aggregate based on a uniform recall period and modeled
poverty post 2011-12 based on estimates from Roy and Van der Weide (2025). In 2004 the
predicted poverty rate is 23% while the one from PIP based on the uniform recall period is 40%.
If the modified mixed recall period was used for 2004-05 instead, PIP would have a poverty rate
of 28%, hence much more in line with the predictions from the model. Again, this suggests that



                                                        18
some of the differences between the predicted poverty rates can be explained by particularities
of how poverty is measured in a country.

Figure 8 shows the predicted and survey-based poverty rates for other populous, or poor and
data deprived countries. In most cases, the two are fairly well aligned, though there are
exceptions, such as the cases of China and India just discussed as well as the RepÃºblica
Bolivariana de Venezuela, where the predicted poverty rates are much lower than the survey-
based ones. Figures A.3-A.12 in the annex show similar plots for 218 economies.

PIP currently assigns countries without any poverty estimate (or without national accounts data
to extrapolate/interpolate from an old poverty estimate) the regional poverty rate. This concerns
about 3% of the global population. In Appendix B we show how PIPâ€™s regional and global poverty
rates would change if we used the estimates from our two tiers instead of applying the regional
average, as PIP currently does. Global poverty is projected to be revised upwards in 2019 by 11
million at the $2.15 line and 16 million at the $6.85 line. For certain countries the implications are
larger, notably Afghanistan, Somalia, and the Democratic Peopleâ€™s Republic of Korea.

7 Conclusion
This paper proposes a method to estimate welfare distributions in contexts where little data is
available. Data deprivation is often a bigger concern in poorer and more fragile countries where
data collection might be limited due to active conflict, lack of resources, or institutional fragility,
but where efforts to monitor changes in living conditions are essential. While comprehensive
surveys of household income and consumption remain the best way to measure household
welfare, we offer a simple alternative to estimate welfare distributions when surveys are not
available.

The method consists of leveraging income or consumption distributions from more than 2,000
household surveys available in the World Bankâ€™s Poverty and Inequality Platform (PIP) for 168
countries covering the period between 1991 and 2000. We combine these distributions with more
than 1,000 predictors at the country-year level, including remote sensing variables, across various
databases. We develop a simple model which predicts distributions for all countries in the world
from a regression on country-level data. Guided by variable-selection techniques, we try various
versions of such regressions, each time sequentially excluding all surveys available for one of the
168 countries available in our dataset to predict the distribution for the excluded country using the
remaining 167 ones. We then calculate the absolute deviation between the true welfare and
predicted welfare at 99 points on each distribution. Once we have these deviations for all surveys
in the 168 countries, we calculate the mean absolute deviation across all surveys and all
countries.

We find that when predicting distributions with GDP per capita (or income groups if GDP/capita
is unavailable), under-5-mortality, life expectancy, and rural population shares, no additional
feature significantly reduces the error further. Our preferred model predicts log welfare off by
around 30% on average. Though this may sound high, we show that a random forest using all
1,000+ predictors is unable to reduce the error, suggesting that much of the remaining error is
likely to be noise.

                                                  19
We demonstrate how to apply our model to predict poverty rates in a given country, and apply our
preferred model to global poverty measurement, comparing it to the World Bankâ€™s poverty
estimates. We show that the model in general tracks the official poverty estimates well, but with
notable exceptions. Where there are exceptions, these may in part be explained by the
deficiencies in the survey methodology of a particular country, and not just because our model
comes up short.

References
Alkema, Leontine, and Jin Rou New. 2014. "Global Estimation of Child Mortality Using a Bayesian
        B-spline Bias-Reduction Model." The Annals of Applied Statistics: 2122-2149.
Angrist, Noam, Pinelopi Koujianou Goldberg, and Dean Jolliffe. 2021. â€œWhy Is Growth in
        Developing Countries So Hard to Measure?â€ Journal of Economic Perspectives 35 (3):
        215â€“42.
Bergstrom, Katy. 2022. "The Role of Income Inequality for Poverty Reduction." World Bank
        Economic Review 36 (3): 583-604.
Bolt, Jutta, and Jan Luiten Van Zanden. 2024. "Maddisonâ€Style Estimates of the Evolution of the
        World Economy: A New 2023 Update." Journal of Economic Surveys.
Bresson, Florent. 2009. "On the Estimation of Growth and Inequality Elasticities of Poverty with
        Grouped Data." Review of Income and Wealth 55(2): 266-302.
Castaneda Aguilar, R. Andres. 2022. â€œPip: Stata Module to Access World Bankâ€™s Global Poverty
        and Inequality Data (Version 0.3.8).â€ STATA. https://worldbank.github.io/pip/.
Chen, Yi-Ting. 2018. â€œA Unified Approach to Estimating and Testing Income Distributions with
        Grouped Data.â€ Journal of Business & Economic Statistics 36(3): 438â€“55.
Chotikapanich, Duangkamon, William E. Griffiths, D. S. Prasada Rao, and Vicar Valencia. 2012.
        â€œGlobal Income Distributions and Inequality, 1993 and 2000: Incorporating Country-Level
        Inequality Modeled with Beta Distributions.â€ Review of Economics and Statistics 94(1):
        52â€“73.
Christiaensen, Luc, Peter Lanjouw, Jill Luoto, and David Stifel. 2012. â€œSmall Area Estimation-
        Based Prediction Methods to Track Poverty: Validation and Applications.â€ The Journal of
        Economic Inequality 10(2): 267â€“297.
Dang, Hai-Anh, Dean Jolliffe, and Calogero Carletto. 2019. â€œData Gaps, Data Incomparability,
        and Data Imputation: A Review of Poverty Measurement Methods for Data-Scarce
        Environments.â€ Journal of Economic Surveys 33(3): 757â€“797.
Datt, Gaurav, Valerie Kozel, and Martin Ravallion. 2003. â€œA Model-Based Assessment of Indiaâ€™s
        Progress in Reducing Poverty in the 1990s.â€ Economic and Political Weekly, 355â€“361.
Datt, Gaurav, and Martin Ravallion. 2002. â€œIs Indiaâ€™s Economic Growth Leaving the Poor Behind?â€
        Journal of Economic Perspectives 16(3): 89â€“108.
Deaton, Angus. 2005. â€œMeasuring Poverty in a Growing World (or Measuring Growth in a Poor
        World).â€ The Review of Economics and Statistics 87(1): 1â€“19.
Deaton, Angus, and Paul Schreyer. 2022. â€œGDP, Wellbeing, and Health: Thoughts on the 2017
        Round of the International Comparison Program.â€ Review of Income and Wealth 68(1): 1â€“
        15.
Decerf, Benoit, and Mery Ferrando. 2022. "Unambiguous Trends Combining Absolute and
        Relative Income Poverty: New Results and Global Application." World Bank Economic
        Review 36(3): 605-628.
Eckernkemper, Tobias, and Bastian Gribisch. 2021. â€œClassical and Bayesian Inference for Income
        Distributions Using Grouped Data.â€ Oxford Bulletin of Economics and Statistics 83(1): 32â€“
        65.

                                               20
Ekhator-Mobayode, Uche E., and Johannes Hoogeveen. 2022. â€œMicrodata Collection and
          Openness in the Middle East and North Africa.â€ Data & Policy 4: e31.
Elbers, Chris, Jean O. Lanjouw, and Peter Lanjouw. 2003. â€œMicro-Level Estimation of Poverty and
          Inequality.â€ Econometrica 71(1): 355â€“364.
Engstrom, Ryan, Jonathan Hersh, and David Newhouse. 2022. "Poverty from Space: Using High
          Resolution Satellite Imagery for Estimating Economic Well-Being." World Bank Economic
          Review 36(2): 382-412.
Filmer, Deon, and Lant H. Pritchett. 2001. â€œEstimating Wealth Effects without Expenditure Dataâ€”
          or Tears: An Application to Educational Enrollments in States of India.â€ Demography 38
          (1): 115â€“132.
Gortan, Marco, Lorenzo Testa, Giorgio Fagiolo, and Francesco Lamperti. 2023. "A Unified
          Repository for Pre-Processed Climate Data Weighted by Gridded Economic Activity."
          Scientific Data 11: 533.
Haddad, Cameron Nadim, Daniel Gerszon Mahler, Carolina Diaz-Bonilla, Ruth Hill, Christoph
          Lakner, and Gabriel Lara Ibarra. 2024. "The World Bankâ€™s New Inequality Indicator: The
          Number of Countries with High Inequality." Policy Research Working Paper Series 10796.
Hajargasht, Gholamreza, William E. Griffiths, Joseph Brice, DS Prasada Rao, and Duangkamon
          Chotikapanich. 2012. "Inference for Income Distributions Using Grouped Data." Journal
          of Business & Economic Statistics 30(4): 563-575.
Hothorn, Torsten, Kurt Hornik, and Achim Zeileis. 2006. â€œUnbiased recursive partitioning: A
          conditional inference framework,â€ Journal of Computational and Graphical Statistics 15:
          651â€“674.
Jolliffe, Dean, and Espen Beer Prydz. 2021. â€œSocietal Poverty: A Relative and Relevant Measure.â€
          World Bank Economic Review 35(1): 180-206.
Jolliffe, Dean Mitchell â“¡ Daniel Gerszon Mahler â“¡ Christoph Lakner â“¡ Aziz Atamanov â“¡
          Samuel Kofi Tetteh Baah. 2024. "Assessing the Impact of the 2017 PPPs on the
          International Poverty Line and Global Poverty." World Bank Economic Review, lhae035.
Jorda, Vanesa, and Miguel NiÃ±o-ZarazÃºa. 2019. â€œGlobal Inequality: How Large Is the Effect of
          Top Incomes?â€ World Development 123: 104593.
Jorda, Vanesa, Miguel NiÃ±o-ZarazÃºa, Laurence Roope, and Finn Tarp. 2023. â€œGlobal income
          polarization: Relative and absolute perspectives.â€ WIDER Working Paper 146.
Kanbur, Ravi, Eduardo Ortiz-Juarez, and Andy Sumner. 2022. "The global Inequality
          Boomerang." WIDER Working Paper 27.
Kraay, Aart, and Roy Van der Weide. 2022. "Measuring Intragenerational Mobility Using
          Aggregate Data." Journal of Economic Growth 27(2): 273-314.
Kraay, Aart â“¡ Christoph Lakner â“¡ Berk Ozler â“¡ Benoit Decerf â“¡ Dean Jolliffe â“¡ Olivier Sterck
          â“¡ Nishant Yonzan. 2023. â€œA New Distribution Sensitive Index for Measuring Welfare,
          Poverty, and Inequalityâ€. Policy Research Working Paper 10470. Washington, D.C.: World
          Bank Group.
Kuznets, Simon. 1955. â€œEconomic Growth and Income Inequality.â€ American Economic Review
          45(1): 1-28.
Lakner, Christoph, Daniel Gerszon Mahler, Mario Negre, and Espen Beer Prydz. 2022. "How
          Much Does Reducing Inequality Matter for Global Poverty?." Journal of Economic
          Inequality 20(3): 559-585.
Lee, Kamwoo, and Jeanine Braithwaite. 2022. "High-Resolution Poverty Maps in Sub-Saharan
          Africa." World Development 159: 106028.
Mahler, Daniel Gerszon, R. AndrÃ©s CastaÃ±eda Aguilar, and David Newhouse. 2022. â€œNowcasting
          Global Poverty.â€ World Bank Economic Review 36 (4): 835â€“856.




                                               21
Mahler, Daniel Gerszon â“¡ Nishant Yonzan â“¡ Christoph Lakner. 2022. â€œThe Impact of COVID-
        19 on Global Inequality and Poverty.â€ Policy Research Working Paper 10198; Washington,
        DC: World Bank Group.
Martinez, Luis R. 2022. "How Much Should We Trust the Dictatorâ€™s GDP Growth Estimates?"
        Journal of Political Economy 130(10): 2731-2769.
Nicoletti, Cheti, Franco Peracchi, and Francesca Foliano. 2011. "Estimating Income Poverty in
        the Presence of Missing Data and Measurement Error." Journal of Business & Economic
        Statistics 29(1): 61-72.
Pinkovskiy, Maxim, and Xavier Sala-i-Martin. 2016. â€œLights, Camera â€¦ Income! Illuminating the
        National Accounts-Household Surveys Debate *.â€ The Quarterly Journal of Economics
        131 (2): 579â€“631.
Pokhriyal, Neeti, and Damien Christophe Jacques. 2017. "Combining Disparate Data Sources for
        Improved Poverty Prediction and Mapping." Proceedings of the National Academy of
        Sciences 114(46): E9783-E9792.
Prydz, Espen Beer, Dean Jolliffe, and Umar Serajuddin. 2022. â€œDisparities in Assessments of
        Living Standards Using National Accounts and Household Surveys.â€ Review of Income
        and Wealth 68: S385-S420.
Roberts, David R., Volker Bahn, Simone Ciuti, Mark S. Boyce, Jane Elith, Gurutzeta Guilleraâ€
        Arroita, and Severin Hauenstein. 2017. "Crossâ€Validation Strategies for Data with
        Temporal, Spatial, Hierarchical, or Phylogenetic Structure." Ecography 40(8): 913-929.
Roy, Sutirtha, and Roy Van Der Weide. 2025. â€œEstimating poverty for India after 2011 using
        private-sector survey data.â€ Journal of Development Economics 172, 103386.
Sherwood, B., & Maidman, A. (2017). rqPen: Penalized quantile regression. R package version,
        2.
Soergel, Bjoern, Elmar Kriegler, Benjamin Leon Bodirsky, Nico Bauer, Marian Leimbach, and
        Alexander Popp. 2021. "Combining Ambitious Climate Policies with Efforts to Eradicate
        Poverty." Nature Communications 12(1): 2342.
Stifel, David, and Luc Christiaensen. 2007. "Tracking Poverty over Time in the Absence of
        Comparable Consumption Data." The World Bank Economic Review 21(2): 317-341.
Tarozzi, Alessandro. 2007. Calculating Comparable Statistics from Incomparable Surveys, With
        An Application to Poverty in India.â€ Journal of Business and Economic Statistics 25(3),
        314-336.
UN DESA. 2016. â€œTransforming Our World: The 2030 Agenda for Sustainable Development.â€
United Nations. 2018. â€œWorld Urbanization Prospects: The 2018 Revision.â€ Department of
        Economic and Social Affairs, Population Division. New York: United Nations.
United Nations, Department of Economic and Social Affairs, Population Division (2022). World
        Population Prospects 2022: Methodology of the United Nations population estimates and
        projections. UN DESA/POP/2022/TR/NO. 4.
Van Der Weide, Roy, Brian Blankespoor, Chris Elbers, and Peter Lanjouw. 2022. â€œHow Accurate
        Is a Poverty Map Based on Remote Sensing Data? An Application to Malawi.â€ Journal of
        Development Economics 171, 103352.




                                              22
Appendix A: Additional results
                     Table A.1: Countries with at most three poverty estimates in PIP
             Algeria                   Iraq               Nepal                      Syrian Arab Republic
             Angola                    Japan              Papua New Guinea           Timor-Leste
             Cabo Verde                Kiribati           Samoa                      Tonga
             Central African Repulic   Lebanon            SÃ£o Tomâ€™e and PrÃ­ncipe     Trinidad and Tobago
             Chad                      Lesotho            Sierra Leone               Turkmenistan
             Comoros                   Liberia            Solomon Islands            Tuvalu
             Congo, Rep.               Marshall Islands   South Sudan                United Arab Emirates
             Congo, Dem. Rep.          Mauritius          St. Lucia                  Vanuatu
             Gabon                     Micronesia         Sudan                      Yemen, Rep.
             Guyana                    Myanmar            Suriname                   Zimbabwe
             Haiti                     Nauru


             Figure A.1: Alternative distributional approaches vis-Ã -vis baseline approach




Note: Error of alternative approaches minus the error of our baseline approach. â€˜Indirectâ€™ refers to estimates from first
predicting the median and Gini, and then inferring a full log-logistic distribution. â€˜OLSâ€™ refers to using an OLS regression
rather than a quantile regression. â€˜Log-normalâ€™ refers to assuming that the distributions are log-normal rather than log-
logistic.




                                                            23
                     Figure A.2: Variables important for predicting median and Gini
                                                      (a) Median




                                                        (b)   Gini




Note: The figure shows the ten most important predictors of the median and Gini out of 1000+ variables featuring in
our models. The predictor with the most relevance is scaled to 1.




                                                         24
      Figure A.3: Predicted and actual poverty rates (1 of 9)




Note: When poverty rates from PIP are disconnected, estimates are not comparable.




                                       25
      Figure A.4: Predicted and actual poverty rates (2 of 9)




Note: When poverty rates from PIP are disconnected, estimates are not comparable.




                                       26
      Figure A.5: Predicted and actual poverty rates (3 of 9)




Note: When poverty rates from PIP are disconnected, estimates are not comparable.




                                       27
      Figure A.6: Predicted and actual poverty rates (4 of 9)




Note: When poverty rates from PIP are disconnected, estimates are not comparable.




                                       28
      Figure A.7: Predicted and actual poverty rates (5 of 9)




Note: When poverty rates from PIP are disconnected, estimates are not comparable.




                                       29
      Figure A.8: Predicted and actual poverty rates (6 of 9)




Note: When poverty rates from PIP are disconnected, estimates are not comparable.




                                       30
      Figure A.9: Predicted and actual poverty rates (7 of 9)




Note: When poverty rates from PIP are disconnected, estimates are not comparable.




                                       31
      Figure A.10: Predicted and actual poverty rates (8 of 9)




Note: When poverty rates from PIP are disconnected, estimates are not comparable.




                                       32
      Figure A.11: Predicted and actual poverty rates (9 of 9)




Note: When poverty rates from PIP are disconnected, estimates are not comparable.




                                       33
Appendix B: Application to the World Bankâ€™s global poverty and
  inequality measures
The method we developed lends itself to being used in the World Bankâ€™s global poverty measures
for the countries that do not have any survey data. For the purpose of calculating regional and
global poverty headcounts, countries without a household survey at any point in time, or countries
without national accounts data needed for year-to-year extrapolations, are currently assigned the
population-weighted regional average poverty rate. This affects 55 economies (Table B.1). 5 A few
countries, such as Afghanistan, Saudi Arabia, the RepÃºblica Bolivariana de Venezuela, and the
Democratic Peopleâ€™s Republic of Korea, make up the largest part of the population without data,
adding up to more than 200 million people across all the missing economies (Table B.2).
          Table B.1: Economies without household survey data at any point in time in PIP
                            Cuba
 Afghanistan                CuraÃ§ao                   Kosovo                        San Marino
 American Samoa             Dominica                  Kuwait                        Saudi Arabia
 Andorra                    Equatorial Guinea         Libya                         Singapore
 Antigua and Barbuda        Eritrea                   Liechtenstein                 Sint Maarten (Dutch part)
 Aruba                      Faeroe Islands            Macao SAR, China              Somalia
 Bahamas, The               French Polynesia          Monaco                        South Sudan
 Bahrain                    Gibraltar                 Nauru                         St. Kitts and Nevis
 Barbados                   Greenland                 New Caledonia                 St. Martin (French part)
 Bermuda                    Grenada                   New Zealand                   St. Vincent and the Grenadines
 British Virgin Islands     Guam                      Northern Mariana Islands      Timor-Leste
 Brunei Darussalam          Hong Kong SAR, China      Oman                          Turks and Caicos Islands
 Cambodia                   Isle of Man               Palau                         Venezuela, RB
 Cayman Islands             Korea, Dem. Peopleâ€™s      Puerto Rico                   Virgin Islands (U.S.)
 Channel Islands            Rep.                      Qatar


                          Table B.2: Missing economies ranked by population in 2019

                               Economy                  Population 2019 (in millions)
                               Afghanistan                          38
                               Saudi Arabia                         36
                               Venezuela, RB                        29
                               Korea, Dem. Peopleâ€™s
                               Rep.                                  26
                               Cambodia                              16
                               Somalia                               16
                               Cuba                                  11
                               South Sudan                           10
                               Hong Kong SAR,
                               China                                  8
                               Libya                                  7
                               Other Economies                       40
                               Total Missing                        205

5 Table B.1 includes economies that are missing in all years (e.g., the Democratic Peopleâ€™s Republic of

Korea), but also economies that are missing only for some years (e.g., the RepÃºblica Bolivariana de
Venezuela). The latter occurs when the national accounts data needed for the extrapolations is unavailable
in some years.

                                                         34
Though the World Bank often focuses on poverty at specific poverty lines ($2.15/day and
$6.85/day), PIP allows users to query any poverty line. Therefore, any method to be used for
countries without data needs to predict a poverty rate for any poverty line, and hence, in practice
recover a full distribution. In addition, full distributions are needed to extrapolate and nowcast
poverty, and to calculate inequality measures, such as the prosperity gap (Kraay etâ“¡ al. 2023).
This makes our method particularly suitable.

In this appendix, we first show how our model performs relative to the current practice of using
regional averages. Next, we demonstrate how the model performs compared to a model which
directly predicts poverty rates at the global poverty lines of $2.15 and $6.85 per person per day
to see how much is lost by predicting a full distribution. Third, we explain how the model relates
to PIPâ€™s current extrapolation rules. Finally, we show the implications of adopting the model on
global and regional poverty and inequality measures.

B.1 Accuracy compared to PIPâ€™s current method
Figure B.1 compares the errors using PIPâ€™s current method with the two tiers for the full sample
and for data deprived countries. We add the error for two similar methods: using the regional
average by World Bank region instead of PIP region, and using the income group average. On
the full sample, PIPâ€™s method is 18% higher than tier 2 and 33% higher than tier 1. It performs
better than using income groups or World Bank regions in the full sample, but for data deprived
countries, income groups outperform the PIP regions. This suggests that data deprived countries
are less representative of their region than their income group. On the data deprived sample,
PIPâ€™s method is 23% and 39% higher than tier 2 and 1, respectively.
                   Figure B.1: Accuracy of PIPâ€™s method compared to two tiers




                                                35
                 Note: Prediction error of two tiers, PIPâ€™s current method (â€œPIP regionâ€) and PIPâ€™s
                 current method applied to two alternative groupings.

B.2 Accuracy compared to model predicting poverty rates directly
One concern with the proposed tiers is that their focus on predicting full distributions may imply
that they are not as accurate as models that predict a particular poverty rate. Given that most
cross-country poverty comparisons use the international poverty line or another fixed poverty line,
such as the typical poverty line of upper-middle-income countries, $6.85, this could come at some
cost of the accuracy of such work.

Figure B.2 shows the errors from our two models when evaluated by how well they predict poverty
rates at the $2.15 line or $6.85 line. The predicted poverty rate is then compared with the true
poverty rate at that line, and the error is now evaluated as the mean absolute deviation in poverty
rates. We compare the errors from these models with the errors frum running fractional logit
regressions, where the left-hand-side is the true poverty rate, and the covariates used on the
right-hand side are the ones used for the two tiers. Fractional logit regressions are generalizations
of logit regressions where the left-hand-side variable can take any value in the unit interval. In all
cases, we still evaluate errors using leave-one-country-out cross-validation. We show the errors
on the full sample and the data deprived sample.
                  Figure B.2: Accuracy of proposed method on specific poverty lines




    Note: Errors in predicted poverty rates using our two-tier distributional models (â€œDistributionsâ€) and the errors
    from predicting the poverty rates directly without recovering full distributions (â€œDirect poverty rateâ€)

The error of 6.7 for the distributional approach on the full sample at the $2.15 line suggests that
our model on average predicts $2.15 poverty rates 6.7 percentage points off. Unsurprisingly, the
errors are larger for tier 2 and for the data deprived sample. Somewhat surprisingly, the errors
are almost always higher when the poverty rates are predicted directly through fractional logit
regressions. This suggests that not much is lost by predicting full distributions, in fact, the error
might be reduced. It is important to note, though, that the choice of variables in the prediction


                                                           36
model is held constant in these comparisons: If we had focused exclusively on predicting poverty
at the $2.15 line, it is likely that other covariates would have been chosen, which could have led
to a model predicting $2.15 poverty rates directly performing better than shown above.

B.3 Consistency with PIPâ€™s extrapolation rule
PIP currently extrapolates old welfare vectors forward in time by assuming that a 1% growth in
GDP per capita or HFCE per capita leads to a 1% growth in welfare across the distribution when
income is used, and 0.7% when consumption is used. The coefficients from the tier 1 regression
(Table 1) suggest that, all else equal, when GDP per capita grows 1%, the predicted median
grows by 0.393% for consumption vectors and 0.785% (0.393+0.392) for income vectors. Yet this
ignores that growth in GDP is likely to also impact welfare indirectly through lowering under-5
mortality, rural population shares, and by increasing life expectancy (the other three variables).
Regressing changes in these other variables on growth in GDP per capita suggests that a 1%
growth in GDP per capita is associated with a 0.82% decline in under-5 mortality, a 0.15 pct. point
decline in the rural population share, and a 0.06 increase in life expectancy. Taking these factors
into account, our estimate suggests that a 1% growth in GDP per capita is expected to increase
consumption vectors by 0.69% and income vectors by 1.08%, which is very close to the
extrapolation rule.

B.4 Implications for global poverty and inequality measurement
Finally, we implement our preferred methods to predict poverty rates at the international poverty
line ($2.15/day) and the upper-middle-income poverty line ($6.85/day) for countries that are
missing from PIP. Specifically, we replace missing headcounts for 1,721 country-year
observations with predictions from our Tier 1 model when GDP per capita is available (1,092
country-year observations), and from our Tier 2 model in the remainder of the cases (629
observations). Implementing our proposed method increases the global poverty rate in 2019 by
about 0.14 percentage points at the $2.15 line and 0.21 percentage points at the $6.85 line,
equivalent to an additional 11 million or 16 million poor people globally (Figure B.3). There is a
similarly small, estimated increase in global inequality, with the global Gini coefficient estimated
at 61.9 instead of 61.8 in 2019. For both global poverty and global inequality, the trend from 1991-
2019 looks strikingly similar with or without our two tiers. This is because countries without data
make up only 3% of the global population.

The differences are slightly larger at the regional level, with the extreme poverty rate in East Asia
& Pacific increasing by 7%, and the rate of Latin America & the Caribbean and South Asia
increasing by 4% (Figure B.4). The increase in the poverty rate for these regions is driven by
populous countries without data for which the two tiers suggest higher poverty rates than the
regional average. In Afghanistan, the extreme poverty rate is 20 percentage points higher in 2019
(equivalent to 7 million additional poor), while the figures are also notable for Somalia (increase
by 14 pct. points, 2 million additional poor), the RepÃºblica Bolivariana de Venezuela (5 pct. points,
1.5 million poor), Cambodia (7 pct. points, 1.1 million poor) and the Democratic Peopleâ€™s Republic
of Korea (3 pct. points, 0.7 million poor). These findings suggest that poverty rates are
underestimated in data deprived countries which tend to be poorer than others in their regions.
Country examples of the poverty rates from the current method and our proposal are shown in
Figure B.5.

                                                 37
                                 Figure B.3: Global poverty and inequality

 (a) Global poverty rate, 1991-2019                      (b) Global inequality, 1991-2019




Note: Panel (a) Compares the poverty rates in PIP (version 20230919_2017_01_02_PROD) with estimates when using
the models presented in this paper for countries without a poverty estimate in PIP (in dashed lines). The two nearly
perfectly overlap with each other. Panel (b) does the same for the Global Gini, using estimates from Mahler â“¡ al.
(2022), updated to reflect PIP version 20230919_2017_01_02_PROD.

                                    Figure B.4: Regional poverty in 2019




Note: Compares regional poverty rates in 2019 using PIPâ€™s current method to deal with missing countries (â€œFrom PIPâ€)
and the proposed models from this paper (â€œPredictionsâ€).




                                                        38
          Figure B.5: Country examples comparing current method to proposed alternative




Note: Compares country-level poverty rates for countries without any data in PIP using the regional average poverty
rate (â€œfrom PIPâ€) and the proposed models from this paper (â€œPredictionsâ€).

The country predictions are more volatile than the ones implicitly used in PIP because the current
procedure averages out country fluctuations by relying on information from the whole region. By
contrast, the input data used in the two tiers here at times change drastically year-to-year. For
example, Somaliaâ€™s life expectancy changes from 27 years in 1992 to 51 years in 1993, Cuba
moves from LMIC to UMIC in 2007, and Cambodiaâ€™s GDP/capita falls from US$1,717 to US$1,078
from 1993 to 1994.




                                                        39