Publication: Small Area Estimation of Poverty and Wealth Using Geospatial Data: What Have We Learned So Far ?
Loading...
Date
2023-07-18
ISSN
Published
2023-07-18
Author(s)
Editor(s)
Abstract
This paper offers a nontechnical review of selected applications that combine survey and geospatial data to generate small area estimates of wealth or poverty. Publicly available data from satellites and phones predicts poverty and wealth accurately across space, when evaluated against census data, and their use in model-based estimates improve the accuracy and efficiency of direct survey estimates. Although the evidence is scant, models based on interpretable features appear to predict at least as well as estimates derived from Convolutional Neural Networks. Estimates for sampled areas are significantly more accurate than those for non-sampled areas due to informative sampling. In general, estimates benefit from using geospatial data at the most disaggregated level possible. Tree-based machine learning methods appear to generate more accurate estimates than linear mixed models. Small area estimates using geospatial data can improve the design of social assistance programs, particularly when the existing targeting system is poorly designed.
Link to Data Set
Citation
“Newhouse, David. 2023. Small Area Estimation of Poverty and Wealth Using Geospatial Data: What Have We Learned So Far ?. Policy Research Working Papers; 10512. © World Bank. http://hdl.handle.net/10986/40028 License: CC BY 3.0 IGO.”
Associated URLs
Associated content
Other publications in this report series
Journal
Journal Volume
Journal Issue
Collections
Related items
Showing items related by metadata.
Publication Small Area Estimation of Poverty in Four West African Countries by Integrating Survey and Geospatial Data(Washington, DC: World Bank, 2024-09-05)The paper presents a methodology to generate experimental small area estimates of poverty in four West African countries: Chad, Guinea, Mali, and Niger. Due to the absence of recent census data in these countries, household-level survey data are integrated with grid-level geospatial data, which are used as covariates in model-based estimation. Leveraging geospatial data enables reporting of poverty estimates more frequently at disaggregated administrative levels and makes estimation feasible in areas for which survey data are not available. The paper leverages the availability of a recent census in Burkina Faso for evaluation purposes. Estimates obtained with the same survey instruments and candidate geospatial covariates as the other four countries are compared against estimates obtained using recent census data and an empirical best predictor under a unit-level model. For Burkina Faso, the estimates obtained using geospatial data are highly correlated with the census-based ones in sampled areas but moderately correlated in non-sampled areas. The results demonstrate that in the absence of recent census data, small area estimation with publicly available geospatial covariates isPublication Small Area Estimation of Non-Monetary Poverty with Geospatial Data(World Bank, Washington, DC, 2020-09)This paper uses data from Sri Lanka and Tanzania to evaluate the benefits of combining household surveys with geographically comprehensive geospatial indicators to generate small area estimates of non-monetary poverty. The preferred estimates are generated by utilizing subarea-level geospatial indicators in a household-level empirical best predictor mixed model with a normalized welfare measure. Mean squared errors are estimated using a parametric bootstrap procedure. The resulting estimates are highly correlated with non-monetary poverty calculated from the full census in both countries, and the gain in precision is comparable to increasing the size of the sample by a factor of three in Sri Lanka and five in Tanzania. The empirical best predictor model moderately underestimates uncertainty, but coverage rates are similar to standard survey-based estimates that assume independent outcomes across clusters. A variety of checks, including adding noise to the welfare measure and model-based and design-based simulations, confirm that the main results are robust. The results demonstrate that combining household survey data with subarea-level geospatial indicators can greatly increase the precision of survey estimates of non-monetary poverty at comparatively low cost.Publication Updating Poverty Estimates at Frequent Intervals in the Absence of Consumption Data : Methods and Illustration with Reference to a Middle-Income Country(World Bank Group, Washington, DC, 2014-09)Obtaining consistent estimates on poverty over time as well as monitoring poverty trends on a timely basis is a priority concern for policy makers. However, these objectives are not readily achieved in practice when household consumption data are neither frequently collected, nor constructed using consistent and transparent criteria. This paper develops a formal framework for survey-to-survey poverty imputation in an attempt to overcome these obstacles, and to elevate the discussion of these methods beyond the largely ad-hoc efforts in the existing literature. The framework introduced here imposes few restrictive assumptions, works with simple variance formulas, provides guidance on the selection of control variables for model building, and can be generally applied to imputation either from one survey to another survey with the same design, or to another survey with a different design. Empirical results analyzing the Household Expenditure and Income Survey and the Unemployment and Employment Survey in Jordan are quite encouraging, with imputation-based poverty estimates closely tracking the direct estimates of poverty.Publication Estimating Poverty in the Absence of Consumption Data : The Case of Liberia(World Bank Group, Washington, DC, 2014-09)In much of the developing world, the demand for high frequency quality household data for poverty monitoring and program design far outstrips the capacity of the statistics bureau to provide such data. In these environments, all available data sources must be leveraged. Most surveys, however, do not collect the detailed consumption data necessary to construct aggregates and poverty lines to measure poverty directly. This paper benefits from a shared listing exercise for two large-scale national household surveys conducted in Liberia in 2007 to explore alternative methodologies to estimate poverty indirectly. The first is an asset-based model that is commonly used in Demographic and Health Surveys. The second is a survey-to-survey imputation that makes use of small area estimation techniques. In addition to a standard base model, separate models are estimated for urban and rural areas and an expanded model that includes climatic variables. Special attention is paid to the inclusion of cell phones, with implications for other assets whose cost and availability may be changing rapidly. The results demonstrate substantial limitations with asset-based indexes, but also leave questions as to the accuracy and stability of imputation models.Publication Small Area Estimation of Monetary Poverty in Mexico Using Satellite Imagery and Machine Learning(World Bank, Washington, DC, 2022-09)Estimates of poverty are an important input into policy formulation in developing countries. The accurate measurement of poverty rates is therefore a first-order problem for development policy. This paper shows that combining satellite imagery with household surveys can improve the precision and accuracy of estimated poverty rates in Mexican municipalities, a level at which the survey is not considered representative. It also shows that a household-level model outperforms other common small area estimation methods. However, poverty estimates in 2015 derived from geospatial data remain less accurate than 2010 estimates derived from household census data. These results indicate that the incorporation of household survey data and widely available satellite imagery can improve on existing poverty estimates in developing countries when census data are old or when patterns of poverty are changing rapidly, even for small subgroups.
Users also downloaded
Showing related downloaded files
Publication Doing Business 2020(Washington, DC: World Bank, 2020)Doing Business 2020 is the 17th in a series of annual studies investigating the regulations that enhance business activity and those that constrain it. It provides quantitative indicators covering 12 areas of the business environment in 190 economies. The goal of the Doing Business series is to provide objective data for use by governments in designing sound business regulatory policies and to encourage research on the important dimensions of the regulatory environment for firms.Publication Early Identification of At-Risk Youth in Latin America : An Application of Cluster Analysis(World Bank, Washington, DC, 2007-10)A new literature on the nature of and policies for youth in Latin America is emerging, but there is still very little known about who are the most vulnerable young people. This paper aims to characterize the heterogeneity in the youth population and identify ex ante the youth that are at-risk and should be targeted with prevention programs. Using non-parametric methodologies and specialized youth surveys from Mexico and Chile, the authors quantify and characterize the different sub-groups of youth, according to the amount of risk in their lives, and find that approximately 20 percent of 18 to 24 year old Chileans and 40 percent of the same age cohort in Mexico are suffering the consequences of a range of negative behaviors. Another 8 to 20 percent demonstrate factors in their lives that pre-dispose them to becoming at-risk youth - they are the candidates for prevention programs. The analysis finds two observable variables that can be used to identify which children have a higher probability of becoming troubled youth: poverty and residing in rural areas. The analysis also finds that risky behaviors increase with age and differ by gender, thereby highlighting the need for program and policy differentiation along these two demographic dimensions.Publication Poverty Mapping in the Age of Machine Learning(World Bank, Washington, DC, 2023-05-04)Recent years have witnessed considerable methodological advances in poverty mapping, much of which has focused on the application of modern machine-learning approaches to remotely sensed data. Poverty maps produced with these methods generally share a common validation procedure, which assesses model performance by comparing subnational machine-learning-based poverty estimates with survey-based, direct estimates. Although unbiased, survey-based estimates at a granular level can be imprecise measures of true poverty rates, meaning that it is unclear whether the validation procedures used in machine-learning approaches are informative of actual model performance. This paper examines the credibility of existing approaches to model validation by constructing a pseudo-census from the Mexican Intercensal Survey of 2015, which is used to conduct several design-based simulation experiments. The findings show that the validation procedure often used for machine-learning approaches can be misleading in terms of model assessment since it yields incorrect information for choosing what may be the best set of estimates across different methods and scenarios. Using alternative validation methods, the paper shows that machine-learning-based estimates can rival traditional, more data intensive poverty mapping approaches. Further, the closest approximation to existing machine-learning approaches, using publicly available geo-referenced data, performs poorly when evaluated against “true” poverty rates and fails to outperform traditional poverty mapping methods in targeting simulations.Publication Sustaining Poverty Gains(Washington, DC: World Bank, 2024-09-05)Poverty maps are a useful tool for targeting social programs on areas with high concentrations of poverty. However, a static focus on poverty ignores its temporal dimension. Thus, current nonpoor households still face substantial welfare volatility and are at risk of becoming poor in the face of shocks. This paper combines the methods of poverty mapping and vulnerability estimation to create highly disaggregated vulnerability maps. The maps include predictions of the share of chronically poor households (poverty-induced vulnerability)—the focus of traditional poverty maps—and the share of households showing a significant probability of falling into poverty (risk-induced vulnerability). As an application of the method, the paper estimates a vulnerability map for Senegal that provides quotas for the expansion of the social registry. Accounting for the poor and the population at risk of poverty implies, in practice, the expansion of coverage into urban and peri-urban areas that tend to experience lower poverty rates. The inclusion of nonpoor households also serves as a first step toward supporting a dynamic social registry.Publication Guidelines to Small Area Estimation for Poverty Mapping(Washington, DC : World Bank, 2022-06-16)The eradication of poverty, which was the first of the millennium development goals (MDG) established by the United Nations and followed by the sustainable development goals (SDG), requires knowing where the poor are located. Traditionally, household surveys are considered the best source of information on the living standards of a country’s population. Data from these surveys typically provide a sufficiently accurate direct estimate of household expenditures or income and thus estimates of poverty at the national level and larger international regions. However, when one starts to disaggregate data by local areas or population subgroups, the quality of these direct estimates diminishes. Consequently, national statistical offices (NSOs) cannot provide reliable wellbeing statistical figures at a local level. For example, the module of socioeconomic conditions of the Mexican national survey of household income and expenditure (ENIGH) is designed to produce estimates of poverty and inequality at the national level and for the 32 federate entities (31 states and Mexico City) with disaggregation by rural and urban zones, every two years, but there is a mandate to produce estimates by municipality every five years, and the ENIGH alone cannot provide estimates for all municipalities with adequate precision. This makes monitoring progress toward the sustainable development goals more difficult.