Publication:
Integrating Survey and Geospatial Data to Identify the Poor and Vulnerable: Evidence from Malawi

Loading...
Thumbnail Image
Files in English
English PDF (1.41 MB)
216 downloads
English Text (94.75 KB)
14 downloads
Published
2022-12-08
ISSN
Date
2023-01-10
Editor(s)
Abstract
Generating timely data to identify the poorest villages in developing countries remains a fundamental challenge for existing data systems. This paper investigates the accuracy of four alternative methods for predicting a measure of village economic welfare for approximately 4,500 villages in 10 poor Malawian districts: (1) proxy means test scores calculated from the 2017 social registry, (2) the Meta Relative Wealth Index, (3) predictions derived from a standard household survey and publicly available geospatial indicators, and (4) predictions derived from a two-step approach that first predicts welfare into a hypothetical partial registry of approximately 450 villages, and then predicts welfare into the remaining villages using geospatial indicators. Geospatial indicators include land coverage indicators, weather data, night light data, building patterns, distance to major roads, and population density. Predictions are evaluated against a benchmark village welfare measure, constructed by imputing log per capita consumption from the 2016 integrated household survey into the 2018 household census using gradient boosting. Incorporating the hypothetical partial registry vastly improves the performance of the predictions. When using the partial registry, the rank correlation between the predicted and benchmark welfare measures is 0.75, while those for the other three methods range from -0.02 to 0.2, and similar results are seen when examining the area under the curve. Doubling the size of the partial registry does little to improve predictive performance. The results are robust to using a linear post–Least Absolute Selection and Shrinkage Operator model instead of gradient boosting for prediction. However, predictions using both methods are less accurate when the benchmark welfare measure is derived from a linear post–Least Absolute Selection and Shrinkage Operator model. Overall, the results strongly suggest that collecting partial registries of household-level poverty predictors in low-income contexts can vastly improve the performance of machine learning models that combine survey and satellite imagery for the purpose of village-level targeting.
Link to Data Set
Citation
Gualavisi,Melany; Newhouse,David Locke. 2022. Integrating Survey and Geospatial Data to Identify the Poor and Vulnerable: Evidence from Malawi. © World Bank. http://hdl.handle.net/10986/38442 License: CC BY 3.0 IGO.
Associated URLs
Associated content
Report Series
Report Series
Other publications in this report series
  • Publication
    The Economic Value of Weather Forecasts: A Quantitative Systematic Literature Review
    (Washington, DC: World Bank, 2025-09-10) Farkas, Hannah; Linsenmeier, Manuel; Talevi, Marta; Avner, Paolo; Jafino, Bramka Arga; Sidibe, Moussa
    This study systematically reviews the literature that quantifies the economic benefits of weather observations and forecasts in four weather-dependent economic sectors: agriculture, energy, transport, and disaster-risk management. The review covers 175 peer-reviewed journal articles and 15 policy reports. Findings show that the literature is concentrated in high-income countries and most studies use theoretical models, followed by observational and then experimental research designs. Forecast horizons studied, meteorological variables and services, and monetization techniques vary markedly by sector. Estimated benefits even within specific subsectors span several orders of magnitude and broad uncertainty ranges. An econometric meta-analysis suggests that theoretical studies and studies in richer countries tend to report significantly larger values. Barriers that hinder value realization are identified on both the provider and user sides, with inadequate relevance, weak dissemination, and limited ability to act recurring across sectors. Policy reports rely heavily on back-of-the-envelope or recursive benefit-transfer estimates, rather than on the methods and results of the peer-reviewed literature, revealing a science-to-policy gap. These findings suggest substantial socioeconomic potential of hydrometeorological services around the world, but also knowledge gaps that require more valuation studies focusing on low- and middle-income countries, addressing provider- and user-side barriers and employing rigorous empirical valuation methods to complement and validate theoretical models.
  • Publication
    Direct and Indirect Impacts of Transport Mobility on Access to Jobs: Evidence from South Africa
    (Washington, DC: World Bank, 2025-11-12) Iimi, Atsushi
    Access to jobs is essential for economic growth. In Africa, unemployment rates are notably high. This paper reexamines the relationship between transport mobility and labor market outcomes, with a particular focus on the direct and indirect effects of transport connectivity. As predicted by theory, wages are influenced by the level of commuting deterrence. Generally, higher earnings are associated with longer commute times and/or higher commuting costs. Local accessibility is also important, especially for individuals with time constraints. Both direct and indirect impacts are found to be significant in South Africa, where job accessibility has been challenging since the end of apartheid. For the direct impact, the wage elasticity associated with commuting costs is significant. Returns on commute are particularly high for women. Local accessibility to socioeconomic facilities, such as shops and health services, is also found to have a significant impact, consistent with the concept of mobility of care. To enhance employment, therefore, it is crucial to connect people not only to job locations but also to various socioeconomic points of interest, such as markets and hospitals, in an integrated manner. This integration will enable individuals to spend more time working and commuting longer distances.
  • Publication
    The Macroeconomic Implications of Climate Change Impacts and Adaptation Options
    (Washington, DC: World Bank, 2025-05-29) Abalo, Kodzovi; Boehlert, Brent; Bui, Thanh; Burns, Andrew; Castillo, Diego; Chewpreecha, Unnada; Haider, Alexander; Hallegatte, Stephane; Jooste, Charl; McIsaac, Florent; Ruberl, Heather; Smet, Kim; Strzepek, Ken
    Estimating the macroeconomic implications of climate change impacts and adaptation options is a topic of intense research. This paper presents a framework in the World Bank's macrostructural model to assess climate-related damages. This approach has been used in many Country Climate and Development Reports, a World Bank diagnostic that identifies priorities to ensure continued development in spite of climate change and climate policy objectives. The methodology captures a set of impact channels through which climate change affects the economy by (1) connecting a set of biophysical models to the macroeconomic model and (2) exploring a set of development and climate scenarios. The paper summarizes the results for five countries, highlighting the sources and magnitudes of their vulnerability --- with estimated gross domestic product losses in 2050 exceeding 10 percent of gross domestic product in some countries and scenarios, although only a small set of impact channels is included. The paper also presents estimates of the macroeconomic gains from sector-level adaptation interventions, considering their upfront costs and avoided climate impacts and finding significant net gross domestic product gains from adaptation opportunities identified in the Country Climate and Development Reports. Finally, the paper discusses the limits of current modeling approaches, and their complementarity with empirical approaches based on historical data series. The integrated modeling approach proposed in this paper can inform policymakers as they make proactive decisions on climate change adaptation and resilience.
  • Publication
    From Policy to Practice: Lessons from the Implementation of the Refugee Work Rights Policy in Ethiopia
    (Washington, DC: World Bank, 2025-11-10) Perez, Ana Maria; Rozo, Sandra V.
    This paper examines the early implementation of Ethiopia’s refugee work rights policy, with a focus on the issuance of permits that enable refugees to engage in economic activities. Building on significant legal and institutional advances under the 2019 Refugee Proclamation and subsequent directives, the analysis explores how these reforms are being operationalized in practice. Using a mixed-methods approach, combining document review, administrative data analysis, and semi-structured interviews, the paper identifies both progress and remaining challenges. Permit issuance has increased since the adoption of detailed operational guidance in 2024, reflecting the Government of Ethiopia’s commitment to operationalizing its progressive legal framework and ensuring that refugees can exercise their right to work. However, take-up remains modest, with about 5.2 percent of the working-age population holding a permit. Preliminary evidence suggests that coordination gaps, limited subnational capacity, low awareness among refugees and employers, and disincentives to formalize in a largely informal labor market are contributing to the low take-up. The paper offers policy suggestions, grounded in the Ethiopian context and emerging evidence, to help translate legal commitments into improved labor market outcomes for refugees.
  • Publication
    Monitoring Global Aid Flows: A Novel Approach Using Large Language Models
    (Washington, DC: World Bank, 2025-11-04) Luo, Xubei; Rajasekaran, Arvind Balaji; Scruggs, Andrew Conner
    Effective monitoring of development aid is the foundation for assessing the alignment of flows with their intended development objectives. Existing reporting systems, such as the Organisation for Economic Co-operation and Development’s Creditor Reporting System, provide standardized classification of aid activities but have limitations when it comes to capturing new areas like climate change, digitalization, and other cross-cutting themes. This paper proposes a bottom-up, unsupervised machine learning framework that leverages textual descriptions of aid projects to generate highly granular activity clusters. Using the 2021 Creditor Reporting System data set of nearly 400,000 records, the model produces 841 clusters, which are then grouped into 80 subsectors. These clusters reveal 36 emerging aid areas not tracked in the current Creditor Reporting System taxonomy, allow unpacking of “multi-sectoral” and “sector not specified” classifications, and enable estimation of flows to new themes, including World Bank Global Challenge Programs, International Development Association–20 Special Themes, and Cross-Cutting Issues. Validation against both Creditor Reporting System benchmarks and International Development Association commitment data demonstrates robustness. This approach illustrates how machine learning and the new advances in large language models can enhance the monitoring of global aid flows and inform future improvements in aid classification and reporting. It offers a useful tool that can support more responsive and evidence-based decision-making, helping to better align resources with evolving development priorities.
Journal
Journal Volume
Journal Issue

Related items

Showing items related by metadata.

  • Publication
    Integrating Survey and Geospatial Data for Geographical Targeting of the Poor and Vulnerable
    (Washington, DC: World Bank, 2024-05-27) Gualavisi, Melany; Newhouse, David
    To address the challenge of identifying the poorest villages in developing countries, this study introduces a cost-effective strategy that leverages a combination of household consumption surveys, geospatial data, and a partial registry. The study simulates a partial registry, containing data from 450 villages across 10 impoverished districts of Malawi, and contains proxy poverty indicators. These indicators are used to impute household per capita consumption estimates, which in turn are used to train a prediction model using publicly available geospatial data. This method is evaluated against an imputed reference of village welfare, derived from the 2016 household survey. The partial registry approach is benchmarked against three alternatives: proxy means test scores, the Meta Relative Wealth Index, and predictions from household surveys with geospatial indicators. Results show the partial registry model’s rank correlation with actual welfare measures at 0.75, outperforming the other methods significantly, which ranged from −0.02 to 0.2. These findings hold under various robustness checks, including the addition of Gaussian noise, indicating that collecting household-level proxy poverty data in low-income areas can significantly improve the performance of machine learning models that integrate survey and satellite imagery data for village-level geographic targeting.
  • Publication
    Small Area Estimation of Poverty in Four West African Countries by Integrating Survey and Geospatial Data
    (Washington, DC: World Bank, 2024-09-05) Edochie, Ifeanyi; Newhouse, David; Tzavidis, Nikos; Schmid, Timo; Foster, Elizabeth; Hernandez, Angela; Ouedraogo, Aissatou; Sanoh, Aly; Savadogo, Aboudrahyme
    The paper presents a methodology to generate experimental small area estimates of poverty in four West African countries: Chad, Guinea, Mali, and Niger. Due to the absence of recent census data in these countries, household-level survey data are integrated with grid-level geospatial data, which are used as covariates in model-based estimation. Leveraging geospatial data enables reporting of poverty estimates more frequently at disaggregated administrative levels and makes estimation feasible in areas for which survey data are not available. The paper leverages the availability of a recent census in Burkina Faso for evaluation purposes. Estimates obtained with the same survey instruments and candidate geospatial covariates as the other four countries are compared against estimates obtained using recent census data and an empirical best predictor under a unit-level model. For Burkina Faso, the estimates obtained using geospatial data are highly correlated with the census-based ones in sampled areas but moderately correlated in non-sampled areas. The results demonstrate that in the absence of recent census data, small area estimation with publicly available geospatial covariates is
  • Publication
    Combining Survey and Geospatial Data Can Significantly Improve Gender-Disaggregated Estimates of Labor Market Outcomes
    (World Bank, Washington, DC, 2022-06) Merfeld, Joshua D.; Newhouse, David; Weber, Michael; Lahiri, Partha
    Better understanding the geography of women’s labor market outcomes within countries is important to inform targeted efforts to increase women’s economic empowerment. This paper assesses the extent to which a method that combines simulated survey data from urban areas in Mexico with broadly available geospatial indicators from Google Earth Engine and OpenStreetMap can significantly improve estimates of labor force participation and unemployment rates. Incorporating geospatial information substantially increases the accuracy of male and female labor force participation and unemployment rates at the state level, reducing mean absolute deviation by 50 to 62 percent for labor force participation and 25 to 52 percent for unemployment. Small area estimation using a nested error conditional random effect model also greatly improves municipal estimates of labor force participation, as the mean absolute error falls by approximately half, while the mean squared error falls by almost 75 percent when holding coverage rates constant. In contrast, the results for municipal unemployment rate estimates are not reliable because values of unemployment rates are low and therefore poorly suited for linear models. The municipal results hold in repeated simulations of alternative samples. Models utilizing Basic Geo-Statistical Area (AGEB)–level auxiliary information generate more accurate predictions than area-level models specified using the same auxiliary data. Overall, integrating survey data and publicly available geospatial indicators is feasible and can greatly improve state-level estimates of male and female labor force participation and unemployment rates, as well as municipal estimates of male and female labor force participation.
  • Publication
    Small Area Estimation of Poverty and Wealth Using Geospatial Data
    (World Bank, Washington, DC, 2023-07-18) Newhouse, David
    This paper offers a nontechnical review of selected applications that combine survey and geospatial data to generate small area estimates of wealth or poverty. Publicly available data from satellites and phones predicts poverty and wealth accurately across space, when evaluated against census data, and their use in model-based estimates improve the accuracy and efficiency of direct survey estimates. Although the evidence is scant, models based on interpretable features appear to predict at least as well as estimates derived from Convolutional Neural Networks. Estimates for sampled areas are significantly more accurate than those for non-sampled areas due to informative sampling. In general, estimates benefit from using geospatial data at the most disaggregated level possible. Tree-based machine learning methods appear to generate more accurate estimates than linear mixed models. Small area estimates using geospatial data can improve the design of social assistance programs, particularly when the existing targeting system is poorly designed.
  • Publication
    Small Area Estimation of Non-Monetary Poverty with Geospatial Data
    (World Bank, Washington, DC, 2020-09) Masaki, Takaaki; Newhouse, David; Silwal, Ani Rudra; Bedada, Adane; Engstrom, Ryan
    This paper uses data from Sri Lanka and Tanzania to evaluate the benefits of combining household surveys with geographically comprehensive geospatial indicators to generate small area estimates of non-monetary poverty. The preferred estimates are generated by utilizing subarea-level geospatial indicators in a household-level empirical best predictor mixed model with a normalized welfare measure. Mean squared errors are estimated using a parametric bootstrap procedure. The resulting estimates are highly correlated with non-monetary poverty calculated from the full census in both countries, and the gain in precision is comparable to increasing the size of the sample by a factor of three in Sri Lanka and five in Tanzania. The empirical best predictor model moderately underestimates uncertainty, but coverage rates are similar to standard survey-based estimates that assume independent outcomes across clusters. A variety of checks, including adding noise to the welfare measure and model-based and design-based simulations, confirm that the main results are robust. The results demonstrate that combining household survey data with subarea-level geospatial indicators can greatly increase the precision of survey estimates of non-monetary poverty at comparatively low cost.

Users also downloaded

Showing related downloaded files

  • Publication
    Lebanon Economic Monitor, Fall 2022
    (Washington, DC, 2022-11) World Bank
    The economy continues to contract, albeit at a somewhat slower pace. Public finances improved in 2021, but only because spending collapsed faster than revenue generation. Testament to the continued atrophy of Lebanon’s economy, the Lebanese Pound continues to depreciate sharply. The sharp deterioration in the currency continues to drive surging inflation, in triple digits since July 2020, impacting the poor and vulnerable the most. An unprecedented institutional vacuum will likely further delay any agreement on crisis resolution and much needed reforms; this includes prior actions as part of the April 2022 International Monetary Fund (IMF) staff-level agreement (SLA). Divergent views among key stakeholders on how to distribute the financial losses remains the main bottleneck for reaching an agreement on a comprehensive reform agenda. Lebanon needs to urgently adopt a domestic, equitable, and comprehensive solution that is predicated on: (i) addressing upfront the balance sheet impairments, (ii) restoring liquidity, and (iii) adhering to sound global practices of bail-in solutions based on a hierarchy of creditors (starting with banks’ shareholders) that protects small depositors.
  • Publication
    Argentina Country Climate and Development Report
    (World Bank, Washington, DC, 2022-11) World Bank Group
    The Argentina Country Climate and Development Report (CCDR) explores opportunities and identifies trade-offs for aligning Argentina’s growth and poverty reduction policies with its commitments on, and its ability to withstand, climate change. It assesses how the country can: reduce its vulnerability to climate shocks through targeted public and private investments and adequation of social protection. The report also shows how Argentina can seize the benefits of a global decarbonization path to sustain a more robust economic growth through further development of Argentina’s potential for renewable energy, energy efficiency actions, the lithium value chain, as well as climate-smart agriculture (and land use) options. Given Argentina’s context, this CCDR focuses on win-win policies and investments, which have large co-benefits or can contribute to raising the country’s growth while helping to adapt the economy, also considering how human capital actions can accompany a just transition.
  • Publication
    Digital Africa
    (Washington, DC: World Bank, 2023-03-13) Begazo, Tania; Dutz, Mark Andrew; Blimpo, Moussa
    All African countries need better and more jobs for their growing populations. "Digital Africa: Technological Transformation for Jobs" shows that broader use of productivity-enhancing, digital technologies by enterprises and households is imperative to generate such jobs, including for lower-skilled people. At the same time, it can support not only countries’ short-term objective of postpandemic economic recovery but also their vision of economic transformation with more inclusive growth. These outcomes are not automatic, however. Mobile internet availability has increased throughout the continent in recent years, but Africa’s uptake gap is the highest in the world. Areas with at least 3G mobile internet service now cover 84 percent of Africa’s population, but only 22 percent uses such services. And the average African business lags in the use of smartphones and computers as well as more sophisticated digital technologies that catalyze further productivity gains. Two issues explain the usage gap: affordability of these new technologies and willingness to use them. For the 40 percent of Africans below the extreme poverty line, mobile data plans alone would cost one-third of their incomes—in addition to the price of access devices, apps, and electricity. Data plans for small- and medium-size businesses are also more expensive than in other regions. Moreover, shortcomings in the quality of internet services—and in the supply of attractive, skills-appropriate apps that promote entrepreneurship and raise earnings—dampen people’s willingness to use them. For those countries already using these technologies, the development payoffs are significant. New empirical studies for this report add to the rapidly growing evidence that mobile internet availability directly raises enterprise productivity, increases jobs, and reduces poverty throughout Africa. To realize these and other benefits more widely, Africa’s countries must implement complementary and mutually reinforcing policies to strengthen both consumers’ ability to pay and willingness to use digital technologies. These interventions must prioritize productive use to generate large numbers of inclusive jobs in a region poised to benefit from a massive, youthful workforce—one projected to become the world’s largest by the end of this century.
  • Publication
    Classroom Assessment to Support Foundational Literacy
    (Washington, DC: World Bank, 2025-03-21) Luna-Bazaldua, Diego; Levin, Victoria; Liberman, Julia; Gala, Priyal Mukesh
    This document focuses primarily on how classroom assessment activities can measure students’ literacy skills as they progress along a learning trajectory towards reading fluently and with comprehension by the end of primary school grades. The document addresses considerations regarding the design and implementation of early grade reading classroom assessment, provides examples of assessment activities from a variety of countries and contexts, and discusses the importance of incorporating classroom assessment practices into teacher training and professional development opportunities for teachers. The structure of the document is as follows. The first section presents definitions and addresses basic questions on classroom assessment. Section 2 covers the intersection between assessment and early grade reading by discussing how learning assessment can measure early grade reading skills following the reading learning trajectory. Section 3 compares some of the most common early grade literacy assessment tools with respect to the early grade reading skills and developmental phases. Section 4 of the document addresses teacher training considerations in developing, scoring, and using early grade reading assessment. Additional issues in assessing reading skills in the classroom and using assessment results to improve teaching and learning are reviewed in section 5. Throughout the document, country cases are presented to demonstrate how assessment activities can be implemented in the classroom in different contexts.
  • Publication
    World Development Report 2006
    (Washington, DC, 2005) World Bank
    This year’s Word Development Report (WDR), the twenty-eighth, looks at the role of equity in the development process. It defines equity in terms of two basic principles. The first is equal opportunities: that a person’s chances in life should be determined by his or her talents and efforts, rather than by pre-determined circumstances such as race, gender, social or family background. The second principle is the avoidance of extreme deprivation in outcomes, particularly in health, education and consumption levels. This principle thus includes the objective of poverty reduction. The report’s main message is that, in the long run, the pursuit of equity and the pursuit of economic prosperity are complementary. In addition to detailed chapters exploring these and related issues, the Report contains selected data from the World Development Indicators 2005‹an appendix of economic and social data for over 200 countries. This Report offers practical insights for policymakers, executives, scholars, and all those with an interest in economic development.