Publication: Poverty Mapping in the Age of Machine Learning
Loading...
Published
2023-05-04
ISSN
Date
2023-05-04
Author(s)
Editor(s)
Abstract
Recent years have witnessed considerable methodological advances in poverty mapping, much of which has focused on the application of modern machine-learning approaches to remotely sensed data. Poverty maps produced with these methods generally share a common validation procedure, which assesses model performance by comparing subnational machine-learning-based poverty estimates with survey-based, direct estimates. Although unbiased, survey-based estimates at a granular level can be imprecise measures of true poverty rates, meaning that it is unclear whether the validation procedures used in machine-learning approaches are informative of actual model performance. This paper examines the credibility of existing approaches to model validation by constructing a pseudo-census from the Mexican Intercensal Survey of 2015, which is used to conduct several design-based simulation experiments. The findings show that the validation procedure often used for machine-learning approaches can be misleading in terms of model assessment since it yields incorrect information for choosing what may be the best set of estimates across different methods and scenarios. Using alternative validation methods, the paper shows that machine-learning-based estimates can rival traditional, more data intensive poverty mapping approaches. Further, the closest approximation to existing machine-learning approaches, using publicly available geo-referenced data, performs poorly when evaluated against “true” poverty rates and fails to outperform traditional poverty mapping methods in targeting simulations.
Link to Data Set
Citation
“Henderson, Heath; Corral, Paul; Segovia, Sandra. 2023. Poverty Mapping in the Age of Machine Learning. Policy Research Working Papers; 10429. © World Bank. http://hdl.handle.net/10986/39783 License: CC BY 3.0 IGO.”
Digital Object Identifier
Associated URLs
Associated content
Other publications in this report series
Journal
Journal Volume
Journal Issue
Collections
Related items
Showing items related by metadata.
Publication Small Area Estimation of Monetary Poverty in Mexico Using Satellite Imagery and Machine Learning(World Bank, Washington, DC, 2022-09)Estimates of poverty are an important input into policy formulation in developing countries. The accurate measurement of poverty rates is therefore a first-order problem for development policy. This paper shows that combining satellite imagery with household surveys can improve the precision and accuracy of estimated poverty rates in Mexican municipalities, a level at which the survey is not considered representative. It also shows that a household-level model outperforms other common small area estimation methods. However, poverty estimates in 2015 derived from geospatial data remain less accurate than 2010 estimates derived from household census data. These results indicate that the incorporation of household survey data and widely available satellite imagery can improve on existing poverty estimates in developing countries when census data are old or when patterns of poverty are changing rapidly, even for small subgroups.Publication Identifying Urban Areas by Combining Human Judgment and Machine Learning(World Bank, Washington, DC, 2020-02)This paper proposes a methodology for identifying urban areas that combines subjective assessments with machine learning, and applies it to India, a country where several studies see the official urbanization rate as an under-estimate. For a representative sample of cities, towns and villages, as administratively defined, human judgment of Google images is used to determine whether they are urban or rural in practice. Judgments are collected across four groups of assessors, differing in their familiarity with India and with urban issues, following two different protocols. The judgment-based classification is then combined with data from the population census and from satellite imagery to predict the urban status of the sample. The Logit model, and LASSO and random forests methods, are applied. These approaches are then used to decide whether each of the out-of-sample administrative units in India is urban or rural in practice. The analysis does not find that India is substantially more urban than officially claimed. However, there are important differences at more disaggregated levels, with “other towns” and “census towns” being more rural, and some southern states more urban, than is officially claimed. The consistency of human judgment across assessors and protocols, the easy availability of crowd-sourcing, and the stability of predictions across approaches, suggest that the proposed methodology is a promising avenue for studying urban issues.Publication When Aggregation Misleads(Washington, DC: World Bank, 2025-05-01)This paper explores why small area poverty estimates from models at the household level that only use aggregate data as covariates, exhibit systematic bias. The analysis demonstrates that this bias stems from the model’s inability to capture the complete between-household variation in welfare, as they rely solely on covariates aggregated at geographic levels. Through model-based simulations, the paper shows that the bias in these models is minimized when the empirical variability of simulated welfare based on the model is closest to the true empirical variance of welfare at the area level. This finding also has implications for bias in unit-level models.Publication Using Machine Learning to Assess Yield Impacts of Crop Rotation(World Bank, Washington, DC, 2020-06)To overcome the constraints for policy and practice posed by limited availability of data on crop rotation, this paper applies machine learning to freely available satellite imagery to identify the rotational practices of more than 7,000 villages in Ukraine. Rotation effects estimated based on combining these data with survey-based yield information point toward statistically significant and economically meaningful effects that differ from what has been reported in the literature, highlighting the value of this approach. Independently derived indices of vegetative development and soil water content produce similar results, not only supporting the robustness of the results, but also suggesting that the opportunities for spatial and temporal disaggregation inherent in such data offer tremendous unexploited opportunities for policy-relevant analysis.Publication Tracking Advances in Access to Electricity Using Satellite-Based Data and Machine Learning to Complement Surveys(World Bank, Washington, DC, 2021-04-15)Access to electricity is widely considered a major determinant of socioeconomic development. But despite long-standing efforts to expand access, 789 million people remained without electricity in 2018. Accurate and reliable data to keep track of electrification efforts must be the first step toward achieving universal access. Monitoring access with the finest granularity and taking into account local socioeconomic characteristics enable a realistic depiction of electrification progress. Such data can be used to plan efficient and robust energy access policies and programs, to raise public awareness of the urgency of action, to sustain the pace of electrification, and ultimately to connect the hardest-to-reach populations. In addition to identifying where efforts should be targeted, high-resolution data are needed to show which electricity supply options are most relevant. Remote sensing techniques and geographic information systems have revolutionized data collection by providing a range of location-specific information that was not previously accessible. The use of standardized geospatial tools and methods has made it possible to offer countries technical assistance and operational support for the development of national electrification strategies, least-cost electrification plans, and country-based investment prospectuses that combine grid, mini-grid, and off-grid technologies.
Users also downloaded
Showing related downloaded files
Publication The Container Port Performance Index 2020 to 2024: Trends and Lessons Learned(Washington, DC: World Bank, 2025-09-22)The Container Port Performance Index (CPPI) provides a global benchmark of how container ports perform in handling vessel calls. Developed jointly by the World Bank and S&P Global Market Intelligence, it measures the time ships spend in port and relates this to the number of containers moved during that time. This approach makes the CPPI a unique diagnostic tool that can highlight patterns in port operations and shed light on global and regional supply chain dynamics. Now in its fifth edition, the CPPI report covers the period from 2020 to 2024. It builds on a well-established methodology to generate scores for more than 400 container ports worldwide. Over time, the CPPI has become a trusted reference point for policymakers, industry stakeholders, and researchers who seek to understand how ports adapt to shocks, recover from disruptions, and identify opportunities for investments, reform and modernization. A major innovation in this edition is the introduction of multi-year trend analysis. Rather than presenting annual snapshots, the report now tracks how CPPI scores have changed across five years. This longitudinal perspective reveals shifts in port performance, showing where scores have risen, fallen, or remained stable. By linking these movements to external factors, the CPPI offers insights into how global and regional supply chains evolve under pressure. The results clearly mirror the crises that have shaken global trade. During the COVID-19 pandemic, CPPI scores in different regions declined sharply as congestion, equipment shortages, and delays overwhelmed many ports. By 2023, global averages rebounded in parallel with easing freight markets and reduced congestion. Yet 2024 brought new challenges: the Red Sea crisis disrupted major trade lanes, while climate-related constraints at the Panama Canal added further stress. These shocks were reflected in lower global and several regional average scores, underscoring the vulnerability of maritime transport to geopolitical and environmental events. The CPPI is not about comparing one port against another, but about understanding changes in performance over time. Ports that improved their scores often did so by reducing time at anchor, optimizing berth operations, investing in digital tools, and strengthening coordination across logistics partners. The evidence confirms that improvements are possible across ports of all sizes, and that rising scores are linked to deliberate actions to minimize time in port relative to containers moved. By consolidating five years of results, this edition transforms the CPPI into a long-term reference point. It shows how global crises have affected shipping, how different regions have adapted, and what lessons can be drawn for future resilience. The World Bank and S&P Global Market Intelligence remain committed to maintaining the CPPI as a global public good, providing transparency, comparability, and practical insights to support more reliable and sustainable maritime supply chains.Publication Global Economic Prospects, June 2025(Washington, DC: World Bank, 2025-06-10)The global economy is facing another substantial headwind, emanating largely from an increase in trade tensions and heightened global policy uncertainty. For emerging market and developing economies (EMDEs), the ability to boost job creation and reduce extreme poverty has declined. Key downside risks include a further escalation of trade barriers and continued policy uncertainty. These challenges are exacerbated by subdued foreign direct investment into EMDEs. Global cooperation is needed to restore a more stable international trade environment and scale up support for vulnerable countries grappling with conflict, debt burdens, and climate change. Domestic policy action is also critical to contain inflation risks and strengthen fiscal resilience. To accelerate job creation and long-term growth, structural reforms must focus on raising institutional quality, attracting private investment, and strengthening human capital and labor markets. Countries in fragile and conflict situations face daunting development challenges that will require tailored domestic policy reforms and well-coordinated multilateral support.Publication Digital Progress and Trends Report 2023(Washington, DC: World Bank, 2024-03-05)Digitalization is the transformational opportunity of our time. The digital sector has become a powerhouse of innovation, economic growth, and job creation. Value added in the IT services sector grew at 8 percent annually during 2000–22, nearly twice as fast as the global economy. Employment growth in IT services reached 7 percent annually, six times higher than total employment growth. The diffusion and adoption of digital technologies are just as critical as their invention. Digital uptake has accelerated since the COVID-19 pandemic, with 1.5 billion new internet users added from 2018 to 2022. The share of firms investing in digital solutions around the world has more than doubled from 2020 to 2022. Low-income countries, vulnerable populations, and small firms, however, have been falling behind, while transformative digital innovations such as artificial intelligence (AI) have been accelerating in higher-income countries. Although more than 90 percent of the population in high-income countries was online in 2022, only one in four people in low-income countries used the internet, and the speed of their connection was typically only a small fraction of that in wealthier countries. As businesses in technologically advanced countries integrate generative AI into their products and services, less than half of the businesses in many low- and middle-income countries have an internet connection. The growing digital divide is exacerbating the poverty and productivity gaps between richer and poorer economies. The Digital Progress and Trends Report series will track global digitalization progress and highlight policy trends, debates, and implications for low- and middle-income countries. The series adds to the global efforts to study the progress and trends of digitalization in two main ways: · By compiling, curating, and analyzing data from diverse sources to present a comprehensive picture of digitalization in low- and middle-income countries, including in-depth analyses on understudied topics. · By developing insights on policy opportunities, challenges, and debates and reflecting the perspectives of various stakeholders and the World Bank’s operational experiences. This report, the first in the series, aims to inform evidence-based policy making and motivate action among internal and external audiences and stakeholders. The report will bring global attention to high-performing countries that have valuable experience to share as well as to areas where efforts will need to be redoubled.Publication The Container Port Performance Index 2023(Washington, DC: World Bank, 2024-07-18)The Container Port Performance Index (CPPI) measures the time container ships spend in port, making it an important point of reference for stakeholders in the global economy. These stakeholders include port authorities and operators, national governments, supranational organizations, development agencies, and other public and private players in trade and logistics. The index highlights where vessel time in container ports could be improved. Streamlining these processes would benefit all parties involved, including shipping lines, national governments, and consumers. This fourth edition of the CPPI relies on data from 405 container ports with at least 24 container ship port calls in the calendar year 2023. As in earlier editions of the CPPI, the ranking employs two different methodological approaches: an administrative (technical) approach and a statistical approach (using matrix factorization). Combining these two approaches ensures that the overall ranking of container ports reflects actual port performance as closely as possible while also being statistically robust. The CPPI methodology assesses the sequential steps of a container ship port call. ‘Total port hours’ refers to the total time elapsed from the moment a ship arrives at the port until the vessel leaves the berth after completing its cargo operations. The CPPI uses time as an indicator because time is very important to shipping lines, ports, and the entire logistics chain. However, time, as captured by the CPPI, is not the only way to measure port efficiency, so it does not tell the entire story of a port’s performance. Factors that can influence the time vessels spend in ports can be location-specific and under the port’s control (endogenous) or external and beyond the control of the port (exogenous). The CPPI measures time spent in container ports, strictly based on quantitative data only, which do not reveal the underlying factors or root causes of extended port times. A detailed port-specific diagnostic would be required to assess the contribution of underlying factors to the time a vessel spends in port. A very low ranking or a significant change in ranking may warrant special attention, for which the World Bank generally recommends a detailed diagnostic.Publication Global Economic Prospects, January 2025(Washington, DC: World Bank, 2025-01-16)Global growth is expected to hold steady at 2.7 percent in 2025-26. However, the global economy appears to be settling at a low growth rate that will be insufficient to foster sustained economic development—with the possibility of further headwinds from heightened policy uncertainty and adverse trade policy shifts, geopolitical tensions, persistent inflation, and climate-related natural disasters. Against this backdrop, emerging market and developing economies are set to enter the second quarter of the twenty-first century with per capita incomes on a trajectory that implies substantially slower catch-up toward advanced-economy living standards than they previously experienced. Without course corrections, most low-income countries are unlikely to graduate to middle-income status by the middle of the century. Policy action at both global and national levels is needed to foster a more favorable external environment, enhance macroeconomic stability, reduce structural constraints, address the effects of climate change, and thus accelerate long-term growth and development.