Publication:
Predicting Income Distributions from Almost Nothing

Loading...
Thumbnail Image
Files in English
English PDF (6.94 MB)
357 downloads
English Text (94.33 KB)
39 downloads
Published
2025-01-13
ISSN
Date
2025-01-13
Author(s)
Schoch, Marta
Nguyen, Minh Cong
Montes, Jose
Editor(s)
Abstract
This paper develops a method to predict comparable income and consumption distributions for all countries in the world from a simple regression with a handful of country-level variables. To fit the model, the analysis uses more than 2,000 distributions from household surveys covering 168 countries from the World Bank’s Poverty and Inequality Platform. More than 1,000 economic, demographic, and remote sensing predictors from multiple databases are used to test the models. A model is selected that balances out-of-sample accuracy, simplicity, and the share of countries for which the method can be applied. The paper finds that a simple model relying on gross domestic product per capita, under-5 mortality rate, life expectancy, and rural population share gives almost the same accuracy as a complex machine learning model using 1,000 indicators jointly. The method allows for easy distributional analysis in countries with extreme data deprivation where survey data are unavailable or severely outdated, several of which are likely among the poorest countries in the world.
Link to Data Set
Citation
Mahler, Daniel Gerszon; Schoch, Marta; Lakner, Christoph; Nguyen, Minh Cong; Montes, Jose. 2025. Predicting Income Distributions from Almost Nothing. Policy Research Working Paper; 11034. © World Bank. http://hdl.handle.net/10986/42676 License: CC BY 3.0 IGO.
Associated URLs
Associated content
Report Series
Report Series
Other publications in this report series
  • Publication
    The Economic Value of Weather Forecasts: A Quantitative Systematic Literature Review
    (Washington, DC: World Bank, 2025-09-10) Farkas, Hannah; Linsenmeier, Manuel; Talevi, Marta; Avner, Paolo; Jafino, Bramka Arga; Sidibe, Moussa
    This study systematically reviews the literature that quantifies the economic benefits of weather observations and forecasts in four weather-dependent economic sectors: agriculture, energy, transport, and disaster-risk management. The review covers 175 peer-reviewed journal articles and 15 policy reports. Findings show that the literature is concentrated in high-income countries and most studies use theoretical models, followed by observational and then experimental research designs. Forecast horizons studied, meteorological variables and services, and monetization techniques vary markedly by sector. Estimated benefits even within specific subsectors span several orders of magnitude and broad uncertainty ranges. An econometric meta-analysis suggests that theoretical studies and studies in richer countries tend to report significantly larger values. Barriers that hinder value realization are identified on both the provider and user sides, with inadequate relevance, weak dissemination, and limited ability to act recurring across sectors. Policy reports rely heavily on back-of-the-envelope or recursive benefit-transfer estimates, rather than on the methods and results of the peer-reviewed literature, revealing a science-to-policy gap. These findings suggest substantial socioeconomic potential of hydrometeorological services around the world, but also knowledge gaps that require more valuation studies focusing on low- and middle-income countries, addressing provider- and user-side barriers and employing rigorous empirical valuation methods to complement and validate theoretical models.
  • Publication
    Direct and Indirect Impacts of Transport Mobility on Access to Jobs: Evidence from South Africa
    (Washington, DC: World Bank, 2025-11-12) Iimi, Atsushi
    Access to jobs is essential for economic growth. In Africa, unemployment rates are notably high. This paper reexamines the relationship between transport mobility and labor market outcomes, with a particular focus on the direct and indirect effects of transport connectivity. As predicted by theory, wages are influenced by the level of commuting deterrence. Generally, higher earnings are associated with longer commute times and/or higher commuting costs. Local accessibility is also important, especially for individuals with time constraints. Both direct and indirect impacts are found to be significant in South Africa, where job accessibility has been challenging since the end of apartheid. For the direct impact, the wage elasticity associated with commuting costs is significant. Returns on commute are particularly high for women. Local accessibility to socioeconomic facilities, such as shops and health services, is also found to have a significant impact, consistent with the concept of mobility of care. To enhance employment, therefore, it is crucial to connect people not only to job locations but also to various socioeconomic points of interest, such as markets and hospitals, in an integrated manner. This integration will enable individuals to spend more time working and commuting longer distances.
  • Publication
    The Macroeconomic Implications of Climate Change Impacts and Adaptation Options
    (Washington, DC: World Bank, 2025-05-29) Abalo, Kodzovi; Boehlert, Brent; Bui, Thanh; Burns, Andrew; Castillo, Diego; Chewpreecha, Unnada; Haider, Alexander; Hallegatte, Stephane; Jooste, Charl; McIsaac, Florent; Ruberl, Heather; Smet, Kim; Strzepek, Ken
    Estimating the macroeconomic implications of climate change impacts and adaptation options is a topic of intense research. This paper presents a framework in the World Bank's macrostructural model to assess climate-related damages. This approach has been used in many Country Climate and Development Reports, a World Bank diagnostic that identifies priorities to ensure continued development in spite of climate change and climate policy objectives. The methodology captures a set of impact channels through which climate change affects the economy by (1) connecting a set of biophysical models to the macroeconomic model and (2) exploring a set of development and climate scenarios. The paper summarizes the results for five countries, highlighting the sources and magnitudes of their vulnerability --- with estimated gross domestic product losses in 2050 exceeding 10 percent of gross domestic product in some countries and scenarios, although only a small set of impact channels is included. The paper also presents estimates of the macroeconomic gains from sector-level adaptation interventions, considering their upfront costs and avoided climate impacts and finding significant net gross domestic product gains from adaptation opportunities identified in the Country Climate and Development Reports. Finally, the paper discusses the limits of current modeling approaches, and their complementarity with empirical approaches based on historical data series. The integrated modeling approach proposed in this paper can inform policymakers as they make proactive decisions on climate change adaptation and resilience.
  • Publication
    From Policy to Practice: Lessons from the Implementation of the Refugee Work Rights Policy in Ethiopia
    (Washington, DC: World Bank, 2025-11-10) Perez, Ana Maria; Rozo, Sandra V.
    This paper examines the early implementation of Ethiopia’s refugee work rights policy, with a focus on the issuance of permits that enable refugees to engage in economic activities. Building on significant legal and institutional advances under the 2019 Refugee Proclamation and subsequent directives, the analysis explores how these reforms are being operationalized in practice. Using a mixed-methods approach, combining document review, administrative data analysis, and semi-structured interviews, the paper identifies both progress and remaining challenges. Permit issuance has increased since the adoption of detailed operational guidance in 2024, reflecting the Government of Ethiopia’s commitment to operationalizing its progressive legal framework and ensuring that refugees can exercise their right to work. However, take-up remains modest, with about 5.2 percent of the working-age population holding a permit. Preliminary evidence suggests that coordination gaps, limited subnational capacity, low awareness among refugees and employers, and disincentives to formalize in a largely informal labor market are contributing to the low take-up. The paper offers policy suggestions, grounded in the Ethiopian context and emerging evidence, to help translate legal commitments into improved labor market outcomes for refugees.
  • Publication
    Monitoring Global Aid Flows: A Novel Approach Using Large Language Models
    (Washington, DC: World Bank, 2025-11-04) Luo, Xubei; Rajasekaran, Arvind Balaji; Scruggs, Andrew Conner
    Effective monitoring of development aid is the foundation for assessing the alignment of flows with their intended development objectives. Existing reporting systems, such as the Organisation for Economic Co-operation and Development’s Creditor Reporting System, provide standardized classification of aid activities but have limitations when it comes to capturing new areas like climate change, digitalization, and other cross-cutting themes. This paper proposes a bottom-up, unsupervised machine learning framework that leverages textual descriptions of aid projects to generate highly granular activity clusters. Using the 2021 Creditor Reporting System data set of nearly 400,000 records, the model produces 841 clusters, which are then grouped into 80 subsectors. These clusters reveal 36 emerging aid areas not tracked in the current Creditor Reporting System taxonomy, allow unpacking of “multi-sectoral” and “sector not specified” classifications, and enable estimation of flows to new themes, including World Bank Global Challenge Programs, International Development Association–20 Special Themes, and Cross-Cutting Issues. Validation against both Creditor Reporting System benchmarks and International Development Association commitment data demonstrates robustness. This approach illustrates how machine learning and the new advances in large language models can enhance the monitoring of global aid flows and inform future improvements in aid classification and reporting. It offers a useful tool that can support more responsive and evidence-based decision-making, helping to better align resources with evolving development priorities.
Journal
Journal Volume
Journal Issue

Related items

Showing items related by metadata.

  • Publication
    March 2020 PovcalNet Update
    (World Bank, Washington, DC, 2020-03) Castaneda Aguilar, R. Andres; Atamanov, Aziz; Fujs, Tony H.M.J.; Dewina, Reno; Diaz-Bonilla, Carolina; Mahler, Daniel Gerszon; Jolliffe, Dean; Matytsin, Mikhail; Lakner, Christoph; Montes, Jose; Moreno Herrera, Laura L.; Mungai, Rose; Newhouse, David; Nguyen, Minh C.; Parada Gomez Urquiza, Francisco J.; Silwal, Ani Rudra; Sanchez Castro, Diana M.; Schoch, Marta; Vargas Mogollon, David L.; Viveros Mendoza, Martha C.; Yang, Judy; Yoshida, Nobuo; Wu, Haoyu
    The March 2020 update to PovcalNet involves several changes to the data underlying the global poverty estimates. Some welfare aggregates have been changed for improved harmonization, and some of the CPI, national accounts, and population input data have been revised. This document explains these changes in detail and the reasoning behind them. In addition to the changes listed here, a large number of new country-years have been added, bringing the total number of surveys to more than 1,900.
  • Publication
    March 2021 PovcalNet Update
    (World Bank, Washington, DC, 2021-03) Arayavechkit, Tanida; Atamanov, Aziz; Barreto Herrera, Karen Y.; Belghith, Nadia Belhaj Hassine; Castaneda Aguilar, R. Andres; Fujs, Tony H.M.J.; Dewina, Reno; Diaz-Bonilla, Carolina; Edochie, Ifeanyi N.; Jolliffe, Dean; Lakner, Christoph; Mahler, Daniel; Montes, Jose; Moreno Herrera, Laura L.; Mungai, Rose; Newhouse, David; Nguyen, Minh C.; Sanchez Castro, Diana M.; Schoch, Marta; Sharma, Dhiraj; Simler, Kenneth; Swinkels, Rob; Takamatsu, Shinya; Uochi, Ikuko; Viveros Mendoza, Martha C.; Yonzan, Nishant; Yoshida, Nobuo; Wu, Haoyu
    The March 2021 update to PovcalNet involves several changes to the data underlying the global poverty estimates. Some welfare aggregates have been changed for improved harmonization, and the CPI, national accounts, and population input data have been updated. This document explains these changes in detail and the reasoning behind them. In addition to the changes listed here, a large number of new country-years have been added, resulting in a total number of surveys of more than 1,900. Moreover, this update includes important revisions to the historical survey data and for the first time, poverty estimates based on imputed consumption data.
  • Publication
    September 2020 PovcalNet Update
    (World Bank, Washington, DC, 2020-09) Castaneda Aguilar, R. Andres; Fujs, Tony; Jolliffe, Dean; Lakner, Christoph; Gerszon Mahler, Daniel; Nguyen, Minh C.; Schoch, Marta; Vargas Mogollon, David L.; Viveros Mendoza, Martha C.; Baah, Samuel Kofi Tetteh; Yonzan, Nishant; Yoshida, Nobuo
    The September 2020 update to PovcalNet mainly involves the adoption of the revised 2011 PPPs for the estimation of global poverty. In addition, the coverage rules for reporting regional and global poverty aggregates have been reviewed, resulting in small adjustments. Historical regional and global aggregates are now reported with an annual frequency instead of intervals with varying lengths. Only two surveys have been added and some welfare aggregates have been revised compared with the March 2020 update. National accounts and population input data have been updated. This document explains these changes and the rationale behind them in detail. The data and associated estimates are used for the analysis of global poverty in the forthcoming Poverty and Shared Prosperity Report 2020.
  • Publication
    April 2022 Update to the Poverty and Inequality Platform (PIP)
    (World Bank, Washington, DC, 2022-04) Castaneda Aguilar, R. Andres; Dewina, Reno; Diaz-Bonilla, Carolina; Edochie, Ifeanyi N.; Fujs, Tony H. M. J.; Jolliffe, Dean; Lain, Jonathan; Lakner, Christoph; Ibarra, Gabriel Lara; Mahler, Daniel G.; Meyer, Moritz; Montes, Jose; Moreno Herrera, Laura L.; Mungai, Rose; Newhouse, David; Nguyen, Minh C.; Sanchez Castro, Diana; Schoch, Marta; Sousa, Liliana D.; Tetteh-Baah, Samuel K.; Uochi, Ikuko; Viveros Mendoza, Martha C.; Wu, Haoya; Yonzan, Nishant; Yoshida, Nobu
    The April 2022 update to the newly launched Poverty and Inequality Platform (PIP) involves several changes to the data underlying the global poverty estimates. Some welfare aggregates have been changed for improved harmonization, and the CPI, national accounts, and population input data have been updated. This document explains these changes in detail and the reasoning behind them. Moreover, a large number of new country-years have been added, bringing the total number of surveys to more than 2,000. These include new harmonized surveys for countries in West Africa, new imputed poverty estimates for Nigeria, and recent 2020 household survey data for several countries. Global poverty estimates are now reported up to 2018 and earlier years have been revised.
  • Publication
    September 2019 PovcalNet Update
    (World Bank, Washington, DC, 2019-09) Castaneda Aguilar, R. Andres; Atamanov, Aziz; Diaz-Bonilla, Carolina; Jolliffe, Dean; Lakner, Christoph; Montes, Jose; Mahler, Daniel Gerszon; Moreno Herrera, Laura Liliana; Newhouse, David; Nguyen, Minh C.; Prydz, Espen Beer; Sangraula, Prem; Tandon, Sharad Alan; Yang, Judy
    The September 2019 global poverty update from the World Bank includes revised survey data which lead to minor changes in the most recent global poverty estimates. The update includes revisions to 18 surveys from four countries. As a result of the revised data, the estimate of the global 1.90 US Dollars headcount ratio for 2015 increases slightly from 9.94 percent to 9.98 percent, whereas the number of poor increases from 731.0 million to 734.5 million people.

Users also downloaded

Showing related downloaded files

  • Publication
    Global Economic Prospects, June 2025
    (Washington, DC: World Bank, 2025-06-10) World Bank
    The global economy is facing another substantial headwind, emanating largely from an increase in trade tensions and heightened global policy uncertainty. For emerging market and developing economies (EMDEs), the ability to boost job creation and reduce extreme poverty has declined. Key downside risks include a further escalation of trade barriers and continued policy uncertainty. These challenges are exacerbated by subdued foreign direct investment into EMDEs. Global cooperation is needed to restore a more stable international trade environment and scale up support for vulnerable countries grappling with conflict, debt burdens, and climate change. Domestic policy action is also critical to contain inflation risks and strengthen fiscal resilience. To accelerate job creation and long-term growth, structural reforms must focus on raising institutional quality, attracting private investment, and strengthening human capital and labor markets. Countries in fragile and conflict situations face daunting development challenges that will require tailored domestic policy reforms and well-coordinated multilateral support.
  • Publication
    Digital Progress and Trends Report 2023
    (Washington, DC: World Bank, 2024-03-05) World Bank
    Digitalization is the transformational opportunity of our time. The digital sector has become a powerhouse of innovation, economic growth, and job creation. Value added in the IT services sector grew at 8 percent annually during 2000–22, nearly twice as fast as the global economy. Employment growth in IT services reached 7 percent annually, six times higher than total employment growth. The diffusion and adoption of digital technologies are just as critical as their invention. Digital uptake has accelerated since the COVID-19 pandemic, with 1.5 billion new internet users added from 2018 to 2022. The share of firms investing in digital solutions around the world has more than doubled from 2020 to 2022. Low-income countries, vulnerable populations, and small firms, however, have been falling behind, while transformative digital innovations such as artificial intelligence (AI) have been accelerating in higher-income countries. Although more than 90 percent of the population in high-income countries was online in 2022, only one in four people in low-income countries used the internet, and the speed of their connection was typically only a small fraction of that in wealthier countries. As businesses in technologically advanced countries integrate generative AI into their products and services, less than half of the businesses in many low- and middle-income countries have an internet connection. The growing digital divide is exacerbating the poverty and productivity gaps between richer and poorer economies. The Digital Progress and Trends Report series will track global digitalization progress and highlight policy trends, debates, and implications for low- and middle-income countries. The series adds to the global efforts to study the progress and trends of digitalization in two main ways: · By compiling, curating, and analyzing data from diverse sources to present a comprehensive picture of digitalization in low- and middle-income countries, including in-depth analyses on understudied topics. · By developing insights on policy opportunities, challenges, and debates and reflecting the perspectives of various stakeholders and the World Bank’s operational experiences. This report, the first in the series, aims to inform evidence-based policy making and motivate action among internal and external audiences and stakeholders. The report will bring global attention to high-performing countries that have valuable experience to share as well as to areas where efforts will need to be redoubled.
  • Publication
    The Container Port Performance Index 2023
    (Washington, DC: World Bank, 2024-07-18) World Bank
    The Container Port Performance Index (CPPI) measures the time container ships spend in port, making it an important point of reference for stakeholders in the global economy. These stakeholders include port authorities and operators, national governments, supranational organizations, development agencies, and other public and private players in trade and logistics. The index highlights where vessel time in container ports could be improved. Streamlining these processes would benefit all parties involved, including shipping lines, national governments, and consumers. This fourth edition of the CPPI relies on data from 405 container ports with at least 24 container ship port calls in the calendar year 2023. As in earlier editions of the CPPI, the ranking employs two different methodological approaches: an administrative (technical) approach and a statistical approach (using matrix factorization). Combining these two approaches ensures that the overall ranking of container ports reflects actual port performance as closely as possible while also being statistically robust. The CPPI methodology assesses the sequential steps of a container ship port call. ‘Total port hours’ refers to the total time elapsed from the moment a ship arrives at the port until the vessel leaves the berth after completing its cargo operations. The CPPI uses time as an indicator because time is very important to shipping lines, ports, and the entire logistics chain. However, time, as captured by the CPPI, is not the only way to measure port efficiency, so it does not tell the entire story of a port’s performance. Factors that can influence the time vessels spend in ports can be location-specific and under the port’s control (endogenous) or external and beyond the control of the port (exogenous). The CPPI measures time spent in container ports, strictly based on quantitative data only, which do not reveal the underlying factors or root causes of extended port times. A detailed port-specific diagnostic would be required to assess the contribution of underlying factors to the time a vessel spends in port. A very low ranking or a significant change in ranking may warrant special attention, for which the World Bank generally recommends a detailed diagnostic.
  • Publication
    Global Economic Prospects, January 2025
    (Washington, DC: World Bank, 2025-01-16) World Bank
    Global growth is expected to hold steady at 2.7 percent in 2025-26. However, the global economy appears to be settling at a low growth rate that will be insufficient to foster sustained economic development—with the possibility of further headwinds from heightened policy uncertainty and adverse trade policy shifts, geopolitical tensions, persistent inflation, and climate-related natural disasters. Against this backdrop, emerging market and developing economies are set to enter the second quarter of the twenty-first century with per capita incomes on a trajectory that implies substantially slower catch-up toward advanced-economy living standards than they previously experienced. Without course corrections, most low-income countries are unlikely to graduate to middle-income status by the middle of the century. Policy action at both global and national levels is needed to foster a more favorable external environment, enhance macroeconomic stability, reduce structural constraints, address the effects of climate change, and thus accelerate long-term growth and development.
  • Publication
    Business Ready 2024
    (Washington, DC: World Bank, 2024-10-03) World Bank
    Business Ready (B-READY) is a new World Bank Group corporate flagship report that evaluates the business and investment climate worldwide. It replaces and improves upon the Doing Business project. B-READY provides a comprehensive data set and description of the factors that strengthen the private sector, not only by advancing the interests of individual firms but also by elevating the interests of workers, consumers, potential new enterprises, and the natural environment. This 2024 report introduces a new analytical framework that benchmarks economies based on three pillars: Regulatory Framework, Public Services, and Operational Efficiency. The analysis centers on 10 topics essential for private sector development that correspond to various stages of the life cycle of a firm. The report also offers insights into three cross-cutting themes that are relevant for modern economies: digital adoption, environmental sustainability, and gender. B-READY draws on a robust data collection process that includes specially tailored expert questionnaires and firm-level surveys. The 2024 report, which covers 50 economies, serves as the first in a series that will expand in geographical coverage and refine its methodology over time, supporting reform advocacy, policy guidance, and further analysis and research.