Publication:
When Aggregation Misleads: Bias in Unit-Level Small Area Estimates of Poverty with Aggregate Data

Loading...
Thumbnail Image
Files in English
English PDF (1 MB)
72 downloads
English Text (49.31 KB)
12 downloads
Published
2025-05-01
ISSN
Date
2025-05-01
Author(s)
Editor(s)
Abstract
This paper explores why small area poverty estimates from models at the household level that only use aggregate data as covariates, exhibit systematic bias. The analysis demonstrates that this bias stems from the model’s inability to capture the complete between-household variation in welfare, as they rely solely on covariates aggregated at geographic levels. Through model-based simulations, the paper shows that the bias in these models is minimized when the empirical variability of simulated welfare based on the model is closest to the true empirical variance of welfare at the area level. This finding also has implications for bias in unit-level models.
Link to Data Set
Citation
Corral, Paul. 2025. When Aggregation Misleads: Bias in Unit-Level Small Area Estimates of Poverty with Aggregate Data. Public Research Working Paper; 11110. © World Bank. http://hdl.handle.net/10986/43150 License: CC BY 3.0 IGO.
Associated URLs
Associated content
Report Series
Report Series
Other publications in this report series
  • Publication
    The Economic Value of Weather Forecasts: A Quantitative Systematic Literature Review
    (Washington, DC: World Bank, 2025-09-10) Farkas, Hannah; Linsenmeier, Manuel; Talevi, Marta; Avner, Paolo; Jafino, Bramka Arga; Sidibe, Moussa
    This study systematically reviews the literature that quantifies the economic benefits of weather observations and forecasts in four weather-dependent economic sectors: agriculture, energy, transport, and disaster-risk management. The review covers 175 peer-reviewed journal articles and 15 policy reports. Findings show that the literature is concentrated in high-income countries and most studies use theoretical models, followed by observational and then experimental research designs. Forecast horizons studied, meteorological variables and services, and monetization techniques vary markedly by sector. Estimated benefits even within specific subsectors span several orders of magnitude and broad uncertainty ranges. An econometric meta-analysis suggests that theoretical studies and studies in richer countries tend to report significantly larger values. Barriers that hinder value realization are identified on both the provider and user sides, with inadequate relevance, weak dissemination, and limited ability to act recurring across sectors. Policy reports rely heavily on back-of-the-envelope or recursive benefit-transfer estimates, rather than on the methods and results of the peer-reviewed literature, revealing a science-to-policy gap. These findings suggest substantial socioeconomic potential of hydrometeorological services around the world, but also knowledge gaps that require more valuation studies focusing on low- and middle-income countries, addressing provider- and user-side barriers and employing rigorous empirical valuation methods to complement and validate theoretical models.
  • Publication
    The Macroeconomic Implications of Climate Change Impacts and Adaptation Options
    (Washington, DC: World Bank, 2025-05-29) Abalo, Kodzovi; Boehlert, Brent; Bui, Thanh; Burns, Andrew; Castillo, Diego; Chewpreecha, Unnada; Haider, Alexander; Hallegatte, Stephane; Jooste, Charl; McIsaac, Florent; Ruberl, Heather; Smet, Kim; Strzepek, Ken
    Estimating the macroeconomic implications of climate change impacts and adaptation options is a topic of intense research. This paper presents a framework in the World Bank's macrostructural model to assess climate-related damages. This approach has been used in many Country Climate and Development Reports, a World Bank diagnostic that identifies priorities to ensure continued development in spite of climate change and climate policy objectives. The methodology captures a set of impact channels through which climate change affects the economy by (1) connecting a set of biophysical models to the macroeconomic model and (2) exploring a set of development and climate scenarios. The paper summarizes the results for five countries, highlighting the sources and magnitudes of their vulnerability --- with estimated gross domestic product losses in 2050 exceeding 10 percent of gross domestic product in some countries and scenarios, although only a small set of impact channels is included. The paper also presents estimates of the macroeconomic gains from sector-level adaptation interventions, considering their upfront costs and avoided climate impacts and finding significant net gross domestic product gains from adaptation opportunities identified in the Country Climate and Development Reports. Finally, the paper discusses the limits of current modeling approaches, and their complementarity with empirical approaches based on historical data series. The integrated modeling approach proposed in this paper can inform policymakers as they make proactive decisions on climate change adaptation and resilience.
  • Publication
    Rigging the Scores: Corruption through Scoring Rule Manipulation in Public Procurement Auctions
    (Washington, DC: World Bank, 2025-12-02) Chen, Qianmiao
    Public procurement is highly susceptible to corruption, especially in developing countries. Although open auctions are widely adopted to curb it, this paper finds that corruption remains prevalent even within this procurement format. Procurement officers can collaborate with firms to manipulate scoring rules, ensuring predetermined winners, while corrupt firms submit noncompetitive bids to meet minimum bidder requirements. Using extensive data from Chinese public procurement auctions, the paper introduces model-driven statistical tools to detect such corruption, identifying a corruption rate of 65 percent. A procurement expert audit survey confirms the tools’ reliability, with a 91 percent probability that experts recognize suspicious scoring rules when flagged. Firm-level analysis reveals that local, state-owned, and less productive firms are favored in corrupt auctions. Lastly, the paper explores policy implications. Analysis of the national anti-corruption campaign since 2012 suggests that general investigations may be insufficient to address deeply ingrained corrupt practices. Using counterfactuals based on an estimated structural model, the paper shows that implementing anonymous call-for-tender evaluations could improve social welfare by 10 percent by eliminating suspicious rules and encouraging broader participation.
  • Publication
    Labor Demand in the Age of Generative AI: Early Evidence from the U.S. Job Posting Data
    (Washington, DC: World Bank, 2025-11-18) Liu, Yan; Wang, He; Yu, Shu
    This paper examines the causal impact of generative artificial intelligence on U.S. labor demand using online job posting data. Exploiting ChatGPT’s release in November 2022 as an exogenous shock, the paper applies difference-in-differences and event study designs to estimate the job displacement effects of generative artificial intelligence. The identification strategy compares labor demand for occupations with high versus low artificial intelligence substitution vulnerability following ChatGPT’s launch, conditioning on similar generative artificial intelligence exposure levels to isolate substitution effects from complementary uses. The analysis uses 285 million job postings collected by Lightcast from the first quarter of 2018 to the second quarter of 2025Q2. The findings show that the number of postings for occupations with above-median artificial intelligence substitution scores fell by an average of 12 percent relative to those with below-median scores. The effect increased from 6 percent in the first year after the launch to 18 percent by the third year. Losses were particularly acute for entry-level positions that require neither advanced degrees (18 percent) nor extensive experience (20 percent), as well as those in administrative support (40 percent) and professional services (30 percent). Although generative artificial intelligence generates new occupations and enhances productivity, which may increase labor demand, early evidence suggests that some occupations may be less likely to be complemented by generative artificial intelligence than others.
  • Publication
    Investment Policy Reforms and Foreign Direct Investment Inflows
    (Washington, DC: World Bank, 2025-12-01) Fwaga, Sammy; Chakrapani, Deepa; Abebe, Girum
    Foreign direct investment has the potential to introduce much-needed capital and expertise in emerging and developing economies. To attract foreign direct investment, many countries have eased restrictions on foreign ownership in various sectors, reformed their institutions, and set up investment promotion agencies. Until the mid-2010s, Ethiopia remained one of the few countries that resisted this trend, with several stringent restrictions in place on foreign direct investment entry and operations in the country. This study employs a synthetic control method to examine patterns in foreign capital inflows following a series of investment policy reforms that were substantively introduced in the mid-2010s (circa 2015). The study offers evidence that investment policy reforms contributed to a significant foreign direct investment inflow in Ethiopia, compared to what would have occurred in the absence of these policies. An alternative strategy that conservatively specifies the donor country pool using an AI-assisted deep search technique changes the donor pool weighting matrix of the synthetic control method, but the estimated policy effects largely remain robust to this specification. The findings highlight the importance of targeted reforms in promoting foreign direct investment inflow in developing countries.
Journal
Journal Volume
Journal Issue

Related items

Showing items related by metadata.

  • Publication
    Poverty Mapping in the Age of Machine Learning
    (World Bank, Washington, DC, 2023-05-04) Henderson, Heath; Corral, Paul; Segovia, Sandra
    Recent years have witnessed considerable methodological advances in poverty mapping, much of which has focused on the application of modern machine-learning approaches to remotely sensed data. Poverty maps produced with these methods generally share a common validation procedure, which assesses model performance by comparing subnational machine-learning-based poverty estimates with survey-based, direct estimates. Although unbiased, survey-based estimates at a granular level can be imprecise measures of true poverty rates, meaning that it is unclear whether the validation procedures used in machine-learning approaches are informative of actual model performance. This paper examines the credibility of existing approaches to model validation by constructing a pseudo-census from the Mexican Intercensal Survey of 2015, which is used to conduct several design-based simulation experiments. The findings show that the validation procedure often used for machine-learning approaches can be misleading in terms of model assessment since it yields incorrect information for choosing what may be the best set of estimates across different methods and scenarios. Using alternative validation methods, the paper shows that machine-learning-based estimates can rival traditional, more data intensive poverty mapping approaches. Further, the closest approximation to existing machine-learning approaches, using publicly available geo-referenced data, performs poorly when evaluated against “true” poverty rates and fails to outperform traditional poverty mapping methods in targeting simulations.
  • Publication
    Migration, Remittances and Forests : Disentangling the Impact of Population and Economic Growth on Forests
    (2011-12-01) Bhattarai, Keshav; Tiwari, Sailesh
    International migration has increased rapidly in recent decades and this has been accompanied by a remarkable increase in transfers made by migrants to their home countries. This paper investigates the effect of the rural economic growth brought about by migration and remittances on Nepal's Himalayan forests. The authors assemble a unique village-panel dataset combining remote sensing data on land use and forest cover change with data from the census and multiple rounds of living standards surveys to test various inter-relationships between population, economic growth and forests. The results suggest that rural economic growth spurred by remittances has had an overall positive impact on forests. The paper also finds that remittances caused an increase in rural wages and an increase in income, but a decrease in land prices. Considered together, however, the relationship between forests and remittances is driven largely through the income channel, indicating that the demand for amenities provided by forests in the rural Nepali setting may have been more important than factor prices in influencing land use changes for the period of the study.
  • Publication
    Estimating Small Area Population Density Using Survey Data and Satellite Imagery
    (World Bank, Washington, DC, 2019-03) Engstrom, Ryan; Newhouse, David; Soundararajan, Vidhya
    Country-level census data are typically collected once every 10 years. However, conflict, migration, urbanization, and natural disasters can cause rapid shifts in local population patterns. This study uses Sri Lankan data to demonstrate the feasibility of a bottom-up method that combines household survey data with contemporaneous satellite imagery to track frequent changes in local population density. A Poisson regression model based on indicators derived from satellite data, selected using the least absolute shrinkage and selection operator, accurately predicts village-level population density. The model is estimated in villages sampled in the 2012/13 Household Income and Expenditure Survey to obtain out-of-sample density predictions in the nonsurveyed villages. The predictions approximate the 2012 census density well and are more accurate than other bottom-up studies based on lower-resolution satellite data. The predictions are also more accurate than most publicly available population products, which rely on areal interpolation of census data to redistribute population at the local level. The accuracies are similar when estimated using a random forest model, and when density estimates are expressed in terms of population counts. The collective evidence suggests that combining surveys with satellite data is a cost-effective method to track local population changes at more frequent intervals.
  • Publication
    Guidelines to Small Area Estimation for Poverty Mapping
    (Washington, DC : World Bank, 2022-06-16) Corral, Paul; Cojocaru, Alexandru; Segovia, Sandra; Molina, Isabel
    The eradication of poverty, which was the first of the millennium development goals (MDG) established by the United Nations and followed by the sustainable development goals (SDG), requires knowing where the poor are located. Traditionally, household surveys are considered the best source of information on the living standards of a country’s population. Data from these surveys typically provide a sufficiently accurate direct estimate of household expenditures or income and thus estimates of poverty at the national level and larger international regions. However, when one starts to disaggregate data by local areas or population subgroups, the quality of these direct estimates diminishes. Consequently, national statistical offices (NSOs) cannot provide reliable wellbeing statistical figures at a local level. For example, the module of socioeconomic conditions of the Mexican national survey of household income and expenditure (ENIGH) is designed to produce estimates of poverty and inequality at the national level and for the 32 federate entities (31 states and Mexico City) with disaggregation by rural and urban zones, every two years, but there is a mandate to produce estimates by municipality every five years, and the ENIGH alone cannot provide estimates for all municipalities with adequate precision. This makes monitoring progress toward the sustainable development goals more difficult.
  • Publication
    Estimating Small Area Poverty and Welfare Indicators in Timor-Leste Using Satellite Imagery Data
    (World Bank, Washington, DC, 2020-09-28) Purnamasari, Ririn; Wirapati, Bagus Arya; Alatas, Hamidah; Nasiir, Mercoledi
    This report is structured as follows: an in-depth explanation of the FHSAE method is presented in section two. Section three reviews the sub-district level data used in this study, which includes imprecise TL-SLS and DHS direct estimates, as well as satellite imagery data used in this study. The variable selection method used for the FHSAE model in this model is explained in section four. Section five provides the results of the FHSAE exercise on poverty estimates, average real per capita consumption and welfare index, presenting them in the graphical maps. Section six concludes.

Users also downloaded

Showing related downloaded files

  • Publication
    Business Ready 2024
    (Washington, DC: World Bank, 2024-10-03) World Bank
    Business Ready (B-READY) is a new World Bank Group corporate flagship report that evaluates the business and investment climate worldwide. It replaces and improves upon the Doing Business project. B-READY provides a comprehensive data set and description of the factors that strengthen the private sector, not only by advancing the interests of individual firms but also by elevating the interests of workers, consumers, potential new enterprises, and the natural environment. This 2024 report introduces a new analytical framework that benchmarks economies based on three pillars: Regulatory Framework, Public Services, and Operational Efficiency. The analysis centers on 10 topics essential for private sector development that correspond to various stages of the life cycle of a firm. The report also offers insights into three cross-cutting themes that are relevant for modern economies: digital adoption, environmental sustainability, and gender. B-READY draws on a robust data collection process that includes specially tailored expert questionnaires and firm-level surveys. The 2024 report, which covers 50 economies, serves as the first in a series that will expand in geographical coverage and refine its methodology over time, supporting reform advocacy, policy guidance, and further analysis and research.
  • Publication
    World Bank East Asia and Pacific Economic Update, October 2025: Jobs
    (Washington, DC: World Bank, 2025-10-07) World Bank
    GDP growth in the East Asia and Pacific (EAP) region remains above the global average but is projected to slow down in 2025 and even further in 2026. The sluggishness is due to a less favorable external environment—rising trade restrictions, easing but still elevated global uncertainty, and slowing global growth—as well as persistent domestic difficulties. Today, many people are in low-productivity or informal jobs, and many of the young cannot find any jobs. The class of people vulnerable to falling into poverty is now larger than the middle class in most countries. In a region that thrived because export-oriented, labor-intensive growth created more productive jobs, firms must deal with higher tariffs and workers must contend with the growing use of robots, AI and digital platforms. More productive jobs would be created by reforms to enhance economic opportunity, human capacity and their virtuous interplay.
  • Publication
    Taking Stock, September 2025: Special Focus : Nurturing Viet Nam’s High-tech Talents
    (Washington, DC: World Bank, 2025-09-04) World Bank
    Following strong momentum in the first half of 2025, driven by front-loaded exports, the Vietnamese economy is expected to moderate over the remainder of the year as export growth normalizes, with a forecast real GDP growth of 6.6 percent in 2025. As an export-oriented economy, Viet Nam remains vulnerable to slower global growth and softening demand from major trading partners. Trade-policy uncertainty may also begin to weigh on business and consumer confidence. Over the medium term, growth is projected to ease to 6.1 percent in 2026 before rebounding to 6.5 percent in 2027, supported by a recovery in global trade and Viet Nam’s continued appeal as a competitive manufacturing base. To support growth and hedge against external uncertainty, the report recommends a focus on scaling up public investment, mitigating financial-sector risks, and advancing structural reforms. The special focus of this edition titled Nurturing Viet Nam’s High Tech Talents” highlights the need to build a skilled talent base that can support and accelerate the country’s innovation ecosystem. Achieving Viet Nam’s high tech ambitions and its goal of high income status by 2045 will require not only a broad and growing pipeline of young STEM graduates, but also a stronger core of experts who lead research, run laboratories, and turn ideas into market-ready products. The report highlights the potential to raise public and private R&D spending in Viet Nam, complementing broader business enabling reforms. Total R&D spending in Viet Nam remains lower than more developed regional peers. There is scope to increase PhD-level faculty to grow the pipeline of advanced-degree graduates and high-caliber researchers. Strengthening university–industry-government linkages could catalyze the development of a work-ready workforce and promote technology transfer and knowledge spillovers.
  • Publication
    Forging Viet Nam's Semiconductor Future: Talent and Innovation Leading the Way
    (Washington, DC: World Bank, 2025-09-10) World Bank
    Vietnam has prioritized semiconductors as one of ten critical technologies and set explicit goals of becoming a global semiconductor talent hub by 2030, moving up the value chain toward higher value-added segments, and developing a complete semiconductor value chain by 2045. Delivering on that pledge hinges on strategic and urgent investment in top talent, science, and innovation. This report concentrates on the most urgent gap – cultivating a cadre of scientists and engineers (S&E). and, to the extent possible, tech entrepreneurs, to support Vietnam’s semiconductor and high-tech ambitions. The analysis and interventions focus on the highly skilled workforce and frontier talent development, and cover research, innovation, and university-industry linkages in relation to the talent challenges. The report recognizes that to unlock its semiconductor potential, Viet Nam must address other binding constraints, notably: weak linkages between FDI and domestic firms, emerging infrastructure gaps (e.g., energy and logistics), and intellectual property rights regulations, amid evolving geoeconomics conditions.
  • Publication
    Global Economic Prospects, June 2025
    (Washington, DC: World Bank, 2025-06-10) World Bank
    The global economy is facing another substantial headwind, emanating largely from an increase in trade tensions and heightened global policy uncertainty. For emerging market and developing economies (EMDEs), the ability to boost job creation and reduce extreme poverty has declined. Key downside risks include a further escalation of trade barriers and continued policy uncertainty. These challenges are exacerbated by subdued foreign direct investment into EMDEs. Global cooperation is needed to restore a more stable international trade environment and scale up support for vulnerable countries grappling with conflict, debt burdens, and climate change. Domestic policy action is also critical to contain inflation risks and strengthen fiscal resilience. To accelerate job creation and long-term growth, structural reforms must focus on raising institutional quality, attracting private investment, and strengthening human capital and labor markets. Countries in fragile and conflict situations face daunting development challenges that will require tailored domestic policy reforms and well-coordinated multilateral support.