Publication: Integrating Survey and Geospatial
Data to Identify the Poor and Vulnerable: Evidence from Malawi
Loading...
Published
2022-12-08
ISSN
Date
2023-01-10
Author(s)
Editor(s)
Abstract
Generating timely data to identify the poorest villages in developing countries remains a fundamental challenge for existing data systems. This paper investigates the accuracy of four alternative methods for predicting a measure of village economic welfare for approximately 4,500 villages in 10 poor Malawian districts: (1) proxy means test scores calculated from the 2017 social registry, (2) the Meta Relative Wealth Index, (3) predictions derived from a standard household survey and publicly available geospatial indicators, and (4) predictions derived from a two-step approach that first predicts welfare into a hypothetical partial registry of approximately 450 villages, and then predicts welfare into the remaining villages using geospatial indicators. Geospatial indicators include land coverage indicators, weather data, night light data, building patterns, distance to major roads, and population density. Predictions are evaluated against a benchmark village welfare measure, constructed by imputing log per capita consumption from the 2016 integrated household survey into the 2018 household census using gradient boosting. Incorporating the hypothetical partial registry vastly improves the performance of the predictions. When using the partial registry, the rank correlation between the predicted and benchmark welfare measures is 0.75, while those for the other three methods range from -0.02 to 0.2, and similar results are seen when examining the area under the curve. Doubling the size of the partial registry does little to improve predictive performance. The results are robust to using a linear post–Least Absolute Selection and Shrinkage Operator model instead of gradient boosting for prediction. However, predictions using both methods are less accurate when the benchmark welfare measure is derived from a linear post–Least Absolute Selection and Shrinkage Operator model. Overall, the results strongly suggest that collecting partial registries of household-level poverty predictors in low-income contexts can vastly improve the performance of machine learning models that combine survey and satellite imagery for the purpose of village-level targeting.
Link to Data Set
Citation
“Gualavisi,Melany; Newhouse,David Locke. 2022. Integrating Survey and Geospatial
Data to Identify the Poor and Vulnerable: Evidence from Malawi. © World Bank. http://hdl.handle.net/10986/38442 License: CC BY 3.0 IGO.”
Digital Object Identifier
Associated URLs
Associated content
Other publications in this report series
Publication Climate and Social Sustainability in Fragility, Conflict, and Violence Contexts(Washington, DC: World Bank, 2026-01-07)Climate change is widely recognized as a driver of violent conflict, but its broader social effects remain less understood. Ignoring these dimensions risks a vicious cycle where climate policies might undermine socially just adaptation. Evidence is still limited on how climate shocks influence political participation, trust, or migration. This paper helps fill that gap by examining links between climate change, conflict, and social sustainability, with a focus on inclusion, resilience, cohesion, and legitimacy. Using secondary data from 2019–24, the study applies simple correlation-based methods to test three hypotheses on the nature, severity, and composition of these associations. The analysis combines multiple climate impact measures, new conflict classifications, recent social sustainability frameworks, and controls for population and geography. The results reveal strong correlations—not causation—between climate events and contexts of fragility, conflict, and violence. Climate impacts are most pronounced in both national and subnational conflict settings. The study also finds robust links between fragility, conflict, and violence and low levels of social sustainability, reflecting its role as both a driver and consequence of conflict. Some dimensions—such as violent events and insecurity—appear weaker in areas most affected by climate shocks. Two of the hypotheses are supported, and one remains inconclusive.Publication The Macroeconomic Implications of Climate Change Impacts and Adaptation Options(Washington, DC: World Bank, 2025-05-29)Estimating the macroeconomic implications of climate change impacts and adaptation options is a topic of intense research. This paper presents a framework in the World Bank's macrostructural model to assess climate-related damages. This approach has been used in many Country Climate and Development Reports, a World Bank diagnostic that identifies priorities to ensure continued development in spite of climate change and climate policy objectives. The methodology captures a set of impact channels through which climate change affects the economy by (1) connecting a set of biophysical models to the macroeconomic model and (2) exploring a set of development and climate scenarios. The paper summarizes the results for five countries, highlighting the sources and magnitudes of their vulnerability --- with estimated gross domestic product losses in 2050 exceeding 10 percent of gross domestic product in some countries and scenarios, although only a small set of impact channels is included. The paper also presents estimates of the macroeconomic gains from sector-level adaptation interventions, considering their upfront costs and avoided climate impacts and finding significant net gross domestic product gains from adaptation opportunities identified in the Country Climate and Development Reports. Finally, the paper discusses the limits of current modeling approaches, and their complementarity with empirical approaches based on historical data series. The integrated modeling approach proposed in this paper can inform policymakers as they make proactive decisions on climate change adaptation and resilience.Publication Institutional Capacity for Policy Implementation: An Analytical Framework(Washington, DC: World Bank, 2026-01-07)State capacity is an important prerequisite for policy implementation, yet at the country level it is difficult to measure, assess, and reform. This paper proposes a focus on institutional capacity: the ability of public institutions to implement the specific policy mandates for which they are responsible. Based on a review of existing literature, the paper defines the different dimensions that compose institutional capacity and groups them into two cross-cutting categories: organizational dimensions (personnel, financial resources, information systems, and management practices) and governance dimensions (transparency, independence, and accountability). The paper proposes measures for organizational and governance dimensions using existing data, shows intra-institutional variation of these measures within countries, and discusses how new data could be collected for better measurement of these concepts. Finally, the paper illustrates how the framework can be used to diagnose the sources of common problems related to weak policy implementation.Publication South Africa’s Fragmented Cities: The Unequal Burden of Labor Market Frictions(Washington, DC: World Bank, 2026-01-08)Using high-resolution administrative, census, and satellite data, this paper shows that South African cities are characterized by spatial mismatches between where people live and where jobs are located, relative to 20 global peers. Areas within 5 kilometers of commercial centers have 9,300 fewer residents per square kilometer than expected, which is 60 percent below the global median. Poor, dense neighborhoods are most affected. In Johannesburg, a 10-percentile increase in distance from the nearest business hub corresponds to a 3.7-percentile drop in asset wealth (a proxy of household wellbeing) and 4.9-percentile drop in employment. In Cape Town, the declines are 4.0 and 3.7 percentiles, respectively. Employment is 87 percent lower in the poorest decile than the richest in Johannesburg and 61 percent lower in Cape Town. These findings suggest that South Africa’s spatial organization of people and economic activity constrains agglomeration and reinforces inequality. This methodology provides a scalable and standardized data-driven framework to analyze spatial accessibility and agglomeration frictions in complex, data-constrained urban systems.Publication Investment in Emerging and Developing Economies(Washington, DC: World Bank, 2026-01-07)The world faces a pressing challenge to meet key development objectives amid slowing growth and rising macroeconomic and geopolitical risks. With the number of job seekers rising rapidly, infrastructure shortfalls continuing to be large, and climate costs mounting, the case for a significant investment push has never been stronger. Yet the capacity to respond in many emerging markets and developing economies has eroded. Since the global financial crisis, investment growth has slowed to about half its pace in the 2000s, with both public and private investment weakening. Foreign direct investment inflows—a critical source of capital, technology, and managerial know-how—have also fallen sharply and become increasingly concentrated, leaving low-income countries with only a marginal share. The risks of further retrenchment are significant, as trade tensions, policy uncertainty, and elevated debt levels continue to weigh on investment. Reigniting momentum will require ambitious domestic reforms to strengthen institutions, rebuild macro-fiscal stability, and deepen trade and investment integration—the foundations of a supportive business climate. At the same time, international cooperation is indispensable. A renewed commitment to a predictable system of cross-border trade and investment flows, combined with scaled-up financial support and sustained technical assistance, is essential to help emerging markets and developing economies—especially low-income countries and economies in fragile and conflict situations—bridge financing gaps and implement the domestic reforms needed to restore investment as an engine of growth, jobs, and development.
Journal
Journal Volume
Journal Issue
Collections
Related items
Showing items related by metadata.
Publication Integrating Survey and Geospatial Data for Geographical Targeting of the Poor and Vulnerable(Washington, DC: World Bank, 2024-05-27)To address the challenge of identifying the poorest villages in developing countries, this study introduces a cost-effective strategy that leverages a combination of household consumption surveys, geospatial data, and a partial registry. The study simulates a partial registry, containing data from 450 villages across 10 impoverished districts of Malawi, and contains proxy poverty indicators. These indicators are used to impute household per capita consumption estimates, which in turn are used to train a prediction model using publicly available geospatial data. This method is evaluated against an imputed reference of village welfare, derived from the 2016 household survey. The partial registry approach is benchmarked against three alternatives: proxy means test scores, the Meta Relative Wealth Index, and predictions from household surveys with geospatial indicators. Results show the partial registry model’s rank correlation with actual welfare measures at 0.75, outperforming the other methods significantly, which ranged from −0.02 to 0.2. These findings hold under various robustness checks, including the addition of Gaussian noise, indicating that collecting household-level proxy poverty data in low-income areas can significantly improve the performance of machine learning models that integrate survey and satellite imagery data for village-level geographic targeting.Publication Small Area Estimation of Poverty in Four West African Countries by Integrating Survey and Geospatial Data(Washington, DC: World Bank, 2024-09-05)The paper presents a methodology to generate experimental small area estimates of poverty in four West African countries: Chad, Guinea, Mali, and Niger. Due to the absence of recent census data in these countries, household-level survey data are integrated with grid-level geospatial data, which are used as covariates in model-based estimation. Leveraging geospatial data enables reporting of poverty estimates more frequently at disaggregated administrative levels and makes estimation feasible in areas for which survey data are not available. The paper leverages the availability of a recent census in Burkina Faso for evaluation purposes. Estimates obtained with the same survey instruments and candidate geospatial covariates as the other four countries are compared against estimates obtained using recent census data and an empirical best predictor under a unit-level model. For Burkina Faso, the estimates obtained using geospatial data are highly correlated with the census-based ones in sampled areas but moderately correlated in non-sampled areas. The results demonstrate that in the absence of recent census data, small area estimation with publicly available geospatial covariates isPublication Combining Survey and Geospatial Data Can Significantly Improve Gender-Disaggregated Estimates of Labor Market Outcomes(World Bank, Washington, DC, 2022-06)Better understanding the geography of women’s labor market outcomes within countries is important to inform targeted efforts to increase women’s economic empowerment. This paper assesses the extent to which a method that combines simulated survey data from urban areas in Mexico with broadly available geospatial indicators from Google Earth Engine and OpenStreetMap can significantly improve estimates of labor force participation and unemployment rates. Incorporating geospatial information substantially increases the accuracy of male and female labor force participation and unemployment rates at the state level, reducing mean absolute deviation by 50 to 62 percent for labor force participation and 25 to 52 percent for unemployment. Small area estimation using a nested error conditional random effect model also greatly improves municipal estimates of labor force participation, as the mean absolute error falls by approximately half, while the mean squared error falls by almost 75 percent when holding coverage rates constant. In contrast, the results for municipal unemployment rate estimates are not reliable because values of unemployment rates are low and therefore poorly suited for linear models. The municipal results hold in repeated simulations of alternative samples. Models utilizing Basic Geo-Statistical Area (AGEB)–level auxiliary information generate more accurate predictions than area-level models specified using the same auxiliary data. Overall, integrating survey data and publicly available geospatial indicators is feasible and can greatly improve state-level estimates of male and female labor force participation and unemployment rates, as well as municipal estimates of male and female labor force participation.Publication Small Area Estimation of Poverty and Wealth Using Geospatial Data(World Bank, Washington, DC, 2023-07-18)This paper offers a nontechnical review of selected applications that combine survey and geospatial data to generate small area estimates of wealth or poverty. Publicly available data from satellites and phones predicts poverty and wealth accurately across space, when evaluated against census data, and their use in model-based estimates improve the accuracy and efficiency of direct survey estimates. Although the evidence is scant, models based on interpretable features appear to predict at least as well as estimates derived from Convolutional Neural Networks. Estimates for sampled areas are significantly more accurate than those for non-sampled areas due to informative sampling. In general, estimates benefit from using geospatial data at the most disaggregated level possible. Tree-based machine learning methods appear to generate more accurate estimates than linear mixed models. Small area estimates using geospatial data can improve the design of social assistance programs, particularly when the existing targeting system is poorly designed.Publication Small Area Estimation of Non-Monetary Poverty with Geospatial Data(World Bank, Washington, DC, 2020-09)This paper uses data from Sri Lanka and Tanzania to evaluate the benefits of combining household surveys with geographically comprehensive geospatial indicators to generate small area estimates of non-monetary poverty. The preferred estimates are generated by utilizing subarea-level geospatial indicators in a household-level empirical best predictor mixed model with a normalized welfare measure. Mean squared errors are estimated using a parametric bootstrap procedure. The resulting estimates are highly correlated with non-monetary poverty calculated from the full census in both countries, and the gain in precision is comparable to increasing the size of the sample by a factor of three in Sri Lanka and five in Tanzania. The empirical best predictor model moderately underestimates uncertainty, but coverage rates are similar to standard survey-based estimates that assume independent outcomes across clusters. A variety of checks, including adding noise to the welfare measure and model-based and design-based simulations, confirm that the main results are robust. The results demonstrate that combining household survey data with subarea-level geospatial indicators can greatly increase the precision of survey estimates of non-monetary poverty at comparatively low cost.
Users also downloaded
Showing related downloaded files
Publication The Role of Social Ties in Factor Allocation(Published by Oxford University Press on behalf of the World Bank, 2019-10)We investigate whether social structure helps or hinders factor allocation using unusually rich data from the Gambia. Evidence indicates that land available for cultivation is allocated unequally across households; and that factor transfers are more common between neighbors, co-ethnics, and kinship-related households. Does this lead to the conclusion that land inequality is due to flows of land between households being impeded by social divisions? To answer this question, a novel methodology that approaches exhaustive data on dyadic flows from an aggregate point of view is introduced. Land transfers lead to a more equal distribution of land and to more comparable factor ratios across households in general. But equalizing transfers of land are not more likely within ethnic or kinship groups. In conclusion, ethnic and kinship divisions do not hinder land and labor transfers in a way that contributes to aggregate factor inequality. Labor transfers do not equilibrate factor ratios across households. But it cannot be ruled out that they serve a beneficial role, for example, to deal with unanticipated health shocks.Publication Digital Africa(Washington, DC: World Bank, 2023-03-13)All African countries need better and more jobs for their growing populations. "Digital Africa: Technological Transformation for Jobs" shows that broader use of productivity-enhancing, digital technologies by enterprises and households is imperative to generate such jobs, including for lower-skilled people. At the same time, it can support not only countries’ short-term objective of postpandemic economic recovery but also their vision of economic transformation with more inclusive growth. These outcomes are not automatic, however. Mobile internet availability has increased throughout the continent in recent years, but Africa’s uptake gap is the highest in the world. Areas with at least 3G mobile internet service now cover 84 percent of Africa’s population, but only 22 percent uses such services. And the average African business lags in the use of smartphones and computers as well as more sophisticated digital technologies that catalyze further productivity gains. Two issues explain the usage gap: affordability of these new technologies and willingness to use them. For the 40 percent of Africans below the extreme poverty line, mobile data plans alone would cost one-third of their incomes—in addition to the price of access devices, apps, and electricity. Data plans for small- and medium-size businesses are also more expensive than in other regions. Moreover, shortcomings in the quality of internet services—and in the supply of attractive, skills-appropriate apps that promote entrepreneurship and raise earnings—dampen people’s willingness to use them. For those countries already using these technologies, the development payoffs are significant. New empirical studies for this report add to the rapidly growing evidence that mobile internet availability directly raises enterprise productivity, increases jobs, and reduces poverty throughout Africa. To realize these and other benefits more widely, Africa’s countries must implement complementary and mutually reinforcing policies to strengthen both consumers’ ability to pay and willingness to use digital technologies. These interventions must prioritize productive use to generate large numbers of inclusive jobs in a region poised to benefit from a massive, youthful workforce—one projected to become the world’s largest by the end of this century.Publication The Firm-Level Impact of the Covid–19 Pandemic(World Bank, Washington, DC, 2020-09-02)The World Bank commissioned a firm-level survey to provide quantitative evidence of the impact of the Coronavirus (COVID-19) pandemic. Two rounds of data have now been collected for the months of March and May using a nationally representative World Bank survey providing information on the impact of the Coronavirus (COVID-19) pandemic. The survey includes five hundred firms spanning a wide range of industries and firm sizes, as well as the formal and informal sector. This note provides a snapshot of how the firms’ outcomes and response to the pandemic have changed between the months of March and May 2020.Publication The World Bank Annual Report 2017(Washington, DC: World Bank, 2017-10-06)The Annual Report is prepared by the Executive Directors of the International Bank for Reconstruction and Development (IBRD) and the International Development Association (IDA)--collectively known as the World Bank--in accordance with the by-laws of the two institutions. The President of the IBRD and IDA and the Chairman of the Board of Executive Directors submits the Report, together with the accompanying administrative budgets and audited financial statements, to the Board of Governors.Publication MIGA Annual Report 2013 : Insuring Investments, Ensuring Opportunities(Washington, DC: World Bank Group, 2013-10-11)In fiscal year 2013, Multilateral Investment Guarantee Agency (MIGA) issued 2.8 billion dollars in investment guarantees for projects in our developing member countries. At 1.5 billion dollars, representing more than half of new business, the bulk of MIGA's guarantees issued support investments in Sub-Saharan Africa. Sixty-nine percent of new business volume this year was in complex projects in infrastructure and extractive industries, a strategic priority for the Agency. This year, 82 percent of MIGA's new volume fell into one or more of strategic priority areas: investments in the world's poorest countries, "South-South" investments, investments in conflict-affected countries, and investments in complex projects. MIGA also established the conflict-affected and fragile economies facility to further deepen support to this priority area.