Publication: Implications of Choice of Second Stage Selection Method on Sampling Error and Non-Sampling Error: Evidence from an IDP Camp in South Sudan
Loading...
Published
2024-01-25
ISSN
Date
2024-01-25
Author(s)
Editor(s)
Abstract
The most common sampling approach for cross-sectional household surveys in the developing world is a stratified two-stage design, where the first stage is usually a sample from a census-based area frame, and the second stage is a random sample of households from each of the areas selected in the first stage. To overcome the problem of outdated census frame information, it is common to conduct a household listing operation within these areas. However, these listing operations come with severe implications for survey costs, timeframe, as well as quality. To avoid such second-stage operations, some surveys choose alternate approaches for their second-stage operation. This paper compares five of these approaches, namely, satellite mapping, segmentation, grid square, the north method, and random walk, through simulations based on a census conducted in a refugee camp in South Sudan. The paper compares the simulated approach with the estimates derived from the actual experiment and finds that all the resulting estimates are biased. Nevertheless, in addition to their practical challenges, the satellite mapping, segmentation, and grid square approaches exhibit the smallest bias. Although random walk shows the worst performance in the simulations, it regains ground in its implementation, especially vis-à-vis the north method, where implementation adds most significantly to its bias. In conclusion, most probability-based methods perform better than non-probability methods like random walk and are therefore preferrable when no traditional household listing can take place. Although it is important to consider the theoretical properties of sampling approaches, implementation is at least as important. Training, implementation modalities, and monitoring of compliance are key factors in the overall performance.
Link to Data Set
Citation
“Himelein, Kristen; Pape, Utz; Wild, Michael. 2024. Implications of Choice of Second Stage Selection Method on Sampling Error and Non-Sampling Error: Evidence from an IDP Camp in South Sudan. Policy Research Working Paper; 10675. © World Bank. http://hdl.handle.net/10986/40968 License: CC BY 3.0 IGO.”
Digital Object Identifier
Associated URLs
Associated content
Other publications in this report series
Publication The Economic Value of Weather Forecasts: A Quantitative Systematic Literature Review(Washington, DC: World Bank, 2025-09-10)This study systematically reviews the literature that quantifies the economic benefits of weather observations and forecasts in four weather-dependent economic sectors: agriculture, energy, transport, and disaster-risk management. The review covers 175 peer-reviewed journal articles and 15 policy reports. Findings show that the literature is concentrated in high-income countries and most studies use theoretical models, followed by observational and then experimental research designs. Forecast horizons studied, meteorological variables and services, and monetization techniques vary markedly by sector. Estimated benefits even within specific subsectors span several orders of magnitude and broad uncertainty ranges. An econometric meta-analysis suggests that theoretical studies and studies in richer countries tend to report significantly larger values. Barriers that hinder value realization are identified on both the provider and user sides, with inadequate relevance, weak dissemination, and limited ability to act recurring across sectors. Policy reports rely heavily on back-of-the-envelope or recursive benefit-transfer estimates, rather than on the methods and results of the peer-reviewed literature, revealing a science-to-policy gap. These findings suggest substantial socioeconomic potential of hydrometeorological services around the world, but also knowledge gaps that require more valuation studies focusing on low- and middle-income countries, addressing provider- and user-side barriers and employing rigorous empirical valuation methods to complement and validate theoretical models.Publication Direct and Indirect Impacts of Transport Mobility on Access to Jobs: Evidence from South Africa(Washington, DC: World Bank, 2025-11-12)Access to jobs is essential for economic growth. In Africa, unemployment rates are notably high. This paper reexamines the relationship between transport mobility and labor market outcomes, with a particular focus on the direct and indirect effects of transport connectivity. As predicted by theory, wages are influenced by the level of commuting deterrence. Generally, higher earnings are associated with longer commute times and/or higher commuting costs. Local accessibility is also important, especially for individuals with time constraints. Both direct and indirect impacts are found to be significant in South Africa, where job accessibility has been challenging since the end of apartheid. For the direct impact, the wage elasticity associated with commuting costs is significant. Returns on commute are particularly high for women. Local accessibility to socioeconomic facilities, such as shops and health services, is also found to have a significant impact, consistent with the concept of mobility of care. To enhance employment, therefore, it is crucial to connect people not only to job locations but also to various socioeconomic points of interest, such as markets and hospitals, in an integrated manner. This integration will enable individuals to spend more time working and commuting longer distances.Publication The Macroeconomic Implications of Climate Change Impacts and Adaptation Options(Washington, DC: World Bank, 2025-05-29)Estimating the macroeconomic implications of climate change impacts and adaptation options is a topic of intense research. This paper presents a framework in the World Bank's macrostructural model to assess climate-related damages. This approach has been used in many Country Climate and Development Reports, a World Bank diagnostic that identifies priorities to ensure continued development in spite of climate change and climate policy objectives. The methodology captures a set of impact channels through which climate change affects the economy by (1) connecting a set of biophysical models to the macroeconomic model and (2) exploring a set of development and climate scenarios. The paper summarizes the results for five countries, highlighting the sources and magnitudes of their vulnerability --- with estimated gross domestic product losses in 2050 exceeding 10 percent of gross domestic product in some countries and scenarios, although only a small set of impact channels is included. The paper also presents estimates of the macroeconomic gains from sector-level adaptation interventions, considering their upfront costs and avoided climate impacts and finding significant net gross domestic product gains from adaptation opportunities identified in the Country Climate and Development Reports. Finally, the paper discusses the limits of current modeling approaches, and their complementarity with empirical approaches based on historical data series. The integrated modeling approach proposed in this paper can inform policymakers as they make proactive decisions on climate change adaptation and resilience.Publication From Policy to Practice: Lessons from the Implementation of the Refugee Work Rights Policy in Ethiopia(Washington, DC: World Bank, 2025-11-10)This paper examines the early implementation of Ethiopia’s refugee work rights policy, with a focus on the issuance of permits that enable refugees to engage in economic activities. Building on significant legal and institutional advances under the 2019 Refugee Proclamation and subsequent directives, the analysis explores how these reforms are being operationalized in practice. Using a mixed-methods approach, combining document review, administrative data analysis, and semi-structured interviews, the paper identifies both progress and remaining challenges. Permit issuance has increased since the adoption of detailed operational guidance in 2024, reflecting the Government of Ethiopia’s commitment to operationalizing its progressive legal framework and ensuring that refugees can exercise their right to work. However, take-up remains modest, with about 5.2 percent of the working-age population holding a permit. Preliminary evidence suggests that coordination gaps, limited subnational capacity, low awareness among refugees and employers, and disincentives to formalize in a largely informal labor market are contributing to the low take-up. The paper offers policy suggestions, grounded in the Ethiopian context and emerging evidence, to help translate legal commitments into improved labor market outcomes for refugees.Publication Monitoring Global Aid Flows: A Novel Approach Using Large Language Models(Washington, DC: World Bank, 2025-11-04)Effective monitoring of development aid is the foundation for assessing the alignment of flows with their intended development objectives. Existing reporting systems, such as the Organisation for Economic Co-operation and Development’s Creditor Reporting System, provide standardized classification of aid activities but have limitations when it comes to capturing new areas like climate change, digitalization, and other cross-cutting themes. This paper proposes a bottom-up, unsupervised machine learning framework that leverages textual descriptions of aid projects to generate highly granular activity clusters. Using the 2021 Creditor Reporting System data set of nearly 400,000 records, the model produces 841 clusters, which are then grouped into 80 subsectors. These clusters reveal 36 emerging aid areas not tracked in the current Creditor Reporting System taxonomy, allow unpacking of “multi-sectoral” and “sector not specified” classifications, and enable estimation of flows to new themes, including World Bank Global Challenge Programs, International Development Association–20 Special Themes, and Cross-Cutting Issues. Validation against both Creditor Reporting System benchmarks and International Development Association commitment data demonstrates robustness. This approach illustrates how machine learning and the new advances in large language models can enhance the monitoring of global aid flows and inform future improvements in aid classification and reporting. It offers a useful tool that can support more responsive and evidence-based decision-making, helping to better align resources with evolving development priorities.
Journal
Journal Volume
Journal Issue
Collections
Related items
Showing items related by metadata.
Publication Second-Stage Sampling for Conflict Areas(World Bank, Washington, DC, 2016-03)The collection of survey data from war zones or other unstable security situations is vulnerable to error because conflict often limits the implementation options. Although there are elevated risks throughout the process, this paper focuses specifically on challenges to frame construction and sample selection. The paper uses simulations based on data from the Mogadishu High Frequency Survey Pilot to examine the implications of the choice of second-stage selection methodology on bias and variance. Among the other findings, the simulations show the bias introduced by a random walk design leads to the underestimation of the poverty headcount by more than 10 percent. The paper also discusses the experience of the authors in the time required and technical complexity of the associated back-office preparation work and weight calculations for each method. Finally, as the simulations assume perfect implementation of the design, the paper also discusses practicality, including the ease of implementation and options for remote verification, and outlines areas for future research and pilot testing.Publication A Novel Approach to the Automatic Designation of Predefined Census Enumeration Areas and Population Sampling Frames(World Bank, Washington, DC, 2019-08)Enumeration areas are the operational geographic units for the collection, dissemination, and analysis of census data and are often used as a national sampling frame for various types of surveys. Traditionally, enumeration areas are created by manually digitizing small geographic units on high-resolution satellite imagery or physically walking the boundaries of units, both of which are highly time, cost, and labor intensive. In addition, creating enumeration areas requires considering the size of the population and area within each unit. This is an optimization problem that can best be solved by a computer. This paper, for the first time, produces an automatic designation of predefined census enumeration areas based on high-resolution gridded population and settlement data sets and using publicly available natural and administrative boundaries. This automated approach is compared with manually digitized enumeration areas that were created in urban areas in Mogadishu and Hargeisa for the United Nations Population Estimation Survey for Somalia in 2014. The automatically generated enumeration areas are consistent with standard enumeration areas, including having identifiable boundaries to field teams on the ground, and appropriate sizing and population for coverage by an enumerator. Furthermore, the automated urban enumeration areas have no gaps. The paper extends this work to rural Somalia, for which no records exist of previous enumeration area demarcations. This work shows the time, labor, and cost-saving value of automated enumeration area delineation and points to the potential for broadly available tools that are suitable for low-income and data-poor settings but applicable to potentially wider contexts.Publication Estimating Poverty in the Absence of Consumption Data : The Case of Liberia(World Bank Group, Washington, DC, 2014-09)In much of the developing world, the demand for high frequency quality household data for poverty monitoring and program design far outstrips the capacity of the statistics bureau to provide such data. In these environments, all available data sources must be leveraged. Most surveys, however, do not collect the detailed consumption data necessary to construct aggregates and poverty lines to measure poverty directly. This paper benefits from a shared listing exercise for two large-scale national household surveys conducted in Liberia in 2007 to explore alternative methodologies to estimate poverty indirectly. The first is an asset-based model that is commonly used in Demographic and Health Surveys. The second is a survey-to-survey imputation that makes use of small area estimation techniques. In addition to a standard base model, separate models are estimated for urban and rural areas and an expanded model that includes climatic variables. Special attention is paid to the inclusion of cell phones, with implications for other assets whose cost and availability may be changing rapidly. The results demonstrate substantial limitations with asset-based indexes, but also leave questions as to the accuracy and stability of imputation models.Publication Surveying Justice : A Practical Guide to Household Surveys(World Bank, Washington, DC, 2010-01)Though household surveys have long been an established part of development practice and regularly used to gather data on poverty incidence and the range of associated indicators, they have not yet become a common tool of justice reform practitioners. This guide aims to be a practical starting point for integrating justice work and household data collection, targeted both towards justice practitioners interested in survey design, as well as survey researchers interested in incorporating justice questions into their work. It provides guidance on designing a survey, suggested topics and questions, and ideas to facilitate a constructive engagement in discussions around justice in development practice. Household survey data can be beneficial to understanding justice questions as household surveys ordinarily cover a large, randomly selected cross-section of people - including the rich and poor, urban and rural dwellers - capturing a population's most common justice issues. Household survey questions commonly ask respondents about their most frequently experienced justice issues, issues when seeking redress, and knowledge and opinions of the law. Household surveys thus complement data collection techniques more familiar to justice practitioners (such as user surveys or sector assessments) that tend to focus on institutions of the justice sector and hence capture only the views of those who manage to access such institutions and privilege the perspectives of system incumbents. Household surveys have their limitations - not least significant cost, time and complexity implications. In addition, the standardized nature of surveys limits the type of information that can be gleaned and hence household surveys are generally most useful for gaining a picture of the "what" when it comes to justice issues, with complementary research methods often needed to properly understand the "why." Nevertheless, surveys can represent a useful starting point for engagement in a particular context, providing a snap shot of the justice landscape from which more detailed qualitative and quantitative studies can be undertaken.Publication Effects of Data Collection Methods on Estimated Household Consumption and Survey Costs(World Bank, Washington, DC, 2022-04-28)In the Pacific, multitopic household surveys have historically gathered expenditure data using open form diaries completed on paper. This methodology is costly to governments, is burdensome for respondents, and takes substantial time to process the results. Noncompliance and partial compliance in diary keeping can artificially inflate poverty measures, biasing economic statistics. This paper reports findings from an experiment in the Marshall Islands comparing the cost and accuracy of several collection methodologies. Variable costs for the status quo diary survey design are between 2.8 and 4.4 times more expensive than a single-visit seven-day recall survey, with the tablet-based diary being even more costly. The highly monitored diaries give similar results to recall but at much greater cost; the status quo yields data of worse quality as effective completion rates with low monitored diaries are only two-thirds the completion rates of recall-based options. Finally, the paper discusses the implementation challenges associated with the different methods in a capacity-constrained environment.
Users also downloaded
Showing related downloaded files
Publication Digital Africa(Washington, DC: World Bank, 2023-03-13)All African countries need better and more jobs for their growing populations. "Digital Africa: Technological Transformation for Jobs" shows that broader use of productivity-enhancing, digital technologies by enterprises and households is imperative to generate such jobs, including for lower-skilled people. At the same time, it can support not only countries’ short-term objective of postpandemic economic recovery but also their vision of economic transformation with more inclusive growth. These outcomes are not automatic, however. Mobile internet availability has increased throughout the continent in recent years, but Africa’s uptake gap is the highest in the world. Areas with at least 3G mobile internet service now cover 84 percent of Africa’s population, but only 22 percent uses such services. And the average African business lags in the use of smartphones and computers as well as more sophisticated digital technologies that catalyze further productivity gains. Two issues explain the usage gap: affordability of these new technologies and willingness to use them. For the 40 percent of Africans below the extreme poverty line, mobile data plans alone would cost one-third of their incomes—in addition to the price of access devices, apps, and electricity. Data plans for small- and medium-size businesses are also more expensive than in other regions. Moreover, shortcomings in the quality of internet services—and in the supply of attractive, skills-appropriate apps that promote entrepreneurship and raise earnings—dampen people’s willingness to use them. For those countries already using these technologies, the development payoffs are significant. New empirical studies for this report add to the rapidly growing evidence that mobile internet availability directly raises enterprise productivity, increases jobs, and reduces poverty throughout Africa. To realize these and other benefits more widely, Africa’s countries must implement complementary and mutually reinforcing policies to strengthen both consumers’ ability to pay and willingness to use digital technologies. These interventions must prioritize productive use to generate large numbers of inclusive jobs in a region poised to benefit from a massive, youthful workforce—one projected to become the world’s largest by the end of this century.Publication Lebanon Economic Monitor, Fall 2022(Washington, DC, 2022-11)The economy continues to contract, albeit at a somewhat slower pace. Public finances improved in 2021, but only because spending collapsed faster than revenue generation. Testament to the continued atrophy of Lebanon’s economy, the Lebanese Pound continues to depreciate sharply. The sharp deterioration in the currency continues to drive surging inflation, in triple digits since July 2020, impacting the poor and vulnerable the most. An unprecedented institutional vacuum will likely further delay any agreement on crisis resolution and much needed reforms; this includes prior actions as part of the April 2022 International Monetary Fund (IMF) staff-level agreement (SLA). Divergent views among key stakeholders on how to distribute the financial losses remains the main bottleneck for reaching an agreement on a comprehensive reform agenda. Lebanon needs to urgently adopt a domestic, equitable, and comprehensive solution that is predicated on: (i) addressing upfront the balance sheet impairments, (ii) restoring liquidity, and (iii) adhering to sound global practices of bail-in solutions based on a hierarchy of creditors (starting with banks’ shareholders) that protects small depositors.Publication World Development Report 2006(Washington, DC, 2005)This year’s Word Development Report (WDR), the twenty-eighth, looks at the role of equity in the development process. It defines equity in terms of two basic principles. The first is equal opportunities: that a person’s chances in life should be determined by his or her talents and efforts, rather than by pre-determined circumstances such as race, gender, social or family background. The second principle is the avoidance of extreme deprivation in outcomes, particularly in health, education and consumption levels. This principle thus includes the objective of poverty reduction. The report’s main message is that, in the long run, the pursuit of equity and the pursuit of economic prosperity are complementary. In addition to detailed chapters exploring these and related issues, the Report contains selected data from the World Development Indicators 2005‹an appendix of economic and social data for over 200 countries. This Report offers practical insights for policymakers, executives, scholars, and all those with an interest in economic development.Publication Classroom Assessment to Support Foundational Literacy(Washington, DC: World Bank, 2025-03-21)This document focuses primarily on how classroom assessment activities can measure students’ literacy skills as they progress along a learning trajectory towards reading fluently and with comprehension by the end of primary school grades. The document addresses considerations regarding the design and implementation of early grade reading classroom assessment, provides examples of assessment activities from a variety of countries and contexts, and discusses the importance of incorporating classroom assessment practices into teacher training and professional development opportunities for teachers. The structure of the document is as follows. The first section presents definitions and addresses basic questions on classroom assessment. Section 2 covers the intersection between assessment and early grade reading by discussing how learning assessment can measure early grade reading skills following the reading learning trajectory. Section 3 compares some of the most common early grade literacy assessment tools with respect to the early grade reading skills and developmental phases. Section 4 of the document addresses teacher training considerations in developing, scoring, and using early grade reading assessment. Additional issues in assessing reading skills in the classroom and using assessment results to improve teaching and learning are reviewed in section 5. Throughout the document, country cases are presented to demonstrate how assessment activities can be implemented in the classroom in different contexts.Publication Argentina Country Climate and Development Report(World Bank, Washington, DC, 2022-11)The Argentina Country Climate and Development Report (CCDR) explores opportunities and identifies trade-offs for aligning Argentina’s growth and poverty reduction policies with its commitments on, and its ability to withstand, climate change. It assesses how the country can: reduce its vulnerability to climate shocks through targeted public and private investments and adequation of social protection. The report also shows how Argentina can seize the benefits of a global decarbonization path to sustain a more robust economic growth through further development of Argentina’s potential for renewable energy, energy efficiency actions, the lithium value chain, as well as climate-smart agriculture (and land use) options. Given Argentina’s context, this CCDR focuses on win-win policies and investments, which have large co-benefits or can contribute to raising the country’s growth while helping to adapt the economy, also considering how human capital actions can accompany a just transition.