Publication: Agricultural Data Collection to Minimize Measurement Error and Maximize Coverage
Loading...
Published
2021-07
ISSN
Date
2021-08-05
Author(s)
Editor(s)
Abstract
Advances in agricultural data production provide ever-increasing opportunities for pushing the research frontier in agricultural economics and designing better agricultural policy. As new technologies present opportunities to create new and integrated data sources, researchers face trade-offs in survey design that may reduce measurement error or increase coverage. This paper first reviews the econometric and survey methodology literatures that focus on the sources of measurement error and coverage bias in agricultural data collection. Second, it provides examples of how agricultural data structure affects testable empirical models. Finally, it reviews the challenges and opportunities offered by technological innovation to meet old and new data demands and address key empirical questions, focusing on the scalable data innovations of greatest potential impact for empirical methods and research.
Link to Data Set
Citation
“Carletto, Calogero; Dillon, Andrew; Zezza, Alberto. 2021. Agricultural Data Collection to Minimize Measurement Error and Maximize Coverage. Policy Research Working Paper;No. 9745. © World Bank. http://hdl.handle.net/10986/36056 License: CC BY 3.0 IGO.”
Digital Object Identifier
Associated URLs
Associated content
Other publications in this report series
Publication The Economic Value of Weather Forecasts: A Quantitative Systematic Literature Review(Washington, DC: World Bank, 2025-09-10)This study systematically reviews the literature that quantifies the economic benefits of weather observations and forecasts in four weather-dependent economic sectors: agriculture, energy, transport, and disaster-risk management. The review covers 175 peer-reviewed journal articles and 15 policy reports. Findings show that the literature is concentrated in high-income countries and most studies use theoretical models, followed by observational and then experimental research designs. Forecast horizons studied, meteorological variables and services, and monetization techniques vary markedly by sector. Estimated benefits even within specific subsectors span several orders of magnitude and broad uncertainty ranges. An econometric meta-analysis suggests that theoretical studies and studies in richer countries tend to report significantly larger values. Barriers that hinder value realization are identified on both the provider and user sides, with inadequate relevance, weak dissemination, and limited ability to act recurring across sectors. Policy reports rely heavily on back-of-the-envelope or recursive benefit-transfer estimates, rather than on the methods and results of the peer-reviewed literature, revealing a science-to-policy gap. These findings suggest substantial socioeconomic potential of hydrometeorological services around the world, but also knowledge gaps that require more valuation studies focusing on low- and middle-income countries, addressing provider- and user-side barriers and employing rigorous empirical valuation methods to complement and validate theoretical models.Publication The Macroeconomic Implications of Climate Change Impacts and Adaptation Options(Washington, DC: World Bank, 2025-05-29)Estimating the macroeconomic implications of climate change impacts and adaptation options is a topic of intense research. This paper presents a framework in the World Bank's macrostructural model to assess climate-related damages. This approach has been used in many Country Climate and Development Reports, a World Bank diagnostic that identifies priorities to ensure continued development in spite of climate change and climate policy objectives. The methodology captures a set of impact channels through which climate change affects the economy by (1) connecting a set of biophysical models to the macroeconomic model and (2) exploring a set of development and climate scenarios. The paper summarizes the results for five countries, highlighting the sources and magnitudes of their vulnerability --- with estimated gross domestic product losses in 2050 exceeding 10 percent of gross domestic product in some countries and scenarios, although only a small set of impact channels is included. The paper also presents estimates of the macroeconomic gains from sector-level adaptation interventions, considering their upfront costs and avoided climate impacts and finding significant net gross domestic product gains from adaptation opportunities identified in the Country Climate and Development Reports. Finally, the paper discusses the limits of current modeling approaches, and their complementarity with empirical approaches based on historical data series. The integrated modeling approach proposed in this paper can inform policymakers as they make proactive decisions on climate change adaptation and resilience.Publication Labor Demand in the Age of Generative AI: Early Evidence from the U.S. Job Posting Data(Washington, DC: World Bank, 2025-11-18)This paper examines the causal impact of generative artificial intelligence on U.S. labor demand using online job posting data. Exploiting ChatGPT’s release in November 2022 as an exogenous shock, the paper applies difference-in-differences and event study designs to estimate the job displacement effects of generative artificial intelligence. The identification strategy compares labor demand for occupations with high versus low artificial intelligence substitution vulnerability following ChatGPT’s launch, conditioning on similar generative artificial intelligence exposure levels to isolate substitution effects from complementary uses. The analysis uses 285 million job postings collected by Lightcast from the first quarter of 2018 to the second quarter of 2025Q2. The findings show that the number of postings for occupations with above-median artificial intelligence substitution scores fell by an average of 12 percent relative to those with below-median scores. The effect increased from 6 percent in the first year after the launch to 18 percent by the third year. Losses were particularly acute for entry-level positions that require neither advanced degrees (18 percent) nor extensive experience (20 percent), as well as those in administrative support (40 percent) and professional services (30 percent). Although generative artificial intelligence generates new occupations and enhances productivity, which may increase labor demand, early evidence suggests that some occupations may be less likely to be complemented by generative artificial intelligence than others.Publication The Lasting Effects of Working while in School(Washington, DC: World Bank, 2025-08-18)This paper provides the first experimental evidence on the long-term effects of work-study programs, leveraging a randomized lottery design from a national program in Uruguay. Participation leads to a persistent 11 percent increase in formal labor earnings, observable seven years after the program. Effects are stronger for youth who participate during pivotal educational transitions and are larger for vulnerable youth and men, while remaining positive for women and non-vulnerable youth. The program is highly cost-effective, with average impacts exceeding those of job training programs and comparable to early childhood investments.Publication It’s Not (Just) the Tariffs: Rethinking Non-Tariff Measures in a Fragmented Global Economy(Washington, DC: World Bank, 2025-10-22)As tariffs have declined, non-tariff measures (NTMs) have become central to trade policy, especially in high-income countries and regulated sectors like food and green technologies. Although NTMs may serve legitimate goals, they could also sort countries and firms into or out of markets based on compliance capacity and differences in product mix. Documenting recent advances in the estimation of ad valorem equivalents (AVEs), this paper uncovers new patterns of use and exposure of NTMs. High-income countries rely more heavily on NTMs relative to tariffs, while low- and middle-income countries face steeper AVEs on their exports. Firm-level evidence shows that NTMs disproportionately affect smaller firms, leading to market exit and concentration. Poorly designed NTMs can harm productivity and welfare, while coordinated, capacity-aware use can deliver inclusive outcomes. Policy design, transparency, and diagnostics must evolve to reflect the growing role—and risks—of NTMs in a fragmented global trade landscape.
Journal
Journal Volume
Journal Issue
Collections
Related items
Showing items related by metadata.
Publication Missing(ness) in Action : Selectivity Bias in GPS-Based Land Area Measurements(World Bank, Washington, DC, 2013-06)Land area is a fundamental component of agricultural statistics, and of analyses undertaken by agricultural economists. While household surveys in developing countries have traditionally relied on farmers' own, potentially error-prone, land area assessments, the availability of affordable and reliable Global Positioning System (GPS) units has made GPS-based area measurement a practical alternative. Nonetheless, in an attempt to reduce costs, keep interview durations within reasonable limits, and avoid the difficulty of asking respondents to accompany interviewers to distant plots, survey implementing agencies typically require interviewers to record GPS-based area measurements only for plots within a given radius of dwelling locations. It is, therefore, common for as much as a third of the sample plots not to be measured, and research has not shed light on the possible selection bias in analyses relying on partial data due to gaps in GPS-based area measures. This paper explores the patterns of missingness in GPS-based plot areas, and investigates their implications for land productivity estimates and the inverse scale-land productivity relationship. Using Multiple Imputation (MI) to predict missing GPS-based plot areas in nationally-representative survey data from Uganda and Tanzania, the paper highlights the potential of MI in reliably simulating the missing data, and confirms the existence of an inverse scale-land productivity relationship, which is strengthened by using the complete, multiply-imputed dataset. The study demonstrates the usefulness of judiciously reconstructed GPS-based areas in alleviating concerns over potential measurement error in farmer-reported areas, and with regards to systematic bias in plot selection for GPS-based area measurement.Publication Fact or Artefact : The Impact of Measurement Errors on the Farm Size - Productivity Relationship(2011-12-01)This paper revisits the role of land measurement error in the inverse farm size and productivity relationship. By making use of data from a nationally representative household survey from Uganda, in which self-reported land size information is complemented by plot measurements collected using Global Position System devices, the authors reject the hypothesis that the inverse relationship may just be a statistical artifact linked to problems with land measurement error. In particular, the paper explores: (i) the determinants of the bias in land measurement, (ii) how this bias varies systematically with plot size and landholding, and (iii) the extent to which land measurement error affects the relative advantage of smallholders implied by the inverse relationship. The findings indicate that using an improved measure of land size strengthens the evidence in support of the existence of the inverse relationship.Publication Recall Length and Measurement Error in Agricultural Surveys(World Bank, Washington, DC, 2020-01)This paper assesses the relationship between the length of recall and nonrandom error in agricultural survey data. Using data from the World Bank's Living Standards Measurement Study–Integrated Surveys on Agriculture in Malawi and Tanzania, the paper shows that key input and output variables are systematically related to the length of the recall period, indicating the presence of nonrandom measurement error. With longer recall periods, farmers report greater quantities of harvest, labor, and fertilizer inputs. Farmers list fewer plots as the recall period increases. The paper argues that it is plausible that farmers overestimate plot-level outcomes, or they forget some of their more marginal plots due to longer recall periods. The analysis also finds evidence of measurement error related to the length of recall in common measures of agricultural productivity. The size of the recall effect typically varies between 2 and 5 percent per additional month of recall length, which is economically significant. With data reliability affecting policy effectiveness, improving agricultural survey data quality remains an important concern. Mainstreaming objective measures where possible and reducing the risk of recall error through shorter recall periods appear to be promising avenues to improve the quality of key variables in agricultural surveys.Publication Food Counts - Measuring Food Consumption and Expenditures in Household Consumption and Expenditure Surveys(Elsevier, 2017-10)This introductory paper presents the results of an international multi-disciplinary research project on the measurement of food consumption in national household surveys. Food consumption data from household surveys are possibly the single most important source of information on poverty, food security, and nutrition outcomes at national, sub-national and household level, and contribute building blocks to global efforts to monitor progress towards the major international development goals. The paper synthesizes case studies from a diverse set of developing and OECD countries, looking at some of the main outstanding research issues as identified by a recent international assessment of 100 existing national household surveys (Smith et al., 2014). The project mobilized expertise from different disciplines (statistics, economics, food security, nutrition) to work towards enhancing our understanding of how to improve the quality and availability of food consumption and expenditure data, while making them more valuable for a diverse set of users. The individual studies summarized in this paper analyze, both theoretically and empirically, how different surveys design options affect the quality of the data being collected and, in turn, the implications for statistical inference and policy analysis. The conclusions and recommendations derived from this collection of studies will be instrumental in advancing the methodological agenda for the collection of household level food data, and will provide national statistical offices and survey practitioners worldwide with practical insights for survey design, while providing poverty, food and nutrition policymakers with greater understanding of these issues, as well as improved tools for and better guidance in policy formulation.Publication From Guesstimates to GPStimates : Land Area Measurement and Implications for Agricultural Analysis(World Bank, Washington, DC, 2013-07)Land area measurement is a fundamental component of agricultural statistics and analysis. Yet, commonly employed self-reported land area measures used in most analysis are not only potentially measured with error, but these errors may be correlated with agricultural outcomes. Measures employing Global Positioning Systems, on the other hand, while not perfect especially on smaller plots, are likely to provide more precise measures and errors less correlated with agricultural outcomes. This paper uses data from four African countries to compare the use of self-reported and Global Positioning Systems land measures to (1) examine the differences between the measures, (2) identify the sources of the differences, and (3) assess the implications of the different measures on agricultural analysis focusing on the inverse productivity relationship. The results indicate that self-reported land areas systematically differ from Global Positioning Systems land measures and that this difference leads to potentially biased estimates of the relationship between land and productivity.
Users also downloaded
Showing related downloaded files
Publication Global Economic Prospects, June 2025(Washington, DC: World Bank, 2025-06-10)The global economy is facing another substantial headwind, emanating largely from an increase in trade tensions and heightened global policy uncertainty. For emerging market and developing economies (EMDEs), the ability to boost job creation and reduce extreme poverty has declined. Key downside risks include a further escalation of trade barriers and continued policy uncertainty. These challenges are exacerbated by subdued foreign direct investment into EMDEs. Global cooperation is needed to restore a more stable international trade environment and scale up support for vulnerable countries grappling with conflict, debt burdens, and climate change. Domestic policy action is also critical to contain inflation risks and strengthen fiscal resilience. To accelerate job creation and long-term growth, structural reforms must focus on raising institutional quality, attracting private investment, and strengthening human capital and labor markets. Countries in fragile and conflict situations face daunting development challenges that will require tailored domestic policy reforms and well-coordinated multilateral support.Publication Business Ready 2024(Washington, DC: World Bank, 2024-10-03)Business Ready (B-READY) is a new World Bank Group corporate flagship report that evaluates the business and investment climate worldwide. It replaces and improves upon the Doing Business project. B-READY provides a comprehensive data set and description of the factors that strengthen the private sector, not only by advancing the interests of individual firms but also by elevating the interests of workers, consumers, potential new enterprises, and the natural environment. This 2024 report introduces a new analytical framework that benchmarks economies based on three pillars: Regulatory Framework, Public Services, and Operational Efficiency. The analysis centers on 10 topics essential for private sector development that correspond to various stages of the life cycle of a firm. The report also offers insights into three cross-cutting themes that are relevant for modern economies: digital adoption, environmental sustainability, and gender. B-READY draws on a robust data collection process that includes specially tailored expert questionnaires and firm-level surveys. The 2024 report, which covers 50 economies, serves as the first in a series that will expand in geographical coverage and refine its methodology over time, supporting reform advocacy, policy guidance, and further analysis and research.Publication Digital Progress and Trends Report 2023(Washington, DC: World Bank, 2024-03-05)Digitalization is the transformational opportunity of our time. The digital sector has become a powerhouse of innovation, economic growth, and job creation. Value added in the IT services sector grew at 8 percent annually during 2000–22, nearly twice as fast as the global economy. Employment growth in IT services reached 7 percent annually, six times higher than total employment growth. The diffusion and adoption of digital technologies are just as critical as their invention. Digital uptake has accelerated since the COVID-19 pandemic, with 1.5 billion new internet users added from 2018 to 2022. The share of firms investing in digital solutions around the world has more than doubled from 2020 to 2022. Low-income countries, vulnerable populations, and small firms, however, have been falling behind, while transformative digital innovations such as artificial intelligence (AI) have been accelerating in higher-income countries. Although more than 90 percent of the population in high-income countries was online in 2022, only one in four people in low-income countries used the internet, and the speed of their connection was typically only a small fraction of that in wealthier countries. As businesses in technologically advanced countries integrate generative AI into their products and services, less than half of the businesses in many low- and middle-income countries have an internet connection. The growing digital divide is exacerbating the poverty and productivity gaps between richer and poorer economies. The Digital Progress and Trends Report series will track global digitalization progress and highlight policy trends, debates, and implications for low- and middle-income countries. The series adds to the global efforts to study the progress and trends of digitalization in two main ways: · By compiling, curating, and analyzing data from diverse sources to present a comprehensive picture of digitalization in low- and middle-income countries, including in-depth analyses on understudied topics. · By developing insights on policy opportunities, challenges, and debates and reflecting the perspectives of various stakeholders and the World Bank’s operational experiences. This report, the first in the series, aims to inform evidence-based policy making and motivate action among internal and external audiences and stakeholders. The report will bring global attention to high-performing countries that have valuable experience to share as well as to areas where efforts will need to be redoubled.Publication Global Economic Prospects, January 2025(Washington, DC: World Bank, 2025-01-16)Global growth is expected to hold steady at 2.7 percent in 2025-26. However, the global economy appears to be settling at a low growth rate that will be insufficient to foster sustained economic development—with the possibility of further headwinds from heightened policy uncertainty and adverse trade policy shifts, geopolitical tensions, persistent inflation, and climate-related natural disasters. Against this backdrop, emerging market and developing economies are set to enter the second quarter of the twenty-first century with per capita incomes on a trajectory that implies substantially slower catch-up toward advanced-economy living standards than they previously experienced. Without course corrections, most low-income countries are unlikely to graduate to middle-income status by the middle of the century. Policy action at both global and national levels is needed to foster a more favorable external environment, enhance macroeconomic stability, reduce structural constraints, address the effects of climate change, and thus accelerate long-term growth and development.Publication The Container Port Performance Index 2023(Washington, DC: World Bank, 2024-07-18)The Container Port Performance Index (CPPI) measures the time container ships spend in port, making it an important point of reference for stakeholders in the global economy. These stakeholders include port authorities and operators, national governments, supranational organizations, development agencies, and other public and private players in trade and logistics. The index highlights where vessel time in container ports could be improved. Streamlining these processes would benefit all parties involved, including shipping lines, national governments, and consumers. This fourth edition of the CPPI relies on data from 405 container ports with at least 24 container ship port calls in the calendar year 2023. As in earlier editions of the CPPI, the ranking employs two different methodological approaches: an administrative (technical) approach and a statistical approach (using matrix factorization). Combining these two approaches ensures that the overall ranking of container ports reflects actual port performance as closely as possible while also being statistically robust. The CPPI methodology assesses the sequential steps of a container ship port call. ‘Total port hours’ refers to the total time elapsed from the moment a ship arrives at the port until the vessel leaves the berth after completing its cargo operations. The CPPI uses time as an indicator because time is very important to shipping lines, ports, and the entire logistics chain. However, time, as captured by the CPPI, is not the only way to measure port efficiency, so it does not tell the entire story of a port’s performance. Factors that can influence the time vessels spend in ports can be location-specific and under the port’s control (endogenous) or external and beyond the control of the port (exogenous). The CPPI measures time spent in container ports, strictly based on quantitative data only, which do not reveal the underlying factors or root causes of extended port times. A detailed port-specific diagnostic would be required to assess the contribution of underlying factors to the time a vessel spends in port. A very low ranking or a significant change in ranking may warrant special attention, for which the World Bank generally recommends a detailed diagnostic.