Agricultural Data Collection to Minimize Measurement Error and Maximize Coverage

Advances in agricultural data production provide ever-increasing opportunities for pushing the research frontier in agricultural economics and designing better agricultural policy. As new technologies present opportunities to create new and integrated data sources, researchers face tradeoffs in survey design that may reduce measurement error or increase coverage. In this chapter, we first review the econometric and survey methodology literatures that focus on the sources of measurement error and coverage bias in agricultural data collection. Second, we provide examples of how agricultural data structure affects testable empirical models. Finally, we review the challenges and opportunities offered by technological innovation to meet old and new data demands and address key empirical questions, focusing on the scalable data innovations of greatest potential impact for empirical methods and research.


Introduction
In the past two decades, innovations in data systems have led to the production of more real-time, disaggregated, and interoperable data on agriculture than ever before. Increasing data demands and emerging policy questions are driving much of this innovation, with fast technological change and methodological advances providing an opportunity to collect more and better data at lower costs (Akogun et al., 2020;Carletto et al., 2015;Dillon et al., 2021a;Kosmowski et al., 2019;Liao, 2018;Lobell et al., 2019). Investments in country-level data infrastructure have enabled new approaches to methodological innovation, such as incorporating randomized control trials into national panel data collection or devising improved methods to ensure greater data interoperability.
Meanwhile, new types of datasuch as remote sensing data and citizen-generated dataand new technologiessuch as portable sensors, DNA fingerprinting, and computer-assisted personal interviewing (CAPI)provide unparalleled prospects for collecting and analyzing a wide array of agricultural constructs in a more granular, timely, and cost-effective manner. These advantages are further enhanced by integrating new types of data with traditional data sources such as household surveys, censuses, and administrative data.
While other data sources are becoming increasingly important, household and farm surveys are likely to remain the centerpiece of policy research for agricultural and development economists.
Not only are household surveys a key data source in their own right, but they serve as interoperable complements and validation instruments for other data sources, such as for the ground-truthing of remote sensing data, or for the ex-post adjustment of bias in studies based on citizen-generated and other non-probability data. Emerging literature on a wide array of agricultural measurement issues in land, production, and gender analysis has relied upon innovations in survey design, as fostered in the past decade through data initiatives like the Living Standards Measurement Study-Integrated Surveys on Agriculture (LSMS-ISA) and the Global Strategy to Improve Agricultural and Rural Statistics (GSARS).
The influential publication on household survey data collection by Grosh and Glewwe (2000), and in particular the chapter by Reardon and Glewwe (2000) on agriculture, together with other chapters on consumption, income, and enterprises, provided an original contribution to the field of survey measurement issues that remains relevant to this day, as does the influential work by Sudman and Bradburn (1974) on response effects in the United States. However, significant innovations in methodological development for household surveys have taken place in recent years, including on the collection of agricultural data in multi-purpose surveys. Agricultural survey design continues to evolve through important innovations such as scaling up the collection of plotlevel data in low-income countries, gender-disaggregated agricultural data 1 , agricultural panel surveys, and the collection of national agricultural household and enterprise data 2 , inter alia.
While the importance of household and farm surveys within national agricultural data systems is indisputable, it is equally important to recognize their limitations in addressing new data challenges. For instance, household and farm surveys may be ill-suited to capture the evolving value chains of rapidly transforming agri-food systems (Barrett et al., forthcoming). Surveys seldom collect sufficient data on contracting and on the different agents involved in transactions with the household and, when they do, they tend to be case studies focused on a few commodities in limited geographies or be qualitative in nature 3 (Barrett et al., 2020;Minten et al., 2016).
Furthermore, surveys often lack sufficient spatial and temporal resolution and are unable to provide the real-time data needed by policy makers, being limited by cost and sample size considerations. In higher income countries, remote sensing has been widely used for decades as a complement to ground-based measures for an array of applications, including sample frame construction, crop area and land use estimation, crop conditions assessment, climate data, and production forecasting (Hale et al. 1999). In recent years, the use of Earth Observation data for agricultural applications, combined and validated with ground-based measurements, has been spreading rapidly in low-and middle-income countries, yielding promise for more accurate and timely agricultural data in these contexts Gourlay et al., 2019).
1 See Doss and Quisumbing (this volume citation) for an extensive review of gender-disaggregated data. 2 Agricultural sector censuses such as FAO's World Programme for the Census of Agriculture includes agricultural households and agricultural enterprises. For a recent review of this program, see WCA (2020). 3 An exception in national surveys is the collection of network data in a few LSMS-ISA surveys, where information is collected from respondents on agents involved in the transaction of agricultural inputs and outputs.
Unfortunately, despite impressive progress in both traditional and new data sources, large gaps still persist in terms of the availability and quality of agricultural data. Furthermore, mounting global challenges such as rising inequality, climate change, and rapid population growth remain are likely to disproportionately affect the agriculture sector and rural areas, with more significant impacts for low-and middle-income countries. Meanwhile, the ongoing COVID-19 pandemic provided a stark reminder of the need to accelerate the production of more timely and accurate data to save lives. The pandemic has also exposed growing inequities in data systems across countries, with innovation moving at a faster pace in higher-income countries (United Nations and World Bank, 2020). Worse still, agricultural data gaps tend to be the largest where good data are needed the most, that is, in resource-constrained countries for which agriculture represents the lifeline of the majority of households and the whole economy. At the same time, the emergence and diffusion of complex farms in higher income countries (Kling and Mackie, 2019;Macdonald, 2016) creates new layers of difficulty in data collection and measurement. Recognizing that individual data sources are often unable to singlehandedly address these complex and multifaceted challenges, researchers are increasingly focusing on the potential offered by improved data integration and interoperability between data sources.
While appreciating the importance of improving agricultural data in all countries along the entire income gradient, this chapter intentionally focuses on some of the data challenges and scalable applications and tools most suitable to low-and middle-income countries. Because of this geographic focus, we primarily limit our discussion to household and farm surveys, as they are likely to remain the instrument of choice and backbone of agricultural data systems in many countries for years to come. The attention to surveys is also warranted by the availability of a fully developed total survey quality framework around which we develop the narrative of the chapter.
The growing attention to survey design issues and a burgeoning literature on rigorous survey methodological experiments (de Weerdt et al., 2020) also provide added motivation for the chapter focus.
In this chapter, we will argue and provide evidence that renewed attention to data quality issuesspecifically in terms of measurement error and data coverageis critical for advancing the research frontier in agricultural economics and designing better agricultural policy. Both measurement error and issues of limited data coverage threaten the internal and external validity of empirical analysis on agriculture, constraining its efficacy and relevance in informing sectoral policies and investments. A better understanding of measurement error and error-generating processes is crucial, as errors negatively affect the accuracy and validity of inferences resulting from data, and thus limit the usefulness of data to policymaking. Given the significance of these issues, agricultural economists and survey practitioners have paid increasing attention to measurement error in recent years, drawing on insights from existing literature on labor economics, survey methodology, and statistics. The fact that this Handbook contains for the first time a full chapter dedicated to measurement and data is testament to the prominence that data, in general, and measurement issues, in particular, have acquired in the profession today. The purpose of this chapter is to demonstrate that improving agricultural data structuresthat is, making agricultural data systems more credible and fit-for-purposecan address both measurement error and coverage issues to facilitate better empirical analysis on agriculture. For our purposes, we define data structure as the full set of survey design choices that comprise the data production process, including sampling, questionnaire design, and fieldwork implementation.
Today, technology and a well-piloted modernization agenda offer the opportunity to push the data quality production frontier, both in terms of availability and quality of data. Furthermore, increasing demands for evidence-based policymaking and accountability have generated the tailwind to achieve critical advances in agricultural data in general, and agricultural survey data in particular. Addressing existing flaws in survey data would greatly contribute to raising the credibility and, ultimately, the quality of the resulting research and analysis (Jerven and Johnston, 2015). Achieving the "credibility revolution" in empirical research as advocated by Angrist and Pischke (2010) calls for better research design choices, which begins with addressing measurement error and coverage issues. Making agricultural research more policy-relevant, credible, and fit-forpurpose begins with improving the quality of its underlying data to expand the set of testable empirical models.
This chapter highlights the importance of improving agricultural data structures for empirical analysis, while accounting for the inherent tradeoffs intrinsic to designing data collection for agricultural research and policy analysis. In the section that follows, we review sources of measurement error from the perspective of the economics, survey methodology, and statistics literatures, referring to this rich bibliography for a more detailed discussion of the issues. In section three, we turn to design choices related to coverage, including sampling design, the unit of analysis, survey timing, data collection modes, and attrition. The fourth section integrates sources of measurement error and coverage biases to assess their implications and tradeoffs in the empirical specification of a few example agricultural models, documenting where innovation in data structure has advanced the research frontier. The fifth section offers innovative approaches for addressing measurement error and coverage biases in agricultural data, based on recent technological advances and foreseen opportunities. In the sixth and final section, we conclude with recommendations on priorities for accelerating improvements in the accuracy and coverage of agricultural data, to ultimately support higher-quality research for better agricultural policy.

Minimizing Measurement Error
Measurement error and related issues of non-random measurement error have been discussed in some of the earliest work by Fisher (1926) and Working (1925). Since then, these topics have been extensively articulated and well-documented across many subdisciplines in economics, such as health, labor, industrial organization, and applied welfare analysis (Bound et al., 2001;Chesher and Schluter, 2002;De Haan et al., 2019;Gottschalk and Huynh, 2010;Hu and Schennach, 2008;Hyslop and Imbens, 2001;Pischke, 1995;Schennach, 2016Schennach, , 2004Rom et al., 2020). Most of these papers consistently highlight that bias induced in parameter estimates depends on the structure of the measurement error found in the data, as well as the identifying assumptions that empirical economists make when estimating those parameters. Making the right assumptions for these structures and tackling the sources of errors, at both the design and analytical stages, can greatly improve the accuracy and relevance of agricultural data.
While the field of statistics boasts a rich and longstanding literature on measurement error (Biemer, 2010(Biemer, , 2009Biemer et al., 1991;Biemer and Lyberg, 2003;Carroll et al., 2006;Deming, 1944;Groves, 1989;Groves and Lyberg, 2010;Kasprzyk, 2005;Kish, 1965;Wansbeek and Meijer, 2000), we have only more recently witnessed a burgeoning literature in agricultural and development economics journals addressing the sources, magnitude, and implications of measurement error, and proposing new ways to validate and correct for measurement error biases.
Measurement error can result in both bias and variable error, or variance. With non-random measurement error biases in parameter estimation come faulty conclusions and misguided policies.
Even with random measurement error, increased statistical noise requires larger sample sizes to identify parameters of interest, increasing the cost of data collection. Hence, we again emphasize the importance of understanding the sources of measurement error and attenuating its impact.
In the field of survey methodology, the Total Survey Error (TSE) framework has been the dominant paradigm. The framework serves as a useful organizing structure for assessing the extent and composition of different sources of errors that affect estimates, guiding researchers and data collection practitioners towards appropriate design choices for minimizing measurement error and maximizing coverage (Groves and Lyberg, 2010). TSE "refers to the accumulation of all errors that may arise in the design, collection, processing and analysis of survey data" (Biemer, 2010).
The paradigm implies that total errors must be minimized for a given budget and that the major sources of errors should be identified and prioritized to achieve maximum accuracy for a given cost (Biemer, 2010). Broadly speaking, TSE can be viewed as encompassing the concept of data quality which, in statistical terms, is partially captured by the Mean Square Error (MSE), a metric of the accuracy of the estimated variable.
Minimizing measurement error in agricultural data has been problematic due to a number of inherent features of agricultural processes, particularly for certain crops and agronomic practices in smallholder farming. These features include the highly seasonal nature of production and the irregularity of inputs required in the sequencing of production. Multiple studies have shown that across a variety of issues, farmers' self-reported information, which often involves long recall periods, has proven to be inadequate (Beegle et al., 2013;Deininger et al., 2011;Fermont and Benson, 2011;Gourlay et al., 2017;).
Although long aware of the existence of measurement error, only recently have agricultural economists shown interest in how these errors affect their inferences and the policy recommendations deriving from their analysis. Even when measurement errors were considered, the common practice was to make rather cavalier suppositions about the property and distribution of the errors by assuming a classical measurement error (CME)that is, assuming that the error in the variable of interest is independent from its true value as well as from the measurement errors in all other variables in the model and the stochastic error term. While reliance on the CME assumption can be justified in some instances, it is seldom the case for many variables, for which the error-generating process appear to follow more complex and systematic patterns that fail the classical assumption. The assumption appears to be even more troublesome for non-linear models (Bound et al., 2001). More recently, the agricultural economics literature has aptly focused on the potential systematic biases resulting from measurement error and how design choices and new technologies can help improve measurement (for some recent applications of non-classical measurement error in agricultural data, see Abay et al., 2019;Carletto et al., 2013;Desiere and Jolliffe, 2018;Gourlay et al., 2017). We argue that addressing potential bias ex-ante through appropriate design choices may ultimately be a more effective way to tackle the issue, although careful ex-post analysis and modelling may also be helpful in mitigating its impact on estimates (Gollin and Udry, 2021;Maue et al., 2020).
Policy researchers hold the power and responsibility to make wiser design choices at the data collection stage for given objectives and budget constraints. To this end, the TSE framework provides a useful blueprint for understanding the underlying error-generating processes and the relative importance of the different components, as well as how to ameliorate their impact on estimates. While the TSE framework is useful for this chapter, given its focus on sample surveys as one of the main sources of data for policy research in agriculture, it is important to note that most features of TSE also apply to other data sources. For instance, Biemer (2017) argues that TSE provides very useful insights on how to deal with errors in Big Data, drawing clear parallels between errors in surveys and the often selective, incomplete, and erroneous nature of Big Datagenerating processes. As researchers increasingly rely on alternative data sources such as citizengenerated data and crowdsourcing to collect agricultural data, similar data quality frameworks should be developed for those types of data. However, even in the case of TSE, full consensus on a comprehensive typology of errors is yet to exist. Groves and Lyberg (2010) conclude that this lack of consensus is the natural consequence of the continuous evolution of methods and data collection technologies, as well as the different objectives and constraints of different data producers and analysts. As a result, any list defining the universe of TSE is bound to be incomplete and/or to emphasize certain components over others (Groves and Lyberg, 2010).
We must note here that focusing solely on minimizing total survey error with expensive measurement methods ignores the research design cost-variance tradeoff and the full set of research design choices. For instance, a researcher may be willing to accept some degree of measurement error, if reducing such error would also reduce the statistical power of the research design. If a researcher is implementing a randomized control trial, measurement error that is not correlated with treatment status may not bias estimates, whereas in a non-experimental design, measurement error might bias parameter estimates and thus have consequences for internal validity and policy recommendations.
To conceptualize these research design tradeoffs more clearly,  build on earlier writing in the statistical literature (Biemer, 2010, among others) to introduce the idea of the data quality production function. For any given research project, the researcher's objective is to maximize the knowledge or evidence generated from the research project. To do so, the researcher makes decisions about the identification strategy, statistical power, and external validity of the project, subject to a budget constraint and the data quality production function. The data quality production function includes choices on questionnaire design as well as other variables such as sampling, empirical approach, and field implementation modes, protocols, and constraints. These latter choices include decisions based on the availability of financial resources, personnel capacity, and the competing demands and/or mandates of the researcher or agency collecting the data.
Thus, measurement error and bias, which closely relate to the concept of internal validity, must be weighed against other important features of model inferences, including the power of the estimates, external validity and coverage, and the intended use of the data . From a user's perspective, data accuracy (and the costs involved in achieving it) must be weighed against other idiosyncratic user preferences related to the broader construct of fitness-for-use of the data (Juran and Gryna, 1980) as part of a broader Total Survey Quality (TSQ) framework (Biemer, 2010).
This more complete construct of survey quality, going beyond accuracy, includes concepts such as comparability, relevance, timeliness, accessibility, credibility, usability, interpretability, completeness, and coherence (Biemer, 2010). For instance, the temporal or spatial granularity of the estimates and other features related to improved coverage may be more important to some users, who may be willing to sacrifice some degree of accuracy in exchange. Another highly relevant dimension is the interoperability of the data and how data integration can improve accuracy and decrease bias while also playing a role in enhancing and/or reducing coverage. For instance, the use of mixed-mode data collectionsuch as high-frequency phone surveys that are fully integrated into a less-frequent face-to-face large-scale surveyhas the potential to reduce measurement errors due to recall bias, but may introduce other problems such as under-coverage due to the incompleteness of sampling frames or higher levels of attrition. As proposed by Biemer and Lyberg (2003), one could treat all these additional dimensions of quality as constraints in an error minimization problem (Biemer, 2010). While highly relevant for sample surveys, the total survey quality paradigm can also be extended to other sources of data (Amaya et al., 2020).
Keeping in mind the specific design choices that researchers face, we define the possible sources of measurement errorscorresponding to what Groves (1989) calls errors of observationinto five groupings: (1) questionnaire design, (2) interviewer effects, (3) respondent effects, (4) mode of data collection, and (5) data processing. Equally important sources of errors may derive from incomplete coverage, or lack of representativeness (that is, errors of non-observation), including sampling errors as well as non-sampling errors, further categorized into coverage errors and nonresponsewe address these in the following section.
This taxonomy of sources of errors can be juxtaposed with a typical data structurewith units of observation in the rows and variables in the columnsto show the relationship, and thus potential tradeoffs, between sources of measurement errors affecting variables (the columns) vis-à-vis noncoverage errors affecting units of observation (the rows), as well as the tradeoffs between measurement error and coverage. It must be noted, however, that many of these sources and design choices may affect both measurement error and coverage (e.g., mode effects) or be correlated and have covariate effects on total error (e.g., interviewer and respondent effects). Furthermore, sources of measurement errors are likely to simultaneously affect multiple variables, both dependent and independent, generating complex error structures that have differential implications on inferences. Hausmann (2001) reviews approaches to dealing with measurement error in either dependent or independent variables and in the case of continuous and discrete variables. Hyslop and Imbens (2001) provide a clear and succinct classification of the effect of measurement error on either dependent or independent variables, as well as on both. A common approach to measurement error in empirical labor economics is to model measurement error as an 'errors in variables' problem whose proposed solution is an instrumental variable. However, increased concerns about weak instruments have caused such methods to be disfavored in labor economics and this approach to measurement error in empirical agricultural economics has been rare. Finally, continuous dependent variables may lead to reduced statistical precision but not necessarily biasbut the cost of increasing sample size (adding more rows), particularly for numerous sample strata and domains of inference, is often high.
While econometric approaches are inherently ex-post solutions to measurement error that take the data-generating process as given, we see opportunities for ex-ante solutions within current international efforts to build capacity in data quality assurance and methodological innovation in national statistical offices. These capacity building initiatives provide an opportunity to create better agricultural data structures that address research hypotheses and policy concerns by maximizing data quality. To this end, with due consideration to the various tradeoffs, researchers can make design choices in several areas to minimize measurement error in the collection of agricultural data. Below, we present in detail the five main sources of measurement errors listed above. Understanding these groupings and their potential impact on bias and variance can help researchers make the right design choices for their research objectives.

Questionnaire design
Agricultural questionnaire design requires researchers and policy makers to clearly outline the unit of analysis and agricultural processes that they would like to measure. Rozelle (1991) outlined various approaches to agricultural survey design, such as production function approaches, income state approaches, and balance sheet approaches, each of which require different questionnaire designs. Production function approaches map inputs to outputs to estimate the returns to inputs.
Income statement designs measure farm profits based on revenue and expense information. A balance sheet approach values farm assets and liabilities in addition to inputs and outputs. An early resource for agricultural questionnaire design is the Reardon and Glewwe (2000) agricultural chapter in Ghosh and Glewwe (2000), which outlines broad principles of agricultural module design in multi-topic household surveys. Dillon et al. (2021a) provide a recent updated review incorporating recent innovations in survey design choices for agricultural questionnaires, including the integration of plot-level crop production and input modules as well as livestock production questionnaires.
A broad questionnaire design literature explores best practices to minimize measurement error.
Errors from questionnaire design may result from unclear wording, poor formatting, priming, excessive length of questions and instrument, sequencing and skipping of questions, duration of reference period, and differences in reference periods or the coding of responses (Schwarz, 1997;Fowler, 1995;Gideon, 2012;Iarossi, 2006;Manski and Molinari, 2008;Payne, 1980;Sudman and Bradburn, 1973;Sudman et al., 1996;de Weerdt et al., 2020). The impact of questionnaire design choices on data quality can be substantial, with even minor changes having adverse consequences on the accuracy and comparability of estimates (Beegle et al., 2020;Das et al., 2012;De Weerdt et al., 2016). Specification errors, which occur "when the concept implied by the survey question and the concept that should be measured in the survey differ" (Biemer, 2010) can also contribute to errors from poorly designed questionnaires. One example of specification error in many agricultural surveys is lack of clarity when defining plots relative to parcels, which may have large implications for productivity estimates (see section 3.2 on units of analysis for a discussion of plots versus parcels). Lack of consistent specification in the definition of household membership or contextual differences in the social and economic criteria of household membership may also lead to faulty estimates (Beaman and Dillon, 2012). Other examples of questionnaire design choices are the use of rosters to collect individual or plot-level data, or the collection of individual components of income or profits in lieu of eliciting information in a more aggregated format (De Mel et al., 2008;Vijverberg and Mead, 2000). In both paper and CAPI-based questionnaires, visual aids are widely used for capturing non-standard units of measurement for more accurate estimations of both agricultural production and food consumption (Eisenhower et al., 1991;Mathiowetz, 2000;Oseni at al., 2017). Particularly in electronic questionnaires, area maps using GPS are increasingly used for estimating land area, for listing dwellings and plots for sampling, as well as for supervision and quality control purposes.
Questionnaire length and complexity, as well sequencing of the questions and modules, may also have important implications on measurement error (Kilic and Sohnesen, 2019;Strack et al., 1988;Schuman and Presser, 1981;Schwarz and Hippler, 1991). Furthermore, the use of open or closed format may have an impact on responses, where closed questions with clearly identified response options can help respondents in both remembering information and choosing appropriate responses (Kasprzyk, 2005;Schwarz and Hippler, 1991). Finally, the language(s) and translation of the questionnaire, as well as differences in language and cultural background between the survey designer and the respondent may also contribute to errors (Vaessen et al, 1987).  (Ponzini et al., 2021).
Technology is aiding these innovations by facilitating the transfer of information across survey visits via increasingly flexible CAPI applications.

Interviewer effects
Interviewer effects occur when personal characteristics of the interviewer, such as education, ability, motivation, or language barriers, affect the interview process. Proper recruitment, training, and monitoring of job performance are used to minimize errors associated with interviewer effects (Fowler, 2004). A meta-analysis of the literature (West and Blom, 2017) establishes that interviewer behavioral traits and demographic characteristics influence survey responses, and by extension, data quality. Response rates and response biases are particularly influenced by specific interviewer characteristics (such as age, ethnicity, experience, and education), behaviors (such as formal versus conversational interview styles), cognitive and non-cognitive skills (such as mathematical ability, reading, attention to detail, and empathy), and interviewer experience.
Existing literature on interviewer effects finds that data quality can be a function of who is asking the questions. Responses vary by the interviewer's ethnicity (Davis et al, 2010;Davis and Silver, 2003), gender (Benstead, 2010;Flores and Lawson, 2008) and religion (Blaydes and Gillum, 2013), especially for questions sensitive to race, gender, and religion respectively. Studies have also explored the association of data quality with interviewer skills and behaviors such as probing, providing feedback for responses, and rapport building (Belli et al, 2004). Some interviewer characteristics are fixed, while skill-based characteristics may change in response to training.
Responses and measurement error may also vary based on the interviewer's adherence to a script.
For instance, in the context of the Agricultural Labor Survey in the United States, Ridolfo et al. (2021) show how interviewers' lack of adherence to the script resulted in significant measurement errors. Similarly, using the same survey, Rodhouse et al. (2019) quantify the extent to which deviating from the script affects the likelihood of measurement errors and conclude that the presence of measurement error is highly associated with the interviewer's ability to adhere to the script. Biagas et al. (2019), using the same data, use a novel multi-method approach to identify patterns of interviewer behavior and its contribution to total survey error.
Recent research on interviewer effects in a randomized experiment in Uganda by Di Maio and Fiala (2019) found that interviewer characteristics and their differences from respondent characteristics affected survey responses and ultimately data quality for sensitive topics. On the contrary, responses to less sensitive topics were much less, or not at all, susceptible to interviewer characteristics. This is supported by additional research suggesting that the salience and sensitivity of the questions influence the nature and magnitude of interviewer effects (Himelein, 2015;Laajaj and Macours, 2021). Marx et al. (2018) provide evidence on the impacts of team composition and ethnic diversity on interviewer performance. Data on the time use of field teams suggests that teams composed solely of interviewers organize tasks more efficiently than teams that include supervisors, interviewers, and data monitors, which demonstrate lower levels of effort. In a review of several studies, Groves (1989) suggests that demographic traits result in interviewer effects only when the specific question is related to the demographic characteristics of the interviewer (i.e., an interviewer effect based on the race of the interviewer may be found for questions about race).
This may be particularly relevant in contexts with large ethnic and racial diversity.
The effect of priming in surveys and the inconsistent application of interviewing protocols and wording across interviewers is also likely to generate systematic biases (Lavrakas, 2008).
Similarly, the interview setting may also affect interviewers' recording of responses and result in systematic errors. Collecting detailed metadata on the interview process is often used to partially control for potential biases generated by poor interview settings; unfortunately, this practice is not consistently applied across surveys.

Respondent effects
Respondents can also contribute to TSE in several additional ways, either intentionally or unintentionally. Assumptions about the structure of those respondent biases are often uninformed by empirical evidence, although Hyslop and Imbens (2001) provide a categorization of different types of potential biases. For instance, respondents may intentionally under-report the amount of land they own because of taxation concerns or may conversely over-report their land holdings because of prestige considerations or social desirability. Social desirability concerns are likely to result in the under-reporting of "socially undesirable" behavior, and the over-reporting of socially desirable occurrences (Bound et al., 2001). For some agricultural statistics such as child labor, context may determine whether children's work in agriculture carries social stigma and hence potential reporting bias. Similar response behavior may also occur with the reporting of income variables (Tourangeau et al., 2000). Respondents may also round up the amount of land owned to integer values, resulting in a phenomenon known as heaping. Research on land area measurement consistently finds systematic errors in farmers' self-reporting, with farmers that own smaller land holdings systematically over-reporting (and farmers with larger land holdings under-reporting), as well as considerable heaping in reporting (Carletto et al., 2015(Carletto et al., , 2013. Errors in responses may be unintentional, resulting from limited knowledge or recall bias due to memory decay as the length of the recall period increases. Errors may also derive from limited understanding of the questions, potentially correlated with the cognitive level of the respondent (Laajaj et al., 2019;Laajaj and Macours, 2021). The use of bounding techniques, providing an easy-to-remember temporal reference point in respondents' memory to better contextualize the answer, can be used to reduce the effect of telescoping in recall (Abate et al., 2020;Neter and Waksberg, 1964). Recall biases are also affected by the salience of the event being recalled (Beegle et al., 2012;Gaddis et al., 2020;Kilic et al., 2021;Wollburg et al., 2020). Gaddis et al. (2020) and Arthi et al. (2017) analyze the impact of recall on the measurement of agricultural labor. Their findings suggest that a seasonal recall approach to agricultural labor measurement may result in underestimated labor productivity. In their cross-country study, Beegle et al. (2012) find no evidence of bias in harvested quantities for both staple and cash crops. Recall bias, however, was present in hired labor reporting, although the direction of these biases varied by country. As already mentioned, similar findings emerge from Wollburg et al. (2020) related to the design choices of number and timing of field visits, as more visits and shorter roll-out periods reduce the length of the recall period.
Interestingly, at least in domains outside of agriculture, perceptions of salience may be influenced by the length of the recall period (Winkielman et al., 1998) and may vary by the income level of the respondent (Das et al., 2012). Understanding respondents' cognitive strategies is crucial for choosing the most appropriate length of recall and thus minimizing measurement errors in responses. Evidence suggest that beyond a certain recall length, respondents switch from enumeration to estimation, each translating into different errors (de Nicola and Giné, 2014;Scott and Amenuvegbe, 1991).
The use of proxy respondents and widespread reliance on the most informed respondentoften the male head of the householdis also likely to result in biased responses (Dillon and Mensah, 2020;Doss et al., 2019;Kilic and Moylan, 2016;Krosnick, 1999;Moore, 1988). Bardasi et al. (2011) find that using proxy responses led to the under-reporting of men's participation rates in agricultural activities. Kilic et al. (2020a) show significant impacts of respondent selection strategy in the collection of labor data. Kilic et al. (2020b) also find that the common practice of proxy reporting results in different reporting of land assets relative to those reported by self-respondents. The use of proxy reporting by the "most knowledgeable household member" results in higher rates of exclusive ownership of agricultural land among men, and lower rates of joint ownership among women, as compared to the gold standard approach of individual, self-respondent interviews (Kilic et al., 2020b). In this context, interview setting has also been shown to greatly affect responses. For instance, the common practice of non-private interviewing (i.e. where other members of the household and community may be present during the interview), more often conducted through proxies, results in significant under-reporting of employment relative to measurement through private, self-respondent interviews, with stronger effects for women than men (Kilic et al., 2020a). Dillon and Mensah (2020) note that when proxies report household-level agricultural variables as opposed to individual-level responses, proxy response bias is composed of both aggregation errors and asymmetric information within the household.
Thus, their findings suggest that proxy response bias is not solely due to asymmetric information within the household, as is commonly assumed in the literature on proxy response bias for individual-level variables.
Measurement error can also derive from the use of peers (e.g. neighbors, co-workers, key informants, etc.) as proxy respondents, potentially resulting from projection or false consensus biases, among other things (Hogset and Barrett, 2010). Despite the potential biases of proxy reporting of peer behaviors (Ashenfelter and Krueger, 1994), the practice is widely used (Hogset and Barrett, 2010). In some cases, however, it may be justified when gathering data from peers may be sub-optimal yet preferable, such as in the case of collecting highly sensitive information.
While the use of proxy respondent should be minimized to the extent possible, one must also acknowledge the tradeoffs between respondent bias and coverage, as restricting interviewing to the selected respondents is likely to result in higher attrition and unit missingness. Furthermore, the use of proxy respondents is often unavoidable due to logistics or cost considerations. In such cases, the proxy respondent selection process should be conducted based on strict standardized field protocols.

Mode of data collection
The mode of data collectionwhether face-to-face, self-administered or interviewer-led, by phone, or by web, and whether on paper or in electronic formatcan have substantial effects on measurement error as well as coverage. In terms of measurement error, several studies have shown that the effect depends on the type of question as well as interviewer ability and respondent characteristics (Biemer and Lyberg, 2003;Caeyers et al., 2010;De Leeuw, 2005; De Leeuw and Van der Zouwen, 1988). While errors of coverage may result from incompleteness of frames and respondent selection, phone or web surveys may allow more frequent data collection for greater temporal granularity and lower measurement error due to recall bias. Similarly, crowdsourcing and other forms of citizen-generated data are increasingly used in agriculture and potentially offer enormous opportunities for collecting data at greater temporal and spatial resolution. However, these modes of data collection also exhibit serious limitations in terms of representativeness as well as overall data quality which, if left unaddressed, are bound to produce biased inferences

Processing errors
Finally, processing errors include possible errors generated during data entry, editing, coding, weighting, and analysis of data. New technologies and data processing power have transformed the set of opportunities and ways to reduce processing errors. Unfortunately, the relatively faster growth in the volume of data being generated, combined with the complexity of the new data landscape, have created additional challenges in terms of processing errors. For a review of processing errors and how they may impact total error, see Biemer and Lyberg (2003).

Tradeoffs in Maximizing Coverage
As Coverage errors can also derive from omitted variables in model specification due, for instance, to missing environmental variables affecting production decisions (Sherlund et al., 2002). As seldom collected in surveys, information on environmental conditions and capturing inter-farm heterogeneities affecting farmers' choice is often missing, potentially resulting in biased estimates of both estimates and coefficients. With the widespread availability of inexpensive geospatial data that can be linked to household-level data, filling these gaps and potentially reducing the risk of omitted variable bias has become increasingly easier. However, the georeferencing of survey data at farm and plot level is yet to become common practice in many low-and middle-income countries. Furthermore, many of the hurdles related to data privacy for secure and confidential dissemination and use remain unresolved.
In addition to non-coverage errors due to a priori exclusion from the frame, non-response and other reasons for attrition are also likely to affect the validity of the estimates. Non-response can be further divided into unit and item non-response. Unit non-response occurs when a unit of concern is included in the frame but is either not reached or is unwilling to participate in the survey. This missingness is seldom random and is often treated ex-post through imputation methods Rubin, 1996Rubin, , 1987. Conversely, item non-response occurs when a respondent fails to provide information to a question during interviewing. This missingness is most often handled at the data processing stage through complex imputation methods. However, the lack of consistent guidelines and practices of value imputation, combined with poor documentation on how missing values have been treatedpotentially leading to systematic errorsonce again highlights the potential tradeoffs between coverage and measurement error.
In line with the broader total survey quality framework, we review the issue of coverage error found in the statistical literature, considering several of the dimensions of coverage related to fitness-for-use. As postulated above, researchers are likely to be faced with tradeoffs between measurement error and other dimensions of data quality. Furthermore, these tradeoffs are also related to the particular data source. For instance, the feasibility of various possible decisions on the unit of analysis, timing, or level of disaggregation will vary based on whether data is being collected through an agricultural census, a national survey, or a randomized control trial. Coverage error is also affected by the mode of data collection, such as face-to-face, phone, or web-based interviewing, which can also lead to mode effects, another factor contributing to total error. In this section, we describe some of the design choices related to sampling frames, units of analysis, survey timing, modes of data collection, and attrition, all of which have implications for data coverage and the tradeoffs between measurement error and coverage.

Sampling frame
The sampling frame used for agricultural surveys, whether based on population listing or geographic areas, may often be missing some sub-populations or units of concern. Under-coverage could be accidental or intentional and, if deliberate, may be driven by many motives. Of particular relevance to smallholder agriculture is the issue of under-coverage of more remote areas, farm holdings, or individual plots, driven by either cost considerations or convenience . Capturing pastoralist and transient populations is also particularly challenging, and their exclusion is likely to result in biased estimates (Himelein et al., 2014).
As stated above, coverage will also depend on the type of sampling frame chosen. Multi-purpose household surveys like the LSMS-ISA use a population-based listing, with the household as the unit of analysis. More recent agricultural surveys, particularly in more developed economies, have increasingly relied on area or point frames, which presents advantages when the objective is to estimate production at the national or sub-national level. In fact, one concern with using population-based listings is the possibility of missing some plots due to misreporting, thus resulting in lower total production. This problem is further compounded for pastoralist and seminomadic populations which are entirely missing from or hard to reach using population frames; in these cases, using area frames in combination with population listings may be more appropriate (Himelein et al., 2014). However, collecting socioeconomic information on the farm household starting from a given area or point is particularly challenging, and hardly ever done in socioeconomic studies, thus limiting the analytical use of the data. Finding ways of reconciling the choice of frames and maximizing coverage by linking multiple frames has been the subject of recent research under the 50x2030 Data Smart Agriculture initiative (D'Orazio, 2020).
One key shortcoming of using a population frame for agricultural data collection is the wholesale exclusion of medium and large commercial farms. This has been the practice in agricultural household surveys such as those conducted under the LSMS-ISA initiative, which has raised concerns about the validity of inferences made on a truncated distribution of farm holdings Ali and Deininger, 2014). On the other hand, agricultural censuses and farm surveys that focus on both farming households and commercial farms may be more suitable for sector-level estimation of agricultural indicators, but fall short of meeting the analytical objectives of surveys of households as both production and consumption units (Singh et al., 1986).
The use of multi-frame sampling strategies, combining the strengths of the individual sources, is gaining ground in lower-income countries, albeit at a slower pace due to capacity constraints and the technical difficulties involved.
By choosing a household listing as a population-based frame, the definition of household and household membership have important measurement implications, in part because social and economic definitions of the household diverge (Beaman and Dillon, 2012). This is especially the case in communities with extended farming families with land inheritance claims, common production of family lands, or complicated land use rights. The household definition matters from an agricultural perspective, as household membership defines the individuals that will be included in modules on agricultural land holdings, labor, assets, and marketing decisions. Unsurprisingly, reported values of agricultural production are lower when the household definition excludes some agricultural producers. Residency requirements also complicate the measurement of pastoralist activities when households are involved in transhumant pastoralism.

Units of analysis
Agricultural data is often collected at different units of analysis, including at national or subnational levels, geospatial area units, and at the household, farm, plot, or plot-crop-season-manager levels. Depending on the frame used, selection may be based on population listing or a map, subdivided into grids. It may be the case that, for a specific application, both types of frames may be used, which requires reconciling the estimates at the national or sub-national level. For instance, Pelletier et al. (2020) use small area estimation (SAE) to reconcile deforestation estimates from area-based frames with smallholders' use of modern inputs drawn from a population-based frame.
In fact, it is conceptually appealing to use area frames when the focus is the measurement of land area and related agronomic features, as a population-based survey may result in under-counting plots and farming activities. On the other hand, the use of population-based listings may be more appropriate than area frame sampling in capturing data from hard-to-reach farmers living far away from the plots. The use of multiple frames, which in the agricultural domain involves the combined use of both area and list frames, has been advocated as a way to ensure completeness of the frame and full coverage of the sector (FAO, 2015b; Gonzalez Villalobos and Wigton, 2011). However, avoiding duplication and overlapping units is often a challenge when constructing multi-frames.
Indirect sampling has also been proposed as a way to overcome the shortcomings intrinsic to household listings as frames for agricultural statistics, by applying Generalized Weighted Sampling Methods (GWSM) to obtain estimates for landholdings (the unknown and more relevant universe) from a household listing (the known population) (Falorsi et al., 2016;Gennari et al., 2013).
Any livestock sector outcomes should be recorded at the herd level rather than at the household level, particularly for nomadic populations. Household surveys with population-based sampling frames will never capture herd-level outcomes, for which area-based-sampling measures may be a better unit of analysis.
In an example of innovative survey design, LSMS-ISA surveys redesigned agricultural modules by recognizing that the unit of observation for agricultural production is often not the household but the plot, which may be managed by differing household members, with significant implications for sex-disaggregated analysis. In early versions of LSMS surveys and many other multi-topic household surveys, agricultural activities were not detailed at the plot level, as this requires a higher respondent burden relative to household level recall. However, a plot-level approach more accurately measures the relationship between inputs and outputs in agricultural production.
Production heterogeneity results from differences in crops cultivated that require different levels of inputs and may not be managed by the household head. The level of detail required in subsequent modules makes this survey design choice non-trivial. For example, plot-level data collection requires not simply measuring land area at the plot level, but also production, labor, capital, chemical inputs, and land management techniques.
Choosing to measure production at the plot level requires a wider series of choices in identifying the unit of analysis, which has implications for both design and implementation. First, agricultural production is seasonal and multiple crop cycles on a given plot need to be measured. Second, plots are not always associated with a single crop, as multi-cropping or inter-cropping is a common land management practice for increasing yield and land quality. In contexts such as smallholder agriculture in sub-Saharan Africa, inter-cropping is the norm, not the exception. security as it relates to the control of and access to other assets, through access to credit and investment in land, for example, these SDG indicators seek to measure specific aspects of land tenure at the individual level, rather than at the household level.
In summary, the key innovation in conducting plot-level production analysis is not to simply measure inputs and outputs at the plot level, but to distinguish the unit of analysis as plot-cropseason-manager. This unit of analysis facilitates comprehensive measurement of household production, allowing multiple analytical strategies from seasonal, crop, and gender perspectives, but also has some limitations, particularly in the context of a panel survey, given the changing demarcation of plots across seasons. Tracking parcels over time is often a more feasible option.
Across different agricultural systems, the vocabulary associated with an agricultural landholding may also differ. Farmers use different words to indicate their farms, parcels, and plots, often with contradictory meanings. It is important that any agricultural survey design reflects a clear conception of the hierarchy of units consistent with the agricultural system that is being measured. kept and all land used wholly or partly for agricultural production purposes, without regard to title, legal form or size..." Holdings can be divided into parcels, and the FAO notes that "a distinction should be made between a parcel, a field and a plot", where "a field is a piece of land in a parcel separated from the rest of the parcel by easily recognizable demarcation lines such as paths, cadastral boundaries, fences, waterways or hedges. A field may consist of one or more plots, where a plot is a part or whole of a field on which a specific crop or crop mixture is cultivated, or which is fallow or waiting to be planted" . However, when designing and implementing an agricultural survey, practitioners should confirm the tiers and definitions used by the national statistical office, as these may not always coincide with the FAO definitions. As suggested above, tracking parcels may be a more practical option for longitudinal studies, given the changing size of plots across seasons.
Recording agricultural information at the household level inherently aggregates individual production and imposes a linearity assumption across plots for input utilization and asset use. The main tradeoff in recording agricultural information at the plot level is that farmers must recall input allocation at the plot level, which requires more cognitive effort and response time. These recall biases may be compounded by proxy response bias, as plot-level self-reporting is time-consuming in the field and may be not feasible for all survey responses. Proxy respondents may have incomplete information on plots managed by other household members. For farmers who purchase inputs collectively with their family for multiple plots, it may be difficult to accurately assess how much fertilizer, seed, or other input was applied to a particular plot. Consistent with time use data, it may also be difficult for a farmer to recall individual household labor allocations to particular plots over an agricultural season or with respect to particular agricultural tasks. While more research is needed to understand the measurement implications of the disaggregation of input and production data to the plot level from the household level, the known analytical advantages of doing sosuch as analysis of male-managed plots vis-à-vis female-managed plotsoutweigh the unknown risk of aggregation in many surveys, including LSMS-ISA surveys.
Due to variation in land tenure status and land use rights, it is also important to account for seasonality in production on plots and changes in plot management when considering the unit of analysis. Depending on the agricultural season, a plot may be cultivated by a different member of the household and use different levels of inputs along with different cropping choices. Researchers have often cited asymmetries in crop type and input use, and therefore productivity and earnings, by the gender of the plot manager. O'Sullivan et al. (2014) estimate that, after controlling for plot size and region, productivity differences across male-and female-managed plots in Africa ranged from 23 to 66 percent. In order to appropriately account for plot-level production, and to enable analysis of the timing of production and gender asymmetries, both season and plot manager should be considered. The plot manager may differ from one season to the next, often depending on gender-based norms.
Just as there are tradeoffs in empirical specifications among units of analysis, differences in units of analysis also imply different constraints when repeated observations are an objective of the survey design.

Survey timing
Survey timing is a critical design choice that affects coverage as well as measurement error due to questionnaire design or respondent effects. Survey timing can refer to the timing of visits within a single agricultural season, as well as the timing of visits between seasons. LSMS-ISA surveys feature an innovative survey design that includes repeated visits both within and between seasons (Carletto et al., 2010). New international surveys such as those produced under the 50x2030 Initiative will also feature multiple within-and between-season observations. Here, we review survey timing decisions that increase coverage across certain dimensions of time.
As a research and policy issue, seasonality and the presence of multiple cropping cycles within the agricultural calendar imply that multiple agricultural surveys may be best timed according to Variation in production and other agricultural variables as well as the ability to monitor shocks and household resilience could also be captured through community sentinel sites complementing less frequent surveys (Barrett and Headey, 2014). The authors convincingly argue for establishing a multi-country system of sentinel sites in selected communities as a way to improve the timeliness and coverage of agricultural data, in the face of ever more frequent shocks affecting the resilience of rural households.

Mode of data collection
As discussed in the previous section, the choice of survey mode may have significant implications in terms of measurement error, either directly or through its interaction with other design choices related to questionnaire design, interviewer selection, and respondent features. Similarly, certain modes of data collection may also affect survey coverage. Poor representativeness because of inadequacy of the sample frame, selectivity, and potentially high attrition is a major challenge for phone surveys (Ballivian et al., 2015;Gibson et al., 2019). For instance, the use of mobile phones or the web affects not just how responses are elicited, but also whether respondents agree to participate in the survey, and/or whether respondents are included in the frame in the first place.
In terms of frames, phone surveys predominantly rely on three options: (1) recent representative surveys with phone numbers of respondents; (2) lists of phone numbers from mobile phone providers, and (3) random digit dialing . Each option involves significantly different implications for both coverage and attrition. Proper tracking and field protocols, combined with the collection of selected information for ex-post weighing and bias adjustment, can greatly enhance the representativeness and usability of phone surveys and reduce mode bias.
Irrespective of the type of frame used, phone surveys are more likely to miss more remote and poorly connected households, as well as poorer households who do not own a phone or live in areas with poor mobile coverage. This is particularly relevant for agricultural data, where large shares of respondents live in remote and poorly connected areas and are more likely to be poor and technologically illiterate. Educational level, age, and technological literacy will also systematically affect overall coverage. Similarly, selection bias for respondents in citizen-generated data and crowdsourcing makes collecting agricultural data using those modes particularly concerning in terms of representativeness and coverage, especially when it comes to their use in official statistics. They found that pre-survey text messages did little to improve non-response but did increase survey completion. In the Philippines, pre-survey text messages actually increased non-response, while having no effect in the Colombia, Mexico, or Rwanda samples. In those countries, presurvey text messaging increased survey completion by between one and four percentage points. In nine countries, time-of-day and day-of-week effects were estimated, with midday interviews increasing participation and evening calls reducing participation (Das et al. 2021). These effects were relatively small, with an effect size of four and eight percentage points over the base pickup and completion rates, respectively. The day-of-week effects varied substantially between countries, with few generalizable day-of-week effects between countries. Within countries, effect sizes were substantial, but often in different direction by country. More country-specific evidence may need to be generated to reduce non-response, underscoring the importance of understanding local contexts and the differences in time distributions between work and leisure within a country.
The impact of mode of data collection on coverage extended to other survey design features. For instance, the use of diaries, while possibly more accurate than recall modes when properly implemented, may lead to greater under-coverage among illiterate respondents as well as higher non-response among richer households that face higher opportunity costs for answering lengthy diaries. Also, differences in record-keeping across groups of respondents, such as between smallholders and larger-scale farmers, may result in systematic variations based on the chosen method (Lyberg and Kasprzyk, 2004;Silberstein and Scott, 2004). In the next section, we discuss in detail how measurement error and coverage bias affect the empirical estimation of common agricultural models.

Attrition
Significant coverage biases due to attrition affect both the internal and external validity of empirical work for both randomized control trials and observational panels (Beegle et al., 2011;Falaris, 2003;Outes-Leon and Dercon, 2008;Rosenzweig, 2003;Thomas et al., 2012Thomas et al., , 2001Zabel, 1998). Millan and Macours (2019) Thomas et al. (2012) also discuss planning for attrition and protocols for reducing attrition. They find that success in tracking movers depended not only on observable characteristics of respondents, but also the characteristics of interviewers who initially interviewed respondents.
Reducing coverage bias due to attrition is likely to be most successful not simply when surveys are designed to track respondents who have moved, but also when initial interviews collect tracking data and interviewers are trained to establish connections with survey respondents.

Empirical Specification, Data Structure, and Measurement Error
For any empirical analysis, the set of theoretical models that can be tested is defined by the available data. Each dataset has its own data structure, which we defined above as the full set of survey design choices that comprise the data production process, including sampling, questionnaire design, and fieldwork implementation choices. National production surveys such as agricultural censuses imply a specific subset of production models that can be tested. Household surveys that integrate agricultural data, such as LSMS-ISA surveys, are implicitly informed by producer models or agricultural household models, but measurement error or coverage bias can reduce the precision and utility of estimates and restrict the set of testable models. In this section, we review tradeoffs in the empirical specification of agricultural models and data requirements.
We discuss how survey design choices that increase data coverage present a tradeoff in potentially increasing measurement error in prominent empirical models.
While we cannot review the interaction of data structure and empirical specification across all prominent models in agricultural economics given the focus of this chapter, it is illustrative to choose a few common specifications to demonstrate how innovations in data structure have

Profit and production functions
A large literature examines models of the producer problem (Chambers, 1988;Chambers and Quiggin, 2000;Mundlak, 2001) and the specification of the agricultural production function (Pope and Just, 2001). Pope and Just (2003) provide a summary of production technologies and their functional forms. In this earlier production literature, measurement error and coverage bias were central concerns in the field. Pope and Just (2003) specifically discuss coverage bias and its effect on production function specification, as well as the modeling of measurement error. Aggregated district or national data sources led to misattribution of the returns to inputs, as the unit of analysis in the data was not at the producer level where profit-maximizing decisions were made in the theoretical model.
Measurement error due to unobservable decision variables is also a source of bias in production function estimation, but distinguishing measurement error from unobserved heterogeneity and potential misallocation is challenging. Yields can be biased by output or land size measurement as noted by Abay et al. (2020). Inputs such as fertilizer or labor can be biased by errors in both quantity and quality over the relevant recall period. In the case of livestock production, inputs such as medical care and feeding practices may be difficult to attribute within herds. Measurement error in these input and output variables is likely correlated with unobserved heterogeneity in farmer ability. As agricultural production is also characterized by stochastic disturbances such as weather shocks which require similar modeling assumptions to address unobserved farmer heterogeneity, error terms capture multiple sources of stochastic shocks. In principle, researchers can model such errors in the producer problem depending on the data structure. In a similar spirit, Gollin and Udry (2021) model measurement error, unobserved heterogeneity, and misallocation using panel data from Tanzania and Uganda. Several important identification challenges are addressed due to data structure, in particular, a unit of analysis at the farmer-cropplot-season level. After explaining differences in production across farms due to observable differences, Gollin and Udry (2021) note that unobserved variation could be due to unobserved land characteristics, risk, measurement error, or misallocation. With repeated panel data of farmers over time, a production function, whose error term is disaggregated among these different unobserved components, can be estimated. Gollin and Udry (2021)'s estimates suggest that measurement error and heterogeneity explain two-thirds to three-quarters of productivity differences, while misallocation affects productivity only modestly. In considering the advances in the identification of measurement error and misallocation in the production function, we note from a data structure perspective the tradeoff between improved empirical specification of misallocation and measurement error due to survey design. Recall of input allocations at the farmer-crop-plot-season level is ideal, as it permits researchers to map inputs to outputs, but measurement error may actually be increased if farmers do not recall inputs at the farmer-cropplot-season level. For example, farmers may make bulk fertilizer purchases within their household that are then divided within the household and across the farmer's plots. Precisely recalling the amount of fertilizer applied to a farmer's maize field relative to their inter-cropped legumes may be impossible, even if the farmer knows exactly how much fertilizer was purchased in total.
We note that the Gollin and Udry (2021) identification strategy capitalizes on a farmer panel to disentangle the effects of measurement error from misallocation and farmer unobservables, but we also note that assumptions about the production technology, as in Pope and Just (2003), are required. Estimates of misallocation using different production functions would certainly vary, along with the level of measurement error estimated for each. This provides an important example of the tradeoffs between data structure and empirical modeling. Advances in farmer panels allow Gollin and Udry (2021) to estimate misallocation and measurement error addressing farmer unobservables through farmer fixed effects in their production function.
Coverage biases in profit and production functions also largely depend on the unit of analysis. In

The agricultural household model
A second class of models frequently used in agricultural economics link agricultural production decisions with household welfare. We sketch an agricultural household model to motivate measurement error and coverage biases when welfare analysis of production decisions is an empirical objective. In cases where separability is assumed, the model reduces to a profit maximization and utility maximization problem given production choices. The agricultural household model is a useful example of tradeoffs in measurement error and coverage because we implicitly cover a range of producer models within the agricultural household model. Household decisions are constrained by an agricultural production function, time endowment, and an intertemporal budget constraint (see Bardhan and Udry, 1999;LaFave et al., 2013;Singh et al., 1986). The household's problem is to choose own-produced agricultural goods ( ), purchased market goods ( ), agricultural inputs ( ), and leisure ( ) to maximize the discounted stream of expected utility, given observed ( ), and unobserved household characteristics ( ).
In a non-separable formulation of the agricultural household model, production factors such as input prices also influence the household's consumption choices. Coverage biases may exist in the collection of input price data if household surveys do not measure market prices faced by farmers.
Imputed input price data for fertilizer, seed, or pesticides/herbicides ignore substantial price variation within input class correlated with product quality and efficacy.
Equation 1 provides the reduced form purchased market goods demand, which can be derived from the first order condition: ( where good m consumption depends on market ( ) and own produced agricultural good prices ( ), the price of variable inputs ( ) such as agricultural labor, fertilizer, pesticides, or herbicides, interest rates ( +1 ), farm profits ( ) conditional on climate variability ( ), exogenous income ( ), and future prices via the marginal utility of wealth ( ). Consumption also depends on household characteristics, both observed (size and composition) and unobservable (food preferences). Input prices affect household consumption when markets are incomplete, and we cannot assume that income alone affects household consumption demand. Therefore, the consumption demand equation includes not only variables that affect household income, but also those that affect production decisions.
While we have discussed above the challenges in measuring agricultural variables in the demand equation, we now highlight coverage biases and measurement error in the estimation of equation 1. First, coverage biases could have significant effects on consumption demand when consumption is measured substantially after agricultural variables are realized and/or uses a different reference period. For example, annual household surveys that record production data from the last agricultural season may be lagged by months relative to household food consumption data, which is often recorded for the last seven-day reference period. Second, measurement error in food consumption aggregates can be substantial, due to the conversion of non-standard units and the subsequent imputed food prices (Oseni et al., 2017). Deaton and Zaidi (2002)  In the next section, we discuss advances in measuring agricultural variables that are paramount to the producer and agricultural household models described above, but also to a wide set of models in agricultural economics that are beyond the scope of this chapter. We note that the two models chosen in this section are examples, but issues of identification, measurement error, and coverage are not limited to producer and agricultural household models. Advances in data infrastructure improve internal and external validity by expanding the possibilities for improved identification and coverage, providing data sources for testing a wide range of potential empirical models.

Advances in Data Collection
The combined availability of new data sources, affordable computing power and data storage options, and digital technologies allowing for innovative modes of data collection (such as mobile and smart phones, tablets, and sensors of all kinds) have created a new data landscape with novel opportunities for more accurate, affordable, and timely data collection (Hill et al., 2019 in data collection have resulted from addressing constraints to data collection in low-and middleincome countries, but we also highlight experiences from high-income settings to emphasize how these issues are in fact globally relevant.  (Dillon et al., 2019;Cohen, 2019), the associated measurement error is larger in relative terms for very small plots but is not correlated with land size (Carletto et al., 2017b.

Advances in selected thematic areas
Innovation is now proceeding in the direction of integrating GPS measurement in CAPI applications through the testing of features allowing plot delineation on preloaded satellite imagery (Masuda et al., 2020) or on printed high resolution imagery (Dillon and Rao, 2021) In the coming years, these developments can be expected to be brought to scale to address some of the drawbacks of measuring land area with GPS units, such as the cost of plot visits and the inability to measure all plots, particularly those that are more distant or particularly large. While in situ GPS measurement certainly reduces bias, some of these concerns about item non-response can be mitigated through imputation methods, which have been shown to effectively predict GPS plot measures for all plots by using farmers' self-reporting alongside other plot characteristics , or by further technological development if plot delineation on high resolution imagery can reduce the drudgery of the field visit that typically plagues GPS measurement.

Agricultural output and yields
Recent empirical work has reviewed the quality of agricultural output data, related to both the level of data collection as well as biases in farmers' self-reporting of agricultural output. farmers substantially over-reporting production on small plots and under-reporting production on larger plots. Currently, these biases can be corrected through the use of crop cuts on sub-samples, and looking ahead, through Earth Observation data calibrated with ground-truthing from field observations. Two levels of integration will be key to moving the agenda forward: integration between subjective (recall) and objective (crop cuts) data, and between ground and satellite data.
Where available, administrative data can also be combined with survey data (as well as with satellite imagery and climate data) to produce disaggregated model-based yield estimates (see for instance Erciulescu et al., 2019 for county-level yield estimates in the United States).
Meanwhile, challenges persist in the measurement of yields in fields using mixed or inter-cropping planting techniques Wineman et al.;. Estimating land area apportioned to a specific crop as well as its production is particularly difficult. Most household surveys acknowledge the complications of production and input estimates on inter-cropped plots by identifying these plots and apportioning the area planted, to divide reports of plot-level inputs by production reported by crop. However, proportional input attribution implies crop input demands including fertilizer, weeding, and harvest time are similar by crop, which may not always be an accurate assumption.
The Global Strategy to Improve Agricultural and Rural Statistics provides methodological guidance on implementing the above methods to measure the area under a given crop in intercropped systems (GSARS, 2018). Unfortunately, guidance on best practices supported by evidence from methodological survey experiments is not currently available. Remote sensing or crop cut production estimates are possible alternatives, but these measures are also challenging to implement. For instance, crop cutting, in addition to its high costs due to the need for closer supervision and multiple visits over the growing period, can only be done in a very restricted time window which may be difficult to plan correctly in a large survey operation. It also carries implementation difficulties that are associated with specific error generation mechanisms (Kosmowski et al., 2021). Furthermore, Wahab (2020) find a substantial discrepancy between crop-cuts and self-reported output measures, which he ascribes in part to the variability in crop performance within plots, leading to plot area loss in the course of the season.
Yield prediction models based on remote sensing data clearly face bigger challenges the smaller the plots and the more complex the cropping patterns, particularly related to the degree of intercropping or the presence of canopy cover. Lobell et al. (2019) report lower accuracy of remotely sensed production estimates compared to crop cut production estimates for maize intercropped plots in Uganda. However, they also clearly show the benefit of properly calibrating the spatial model through accurate ground-truthing based on high-quality crop cutting, even if only on a small sub-sample of plots. Řezník et al. (2020) compare yield predictions from satellite data with measured yield data on spring barley, winter wheat, corn, and oilseed rape in the Czech Republic, finding the yield predictions to be credible, with only two out of nine measures reporting differences between measured and predicted yields larger than 5 percent. Agricultural labor data have been typically sourced through labor force surveys or national censuses (with information generally limited to the primary occupation) and used primarily in aggregate-level productivity analysis and macro-level comparisons of national agricultural GDP with labor shares. The availability of higher quality labor data in the last decade has raised questions about the validity of evidence that shows a six-fold labor productivity gap between agriculture and non-agricultural sectors of the economy (Gollin et al., 2014). Studies that use more carefully collected labor data from household surveys have shown that the measured labor productivity gap is substantially reduced when data allow for measuring production per hours worked, as opposed to just per person per year (McCullough, 2017), and for individual fixed effects (Hamory et al., 2021). In the US, where data on agricultural labor are collected via a dedicated survey, farm labor hours have historically been difficult for respondents to report, as a low percentage of operators based their responses on formal records (National Research Council, 2008;Ott, 1999). Difficulties in this case refer also to capturing the complexity of the pay structure, recording information on different tasks, since many agricultural workers perform multiple tasks on the farm (Ridolfo and Ott, 2021), and collecting data on contract workers (Ridolfo and Ott, 2020).
Advances in the measurement of labor inputs in recent years have been based on both technologyenhanced and low-tech innovations, including by leveraging mode of data collection to ease the cognitive response burden. Notable technology-enhanced innovations include the use of mobile phones for high-frequency interviews (Arthi et al., 2018;Dillon, 2012), and the use of wearable accelerometers for the measurement of physical effort (Akogun et al., 2020). Arthi et al. (2018) find that phone surveys can be a more accurate alternative to face-to-face interviews for measuring labor inputs, and this finding remains consistent when the research question calls for collecting high-frequency data or repeated measures. In such cases, the cost of additional phone interviews is a fraction of the cost that would be implied by additional face-to-face visits (Table 1).

Table 1. Per-Household Interviewing Cost Increases
Source: Arthi et al., 2017. Akogun et al. (2020 measure the physical activity of sugarcane cutters using accelerometers, which is a direct measure of effort in their piece rate wage setting. They find a high correlation between administrative data on output per worker recorded by the firm and the worker's physical activity, as well as large changes in the intensity of such activity in response to malaria testing and treatment. Integrating objective physical activity measures into a sub-sample of observations in national surveys may be used to calibrate biases in reported time as well as to predict effort-based measures of agricultural labor productivity. Aside from the mode of data collection, substantial recent advances in methodologies relate to the key set of survey design choices in agricultural labor measurement. Bardasi et al. (2011) investigate how survey design elements such as screening questions and proxy response results in biased estimates of labor force participation, hours worked, and income by gender and sector of employment. Female labor participation statistics are not affected by the use of proxy respondents in their survey experiment from Tanzania, but male employment rates are, due to the underreporting of agricultural activity by proxy respondents. Using data from Malawi,  find that employment is further under-reported when recall periods increase and when women are the subject of proxy reporting. Recent advances in data collection software and the ubiquitous use of CAPI can also make it easier to avoid another source of coverage-related bias unearthed by Ambler at al. (2020). They show that the fact that household members are not listed randomly in the labor module, coupled with respondent fatigue, leads to age and gender related biases in employment measures. Software that allows for randomizing the ordering of household members when collecting data on the labor module can mitigate this source of systematic bias, as can avoiding the use of proxy respondents. Avoidance of proxy respondents to minimize measurement error, however, can potentially lead to greater errors of coverage.
The effects of different recall periods for measuring agricultural labor are investigated by Arthi et al. (2018), who use Tanzania data to compare weekly agricultural labor reporting with end-ofseason reporting. The latter is associated with a fourfold increase in the hours reported by individuals at the plot level, in comparison to reports obtained via weekly visits, their preferred benchmark. However, they note that aggregation to household-level reporting causes the differences in reported hours between the weekly and end-of-season recall periods to disappear. In interpreting these findings, the authors note how different recall biases are associated with memory decay (which shorter recall would help address), but also by the mental burden of reporting that varies by the level of aggregation. In their study, aggregating plot-person hours to the household level happens to compensate for competing biases arising from over-reporting at the intensive margin and under-reporting at the extensive margin. However, this is not a result that can be extrapolated to other settings. Understanding the level of disaggregation at which individuals provide the most accurate reports on their agricultural labor inputs should be an area of focus for future research. Research by Gaddis et al. (2020) in Ghana find much less dramatic differences in the magnitude of the recall bias compared to Arthi et al. (2018), but also discover that an important source of bias is the omission of plots and farm workers at the listing stage, which can be mitigated by explicit attention to this specific aspect of survey design.
In the United States, a substantial amount of randomized testing (Reist et al., 2019) and cognitive interview piloting (Ridolfo and Ott, 2020;Ridolfo and Ott, 2021) is routinely devoted to testing innovations aimed at easing response burden and addressing complex questions about workers' remuneration and tasks. The findings suggest that the optimal design of instruments to collect labor data will likely require a fair amount of adaptation based on the context and the intended use of the data. For instance, while respondents in the United States appear comfortable separating base and overtime hours, they had difficulties distinguishing base pay from bonusesthe concept being hardly applicable to respondents paying piece rate (Ridolfo and Ott, 2021). For low-income settings, Sagesaka et al. (2020) have systematized recent findings from survey research into practical data collection guidance for survey practitioners.

Non-labor inputs
One empirical regularity that has recently come to the fore is that measurement error in land area is strongly correlated with farmers' self-reporting of their application levels of agricultural inputs (Abay et al., 2019b;Bevis and Barrett, 2019;Burke et al., 2019). These patterns in the data naturally raise questions on the mechanisms that drive the relationship between non-classical measurement error (NCME) for land area and self-reported input application rates. One such mechanism could be that farmers have a mental heuristic for input application rates and thus selfreport, for example, seed or fertilizer quantities based on the amount of land they believe they cultivated, along the lines of the optimal error prediction model of measurement error. Such a heuristic is easy to imagine in the case of fertilizer or seed, for which extension agents and agricultural input dealers commonly offer recommendations in the form of application rates per unit of land cultivated. If this is indeed the mechanism behind the observed correlation between area NCME and agricultural input levels or application rates, it could imply either of two possibilities. On the one hand, NCME in land area might propagate into NCME in agricultural input datathat is, the measurement error in inputs would merely reflect the error in land area, permitting statistical correction using observed area measurement error. On the other hand, land area NCME could actually affect agricultural input use by farmers, if farmers' decisions on input intensity are based on misperceived land area (Abay et al., 2019b). Eliciting input use information after the collection of objective land area measures to better understand how the mental heuristic of optimal application rates may be influencing farmers' self-reporting is a key methodological research area for improving data collection on input use.
Aside from application rates, measuring the quality of inputs is an important and often unobserved characteristic of input investments. The fact that input quality is often not directly observable poses a problem not only for the analysis of agricultural productivity, but also for farmers in making decisions on input use. Perceived quality may influence input demand and use more than actual attributes of quality. Such questions have been difficult to explore until recently, as economists have begun complementing traditional data collection from farmer respondents with laboratory analysis. The latter is also not free from error, however. An early study by Bold et al. (2017) finding widespread problems with nutrient quality in Uganda has since been contradicted by a series of large-scale sample surveys finding limited evidence of widespread quality issues in synthetic urea in East Africa. There is also evidence that perceptions of quality are influenced by other factors that in turn influence productivity, such as rainfall patterns (Hoel et al., 2021;Michaelson et al., 2020;Ashour et al., 2019a;Sanabria et al., 2018). Collecting better data on both perceived and actual fertilizer quality is essential to explain farmers' behavior with respect to their adoption, and the extent to which possible remedial action for low levels of fertilizer use may come from certification or the use of other policy levers (Hoel et al., 2021).
For herbicides, Ashour et al. (2019b) find that there are widespread quality issues with the herbicides available in local markets in Uganda, but that farmers' perceptions of poor herbicide quality are overstated, and poorly correlated with actual measures of product quality from laboratory testing. Prices correlate with measured quality, but very weakly. In a technical report using the same dataset, Ashour et al (2019a) report poor correlation between tests in two different labs and ascribe the difference to flawed procedures in one of the facilities, a reminder to researchers that 'objective' measures conducted with the aid of technology are, as with any measurement operation, not immune from error.
In countries that have administrative data systems around the use of agricultural inputs such as pesticides, these offer the potential to be combined with survey data to both improve the accuracy of the data compared to respondents' recall, while also reducing the burden on survey participants. This is for instance the case in the United States, where at least some states (Arizona, California) are using data from mandatory pesticide use reporting systems instead of asking farmers (NRC, 2008). However, these methods may be more difficult to implement when the objective is to collect crop or field level data: in such cases, the US National Agricultural Statistical Service (NASS), collects data from respondents on one randomly selected field for selected crops of interest (NASS, 2021). A similar type of use of multiple data sources may also be more difficult to implement in poorer countries, where administrative data systems suffer from low quality and credibility.
These studies are examples of ways in which administrative or market-level data collection can be combined with household-level survey data to provide evidence on the use and quality of inputs available to farmers. In terms of our conceptual framework, this implies efforts towards improving the accuracy of input quality (via objective testing) and quantity (via the use of administrative records) estimates as well as the coverage (e.g. via market-level sampling for quality testing which can be linked to farm level behavioral variables), but also to the collection of additional (omitted) variables related to farmers' perception of quality, as these may be only tenuously linked to actual quality attributes.

Soil quality and soil health
Stevens (2018) writes that soil health "is a straightforward concept in the abstract, but difficult to define in practice". Not only do soils have many attributes that require multiple, complex measures, but these attributes are also interdependent, and the attributes (or their combinations) of significance can vary depending upon the application for which an assessment of soil health is needed.
In Europe, the 'Land Use/Cover In low-income settings, where large-scale soil surveys are not usually available, recent research has cast serious doubts on the reliability of farmers' self-reporting on soil quality and soil health, with findings for Ethiopia (Carletto et al., 2017a;Kosmowski et al., 2020a), Kenya, and Tanzania (Berazneva et al., 2018) consistently finding poor or no correlation between farmers' assessments of soil quality and objective measures based on lab analyses or portable spectrometers. Unlike land area measurement, there are no clear systematic biases emerging in the case of soil quality attributes; the concern is mainly with the lack of explanatory power of the traditional measures relying on farmers' assessments. While some predictive power has been reported for soil type (Berazneva et al., 2018) and soil color and texture (Gourlay, 2017), the reported correlations are very weak.
Efforts to pilot the use of portable spectrometers for in situ objective measurement of key soil health features such as organic carbon, PH, nitrogen, potassium, and clay percentage have been shown to perform well when compared to Conventional Soil Analysis (Carletto et al., 2017a;Kosmowki et al., 2020;Vasques et al., 2020). While portable spectrometers are not nearly as widely available as GPS units, their cost and weight are expected to decline rapidly as technology advances, making the prospects for their use at scale ever more attractive, particularly when soil attributes are important for the research question at hand. In lieu of field-ready soil sensors, some survey efforts have moved towards smartphone-based soil assessments such as LandPKS (Herrick et al., 2013), but these have largely been on pilot-level or small-sample surveys (see for example Nord and Snapp, 2020).
The other related avenue through which advances in soil health data can be expected to rapidly materialize is the integration of remote sensing data with georeferenced survey data. The correlation between available modeled georeferenced data such as AfSIS (see Hengl et al., 2015 for details) has been shown to be encouraging but far from perfect, particularly when there are high variations in soil quality within a given geography (Gourlay et al., 2017). As more objectively measured ground data on soil health is collected and used to train models based on Earth Observation data, however, the quality of the modeled data will increase (Kosmowski et al., 2020a).

Agricultural machinery and farm implements
Agricultural capital in the form of machinery and farm implements can increase the production capacity of smallholder farmers. Understanding the mechanization of agriculture is critical to understanding changes in farm size and profitability over time. While it is generally regarded as easy for farmers to recall agricultural capital within the household, the plot-level attribution and control of such capital are measurement challenges. Plot-level attribution of machinery use is often avoided, as it may be assumed by the survey designer that agricultural capital is shared equally in the household. inputs modules to assess differences in the intra-household allocation of inputs (Udry, 1996).
Recall periods for agricultural machinery and implements usually focus on the availability of assets over the previous 12 months. Differences in input use by crop-plot-season are important to capture, but this may not be possible if the frequency of survey administration is annual rather than seasonal. The age of machinery is usually collected with the intention of calculating depreciation, but much depreciation of machinery depends on their maintenance and frequency of use.

Crop variety identification
Possibly the most important technological choice farmers face is that of choosing which crop, and specifically which crop variety, to plant. A good proportion of the budget for agricultural research globally is directed at breeding crops and livestock with desirable traits. While the uptake and impact of improved varieties has traditionally been collected by eliciting information from either farmers or panels of experts, the shortcomings of such methods have become evident in the past decade; as a result, they are gradually being replaced or combined with more objective methods (Maredia et al., 2016;Stevenson et al., 2018;Wossen et al., 2019). The method that is currently being more widely adopted is DNA fingerprinting, which entails the collection of plant material that is subsequently sent for lab analysis. While logistically cumbersome, its implementation has been shown to be possible at reasonable scale, and protocols for its adoption are emerging (Poets et al., 2020).
Asking farmers to identify the crop variety they are planting has often been shown to be utterly inaccurate, even when augmented with photo aids or phenotypic trait-related questions aimed to improve the accuracy of the data. This holds true for different crops across different settings, including sweet potato (Kosmowski et al., 2019), wheat, maize, barley and sorghum in Ethiopia (Jaleta et al., 2020;Kosmowski et al., 2020b;Yirga et al., 2016), cassava in Ghana and beans in Zambia (Maredia et al., 2016), maize in Uganda  and Tanzania (Wineman et al., 2020), and cassava in Vietnam (Le et al., 2019), Colombia (Floro et al., 2017) and Nigeria (Wossen et al., 2019). A few studies report more encouraging self-reported results, with farmers in Bangladesh being most able to discern modern from traditional varieties for both rice (Kletzschmar et al., 2018) and lentils (Yigezu et al., 2019). The latter study is also of interest in that the panel of experts was, on the contrary, found to overestimate adoption by 89 percent compared to DNA fingerprinting.
Even in the studies where farmers' self-reporting is close to the objective benchmark, DNA fingerprinting was found to have advantages for the analysis of determinants of adoption (Yigezu et al., 2019) as well as for detecting lack of authenticity in modern varieties present in seed markets and in the field (Kletzschmar et al., 2018). When technology adoption is an important component of research design, researchers should consider adopting DNA fingerprinting as a data collection method. The option of conducting such objective, yet more costly, measurement could be more routinely considered on a sub-sample or for priority crops of interest. When field visits for area measurement or crop cuts for output measurement are being performed, the research design can exploit significant economies of scale by performing additional tasks during the same visit to the plot. This does pose other constraints to data collection processes, as such field work needs to be performed within a specific time window (i.e., while crops are still in the field). Ethiopia has been able to incorporate DNA fingerprinting at scale in a national socioeconomic survey for three main crops: wheat, barley, and sorghum (Kosmowski et al., 2020b). Barriga and Fiala (2020) use DNA lab analysis to investigate seed quality along the seed supply chain, looking at genetic variation, physical purity, and performance, focusing for the latter on germination rate, moisture level, and vigor. This allows them to identify issues with the handling and storage of seeds, rather than counterfeiting or adulteration.
In addition, Kosmowski and Worku (2018) report promising results for the use of spectrometers for varietal identification on cultivars of barley, chickpea, and sorghum in Ethiopia, with an overall correct classification accuracy of respectively 89, 96, and 87 percent in their sample. Sinha et al.
(2020) report similarly encouraging results from a study on banana varieties in Uganda by extrapolating the ground-based hyperspectral measures to high-resolution satellite imagery, therefore creating the potential of mapping the distribution of banana varieties at a higher spatial resolution. This is an exciting area of innovation which is currently at the experimental stage but is likely to become mainstream over the next few years, provided validation efforts continue and implementation protocols are devised.

Measurement of farm level food losses
While research on food losses has increased in recent years, the available data is extremely heterogeneous with respect to the measurement approaches used, the stages of the value chain investigated, and the conceptual framework adopted. Bellemare et al. (2017) propose a different conceptualization of food waste from that used by others in this domain, whose estimates of food losses would be largely overestimated according to their definition (Table 2).

Table 2. A Comparison of Quantity and Cost Estimates of Food Waste Across Definitions
Source : Bellemare et al., 2017. In the existing literature, storage is the stage of the value chain where most food losses are concentrated (FAO, 2019). 5 Xue et al. (2017) attributes differences in food losses to different storage conditions, and research from Bachewe et al. (2018) and Minten et al. (2015) also point to the importance of storage losses. Despite the interest and prominence that the debate on food losses has acquired, data of sufficient quality and robustness on storage losses is lacking, hindering the design and implementation of interventions to reduce them systematically and at scale.
Comparisons between objective and self-reported measurements of food losses routinely find systematic differences between the two. While objective measures are more accurate, they are also more costly, time-consuming (selecting, sorting, and weighing samples of grains), and logistically challenging. Model-generated methods of estimation are therefore being researched, as they offer a possibility to deliver measurements in a more cost-effective manner (FLW Protocol Steering Committee, 2016). Model-based estimates could be used in conjunction with rather than as a replacement for survey data, for instance, by estimating losses between survey rounds. These estimates can determine storage outcomes by taking into account the effect of variables related directly to storage conditions (e.g., the type of storage facility, the application of pest protection products, or the moisture content at which the grain is stored) as well as contextual variables (e.g.,

Livestock production and management
The bias of agricultural economists for crops over livestock is reflected in the relatively limited efforts seen to date on developing better data collection methods for livestock Kristjanson et al., 2014;Little et al., 2008;McCarthy et al., 2004;Pica-Ciamarra et al., 2014).
Most methodological work has been directed at pastoral or agro-pastoral systems, which is to be expected, given both the specific challenges these systems pose to data collection and the importance of livestock for people living in regions where pastoralism is prevalent. Recent work in this area has focused on herd mobility, to address the challenges that it poses for enumerating nomadic or semi-nomadic populations, as well as to study mobility patterns linked to the state and management of natural resources (e.g., grazing, water) upon which livestock and their herders depend.
For example, Himelein et el. (2014) conducted a pilot in the Afar region of Ethiopia to explore the use of random geographic cluster sampling as an alternative to conventional sampling methods.
The approach is based on the random selection of points around which circles are drawn and all eligible respondents found inside those circles interviewed. The approach aims to reduce the under-coverage of mobile populations expected when samples are drawn based on lists of dwellings within a primary sampling unit, as is typically the case for household surveys.
Otherwise, methods have not evolved significantly from the surveys at enumeration points (i.e., water, dipping or vaccination points, stock routes) and aerial surveys recommended by ILCA in the 1970s and 1980s (FAO, 1992;GSARS, 2016;ILCA, 1990), except for the fact that these aerial surveys can now also be implemented using higher-resolution imagery captured by drones (Chamoso et al., 2014) or satellites. However, these methods are still in the experimental stage and have not to our knowledge been applied at scale.
Advances in spatial data, both from satellites and on the ground, is creating opportunities for the collection of data on the interaction between livestock, mobility, and natural resources. On the ground, GPS trackers placed on cattle have been used to characterize the mobility of herds and their use of rangeland resources (Bailey et al., 2018;Liao et al., 2018Liao et al., , 2017Swain et al., 2011;Turner et al., 2000), although few of these applications have appeared in economics journals. From space, satellite imagery is being used to characterize the state of rangeland resources (Reinermann et al., 2020) and we expect that the potential for applications in agricultural and natural resource economics will expand dramatically as a result.
On improved measures of livestock productivity, recent studies led by economists are limited.
Specialized livestock surveys often select a random animal in the herd and ask questions about that animal. In household surveys, this is not generally done, as the herd may not be present, and a visit would add to the interview time. Livestock experts also tend to measure productivity using the reproductive capacity of the herd, and thus their focus is on demographic parameters (Lesnoff et al., 2014). For milk off-take, a methodological study conducted in Niger comparing different types of recall to an objective measure provides some confidence in the accuracy of recall measures (Zezza et al., 2016a). Other technologies, such as 3D and thermal cameras, are being used to assess livestock weight and health (Song et al., 2018;Stajnko et al., 2008), but mostly by animal scientists rather than economists or statisticians. Nonetheless, there is a clear potential for economic applications to emerge, as the value of livestock is primarily determined by parameters linked to weight and health, which are notoriously difficult to elicit from survey respondents. Guidance for data collection on livestock in low-income countries has been systematized in recent years in GSARS (2016) and Zezza et al. (2016b). Model-based estimates of livestock populations have been developed by researchers at the FAO (Robinson et al., 2014) and are continuously being updated as new spatial datasets become available and modeling techniques evolve (Nicolas et al., 2016;Da Re et al., 2020).
Land tenure Holden et al. (2016)  For household farms, whenever individual-level data are of interest, such as when the research objective is to study gender gaps in productivity, wealth, or vulnerability, land ownership should be reported by self rather than proxy respondents, owing to well-documented and large discrepancies between proxy and self-responses. While research on the implication of different possible approaches is still needed, the primary issue is the method of respondent selection, where researchers increasingly favor interviewing multiple individuals per household. Approaches may vary, and will also depend on the objective of the analysis, but they can be reduced to essentially three options: (1) interview all household members, (2) focus on the members of the principal couple if one is present, or (3) select a random age-eligible household member and his/her partner if applicable (Doss et al., 2019). When multiple household members are interviewed, they should be interviewed separately and whenever possible concurrently or consecutively, so as to avoid the possibility of contamination in their responses (United Nations, 2019).

Climate: weather events, perceptions of and adaptation to climate change
Climate data have experienced a revolution in recent decades which continues to the present day.
While climate and weather have always been central to explanations of agricultural productivity, attention has increased with the emergence of debates on climate change, climate-smart agriculture (Lipper et al., 2018), and index-based insurance (Benami et al., 2021;Carter et al., 2017;Jensen and Barrett, 2017;Rosenzweig and Udry, 2014). Dell et al. (2014) and Auffhammer et al. (2013) provide excellent reviews of the types of available climate data as well as their accompanying measurement bias and coverage concerns, which economists should consider when relying on climate data for making inferences. In terms of the production and availability of climate data, there has been a surge in data from remote sensing and in situ sensors (which are discussed later in the chapter), as well as concerns in Africa and small island states regarding the decline in the availability of traditional meteorological stations (Dinku, 2019;Dobardzic et al., 2019).
Weather data are commonly classified into four categories: ground station data, gridded data, satellite data, and reanalysis data. Data from ground stations offer direct observation of key weather variables, but their coverage is neither universal nor constant over time, with weather stations being relatively sparse in many low-and middle-income countries. Additionally, their coverage and trends are often related to the distribution of weather variables, posing estimation problems similar to those of selective attrition. Gridded data provide complete coverage at different resolutions by interpolating weather station data and assigning a value for weather variables for each cell on the grid. They present the desirable advantages of balanced panels, but analysts should be aware that results will differ for different products, particularly for outcomes that have greater spatial variation such as precipitation. The presence of missing values in the underlying station data and the spatial correlation introduced by extrapolation algorithms all create potential biases in the estimated coefficients and standard errors when gridded products are used as independent variables in econometric analyses (Dell et al., 2014).
Satellite data use readings from satellite-borne sensors but do not directly measure weather events.
Their time series are shorter than those for station and gridded data (starting in the 1990s and increasing since the 2000s), and their quality may not be uniform, due to changes in satellites and sensor features. Reanalysis data combine information from other weather data sources and elaborate them with a climate model to estimate (not simply interpolate) weather variables across a grid. Analysts should consider whether such modeled data are preferable to interpolated gridded data, given the objective of the analysis, and should be aware that the correlation across models is often weak, particularly for rainfall data. Dell et al. (2014) and Auffhammer et al. (2013) provide a more detailed discussion, while Michler et al. (2020) and Parkes et al. (2019) provide examples of empirical applications testing the behavior of different gridded products as explanatory variables in agricultural productivity analyses for India and sub-Saharan Africa, respectively.
Analysts must also identify the most appropriate set of climatic variables to use when specifying explanatory models for outcomes heavily dependent on climatic inputs. Advances have come from the increased cross-fertilization between crop science and statistical models, which has expanded the range of climate variables used in empirical analysis beyond standard rainfall and temperature measures. Newly included climate variables include growing degree days (GDD) and extreme heat degree days (EHDD) as well as measures to better account for humidity and evapotranspiration such as vapor pressure deficit (VPD), wind speed, and sunshine duration (Roberts et al., 2013;Zhang et al., 2017). Challenges remain for statistical models in accounting for the effects of carbon dioxide (CO2) that accompany warming or the concentration of ozone (O3) that may be associated with the burning of fossil fuels (Lobell and Asseng, 2017).
In parallel, there have also been increased efforts to capture both subjective perceptions of climate change, as well as practices reflecting the adoption of adaptation practices by farmers. While several researchers have engaged in collecting data with this objective in mind (e.g., Di Falco, 2011) there have been few attempts (McCarthy, 2011) to systematize data collection instruments in this domain; as such, this remains an area in need of further development. However, recent studies comparing self-reported data on weather events to recorded, observed weather data find a very weak correlation between the two. More importantly, they find that self-reported weather data are influenced by variables of interest such as the involvement in off-farm activities (Nguyen & Nguyen, 2020;Waldman et al., 2019). Self-reported data hold more promise for investigating perceptions and adaptation actions by farmers, whereas indicators referring to realized weather events should be based on objective data whenever possible. Researchers of smallholder, rain-fed production systems face particular challenges in achieving the granular resolution required for conducting plot-level analysis of the determinants of productivity, yield variability, and other key outcomes.

Earth Observation
The ever-increasing number of satellites orbiting the Earth has exponentially increased the availability of satellite-borne sensors supplying a variety of data at high temporal and spatial resolution. A classification of the satellite sensors categories, with their main features, as per the classification of the European Space Agency is provided in Table 3. for global soil mapping). The combined use of multiple sources is the most promising avenue for agricultural data systems to minimize error and maximize coverage. As for climate variables, users should be aware of the error structures present in modeled estimates when using them as independent variables in econometric analyses. One key obstacle to using Earth Observation data in conjunction with spatially explicit survey data is that of overcoming confidentiality concerns.
For some years now, the United States Department of Agriculture has been aware of the lack of precise spatial information as a major weakness of their flagship ARMS survey, limiting its value for a range of applications. Other international survey programs such as the Demographic and Health Survey (DHS) and the Living Standards Measurement Study (LSMS) adopt protocols to publicly disseminate 'masked' coordinates while preserving anonymity. However, researchers and the global statistical community are still searching for dissemination standards that can maximize the value of spatially explicit data for analytical applications while also preserving anonymity (Croft et al., 2021).

Crowdsourced and citizen-generated data
An innovative source of data that is likely to be increasingly used for research in the coming years is citizen-generated data (Lämmerhirt et al., 2018). This includes data generated via crowdsourcing, that is, by enlisting a large 'crowd' of individuals (volunteers or for pay) or devices (e.g. sensors) to collect and share data. In the cognitive science discipline, one third to one half of the scientific papers in top tier journals are now based on crowdsourced datasets (Stewart et al., 2017). However, at the time of this writing, crowdsourced data are still relatively underused in agricultural economics and are more often employed for operational purposes rather than academic work. In economic research more broadly, the disciplines most likely to use such data are those more amenable to the wholesale enlisting of respondents through dedicated platforms, such as labor market or consumer research. Citizen-generated data are already contributing or demonstrating the potential to contribute to advancing the global data agenda (Fraisl et al., 2020).
Their supply and use can be expected to expand rapidly in the coming years, but this will require solutions to overcome issues around quality control and validation (Balázs et al., 2021;Wiggins et al., 2021.) In the agriculture and food domain, crowdsourced data are more common in price data collection efforts, where agents or volunteers can be recruited to survey markets (UN Global Pulse, 2015; Zeug et al., 2017;Ochieng and Baulch, 2020). They are also used for obtaining climate data, such as rainfall, which is less correlated in space (Minet et al., 2017) and can be crowdsourced by connecting micro rain gauges to the internet ( Van de Giesen et al., 2014). Another option is soil data collection, which can be crowdsourced to farmers using smartphone apps to collect soil profile information (Herrick et al., 2013). One study crowdsourced the visual interpretations of satellite imagery from popular mapping applications to estimate the global distribution of field size (Lesiv et al., 2019). In a review article, Ebitu et al. (2021) identify data collection as the main current thrust for citizen science in agriculture, with key challenges including validation procedures, but primarily the recruitment, motivation, and retention of volunteers.
Citizen-generated data are attractive due to their potential to return data at high levels of spatial and temporal resolution with relatively limited costs. However, these data present significant limitations in their representativeness and the quality of the data generation process that must be understood and managed for statistical inference. Based on a review of survey data, Wiggins et al.
(2011) propose a quality assurance framework for citizen science data organized along two categories of sources of errors (which may derive from participants or field protocols) and three entry points in the data production process. While recognizing the huge potential of citizen science data for agriculture and beyond, it is clear that before it can become mainstreamed in data production, more effort must go into ensuring that data collected through "volunteers" with varying levels of expertise and commitment are of acceptable quality (Bonter and Cooper, 2012). Mehrabi et al. (2021) warn of an emerging global divide in data-driven farming, linked to the differential access to mobile data technologies for low-resourced farmers, particularly in Africa, as a result of a combination of differential ownership of mobile devices, poorer data connection, and connectivity costs. However, the rapid increase in both mobile phone ownership and phone coverage in most countries bodes well for a more widespread adoption of phone data collection.
In the cognitive science literature, where crowdsourced data are mainly generated via the Amazon Mechanical Turk platform, concerns have arisen on the professionalization of the individuals contributing the data, with many of them sharing information on internet fora in ways that pose concerns for the independence of the observations (Stewart et al., 2017). Statistics Canada is one of a few statistical offices that have actively published data generated through crowdsourcing, for public policy applications ranging from urban planning to gauging the price of marijuana on the illegal market ahead of its legalization. Tellingly, such data are not accompanied by indications regarding their accuracy (including bias and coverage) that accompany other published statistics (Statistics Canada, 2021).
Methodologies for validating and correcting crowdsourced data through post-stratification efforts (Arbia et al., 2020) or other efforts to assess and improve the bias and variability of the estimates are now starting to emerge (Buil-Gil et al., 2020). With their further development, crowdsourced data will surely become an increasingly important source of data for agricultural economics applications.

Phone surveys
Phone surveys have been around for decades and are in fact part and parcel of the survey data collection in several high-income countries (NRC, 2008;Slavec and Toninelli, 2015). In lowincome countries, phone surveys were for some time confined predominantly to the collection of data in conflict-or disaster-affected areas where ground operations are more constrained (Hoogeveen and Pape, 2020), or in urban areas where phone ownership and coverage is higher.
However, their adoption has quickly become ubiquitous with the onset of the COVID-19 pandemic in 2020, as statistical offices and practitioners increasingly recognize how phone surveys can become an integral part of a modernized survey system beyond the contingency of the pandemic response period (Glazerman et al., 2020;Young Lives 2020;Josephson et al., 2021).
There are specific coverage concerns for phone surveys linked to the extent and patterns of (mobile) phone penetration, which can be expected to be correlated with variables of interest. Such concerns are far more severe in low-income countries, where phone penetration has been increasing but is still far from universal, and specifically in rural areas, where agricultural economists often focus their research interests (Dillon, 2012;Ballivian et al., 2015;Leo et al., 2015;Lamanna et al., 2019;Mehredi et al., 2021;GSMA, 2020;Dabalen et al., 2016).
During the COVID-19 pandemic, phone surveys allowed for the possibility to contact respondents amid widespread travel and social distancing restrictions, without exposing them or the enumerators to a health risk. Phone surveys can also generate much more frequent data relative to face-to-face interviews, due to their reduced cost and simplified logistics (e.g., not requiring travel). This can limit survey error for variables that are more prone to recall error (such as agricultural labor (Arthi et al., 2018) or continuous crop production estimates , as well as increase the temporal dimension of data collection for outcomes that have low autocorrelation (McKenzie, 2011), or where short-term overtime changes are of the essence, as is the case for the study of resilience (Knippenberg et al., 2019).
Concerns remain for the representativeness and coverage of phone surveys, not only for specific households that may be less likely to have access to a phone connection, but also for individuals who are less likely to be phone owners or are otherwise less represented in phone survey samples (Leo et al., 2015;Brubaker et al., 2021). Such issues can be mitigated when the phone survey sampling frame is based on an adequate set of information on observable household and individual characteristics, as is the case when the phone survey is tied to a recent representative face-to-face survey that collected respondent phone numbers (Ambel et al., 2021). While the sample size of phone surveys using this approach is limited by the sample size of the existing representative survey, phone surveys that use a sampling strategy based on sampling numbers from a list or via Random Digit Dialing (RDD) usually lack sociodemographic information associated with each phone number, making it harder to assess and improve their representativeness (Henderson and Rosenbaum, 2020;Himelein et al., 2020).
Other limitations of phone surveys are related to the type of information that can be asked over the phone, both because of content that respondents may not feel comfortable sharing over the phone, as well as the overall interview length (Abay et al. 2021). Even so, recent experience has demonstrated the value of collecting information over the phone on issues related to agriculture and food security (Amankwah and Gourlay, 2020;Hirvonen et al., 2021), charting the way for a survey research and implementation agenda to leverage the integration of high-frequency data collection via phones and other mobile technology with traditional face-to-face surveys. Such a mixed-mode approach can carry the added advantage of freeing up space in face-to-face surveys from items that can be collected via remote data collection to generate data that are characterized by both reduced survey error and higher temporal resolution. Mixed-mode models can also be instrumental for achieving the temporal resolution needed for many indicators, as well as for providing a low-cost platform to collect more accurate data on high-frequency, repeated occurrences, such as labor allocation in agriculture and other time use data. This is a likely direction for investment in the survey research agenda in the coming years, where the involvement of agricultural economists in influencing the structure and features of the resulting data will be paramount.

Panel data
Understanding agriculture and the fast transformation processes ongoing in all countries at different stages of development requires panel data. Partly in response to this renewed awareness, we have recently witnessed a surge in the availability of panel data related to agriculture and rural development in low-and middle-income countries. While for decades, the ICRISAT village study (Walker and Ryan, 1990) was one of the few longitudinal datasets allowing research on agricultural and rural livelihoods, over the past two decades the availability of such datasets has increased substantially, even if they remain limited in numbers and geographic coverage. Michigan State in Zambia, among others. These surveys have generated an invaluable wealth of research and contributed to answering key policy questions that cross-sectional data have been unable to convincingly address.
We have discussed above (see section 3.5) several actions that can be taken to manage attrition in panel data, whether ex-ante by improving the design and implementation of tracking protocols, or ex-post. The availability and penetration of mobile phones and the growing adoption of CAPI have been important innovations that have enabled implementing and improving the tracking outcomes for longitudinal surveys in low-income countries. Collecting as many contact numbers as feasible at baseline greatly improves the likelihood of being able to recontact households that move between survey waves, and has also played a fundamental role in allowing the longitudinal tracking of households for phone surveys during the COVID-19 pandemic (Glazerman et al., 2020;Gourlay et al., 2021). Georeferencing households is an additional technology-based solution that can help relocate the site of the dwelling in areas where these are not otherwise clearly marked or identifiable (Witoelar, 2011).
Additionally, the technology embedded in CAPI applications is providing new approaches for survey designers and implementers to understand and manage attrition (Kreuter, 2013) as well as to improve data quality through better remote supervision. Specifically, the paradata produced during CAPI interviews enables the understanding of certain features that predict attrition as they materialize during the course of the interview, including enumerator effects. These paradata can inform actions to minimize attrition and monitor individuals at higher risk of dropping out of the sample, thus countering the predominant coverage issue for longitudinal data (Mercer, 2012;Roßmann and Gummer, 2016).
Finally, following the onset of the COVID-19 pandemic, the availability of well-established longterm longitudinal studies put countries at an advantage for rapidly shifting to high-frequency phone surveys to monitor the impact of the pandemic. This served to fill critical data demands, while also reducing the potential coverage biases of phone surveys by providing better sampling frames and a wealth of information for the ex-post mitigation of bias.

Conclusions
Agricultural data continue to suffer from lack of availability, poor quality, and incomplete coverage. However, in recent years, increasing data demands and emerging policy questions such as climate change and demographic trends, among others, have driven innovation in the sector, with rapid technological change and methodological advances providing an opportunity to collect more and better data at lower cost. In the past two decades, technology has expanded the data production frontier to generate more accurate, granular, and frequent data within shrinking budget envelopes. These innovations have been accompanied by greater attention to issues of measurement error and coverage, focused on ways to attenuate tradeoffs and achieve both high accuracy and high representativeness to the greatest extent possible, and by greater rigor in testing the validity of changes in methods via randomized validation exercises.
This chapter is testament to the increased importance of data and data quality issues within the agricultural economics profession. Researchers hold the power and responsibility to make wiser design choices throughout the data production process. However, reaching the full potential of improvements in data structures for producing policy-relevant empirical analysis may require changes in researchers' incentives and priorities to generate knowledge that is accurate, relevant and credible. For instance, a recent evidence synthesis paper exposes a striking disconnect between empirical agricultural and social science research and policy questions (Porciello et al., 2020).
Throughout the chapter, we have highlighted the importance of improving agricultural data structures for empirical analysis, while accounting for the inherent tradeoffs involved in designing data collection for agricultural research and policy. Measurement error creates both internal and external validity issues that limit causal inference and descriptive understanding of national agricultural systems. Coverage biases also create internal and external validity issues, particularly when limited coverage biases the testing of underlying mechanisms that drive agricultural choices.
While surveys remain the linchpin of agricultural policy analysis, other traditional data sources such as administrative data and agricultural censuses, as well as new data like Earth Observation and remote sensing data, play equally important roles in improving the coverage of agricultural data in its many domains. Additionally, alternative data sources such as citizen-generated data and methods such as machine learning, while not yet mainstreamed in agricultural data production, offer tremendous opportunities for the future. To achieve their potential, these newer data sources require fully developed quality assurance frameworks to address multiple sources of errors and biases, just as traditional ones do.
As data users become more integrated into data system design, data systems can be better designed for empirical research and policy to minimize measurement error. As emphasized by many authors, non-classical measurement error and its effects vary by sample and are not necessarily adequately treated and corrected using ex-post econometric tools. Nonetheless, tradeoffs are inevitable, as increased coverage can lead to measurement error and internal validity concerns, while low coverage reduces policy relevance and the external validity of parameter estimates.
To promote more systemic learning, validation studies and experimentation must be carried out more systematically within or in parallel to other data collection efforts, and lessons learned from the existing vast body of research in the impact evaluation literature must be streamlined and systematized to offer guidelines on best practices for researchers. Specifically, we propose bridging the gap between the impact evaluation literature and observational studies by methodically incorporating survey experiments to validate new methods and types of data collection. The empirical standard in many validation studies is to use a "gold standard" as numeraire, although such "gold standard" metrics are also likely to be measured with error. As a result, many of the available validation studies tend to measure error relative to a standard deemed "closer to the truth". While technology presents an opportunity to benchmark agricultural measures and generate more objective benchmarks for validation purposes (e.g., DNA fingerprinting to measure improved seed variety), these processes are often considered too costly to be conducted at scale. However, the rapidly decreasing costs and diffusion of new technologies bode well for the future. Furthermore, future survey experiments need to expand the set of econometric techniques to identify unbiased effects of survey design choices beyond pairwise comparisons (Dillon et al., 2019), as has been the case in labor economics, where even the 'gold standard' of United States administrative data has been challenged (Abowd and Stinson, 2013).
Fostering greater integration and interoperability across data sources would also allow for more opportunities for minimizing measurement error while maximizing spatial and temporal coverage.
As shown, sample surveys have been used to ground-truth remote sensing imagery for the estimation of crop productivity and other agricultural metrics from space. These experiments are examples of how reducing measurement error and improving coverage can simultaneously be achieved through better data interoperability. This is best done when proper design choices are made ex-ante, so as to also minimize the measurement errors of the ground data. Achieving greater reliability of remote sensing data could radically improve the geographic granularity, timeliness, and frequency of agricultural estimates, while also potentially constraining costs. Attaining such a goal will require the better coordination and acceleration of research efforts, including the production of multi-purpose ground layers of high-quality measurements.
Maximizing coverage of agricultural data also requires improving other traditional sources such as routine data systems and agricultural censuses. The weak data quality of both sources, as well as the low periodicity and predictability of agricultural censuses, particularly in lower-income countries and regions, remain matters of concern. With regards to administrative data, underfunding and the persistent neglect of extension services in past decades are responsible for the current unenviable state of affairs. Digitalization and the adoption of technological solutions can accelerate progress in this area. Furthermore, linking administrative data to newer data sources such as crowdsourced data or high-frequency community surveys through sentinel sites could go a long way towards enhancing the statistical rigor of administrative data. Rethinking administrative data collection and its interoperability with other data sources, while also ensuring better access, should be prioritized to contribute to minimizing error and maximizing coverage of agricultural data. The trend towards greater reliance on administrative data is well advanced in more developed economies, with low-and middle-income countries lagging behind.
New data sources and modes of data collection such as phone or web surveys as well as crowdsourcing and other forms of citizen-generated data offer tremendous potential to improve the availability and frequency of agricultural data. However, to fully exploit these opportunities, better methods are needed to account for likely biases due to selectivity and under-coverage. It is also important to raise awareness, particularly among young researchers, of the pitfalls of ignoring these potential errors and to build their capacity in addressing them, both at the design and analytical stages.
Finally, relying on direct measurements in contrast to the more common practice of asking farmers to self-report, often based on long recalls, has become steadily more feasible due to the declining cost of technology. Nonetheless, cost considerations remain an issue in implementing such methods on a full sample or, in the case of agricultural censuses, on the entire population of