Mission Impossible? Exploring the Promise of Multiple Imputation for Predicting Missing Gps-Based Land Area Measures in Household Surveys

Methodological research has showcased GPS technology as the new gold-standard in land area measurement in large-scale household surveys. Nonetheless, facing budget constraints, survey agencies continue to measure with GPS only plots within sampled enumeration areas or a given radius of dwelling locations. It is, subsequently, common for significant shares of plots not to be measured, and research has demonstrated that the incomplete datasets are subject to selection bias. This study relies on nationally-representative survey data from Malawi and Ethiopia that exhibit near-negligible missingness in GPS-based plot areas and uses these datasets to gauge the limits to the accuracy of a Multiple Imputation (MI) application for predicting GPS-based areas for plots that would typically be considered out-of-scope. The analysis (i) artificially creates missingness in area measures, ranging from 1 to 100 percent, among the plots that are beyond two operationally-relevant distance thresholds with respect to the dwellings; (ii) multiply-imputes "missing" values in each dataset created by a distance threshold-missingness combination; and (iii) compares the distributions of the imputed plot-level outcomes with the distributions of their true, observed counterparts. In Malawi, the multiply-imputed distribution of plot-level land productivity is statistically indistinguishable from the true distribution in each imputed dataset with up to 82 percent missingness in GPS-based plot areas that are more than 1 kilometer away from the associated dwellings. The comparable figure in Ethiopia is 56 percent. The study highlights the promise of MI for simulating missing area measures and provides recommendations for optimizing fieldwork to capture the minimum required data.

Methodological research has showcased GPS technology as the new gold-standard in land area measurement in large-scale household surveys. Nonetheless, facing budget constraints, survey agencies continue to measure with GPS only plots within sampled enumeration areas or a given radius of dwelling locations. It is, subsequently, common for significant shares of plots not to be measured, and research has demonstrated that the incomplete datasets are subject to selection bias. This study relies on nationally-representative survey data from Malawi and Ethiopia that exhibit near-negligible missingness in GPS-based plot areas and uses these datasets to gauge the limits to the accuracy of a Multiple Imputation (MI) application for predicting GPSbased areas for plots that would typically be considered out-of-scope. The analysis (i) artificially creates missingness in area measures, ranging from 1 to 100 percent, among the plots that are beyond two operationally-relevant distance thresholds with respect to the dwellings; (ii) multiply-imputes "missing" values in each dataset created by a distance threshold-missingness combination; and (iii) compares the distributions of the imputed plot-level outcomes with the distributions of their true, observed counterparts. In Malawi, the multiply-imputed distribution of plot-level land productivity is statistically indistinguishable from the true distribution in each imputed dataset with up to 82 percent missingness in GPS-based plot areas that are more than 1 kilometer away from the associated dwellings. The comparable figure in Ethiopia is 56 percent. The study highlights the promise of MI for simulating missing area measures and provides recommendations for optimizing fieldwork to capture the minimum required data.

Introduction
Land is a fundamental component of household and personal wealth in rural areas and is the key factor of production in smallholder production systems. As such, the data on parcel and cultivated plot areas are the heart of economic research linked to agriculture and the design and implementation of land registration, titling and redistribution programs. Furthermore, the Sustainable Development Goal (SDG) Target 2.3 require doubling of agricultural productivity and incomes of small-scale food producers by 2030, and the monitoring of the progress towards this target rely on land area information sourced from household or farm surveys.
While large-scale household and farm surveys in low-and middle-income countries have traditionally relied on farmer reporting to elicit information on land areas, this can be problematic, particularly in the African context, which is characterized by the high incidence of smallholder farming and the fragmentation of farms into multiple parcels with irregular shapes and without formal titles. Several reasons may contribute to the inaccuracy in self-reported land areas. First, farmers may knowingly overstate or understate their landholdings for strategic reasons that may relate to access to development programs and/or taxation.
Second, there is a natural tendency to round off numbers and provide approximations, which leads to heaping of the data around discrete values. Third, geography, particularly slope, can influence the way farmers assess distance and area. Fourth, the use of non-standard measurement units and within-country variation in the type and standard unit equivalence of these units complicate the compilation of conversion factors for land area measurement. In fact, methodological research has shown that self-reported land areas are subject to systematic measurement error with direct implications for the accurate measurement and analysis of land productivity (Carletto, et al., 2013(Carletto, et al., , 2015. These reasons, combined with (i) the validated accuracy of GPS-based land area measurement in household survey experiments in Ethiopia, Nigeria, and Tanzania (Zanzibar) , and (ii) the ever-3 increasing affordability and accuracy of handheld GPS devices makes GPS-based land area measurement a desirable alternative for household and farm surveys in countries dominated by smallholder agricultural production. However, with the emergence of GPS-based area measurement as the new, scalable goldstandard for household and farm surveys, a key drawback is related to the operationalization of the technology.
To reduce transportation costs, keep household interview durations within reasonable limits, and avoid the difficulty of asking respondents to accompany enumerators to agricultural plots that are situated far from dwelling locations, survey implementing agencies often require enumerators to only obtain GPS-based area measures for plots within a given (arbitrary and non-cross-country-comparable) radius of dwelling locations. Consequently, non-ignorable shares of area measures are missing in public use datasets. For instance, among the selected national, multi-topic panel household surveys that are supported by the World Bank Living Standards Measurement Study -Integrated Surveys on Agriculture (LSMS-ISA) program, the rate of missingness in GPS-based plot areas range from 13 (Nigeria) to 44 percent (Uganda), as shown in Table 1. Given the potential selection biases that may be brought on by analyzing only non-missing portions of the datasets, there is a concern that the missing data may limit the operational relevance and the analytical value of GPS-based area measures. And this concern is pressing particularly in the context of a survey program such as the LSMS-ISA that has catalyzed a significant expansion in development research on Africa over the last decade. 2 2 As of November 12, 2018, the official World Bank Microdata Library download count for the publicly available, LSMS-ISA-supported household surveys stood at 37,750, and the lower-bound for the number of research outputs over the last decade based on the LSMS-ISA data, according to the continuing LSMS monitoring of online development research outlets, is estimated at 1,000. Recognizing the need to address the problem of missing data for increasing the usability of household survey data, (Kilic, et al., 2017) use the LSMS-ISA data from Tanzania and Uganda to show that the missing GPS-based plot areas indeed constitute a non-random subset of the unit-record data and that the missing data can be simulated by Multiple Imputation (MI). In their analysis of plot-level land productivity, the authors document the non-trivial effects of using the datasets that are completed based on MI.
Underlined by the relatively recent adoption of GPS-based land area measurement in national household surveys, and the evolving appreciation of the recent methodological research on addressing (ultimately unavoidable) missingness in GPS-based land area measures, there is a continuing need to work on two fronts. The first is to offer operational guidance for survey practitioners and data users, including agricultural and development economists, regarding "acceptable" rates of item non-response in GPS-based areas for "distant" plots whose areas could instead be simulated. The second is to further elevate the importance of relying on easily-accessible, model-based simulation approaches, including MI, to judiciously address missingness in public use datasets that are at the core of development research.
To address these needs, we work with the unique, nationally-representative household survey data from Malawi and Ethiopia that exhibit near-negligible rates of missingness in GPS-based plot areas and use these datasets to gauge the validity and accuracy of an MI-based approach to predict missing GPS-based land areas among plots that would otherwise be deemed "distant" in a typical survey operation. In doing so, we test the typically-untestable assumptions of MI and identify the acceptable rates of missingness beyond which these assumptions are less likely to hold, specifically for the reliable estimations of cultivated area and agricultural productivity.
The use of actual data collected as part of large-scale household surveys that have adopted GPS-based area measurement is key to the operational relevance of our research. As such, we provide operational recommendations that can enable survey practitioners, including agricultural and development economists involved in primary data collection, to collect the minimum-required data for model-based imputation applications. In addition, we make available the constructed datasets and syntax files to replicate our analyses; enable future MI applications to address similar missing information problems; and catalyze further research for deriving alternative acceptable rates of missingness that can drive the design and implementation of future survey efforts that may have different analytical objectives than those that we work with.
Our headline finding is that in Malawi, the multiply-imputed distribution of plot-level agricultural productivity is statistically indistinguishable from the true distribution in each of the 50 imputed datasets with up to 82 percent missingness in GPS-based plot areas that are more than 1 kilometer away from the associated dwelling. The comparable figure in Ethiopia is 56 percent. If one sets the distance threshold at 500 meters, the tolerate rates of missing GPS-based areas among distant plots stand at 45 percent in Malawi, and 36 percent in Ethiopia. If one focuses on plot area as the outcome variable of interest, as opposed to productivity, the estimated tolerable rates of missingness present an even more optimistic outlook regarding the promise of MI, irrespective of the country and/or the distance threshold in question.

6
The paper is organized as follows. Section 2 describes the data. Section 3 presents the empirical approach.
Section 4 discussed the results. Section 5 concludes, expanding on the relevance of our findings for household and farm surveys that visit sampled households at least twice and in sync with a given agricultural season.

Data
The The IHS3 data were collected within a two-stage cluster sampling design, and are representative at the national, urban/rural, regional, and district levels, covering 12,271 households in 768 enumeration areas (EAs). ESS2 is part of a long-term project to collect panel data. It covered all regional states including the capital, Addis Ababa. Much of the sample is comprised of rural areas as it was carried over from ESS1.
The survey is representative at the national, urban/rural and, 6 strata (4 regions plus Addis Ababa and the other regions) covering 5,262 households in 433 EAs.
In terms of questionnaire instruments, the IHS3 and the ESS2 both had Household, Agriculture, and Community Questionnaires. In each setting, the sample households were administered a multi-topic Household Questionnaire that collected individual-disaggregated information on demographics, education, health, wage employment, nonfarm enterprises, anthropometrics, and control of income from non-farm income sources, as well as data on housing, food consumption, food and non-food expenditures, food security, and durable and agricultural asset ownership, among other topics. In addition, agricultural households received the Agriculture Questionnaire, which solicited plot-level information on land areas, manager/holder identification, labor and non-labor input use, and crop cultivation and production. 3 Further, agricultural production data were collected for the two main agricultural seasons in each survey. Handheld global positioning system (GPS)-based locations and land areas of the plots were recorded, permitting us to link household-and plot-level data to outside geographic information system (GIS) databases.
The IHS3 required GPS-based area measurement of all plots that are owned and/or cultivated by the sampled households, within 2 hours of travel with respect to the household location, regardless of mode of transportation. For the distant plots, the field teams were advised to cluster them in accordance with their location, and to visit them in a coordinated fashion by using the team vehicle. For the sub-sample of IHS3 households that were visited twice, the first visit data were also reviewed, and the missing GPS-based plot areas were fed forward to the second visit interviews for potential capture by the field teams. While the first visit constraints leading to missing data still applied to most of these households during the second visit, the continuing emphasis on increasing the volume of GPS-based plot area measures did result in additional data capture. On the other hand, the ESS2 instructed the enumerators to take GPS-based area measures of all plots that are owned and/or cultivated by the sampled households, irrespective of distance. For plots less than 40 square meters, the enumerators measured areas by traversing, instead of GPS units. The overall rates of missingness in GPS-based plot areas were considerably low in both settings: 3.8 percent in Malawi and 6.2 percent in Ethiopia. These are in fact the lowest levels observed among the surveys supported by the LSMS-ISA program.
3 Both the IHS3 and the ESS2 make a clear distinction between a parcel and a plot. A parcel is conceptualized as a continuous piece of land under a common tenure system, while a plot is defined as a continuous piece of land on which a unique crop or a mixture of crops is grown, under a uniform, consistent crop management system, not split by a path of more than one meter in width, and with boundaries defined in accordance with the crops grown and the operator. Therefore, a parcel can be made up of one or more plots. This distinction is key since for the purposes of within-farm analysis of agricultural productivity, the ideal is to capture within-parcel, plot area measurements linked with plotlevel measurement of agricultural production Parcel-level GPS-based area measurement, on the other hand, could serve other objectives, such as surveying of land for land registration or titling programs or for land ownership measurement. An open empirical question is whether the extent to which parcel-area measurement could be reliably backed from aggregation of within-parcel, plot area measures -an exercise that will be mediated by the precision with which parcel and plot boundaries are established in the field prior to GPS-based area measurement.
Our analysis assumes both data sets to be complete and representative of the true distributions of interest and is subsequently conducted using plots for which GPS based-land area measurements are available. 4 Table 2 shows the distribution of plots according to their distance from the dwelling for both datasets. Table   3 presents the summary statistics based on the IHS3 and the ESS2, including the plot-level means for the entire sample; for the sample within 1 kilometer of the dwelling; and for the sample that lie outside of the 1-kilometer radius of the dwelling. Table 4 accomplishes the same objective but for the samples split by the alternative distance threshold of 500 meters. We provide the differences between the sample means and note when a given mean difference is statistically significant.
Several noteworthy findings emerge from Tables 2, 3 and 4. First, the distribution of plots per distance threshold is quite similar across the two countries. Between 54 and 60 percent of the plots are within 500 meters and between 72 to 77 percent are within 1 kilometer. Second, the plots within the distance threshold tend to be of significantly smaller areas than the plots beyond that threshold. Third, several important plot and household level characteristics which are expected to be associated with productivity-related outcomes, display statistically significant differences by distance threshold status. These observations highlight the importance of systematically addressing missingness in GPS-based plot areas, if such GPS data are to be used in a robust fashion.  Note: † denotes a dummy variable. *** p<0.01, ** p<0.05, * p<0.1. Sample of plots within a 1-kilometer radius is the comparison group for the tests of mean differences. † denotes a dummy variable. *** p<0.01, ** p<0.05, * p<0.1. Sample of plots within a 1-kilometer radius is the comparison group for the tests of mean differences.

Artificial Missingness Creation
Missing GPS-based plot areas measurements are often tied to numerous field logistics and cost constraints.
The variable that underlies the overwhelming majority of missing GPS-based plot areas in household survey operations is the plot distance from the dwelling or the plot location with respect to the EA boundaries. 5 As noted above, the IHS3 instructed the enumerators to measure all plots within 2 hours travel time from the dwelling locations, while the ESS2 required the measurement of all plots, with the exception of those less than 40 square meters, irrespective of distance/travel time. For a more time and/or budget constrained operation, a lower threshold for GPS based land areas measurements could be enforced.
The first step in our analysis is to generate missing GPS-based plot areas in a way that would be similar to real-life field experience, or in other words, identify plots that could be deemed as "distant" in a large-scale survey operation. Towards this end, we work with two arbitrary but operationally-relevant distance thresholds that map well to the existing practices. In the first scenario, the plots are identified as distant if the georeferenced plot corner is located greater than 500 meters from the georeferenced dwelling unit of the associated household. The second scenario relies on a threshold of 1 kilometer instead. 6 Since survey implementers may want to get sense of the time requirements associated with visiting plot locations that 5 Kilic et al. (2017) report that missingness in GPS-based plot areas due to refusal or physical inaccessibility, as opposed to distance, is near-negligible in the LSMS-ISA-supported surveys in Tanzania and Uganda. In the case of Malawi IHS3 2010/11, the considerable missingness in the stated reasons for lack of GPS-based plot area measurement prevents us from reporting specific statistics on missingness due to refusal or physical inaccessibility. In the case of the more recent Malawi Integrated Household Panel Survey (IHPS) 2016, the GPS-based plot area measures that are missing due to refusal or physical inaccessibility constitute 9 percent of the overall sample of plots without GPS-based areas, and 1 percent of the overall sample of plots. 6 The variable underlying each distance threshold is the Euclidean (crow-fly) distance between the geo-referenced plot and dwelling location. Other geospatial measures of the plot distance to the dwelling were considered, including the estimated minimum cost distance that considers topography; the walking time associated with the minimum cost distance; and the inclination-adjusted measures of these two variables. The weighted pairwise correlation between any of the alternatives and our Euclidean distance measure is above 99 percent, and our results are robust to the use of these alternative distance measures.
13 are below versus above the chosen distance thresholds, consider, for instance, the walking time associated with the inclination-adjusted minimum cost distance between dwelling and plot locations in Malawi. For plots that are within the 500-meters and within the 1-kilometer threshold, the average walking time is 4 minutes and 6 minutes, respectively. Conversely, for plots that are outside the 500-meters and outside the 1-kilometer threshold, the average walking time is 33 minutes and 47 minutes, respectively.
Once the non-random, distant plots are identified according to one of the distance thresholds, we artificially create missingness in GPS-based areas among these plots at random, at a rate of 1 to 100 percent and with an increment of 1 percentage point and save these datasets separately. The choice of creating artificial missingness at random above an arbitrary distance threshold that identifies a non-random portion of the data is anchored in the specific way in which we see our findings would be operationalized, as explained below, particularly as part of multi-visit household and farm surveys that are in sync with a given agricultural season and that field a first, post-planting visit for parcel and plot demarcation and area measurement.

Multiple Imputation
The second step in our analysis is to carry out Multiple Imputation (MI) to fill the gaps in GPS-based plot areas in each unique data set that is created by a given distance threshold-artificial missingness combination. MI, first proposed by (Rubin, 1987), is a Monte Carlo technique that replaces missing values for a given variable with m > 1 simulated alternatives. MI typically consists of three steps: (i) m imputations (i.e. m complete datasets) are generated based on an imputation model that encompasses a vector of observable covariates that predict the missingness in a given variable, (ii) statistical analysis is performed separately with each of the m complete datasets, and (iii) the results obtained from m complete data analyses are combined into a single set of multiply-imputed parameter estimates and standard errors. The conditions under which valid inferences could be obtained from missing data has been laid out by (Rubin, 1987). Our procedure assumes that data are missing at random (MAR), that is that missing data could be predicted based on observable attributes underlying missingness. While the MAR assumption is not empirically testable, the limits of its tenability could be assessed in our study since we have otherwise complete datasets that are used as validation samples.
In building the imputation model, the literature (Rubin, 1996) or (van Buuren, et al., 1999) advises to include as explanatory variables: (i) the variables appearing in the analysis model that features the multiplyimputed variable(s), (ii) the variables that are known to have influenced the occurrence of missing data, and other variables for which the distributions differ between the response and non-response groups, (iii) the variables that explain a considerable amount of variance of the multiply-imputed variable(s) and that help to reduce the uncertainty of the imputations, and (iv) the variables with information on the features of the complex survey design, including stratum and cluster identifiers, and sampling weights.
In their MI application to missingness in GPS-based land areas in Tanzania and Uganda, (Kilic, et al., 2017) attempt to provide support for the MAR assumption by (i) detailing the field work processes underlying the missing data, (ii) providing insights from their field experience and interactions with the survey teams, (iii) systematically documenting the established guidelines on imputation model specification, and (iv) including in the imputation model explanatory variables that influence the occurrence of missing data; that have different distributions between the response and non-response groups; that explain a considerable amount of variance of the multiply-imputed variable; and that include information on the survey design.
Our imputation model specification is anchored in these considerations, and includes farmer-reported plot area, which is both a powerful predictor and an alternative measure of the GPS-based plot area. The use of a self-reported variable in the imputation model to tackle item non-response in an objectively-measured 15 variable has been pursued also by (Schenker, et al., 2010), who feed self-reported health measures into a model to multiply impute clinical values in a different survey. 7 For illustration, Table 5 and Table 6 show the details of the Ordinary Least Squares (OLS) imputation model for Malawi and Ethiopia, respectively. In addition to farmer-reported plot area, we include plot manager, household and other plots attributes as predictors. The model specification differs slightly between the IHS3 and the ESS2 depending of the availability of the variables or the specificity of the data set. For example, the raw data on farmer-reported plot areas could have been expressed in non-standard measurement units in the ESS2, as such we add dummy variables for these units in the imputation model for Ethiopia. 7 The literature on the use of MI to address missing information problems is vast and cuts across several disciplines, including but not limited to economics, statistics, sociology, public health, medicine, and epidemiology. The empirical work relying on MI to deal with missingness in income data is noteworthy given the types of household surveys that inform our analyses, and considering the rates of item non-response that are dealt with in the literature on income and that present similarities to the patterns in Table 1 (Schenker, et al., 2006); (Zarnoch, et al., 2010); (Giusti & Little, 2011); (Ahearn, et al., 2011); (Vermaak, 2011).    We estimate the imputation model using each unique dataset that is created by a given distance thresholdartificial missingness combination. While the results confirm that the predictions are essentially driven by the farmer-reported plot area, the more comprehensive model improves the accuracy and precision of our predictions. As pointed out by (Kilic, et al., 2017), it is worth emphasizing that the imputation model neither intends to provide a parsimonious description of the data nor attempts to portray structural relationships among variables. Instead, it attempts to be as comprehensive as possible to minimize any bias that could stem from omitting variables that might be relevant to the pattern of missingness or the subsequent analysis.
"The possible lost precision when including unimportant predictors is usually viewed as a relatively small price to pay for the general validity of analyses of the resultant multiply-imputed database" (Rubin, 1996).
In multiply imputing missing values that have been artificially created in each scenario, we fit plot-level OLS regression models with the GPS-based plot area as the dependent variable and obtain linear predictions for all plots in the dataset. Under the partially parametric method of predictive mean matching (PMM) 8 , we use the linear prediction as a distance measure to form a set of 5 nearest neighbors chosen from the plot sample with GPS-based area measures, and randomly pick one of the neighbors whose observed GPS-based plot area value replaces the missing value for the incomplete case at hand. 9 The imputation is carried out 50 times to reduce the potential sampling error due to imputation; 10 50 complete datasets are generated; and the posterior estimates of the model parameters are obtained from a bootstrapped sample 11 . By drawing from the observed data, PMM preserves the distribution of observed values in the missing part of the data, which makes it more robust than the fully parametric regression approach. In total, we generate 50 complete datasets of GPS-based land plot areas for each of rate missingness (from 1 to 100) above each distance threshold for each country. These data sets are used to assess the tolerable rates of missingness, as explained below.

Assessing the tolerable rates of missingness in GPS-based plot areas
To assess the performance of the imputation model, we compare, the distributions of the true, observed versions of key variables that rely on GPS-based plot areas with the distributions of their completed (observed plus imputed) counterparts. The key outcomes that our assessment focuses on is GPS-based plot area and plot-level agricultural productivity, which is measured as the quantity or value of crop harvested based on farmer-reporting (the numerator) over cultivated land (the denominator). As discussed earlier, plot-level agricultural productivity is of policy relevance.

21
Given the nature of the problems to which MI is applied, it is often difficult for analysts to verify the appropriateness of their imputation procedures. Imputation values are guesses of unobserved, unknown values (Abayomi, et al., 2008). In this study, however, missingness is artificially created such that the true values are known. This allows direct comparison of the distributions of the observed versus the completed data. Numerically, the comparison of the empirical distributions is done using the Kolmogorov-Smirnov (KS) test separately for each outcome variable for each of the levels of missingness, raising the flag if there are statistically significant differences at the 5 percent level 12 for at least 1 of the 50 imputations generated.
As noted by (Abayomi, et al., 2008), there is no reason to suppose that setting a 5 percent level of significance will be appropriate when producing a MI diagnostic through density comparisons. However, it is useful to start with this rule and further examine the results.

Results
The results of our simulations are illustrated in Figure 1. Each panel shows the results from the analyses conducted with a given distance threshold in a given country. In each panel, the y-axis shows the number of imputations out of 50 for which the KS test indicates that the distribution of the relevant outcome variable derived from the imputed dataset is statistically indistinguishable from its observed counterpart. We additionally highlight the tolerable rates of missingness with vertical lines. The x-axis, on the other hand, shows the percentage of simulated missing GPS-based plot areas measurements beyond a given distance threshold. Three general observations emerge from Figure 1.

Figure 1: Tolerable Rates of Missingness in GPS-Based Plot Areas Above a Given Distance Threshold for Plot Area & Plot-Level Yield Analysis
First, at low rates of missingness, the distribution of each outcome variable in each of the 50 imputed datasets are statistically indistinguishable from the observed counterpart. As the rate of missingness increases, this count starts to decrease until only a small number (between 0 and 10) of the imputations appear to have distributions that are not statistically different from the true distribution. Second, the tolerable rates of missingness are lower with the 500-meter distance threshold in comparison to the 1kilometer counterpart. Third, plot-level agricultural productivity is more sensitive to missingness than plot area (i.e. the tolerable rate of missingness is reached earlier in the case of the latter). The first and second observations confirm the expectations anchored in the descriptive analyses discussed in Section 2. Plots that are further from the dwelling are inherently different from the ones that are closer.
Thus, as missingness increases, the pool of plots with similar characteristics (and thus comparable areas) to choose from gets smaller, and it is understandable that the distribution differs substantially. The third observation is also foreseen: land area being the denominator of the formula for yield, a small deviation of the imputed values from the observed land values brings about a relatively more important deviation in the yield estimates obtained from them. Consequently, the yields calculated from the imputed land areas differ substantially from the true yields at lower rates of missingness.
We now compare the results obtained in the different panels depicted in Figure 1. For convenience, the tolerable rates of missing GPS-based plot areas are summarized in Table 7. Along with the tolerable rates in terms of the percentages of plot areas observations that could go missing beyond a given distance threshold, we report the corresponding overall rates of missingness in parentheses. In the discussion that follows, we focus on the discussion of the results pertaining to plot-level agricultural productivity, given the policy relevance of the outcome and its lower tolerance to missingness vis-à-vis plot area. (15) Note: The overall rates of missingness implied by the tolerable rates of missingness above a given distance threshold are noted in the parentheses.
The results obtained with the 1-kilometer threshold are very encouraging. In Malawi, the multiply-imputed distribution of the plot-level productivity measure is statistically indistinguishable from the true distribution in each of the 50 imputed datasets with up to 82 percent missingness in GPS-based plot areas that are more than 1 kilometer away from the associated dwelling. The comparable figure in Ethiopia is 56 percent. Put differently, the number of plots for which GPS-based area measurement can be forgone represent 23 percent and 13 percent of the overall plot sample in Malawi and Ethiopia, respectively. These findings indicate that in Ethiopia, the plot-level agricultural productivity estimation is more sensitive, compared to Malawi, to missingness in GPS-based distant plot areas.
Further, as noted above, we get lower tolerable rates of missingness among distant GPS-based plot areas when we lower the distance threshold from 1 kilometer to 500 meters. In Malawi, the multiply-imputed distribution of the plot-level productivity measure is statistically indistinguishable from the true distribution in each of the 50 imputed datasets with up to 45 percent missingness in GPS-based plot areas that are more than 500 meters from the associated dwelling. The comparable figure in Ethiopia is 36 percent. In this case, the number of plots for which GPS-based area measurement can be omitted represent 21 percent and 15 percent of the overall plot sample in Malawi and Ethiopia, respectively.
The cross-country differences in tolerable missingness rates are likely in part tied to the differences in farm organization, which mediate the differences in variability in the outcomes of interest. 13 On the one hand, the average plot size in hectares in Malawi (0.4) is twice as much as the comparable statistic in Ethiopia (0.2), as reported in Table 3. On the other hand, the household-level average number of plots per holding in Ethiopia (11.7) is more than six times the comparable figure in Malawi (1.9). While the spatial distribution of the plot samples across the distance intervals in Table 2 are comparable across the two settings, the average plot distance from the dwelling is 2.19 kilometers in Malawi, with a 95 percent 13 Unless otherwise stated, the statistics in this paragraph are not reported in any of the tables but have been computed based on the same datasets used for analysis. 25 confidence interval of 1.91-2.47, versus 1.10 kilometers in Ethiopia with a 95 percent confidence interval of 0.76-1.43. The plot distance from the dwelling further exhibits cross-country distributional differences that are statistically significant at the 1 percent level.
Finally, Table 8 presents country-specific multiply-imputed mean versus true mean comparisons for plotlevel area and agricultural productivity, following MI at identified tolerable rates of missingness above the distance thresholds as reported in Table 7. Irrespective of the distance threshold and country in question, the root mean square error for plot area is close to zero and the difference between the MI mean and the true mean as a percentage of the true mean does not exceed 1.5 percent. For plot-level agricultural productivity, we have more promising findings in Malawi compared to Ethiopia. In Malawi, for instance, at 82 percent missingness above the 1-kilometer threshold, the difference between the MI mean and the true mean as a percentage of the true mean stands at 7.5 percent. The comparable statistic for Ethiopia is 40.4 percent. These findings underscore the relative sensitivity to missingness of plot-level agricultural productivity measures vis-à-vis plot area, and the fact that this sensitivity is likely to vary by country and production system complexity, as in this study.

Conclusion
This paper provides further evidence that combining GPS-based plot areas measurements with farmerreported plots areas in a Multiple Imputation (MI) application can result in reliable simulations of missing GPS-based plot areas. The analysis extends previous research by using survey data from Malawi and Ethiopia that feature negligible levels of missing GPS-based area measures. By artificially simulating missingness among distant plots in otherwise assumed-to-be-complete data, we compare the MI-based predictions to the true, observed values and gauge the levels of missingness in GPS-based land area measurements that can be handled with MI without compromising the robustness of key land area related statistics.
Since the microdata on land areas inform a wide range of research applications on smallholder agriculture, agricultural and development economists that rely heavily on public use datasets are therefore encouraged to think more critically about the use of model-based approaches to address missingness not only in GPSbased land areas but also in other variables with known missing information mechanisms that can be captured with confidence as part of the imputation model specification.
Among the outcome variables of interest, plot-level agricultural productivity, as measured by maize yield in Malawi and total harvest value per land area in Ethiopia, is found to be more sensitive to missingness.
Still, we show that in Malawi, by obtaining GPS-based area measures for only 18 percent of the distant plots in Malawi that are further than 1 kilometer with respect to the dwelling location and that would be selected at random, and by multiply-imputing the remaining missing GPS-based plot areas, we can derive comparable means and distributions with respect to the true data. In the case of Ethiopia, we need a randomly selected sample of at least 44 percent of the distant plots based on the same distance threshold to achieve the same objective.   Mean imputed outcome variable 1km Threshold 29 measurement. As explained above, GPS-based area measures are obtained within a specific radius that is defined ex-ante by the implementing agency, traditionally in terms of subjective assessments of distance, travel time, and plot location as it relates to enumeration area/village boundaries. Hence, to the extent that a smallholder production system presents a sufficient degree of similarity to the Malawian or Ethiopian contexts and a given implementing agency is fielding a survey that mirrors the fieldwork set up in these countries, following the post-planting visit, the survey management team could consider reviewing the set of unmeasured, distant plots, and select a random subset of those in an attempt to achieve one of eight tolerable rates of missingness, as reported in Table 7, and as a function of the distance threshold-outcome variable combination. This random subset of distant plots could then be prioritized for GPS-based area measurement during the subsequent visit(s) to the sampled households, and the resulting, "more complete" dataset could be subject to MI to predict the remaining missing GPS-based plot areas.
However, since the tolerable missingness rates may vary by country, distance threshold and outcome variable, similar analyses could be replicated (i) based on the IHS3 and ESS datasets that inform our analysis but by using alternative distance thresholds and outcome variables, and (ii) using other survey data that exhibit low rates of missingness in GPS-based plot areas. The findings would in turn catalyze the convergence onto comprehensive operational guidelines for survey practitioners. 14 Finally, although dealing with missingness empirically in the post-fieldwork period is always an option, there is no substitute for good fieldwork to prevent unwarranted missing measurements as much as possible. Thus, survey practitioners, including agricultural and development economists involved in primary data collection, should follow a combination of (i) well-supervised field practices to reduce missingness, as exemplified in Section 2, and (ii) sound MI applications to fill the data gaps that will still be unavoidable to a degree.