Land Measurement Bias and its Empirical Implications: Evidence from a Validation Exercise

This paper investigates how land size measurements vary across three common land measurement methods (farmer estimated, Global Positioning System (GPS), and compass and rope), and the effect of land size measurement error on the inverse farm size relationship and input demand functions. The analysis utilizes plot-level data from the second wave of the Nigeria General Household Survey Panel, as well as a supplementary land validation survey covering a subsample of General Household Survey Panel plots. Using this data, both GPS and self-reported farmer estimates can be compared with the gold standard compass and rope measurements on the same plots. The findings indicate that GPS measurements are more reliable than farmer estimates, where self-reported measurement bias leads to over-reporting land sizes of small plots and under-reporting of large plots. The error observed across land measurement methods is nonlinear and results in biased estimates of the inverse land size relationship. Input demand functions that rely on self-reported land measures significantly underestimate the effect of land on input utilization, including fertilizer and household labor.


Policy Research Working Paper 7597
This paper is a product of the Development Data Group, Surveys and Methods Team. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://econ.worldbank.org. The authors may be contacted at goseni@worldbank.org. This paper investigates how land size measurements vary across three common land measurement methods (farmer estimated, Global Positioning System (GPS), and compass and rope), and the effect of land size measurement error on the inverse farm size relationship and input demand functions. The analysis utilizes plot-level data from the second wave of the Nigeria General Household Survey Panel, as well as a supplementary land validation survey covering a subsample of General Household Survey Panel plots. Using this data, both GPS and self-reported farmer estimates can be compared with the gold standard compass and rope measurements on the same plots. The findings indicate that GPS measurements are more reliable than farmer estimates, where self-reported measurement bias leads to over-reporting land sizes of small plots and under-reporting of large plots. The error observed across land measurement methods is nonlinear and results in biased estimates of the inverse land size relationship. Input demand functions that rely on self-reported land measures significantly underestimate the effect of land on input utilization, including fertilizer and household labor.

Introduction
Land measurement is critical to empirical development analyses and national agricultural statistical reporting. Land holdings are dependent variables, independent variables or a relative component indicating the scale of production in key empirical relationships estimated in development economics. Aggregate land holdings measure a household's wealth stock in rural areas. Plot sizes are an input in agricultural production functions. Accurate yield measures and input intensities are essential to the estimation of the inverse-land size relationship and input demand functions, and the policy decisions driven by these estimations. A large econometric literature has assessed the effect of measurement error, particularly non-random measurement error on empirical relationships (see for example the reviews in Hausman 2001 andWooldridge 2008, among others), though specific applications to the above mentioned empirical relationships in development economics remains limited.
The impact of mismeasured land holdings has uncertain, though possibly important, effects on econometric relationships estimated in the literature. Classical measurement error in continuous dependent variables decreases the precision of estimates, but does not necessarily bias parameter estimates of independent variables (Wooldridge 2008). However, classical measurement error in independent continuous variables does potentially bias parameter estimates (Wooldridge 2008). Therefore, measurement error of land size has the potential to distort important and policy relevant econometric relationships especially when land size is included (either directly or indirectly) in the set of independent variables. In their investigation of the commonly found inverse relationship between land size and agricultural productivity (henceforth IR), Carletto, Savastano, and Zezza (2013) and Carletto, Gourlay, and Winters (2015) compared their findings using farmer and GPS estimates of plot size. They find a sizeable difference in the magnitude of the IR when using either land measure. One (or both) of the IR estimates is potentially subject to some degree of measurement error and without a gold standard measure of land size it is difficult to determine which IR estimate is closer to the truth. GPS measures of land area are generally assumed to be more accurate than self-reported land size, though measurement error is non-trivial in many commercial GPS units, particularly as plot size decreases. Another commonly estimated relationship in the literature that can be biased due to land size mismeasurement is the input demand function (Deininger et al 2003;Marenya, Paswel, and Barret, 2007;Erenstein, 2006;Thuo et al, 2011). Input intensification and the estimation of yield gaps related to input intensities are potentially biased when land size is mismeasured.
Despite the relative importance of land measurement, acquiring information on farmers' land area in developing countries can be time consuming, costly, and potentially inaccurate depending on the method employed. In the context of multi-topic household surveys, the costs of correct measurement of one variable must be weighed against the multiple objectives of the survey. The generally accepted gold standard method of measurement (FAO 1982), the compass and rope (CR) method, is an arduous and time consuming process. Two alternative and less costly methods often used in agricultural surveys involve either (1) asking farmers to estimate the size of their plots (SR or self-reported) or (2) measurement of plot size using a GPS unit. However, both of these methods are potentially subject to greater error than CR through respondent error (farmer estimates) or technological error (GPS). Several studies have found significant land size differences both between GPS and farmer estimates (Goldstein & Udry, 1999;Carletto, Savastano, & Zezza, 2013;Carletto, Gourlay, & Winters, 2015) as well as between CR and GPS estimates (Shoning et al, 2005;Keita & Carfagna 2009). It appears that the potential for land area mismeasurement is significant depending on the measurement method employed, potentially due to either respondent driven or enumerator driven errors. The compass and rope method may also be prone to some degree of measurement error as the dimensions of the plot and compass bearings are determined by the enumerator and not automated. Any measurement with human intervention can be biased, but we treat the CR measurement as the truest measure going forward.
In this paper we aim to investigate (1) how land size measurements vary across measurement methodologies, and (2) the econometric and policy implications of land measurement error in estimation of common agricultural relationships found in the literature. We utilize data from the second wave of the Nigeria General Household Survey Panel (GHS-Panel) as well as a supplementary survey covering a subsample of GHS-Panel plots. As a part of the GHS-Panel, plots were measured using GPS devices and farmers were also asked to estimate the size of their plots. In the supplementary survey several months later, a subset of GHS-Panel plots were again measured according to farmer estimates and GPS but with the addition of a CR measurement. This allows us to compare all three measurement methods for the same plots. Such a comparison has not previously been performed on panel households, with two visits in the same season.
Utilizing the sample design of the GHS-Panel and specially collected validation sample data, we first conduct a descriptive comparison of differences across the three measures followed by an investigation of the household, respondent and plot characteristics that are correlated with farmer-estimated and GPS plot mismeasurement. We then examine two econometric relationships often estimated in the land literature: the inverse farm size-productivity relationship and input demand functions (farm size relationship with utilization of hired labor, fertilizer, and herbicide/pesticide). By estimating these common behavioral equations, we will be able to estimate any bias introduced by using self-reported and GPS plot sizes.
In our comparison of the three different measurement methods, we find self-reported mean land size is not statistically different then mean land size compared to either GPS or CR, but that self-reports overestimate land size on small plots and underestimate land size on large plots relative to either GPS or CR. Overreporting bias on the smallest tercile of plots is 83%, while self-reports on the largest tercile of plots underreported by over 20%. The GPS error is relatively consistent ranging from -0.2 percent for large plots to -2.8 percent for medium plots and -2.0 percent for small plots, which is particularly important as this is where the GPS measurement error is most likely to diverge from CR. We also find that plot characteristics such as non-titled land distributed by either the community or family partially explains the SR measurement error. The comparison of the measurement methods overall indicates that GPS estimates provide a relatively accurate estimate of plot size while SR estimates are subject to sizeable error especially for smaller plots.
In comparing SR to CR estimates, we find the expected negative relationship between land size and productivity and land size and input utilization. We find considerable bias in SR estimates for plots in the top two thirds of the land distribution leading to overestimation of the negative inverse productivity relationship. We also found considerable differences in the estimate for the input demand functions. SR land measures underestimates the effect of the land size on fertilizer use, pesticide/herbicide use and hired labor.
The remainder of this paper is organized as follows. In the next section, we provide background on land measurement methods as well as briefly review the literature associated with land measurement. In Section 3, we describe the data used in the analysis. We then analyze and attempt to explain the land measurement differences across methodologies in Section 4. In Section 5 we assess the potential impact of mismeasurement on estimates of the IR relationship and input demand functions using the three methodologies. Section 6 concludes.

Land Measurement: Background and Literature
The extent of possible land measurement error is highly dependent on the method used to measure land size. Three common measurement methods include compass and rope, plot measurement with GPS devices, and farmer self-reported land size. A longstanding approach to collect land size is the compass and rope (CR) method which uses poles, ropes and a compass to carry out a systematic measure of land size (FAO 1982). It does not rely on advanced technology, only basic geometry and commonly available equipment.
The CR method requires only a compass, measuring tape, ranging poles, a programmable calculator, and two to three persons to measure the area of a plot with significantly more accuracy than by subjective estimates. GPS and CR measures require careful implementation by a field team, though GPS measurements may also be affected by the GPS device's margin of calibration error which can be affected by environmental factors. GPS measurements may be subject to more error than CR since the coordinates measured by the GPS device are not exact. 1 When carefully implemented, CR provides precise estimates, and it can therefore be considered as a benchmark against which to assess the precision of other methods.
Although the CR method is highly accurate when performed properly, it is time consuming and costly due to the necessity of careful training, monitoring, transportation costs to individual farmer plots, and enumerator time spent measuring the plot. The use of GPS, can on average require as little as 28% of the time needed for compass and rope (Keita and Carfagna, 2009;Schoning et al., 2005). Keita and Carfagna (2009) find that on small plots compass and rope ("CR") can take up to 17 times as long as GPS, though GPS measurement still requires survey team relocation to the plot and some survey team time to trace the field's perimeter. Because of this, many surveys rely on a farmer's own estimate of land size which avoids the time and cost of actual measurement. However, self-reported estimates could be subject to greater error given that (1) many farmers in developing countries acquire land through informal means where record keeping (and thus information on plot dimensions) is limited and (2) farmers are more likely to give rounded and inexact size estimates. More recently, the use of GPS units to measure land area has become more common in developing countries due to increased affordability of the technology and portability of the equipment. GPS measurements provide a more exact estimate of land size than farmer estimates and is much less time consuming (and thus less expensive) than measurements using CR.
The spread of GPS measured land size has led to several studies that examine the differences between the GPS and both CR and self-reported measures including the potential impact of measurement error resulting from inaccurate land size measurements. In two examinations of GPS versus CR measures in Sub-Saharan Africa, Shoning et al. (2005) and Keita and Carfagna (2009) find that GPS based measurements are generally lower than CR measures though neither study provides a technical explanation for this difference.
Both studies find that the GPS-CR difference is statistically significant for smaller plots but not for larger plots. 2 It appears that for smaller plots, the fixed margin of error associated with determining GPS 1 Standard GPS devices have reported position accuracy of approximately 15 meters depending on the model's satellite calibration algorithm (http://www8.garmin.com/aboutGPS/). 2 Schoning et al. (2005) find a statistically significant difference for plots under 0.5 hectares but not for plots equal to or greater than 0.5 hectares. Keita and Carfagna (2009) separate their sample into 5 clusters based on plot size. They find a statistically significant difference for the clusters with the lowest land size but not for the 2 clusters with larger plots. However, they do not specify the plot size range for each cluster.
coordinates is large relative to the size of the plot and thus results in a greater measurement error. Since most plots in developing countries are relatively small, the potential for land mismeasurement using GPS could be significant. However, rounding of very small plot areas in Schoning et al. (all plots less than 0.01) clouds the reliability of their findings for smaller plots. This paper builds upon these two studies in that we are able to assess the degree of GPS accuracy for even the smallest of areas with the inclusion of CR measures.
Fewer studies have examined the differences between self-reported and GPS measured land sizes. Three studies have found inconsistencies between GPS and self-reported land sizes (Goldstein & Udry 1999;Carletto, Savastano, & Zezza, 2013;Carletto, Gourlay, & Winters, 2015). Although ancillary to their main analysis, Goldstein and Udry (1999) found that the correlation between GPS and self-reported land size was only 0.15 in their dataset from Ghana. The authors do point out that historically, field measurements in the region were based on length and not area and that this could partially explain the lack of a strong correspondence between farmer and GPS estimates. However, Carletto, Savastano, and Zezza (2013) also find a sizeable difference in Uganda though one that varies with plot size. The authors find that the bias is wider for smaller (less than 1.45 acres) and larger (greater than 3.58 acres) plots while self-reported measures are a reasonable estimate for medium size plots. In a similar analysis, Carletto, Gourlay, and Winters (2015) find inconsistencies between GPS and self-reported measurements in Malawi, Uganda, Tanzania, and Niger. They find that in all four countries self-reported plot sizes were higher than GPS for smaller plots but lower for larger plots, suggesting that farmers over-report the area of small plots and under-estimate the area of large plots. Both Carletto, Savastano, and Zezza (2013) and Carletto, Gourlay, and Winters (2015) go further and attempt to explain the difference using a basic econometric model. Carletto, Savastano, and Zezza (2013) find that in addition to plot size, the rounding of self-reported measurements, the age of the household head, and whether the plot was in a dispute with relatives were all positively associated with a greater GPS-self-report difference in land size in Uganda. Carletto, Gourlay, and Winters (2015) similarly find that rounding of farmer estimates are consistently associated with a difference between GPS and self-reported plot size in all four countries.
Following Carletto, Savastano, and Zezza (2013) and Carletto, Gourlay, and Winters (2015), we examine differences between GPS and farmer estimates and attempt to explain those differences, but also include the gold standard CR measure in the analysis. The CR will serve as a better reference measure to accurately assess and explain the measurement error associated with farmer estimates and GPS measurement.

Data Description
Two data sets are used for the analysis: (1) the second wave of the Nigeria General Household Survey -Panel (GHS-Panel) and (2) data from a subsequent land measurement validation survey (henceforth, validation sample) administered to a subsample of GHS-Panel households interviewed in the second wave.

GHS-Panel Sample
The GHS-Panel is a detailed multi-topic household and agricultural survey implemented by the Nigeria National Bureau of Statistics (NBS) and supported by the Living Standards Measurement Study -Integrated Surveys on Agriculture (LSMS-ISA) at the Word Bank. The GHS-Panel was administered to around 5,000 households and is designed to be nationally and regionally representative. As its name indicates, the GHS-Panel is a panel survey which at present consists of two waves: households were first interviewed in 2010/2011 and then re-interviewed in 2012/2013. While households are followed between waves of the GHS-Panel, plots were not followed and therefore cannot be precisely matched between waves.
Therefore, we shall focus our attention on the second wave of the GHS-Panel.
Together, the NBS and the LSMS-ISA aim to improve agricultural data for Nigeria through implementation of the GHS-Panel, particularly with respect to land area measurement. To this end, the GHS-Panel captures plot size using both farmer estimates and GPS measurements. However, extraordinary differences were observed between GPS and self-reported measures of agricultural land areas in the first wave of the GHS-Panel especially when compared to other LSMS-ISA surveys. 3 Table 1 below summarizes the average measurements found in recent LSMS-ISA surveys. 4 The SR-GPS difference observed in Nigeria is considerably larger than in the four other LSMS surveys listed.

Validation Sample from the GHS-Panel
In order to further investigate these large differences in Nigeria, a subsample of plots from wave 2 were selected for a subsequent interview where CR, GPS measurements and farmer estimates were taken. This extra information enables comparison of methods and identification of potential measurement error associated with each methodology. The GHS-Panel has a sample of 5,000 households interviewed in the first wave. Out of this, 3,220 are agriculture households with a total of 5,125 plots. Limitations on time and budget prevented implementing compass and rope measurement on all plots owned and/or utilized by households in the GHS-Panel.
Four states were selected for the exercise based on safety and geographical dispersion. The four states are Oyo and Osun from the South West region and Benue and Kogi from the North Central region. Restricting the geographic area to be covered to two zones provided the benefit of limiting the number of enumerators required. Having fewer enumerators lends internal consistency to the measurements and at the same time, minimized training needs.
To ensure the sample covered a range of plot sizes, the plot selection was first stratified by GHS-Wave 2 plot size so we can examine the difference between farmer self-reported, GPS and CR at different plot sizes.
Secondly, 100 plots were randomly selected within each stratum. 5 The selection of plots based on the above stratification led to the inclusion of 211 households. Because the greatest expense, both temporal and budgetary, is travel to various households, we then decided to measure all plots within the 211 households for which at least one plot was selected by stratification. This allowed us to increase the sample size without significant increases to the budget. This resulted in a sample size of 518 plots, which is representative of the four states covered in the exercise. Stratification by plot size in the validation sample results in the unequal probability of plot selection within households from the GHS-Wave 2 sample. Sample weights were calculated for the validation sample to make them representative of the same household population sampled in Wave 2. 6 Table 2 contains the distribution of the final validation sample across the four states.
<<< TABLE 2 HERE >>> 5 Strata 1: <=1000 sq. meters, 2. 1000-2500 sq. meters, 3. 2500-5000 sq. meters, 4. >5000 sq. meters. 6 The final weight for these plots would be equal to the Panel Survey household weight for Wave 2 times the inverse of the subsampling rate for the plots. The subsampling rate for the plots is based on the probability that at least one of Wave 2 household plots is selected. Given the complexity of calculating this probability, it is simpler to determine the probability that none of the plots is selected, and then subtract this from 1 to determine the probability that one or more of the plots of the household are selected.
Stratification by plot size is essential in isolating the population of interest. Because the current literature questions the accuracy of GPS at smaller areas (Keita &Carfagna, 2009, andSchoning et al., 2005, have suggested that GPS measurements on plots less than 0.5 hectares are significantly different than CR assessments), collecting measurement data on plots in each strata is necessary if we are to analyze the accuracy of the GPS and farmer estimates at various plot sizes and, particularly, address the issue of accuracy on small areas. More than 80 percent of plots measured in the Wave 1 GHS-Panel were smaller than 1 hectare. Although compass and rope measurement is more time intensive on larger areas, we did not restrict the sample by employing an upper limit on area. Rather, we consider it critical to measure plots of all sizes in order to enable the analysis of all plot areas. In Nigeria in particular, the farmer-reported areas and GPS areas have been wildly divergent at all plot sizes, not only the smaller plots on which GPS accuracy is called into question. In measuring plots of all sizes, therefore, we aim to gain an understanding of the discrepancy between the two measures at all plot sizes. Table 3 presents sample averages for the explanatory variables used in the analysis including the validation sample and the wave 2 GHS-Panel sample. In the first three columns, we compare the full wave 2 sample with the validation sample with the difference between the two shown in the third column. There are many significant differences between the validation and wave 2 samples, suggesting the validation sample is not representative of the national GHS-Panel survey. Nigeria is a large and diverse country so this is not surprising given the relatively limited scope of the validation sample. In the middle columns, we limit the wave 2 sample to zones covered by the validation sample (North Central and South West). Comparing the reduced wave 2 sample to the validation sample, there are far fewer significant differences though still a few household characteristics where the samples differ. Restricting the wave 2 sample to the states covered by the validation sample in the far right columns, there are no significant differences between the samples except the land area measurements. The difference between the GPS measurements from the GHS-Panel (from validation states only) is significantly different from the mean GPS measurement from the validation visit, but with limited significance (at 10% level) and low magnitude. There are large differences, both in magnitude and significance level, between the SR estimates collected in the GHP-Panel visit and the validation visit. The change in field staff between the GHS-Panel (professional enumerators) and the validation exercise (enumerated by higher-level staff from NBS headquarters) may partially explain the difference observed in the SR estimates. However, the field staff for the validation exercise were repeatedly instructed to record the farmer's estimate without assisting or correcting the farmer. As the validation results may have limited external validity for states outside the validation sample, the paper's analysis is supplemented with the nationally representative wave 2 sample when possible.

Descriptive Differences between Measurement Methods
In this section we examine differences in plot size measurements across the three land measurement methods using data from the validation exercise 7 as well as the full GHS-Panel wave 2. Since the CR method is considered to be the gold standard measurement method, our analysis assumes that it is the benchmark plot size. Therefore, any deviation from the CR measure will be classified as error or bias relative to this benchmark. Table 4 presents the differences between SR and CR estimates. As shown in the fourth row of the table, the mean difference between SR and CR is small at only -0.03 acres or -2.1 percent of CR. In order to limit the influence of outliers, in the bottom row of panel A of Table 4, we trim the top and bottom 1 percent of SR and CR measurements. The scaled mean bias between SR and CR is slightly larger at 2.9 percentage points after trimming outliers. 8 The results of an adjusted Wald test are also presented in the far right column of Table 4. The test results comparing overall SR and CR measurements suggest that there is no significant difference between the two. However, this comparison masks much larger and significant differences that vary with plot size.
When we separate the sample into terciles based on CR plot size, both tests are in agreement for small and medium sized plots. We find that the difference between CR and SR is significant for both small (1 st tercile) and medium (2 nd tercile) plots. In both cases, the SR estimate is about 0.25 acres larger than the CR measure.
Relative to plot size, the difference is much larger for smaller plots with an SR estimate that is on average nearly twice the actual (CR) size. Overall, this seems to indicate that SR measurements are fairly accurate for larger plots, but become less accurate as plot size decreases. This is in partial agreement with Carletto, Savastano, and Zezza (2013) in Uganda who found that SR estimates were a good approximation of GPS measures for medium plots but not small and large plots.

<<< TABLE 4 HERE >>>
7 All validation plots with land size information have been included in this analysis. Plots that are missing information on other characteristics are excluded from the regression analysis that follows and hence the number of plot observations is different. 8 Outliers were trimmed for the pair of SR and CR observations which results in decreases of the SR and CR means.
The CR measurements and GPS measurements are compared in Table 4, panel B. Overall, the difference between the CR and GPS is very small at 0.01 acres but is statistically significant, though only at the 10 percent level. The results remain largely unchanged after trimming outliers. When disaggregating by plot size tercile, the difference for all terciles is very low (never more than 2.8 percent of CR size). The adjusted Wald test results indicate that the difference between CR and GPS is only significant for medium sized plots. These test results are in partial agreement with Schoning et al. (2005) and Keita and Carfagna (2009) who found that GPS was less accurate for smaller plots. However, the difference is so small in magnitude that it may not cause significant bias when GPS is used in econometric specifications. This issue will be further explored in Section 6.
In Table 4, panel C, SR and GPS measured plots are compared. As would be expected, the differences are largely identical to those for SR versus CR measurements. The difference between SR and GPS is slightly higher with farmers reporting plot sizes that are 0.045 acres larger than GPS measurements. Also similar to Table 4, panel A, the difference between SR and GPS is largest (in percentage terms) for small plots, followed by medium plots, and is statistically different for all three plot terciles.
We can expand our comparison of GPS and SR measurements to the full GHS-Panel sample. Table 4, panel D, presents the results for the SR and GPS measurements taken in the wave 2 visit of the GHS-Panel. This highlights the error associated with farmer estimated plot size, relative to the GPS measured plot which was conducted as part of the GHS-Panel. The difference between the two measurements is significantly larger for the GHS-Panel sample than for the validation sample. While the difference is statistically significant for small and medium sized plots, the relative measurement error is largest for the smallest plots (437% of GPS size), followed by medium plots (205%), and then the largest plots (27%, though not significant). These sizeable differences highlight the potential for significant measurement error associated with farmer estimates of plot size.
In order to better examine the differences in the measures across the plot size distribution, we plot the distributions (using kernel methods) for all three validation measures in Figure 1 and the wave 2 measures in Figure 2. Figure 1 clearly illustrates the strong correspondence between the GPS and CR measures. The two distributions are nearly indistinguishable from one another. The SR distribution does largely mimic those for GPS and CR, but there are clear differences at the lower and middle portions of the distribution.
The wave 2 GPS and SR distributions in Figure 2 are largely similarly to Figure 1 but they also reflect the larger GPS-SR difference found in wave 2 than in the validation data. In the next section, the reasons for this difference are explored.

Determinants of Differences between Methods
In the first econometric specification, we investigate whether the measurement differences observed in the previous section are robust when accounting for the influence of other household and plot characteristics on plot size using the validation sample. In equation 1, land size is measured for each validation sample plot with three observations: one for SR, CR, and GPS. This allows us to directly compare all three measurements in a single regression. In equation 1, is plot size in acres according to measurement , , , and are dummy variables for SR and GPS measurement observations, represents plot fixed effects, and is the idiosyncratic error term. Because we have multiple observations for each plot, we can estimate equation 1 with plot fixed effects which will capture the influence of any observed or unobserved plot characteristics on plot size for each measurement. In this formulation, and will estimate the mean difference between CR and both SR and GPS after accounting for plot fixed effects.
In a second specification, we estimate the interactions of SR and GPS with an indicator of the CR reported tercile, as the descriptive statistics in Tables 4 indicate substantial variation by tercile in measurement differences.
We can also estimate the effect of self-reported land measures relative to GPS measures using the wave 2 data. As the wave 2 data only includes SR and GPS measurements, we cannot directly compare coefficients across the specifications or control for plot fixed effects. Therefore, we estimate the following equation with the wave 2 data controlling for observable characteristics which may influence self-reported land size: where is a matrix of manager, household, and plot characteristics and represents enumeration area fixed effects. In this formulation, captures the inherent difference between SR land size by tercile and the omitted measure (GPS), will capture any characteristics that are correlated with overall plot size (1) (2) (regardless of measurement method), and the interactions between and will capture any specific effects of characteristics on SR plot size relative to GPS. The latter will provide insight on what characteristics contribute to the differences between the two measurement methods. underestimating the size of large plots. The disaggregated GPS land measurement bias relative to CR is negative across the land distribution and statistically different than CR in land tercile 2 (-0.02 acres relative to 0.83 acres tercile mean). Hypothesis tests of the equality of the coefficient estimates is rejected at the 5% level for the SR and GPS coefficients.

<<< TABLE 5 HERE >>>
The SR land bias is estimated relative to GPS land measures in the nationally representative Wave 2 sample in Columns 3, 4, and 5. Though CR measures are considered the 'gold standard' in land measurement, the validation sample indicates only small differences between CR and GPS measures. The wave 2 results for estimation of equation 1 in column 3 highlights the more striking differences we observed between the GPS and SR measurements obtained in the second wave of the GHS-Panel controlling for plot fixed effects. The SR marginal effect relative to GPS measures in wave 2 is highly significant and much higher than that observed in the validation data (0.7 acres for tercile 1, 1.4 acres for tercile 2, and 1 acres for tercile 3). The SR bias is also nonlinear as in the validation sample.
In column 5, we estimate equation 2 using the wave 2 data to determine which characteristics help explain the difference between SR and GPS measurements controlling for EA fixed effects. The specification controls for observable household, manager, plot and interview characteristics in estimating land measurement bias. 9 The results indicate that larger households have larger plots on average. Likewise, whether a plot can be used as collateral and the number of crops grown on a plot are all positively associated with plot size. Larger plots are more likely to be available for use as collateral and have a greater capacity for cultivation of multiple crops.
Household, manager, plot and interview characteristics potentially capture important dimensions of land measurement bias. These characteristics are interacted with the SR indicator in equation 2 to estimate their relative effect to GPS measured land size in wave 2. Observable manager and household characteristics largely do not have statistically significant interactions with self-reported land sizes. Plot characteristics including plot tenure status explain differences between SR and GPS plot sizes in the wave 2 data. SR plot size was higher on plots which were distributed by the community or family compared to purchased plots (the omitted category). This could indicate that farmers are less knowledgeable about the size of plots that did not involve a land market transaction. Farmers are more likely to know the true size of plots acquired through a transaction (whether purchased or rented) since size is important in determining the price of the plot. Interview characteristics did have statistically significant effects on the difference between SR and CR land measurements including the day of the week on which the interview was conducted. Respondents also gave lower SR estimates for interviews that started in the late afternoon compared with those that started in the morning.
One potential concern with our results in column 5 is omitted variable bias. In columns 3 and 4, we control for all observed and unobserved plot characteristics with the plot fixed effects. In column 5, we drop the level of fixed effects to the enumeration area (EA) and include observable characteristics to explain observed plot size. If there were no omitted variables that are important covariates with plot size, we would expect the coefficient estimates on the SR measurement indicators to be the same between columns 4 and 5. However, the estimates in column 5 are nearly twice that of column 4. This could indicate that omitted variable bias is present in that specification. We should be cautious in the interpretation of the effects of observable characteristics interaction with SR land measures. 10 The next section examines the potential implications of SR land measurement bias in the estimation of the inverse farm size and input demand specifications, frequently estimated in empirical applications.

Inverse Farm Size Relationship and Input Demand Functions
We have shown above that there are significant nonlinear land biases in self-reported land estimates. In this section, we examine two well established agricultural relationships to estimate the effect of these biases on the inverse farm size relationship and input demand functions. First, we propose to estimate the inverse land size-productivity relationship (IR) documented by Carter (1984), Barrett (1997), Assunação and Braido (2008), Carletto, Savastano, and Zezza (2013), and Carletto, Gourlay, and Winters (2015)  Secondly we estimate input demand functions where previous studies find that a farmer's decision to invest is closely related to the size of the farm as production technologies change with increasing farm size. Deininger et al (2003) find that farmers with more land per capita are less likely to adopt the long term investment of planting trees on their farm. Other researchers have found that farmers that cultivate more land were more likely to invest in improved inputs such as fertilizer, herbicides, pesticides, and improved seed varieties (Marenya, Paswel and Barret, 2007;Erenstein, 2006;Thuo et al, 2011). We estimate the following input demand function using a linear probability model: Where I indicates use of fertilizer, herbicide/pesticide, or hired labor on plot . As for equation 3 above, we will estimate equation 4 for each measurement method in both the validation and full samples, including the plot area tercile interactions for the validation sample and EA fixed effects. Differences in the coefficient δ can be attributed to the measurement bias in plot size. Using this difference in coefficient estimates, we can estimate the change in predicted probability of agricultural investment (I) due to plot measurement error. In columns 3 and 4, we estimated the IR relationship using the wave 2 main visit SR and GPS measurements for validation sample plots. The SR IR estimate in column 3 is -0.693, very similar to the estimate in column 1 of -.576. We also find that the wave 2 GPS IR estimate in column 3 is not significantly different than the estimate using SR areas. Column 4 presents the analysis when plots are disaggregated by land area tercile.

Results: Inverse Farm Size Relationship
Unlike the results of the validation data (columns 1 and 2), there appears to be no significant difference between the IR estimates produced with GPS and SR area measures at any tercile. The hypothesis test of the equality of coefficients can not reject the null hypothesis.

<<< TABLE 6 HERE >>>
In columns 5 and 6 we compare results for the wave 2 measurements for the full wave 2 sample. The IR estimates are surprisingly similar for the SR and GPS measures for the full sample. This could indicate that the measurement bias in the full sample is relatively similar for both the wave 2 SR and GPS, though we cannot explicitly measure the bias. While we cannot directly compare these estimates with the validation sample estimates, it is interesting to note that the IR relationship for wave 2 GPS is almost identical to that of the validation CR specification.
The results have shown that there can be considerable measurement error associated with estimating the IR relationship using the different land measurement methods. It appears that the bias associated with GPS measurements is consistently lower than the bias when using SR. We also find that the SR estimate of the IR is larger in magnitude than for GPS, particularly for larger plots. This is in contrast to Carletto, Savastano, and Zezza (2013) who find the opposite to be the case in Uganda but largely in agreement with the findings of Carletto, Gourlay, and Winters (2015) in Malawi, Uganda, and Niger. Table 7 presents the results of fertilizer, pesticide/herbicide, and hired labor demand, respectively. The results from the validation sample are provided in columns 1, 2, 5, 6, 9, and 10. For all three we find the expected positive relationship between farm size and use of hired labor, fertilizer and herbicide, albeit very small effects. The is not surprising as the use of hired labor and adoption of non-labor inputs like fertilizer, herbicide and pesticide are low for farm households in developing countries. In the validation data, the interaction terms on GPS and SR with log of plot area are not significantly different than zero in any of the three models, nor when the data is disaggregated by land area tercile.

Results: Input Demand Functions
The results from the full GHS-Panel Wave 2 sample are provided in columns 3, 4, 7, 8, 11, and 12 of Table   7. Similar to the results from the validation sample, the coefficients on both the SR estimates and the GPS measurements are insignificant. When disaggregated into land area terciles, however, there is evidence of bias with the SR estimate. Fertilizer demand is decreased for the largest plots when using the SR estimates, while SR estimates in the pesticide/herbicide model suggests decreased demand with both medium and large plots. For hired labor, the SR estimate for medium plots is reduced with a coefficient of -0.038, suggesting that SR measurement error may bias the estimate of the demand for hired labor downwards.
Overall, it appears that the effect of land size bias leads to underestimating the probability of use for inputs such as hired labor, fertilizer, and pesticide/pesticide when SR land size are used compared to CR and GPS estimates, particularly for medium and larger plots. Land measurement bias in input demand functions may lead to significant underestimation of the potential effects of land pressure on input intensification.

Conclusion
Land measurement is a critical component of many statistical analyses centered on agriculture. However, acquiring information on farmers' land area in developing countries can be time consuming, costly, and potentially inaccurate depending on the method employed. Given the relative importance of plot measurements in informing, implementing, and targeting a variety of agricultural interventions, it is vital to have a clear understanding of the potential measurement error associated with each method. In this paper we have investigated how land size measurements vary across measurement methodologies and the econometric and policy implications of land measurement error in estimation of the commonly estimated inverse land size-productivity relationship and selected input demand functions. Using data from the Nigeria GHS-Panel coupled with the validation survey, we were able to compare the three main land measurement methods. Such a comparison using panel data has not been available in previous research, to the best of our knowledge.
In our comparison of the three different measurement methods, we find that measurement error is larger for SR plot size estimates than GPS estimates. On average, farmers overestimate plot size by 2.9 percent while GPS underestimates CR size by only 1 percent. Mean differences in land size hides much of the interesting variation in differences across the land distribution related to measurement method. The error associated with SR is highly variable depending on plot size. The error for the smallest third of plots was over 80 percent compared to 30 percent for medium sized plots and -20 percent for large plots. The GPS error was relatively consistent ranging from -0.2 percent for large plots to -2.8 percent for medium. The comparison of the measurement methods overall indicates that GPS estimates provide a relatively accurate estimate of plot size while SR estimates are subject to sizeable error especially for particularly small plots yielding overestimates and significant underestimates of larger plots.
Lastly, we examine two econometric relationships often estimated in land literature: the inverse farm sizeproductivity relationship (IR) and input demand functions. By estimating these common behavioral equations, we are able to quantify the effect of land measurement bias. For all three measures, we find the expected negative relationship between land size and productivity. However, there are some considerable differences in the magnitudes of the IR estimate. The bias associated with using SR land size is much higher than when using GPS land size in IR estimates. The much stronger and biased estimated relationship between plot size and agricultural productivity using SR measurements could mislead policy makers to conclude that increasing the size of a farmer's plot will result in a sizeable decrease in productivity when in fact it is only a modest decrease. We also found considerable differences in the estimate for the input demand functions. The SR measurement underestimates the effect of the selected land size on the use of inputs. Of particular concern is the consistent finding for both the IR and input demand functions that small and larger plots induce different directions of biases due to land measurement error. With GPS and CR measurements, it appears that the link between land size and input use is stronger.
Overall, our findings indicate that GPS estimates are more reliable than SR measurements. Farmer estimated land size was subject to greater measurement error, varied more between the main GHS-Panel wave 2 visit and the validation survey visit, and introduced the greatest bias into estimation of the negative land size-productivity relationship and the input demand functions. With well-trained enumerators, GPS estimates provide an effective approximation of CR measurements at a lower cost.    Note: All estimates are weighted to account for survey design (with the exception of Signed-rank tests). Notes: OLS estimates with standard errors clustered at the enumeration area in parenthesis. Significance denoted: *** p<0.01, ** p<0.05, * p<0.1. Equation 5 includes household, plot manager, plot and interview characteristics as well interactions of these variables with the SR land size. These estimates are available upon request.  Linear probablility coefficient estimates presented with standard errors in parenthesis. All regressions include enumeration area fixed effects. Indicators for the major crops were omitted to save space. We use the Battese (1997) method to correct for zero values when taking the natural log. The results for Battese dummy variables associated with zero values are omitted to save space. Significance denoted: *** p<0.01, ** p<0.05, * p<0.1.