Could the Debate Be Over? Errors in Farmer-Reported Production and Their Implications for the Inverse Scale-Productivity Relationship in Uganda

Based on a two-round household panel survey conducted in Eastern Uganda, this study shows that the analysis of the inverse scale-productivity relationship is highly sensitive to how plot-level maize production, hence yield (production divided by GPS-based plot area), is measured. Although farmer-reported production-based plot-level maize yield regressions consistently lend support to the inverse scale-productivity relationship, the comparable regressions estimated with maize yields based on sub-plot crop cutting, full-plot crop cutting, and remote sensing point toward constant returns to scale, at the mean as well as throughout the distributions of objective measures of maize yield. In deriving the much-debated coefficient for GPS-based plot area, the maize yield regressions control for objective measures of soil fertility, maize genetic heterogeneity, and edge effects at the plot level; a rich set of plot, household, and plot manager attributes; as well as time-invariant household- and parcel-level unobserved heterogeneity in select specifications that exploit the panel nature of the data. The core finding is driven by persistent overestimation of farmer-reported maize production and yield vis-a-vis their crop cutting-based counterparts, particularly in the lower half of the plot area distribution. Although the results contribute to a larger, and renewed, body of literature questioning the inverse scale-productivity relationship based on omitted explanatory variables or alternative formulations of the agricultural productivity measure, the paper is among the first documenting how the inverse relationship could be a statistical artifact, driven by errors in farmer-reported survey data on crop production.


Introduction
Worldwide, 475 million farms, constituting 84 percent of a total of 570 million farms, are estimated to be less than 2 hectares (Lowder et al., 2016), and (smallholder) agricultural activities constitute an integral part of livelihoods in rural areas that are home to nearly 70 percent of the population in low-income countries. 2 In Africa specifically, the average share of rural household income stemming from agriculture could be up to 69 percent (Davis et al., 2017), and research has consistently documented higher rates of expected poverty reduction associated with agricultural vis-à-vis nonagricultural growth (see Dorosh and Thurlow, 2016 and the studies cited therein).
The importance of agriculture for development is recognized in the formulation of the Sustainable Development Goal (SDG) Targets 2.3 and 2.4, which require doubling of agricultural productivity and incomes of small-scale food producers, and ensuring sustainable food production systems and implementing resilient agricultural practices that increase productivity and production. Both targets are associated with indicators 3 that rely on crop production and land area information sourced from household or farm surveys, and the documented effects of measurement on accurate measurement and analysis of land productivity underscore the need for high-quality survey data for cross-country monitoring, broader economic research, and policy formulation (Carletto et al., 2013(Carletto et al., , 2015Kilic et al. 2017a;2017b).
An often observed yet insufficiently explained puzzle in smallholder agriculture is that of the inverse relationship between scale (in terms of farm or plot size) and (land) productivity (henceforth referred to as the IR). The existence of the IR could have a direct bearing on policy formulation as it, for instance, relates to reforms targeting land and non-land input markets; land redistribution programs; and identification of beneficiary target universe for agricultural development programs. If the IR does indeed exist, and is not an artifact of measurement error in the data, that would suggest governments might encourage redistribution of land, favoring smaller allotments -even if the operational implications of such recommendation are not clear. Alternatively, the existence of the IR could simply temper concerns around plots becoming smaller or, more broadly, "small" plots, often managed by females and poorer farmers (Kilic et al., 2015a;2015b).
The body of literature on the IR is rich, and yet inconclusive. In the African context, Larson et al. (2014) revisit the IR debate with a focus on maize, using data from a wide array of household and farm surveys conducted from 1999 to 2009 across the continent, and provide support to the existence of the IR both at farm-and plot-levels. Following Eastwood et al. (2010), it has been suggested that the IR may be explained by (i) lower supervision costs on smaller holdings and plots (Feder, 1985); (ii) missing or incomplete factor markets (Barrett, 1996;Eswaran and Kotwal, 1986); (iii) omitted variables, in particular, controls for farmer ability and land quality (Assuncao and Braido, 2007;Assuncao and Ghatak, 2007;Benjamin, 1995;Bhalla and Roy, 1988;Lamb, 2003), and (iv) errors in land area measurement (Lamb, 2003).
Working backwards, on hypothesis (iv), Carletto et al. (2013) in a Uganda-specific study, and Carletto et al. (2015) in a cross-country study focused on Sub-Saharan Africa, document that land areas, on average, are over-reported by farmers, but that the IR persists even after objective, GPSbased land area measures are used in empirical analyses. 4 On hypothesis (iii), Nkonya et al. (2004) and Barrett et al. (2010) show that controlling for objectively-measured soil quality based on laboratory analyses does not explain the IR at the plot-level. On hypothesis (ii), Ali and Deininger (2015) show that at the farm-level, the IR can be explained by using a productivity measure that nets out the labor input valued at market wages. 5 A more recent hypothesis not reviewed by Eastwood et al. (2010) suggests that land productivity may be greater along the edge of plots as labor may be concentrated in those visible areas (Bevis and Barrett, 2017). Smaller plots have a greater ratio of visible edge area to less-visible interior area, which if edges are indeed more productive, may explain the often-observed IR. Bevis and Barrett (2017) provide support for this hypothesis in the context of Uganda.
In debating the IR, however, the question of potential measurement error in production figures has rarely been examined. This is a profound oversight considering that the studies quoted above all measure agricultural productivity based on farmer-reported production. This is likely attributable (at least in part) to the fact that data on objectively-measured agricultural production via crop cutting is rarely available in large-scale household survey operations due to resource constraints. Yet, severe systematic biases have been found in other farmer-estimated data, such as plot area (see Carletto et al., Forthcoming, and the studies cited therein). Farmer-reported production estimates are exceptionally complicated. The use of non-standard production units, various conditions and states of crop harvests, and the reporting of permanent crop harvests, for starters, threaten the quality of farmer-estimated production. Additionally, humans often exhibit an inclination to round off numbers, as shown for land area measurement in Carletto et al. (2015), which may bias production estimates.
Using unique panel data from a methodological study on maize productivity in Eastern Uganda that includes self-reported, as well as highly-supervised crop-cutting and remotely-sensed production and yield estimates, this study aims to determine whether the IR can be explained by measurement error in self-reported production estimates. The yield measures, irrespective of the approach to production measurement, are anchored in GPS-based plot area measurement. Overall, we provide unambiguous support for the sensitivity of the plot-level IR to the choice of the method by which maize production and yield are computed. While farmer-reported production-based maize yield regressions consistently imply diminishing returns to GPS-based plot area, the comparable regressions estimated with maize yields based on sub-plot crop cutting, full-plot crop cutting, and high-resolution satellite imagery-based remote sensing point towards constant returns to scale.
In view of the aforementioned hypotheses put forth for the IR, it is important to note that the results are robust to the inclusion of objective measures of soil fertility, maize genetic heterogeneity and edge effects at the plot-level; a rich set of plot, household and plot manager attributes; as well as household and parcel fixed effects in select specifications that exploit the panel nature of the data. In other words, irrespective of the exhaustive list of controls and panel estimation, the IR exists while using farmer-reported maize production, and ceases to do so when maize yield is anchored in objective measurement methods. In fact, the IR persists throughout the distribution of plot-level maize yields that are based on farmer-reported production, while constant returns to scale prevails across the productivity distribution when one uses objective yield measures.
Our core finding is driven by persistent over-estimation of farmer-reported maize production and yield vis-à-vis their crop cutting-based counterparts, particularly in the lower half of the plot area distribution. Though our findings contribute to a larger (and renewed) body of literature questioning the inverse scale-productivity relationship based on omitted explanatory variables or alternative formulations of the agricultural productivity measures, the analysis, together with Desiere and Jolliffe (2017), is the first documenting how the IR could be a statistical artifact, driven by errors in self-reported survey data on crop production.
The paper is organized as follows. Section 2 discusses the conceptual basis for errors in farmerreported crop production and their potential implications for the IR. Section 3 provides relevant background on the Ugandan context. Section 4 describes the data. Section 5 lays out the empirical strategy. Section 6 presents the results. Section 7 concludes.

Conceptual Basis for Potential Errors in Farmer-Reported Crop Production
The quality of farmer-reported estimates of production is degraded by a myriad of issues, summarized in Table 1. Human performance and behavior are subject to a number of constraints, one of them being finite memory. Recall bias can, therefore, pose a serious threat to the quality of farmer-reported estimates of production. This is especially relevant if there is partial green harvest (for immediate consumption or sale) or if permanent or extended-harvest crops, such as cassava, are grown. 6 In these situations, aggregating all of the production from the previous completed agricultural season becomes a taxing mental exercise.
Also, inherent in human behavior is the inclination to round off values. For instance, recent research has illustrated the persistence of rounding in plot area estimates and the ensuing effects on data quality (Carletto et al., 2015;Forthcoming). The same problem is likely present in the estimation of production. Rounding of production estimates may pose a bigger problem on plots with lower production, such as smaller plots, as the rounding error relative to true production is likely greater on these plots. Suppose, for the sake of example, that farmer A, with true production of 1.5 bags, reports that he produced 2 bags. Farmer B has a true production of 11.5 bags, but he reports 12 bags of production. Farmer A overstates his production by 33.3 percent, while farmer B only overstates by 4.3 percent.
Further, depending on the context and farmer characteristics, intentional bias may be at play. Farmers may be inclined to understate production if they perceive an incentive to do so. Perceived incentives may come in the form of eligibility for various assistance programs or a possible threat of taxation, for example. Social desirability bias may lead farmers to overstate their production in order to appear successful.
Setting the challenges with human behavior aside, aggregating farmer-reported production data for analysis poses several challenges, as summarized by Oseni et al. (2017). First, farmers frequently utilize non-standard measurement units for the quantification of production. These units vary across countries, and often exhibit within-country variation across space and time. Even if the questionnaire instrument permits the use of non-standard measurement units for recording production quantities, for standard analyses, these values must either be monetized or converted into kilogram (kg) equivalent terms. In the case of monetization, if the common measurement units used for production quantification differ from the common units associated with crop sales, the valuation of production will be a challenge without converting crop production and sales into kgequivalent terms. In the case of kg-equivalent conversion, one would need to rely on crop-unitspecific conversion factors (possibly differentiated by region), which are often either unavailable to the researcher or available but not documented adequately and/or suspected to be out of date/in need of further validation.
A factor that further mediates the success of the kg-equivalent conversion is the issue of crop condition and state, and in the case of cereals, the ability to express production in kg-grainequivalent terms. Put differently, a farmer that reports two 100 kg bags of green maize that is on the cob would have drastically different production than a farmer that reports two 100 kg bags of maize in dried grain form. If the questionnaire instrument does not allow for the distinction of crop condition and state, the researcher would need to make assumptions regarding the crop conditions and states associated with the reported production quantities. Even if the questionnaire instrument allows for the distinction of crop condition and state, the conversion factor database would need to cover each crop-unit-condition-state combination -a more demanding challenge for survey implementers to address compared to having a simpler conversion factor database at the crop-unitlevel.
As the literature on the IR largely overlooks the abovementioned issues with relying on farmerreported production values, we now turn to the description of the Ugandan context and the data that allow us to explore whether, and the extent to which, errors in farmer-reported production information exist and how, if at all, they affect the often-observed IR.

Country Context
Uganda is a landlocked East African nation of approximately 39 million people, with an annual population growth rate of 3.25 percent and 83.9 percent of the population living in rural areas (2015). National and rural rates of the population living in poverty, with respect to the national poverty line, are estimated at 19.5 and 22.4 percent, respectively (2012), and gross domestic product per capita in current US dollars stands at 705 (2015). The economy is heavily dependent on agriculture, such that agricultural land constitutes 71.9 percent of total land area (2014), agriculture value added corresponds to 25.8 percent of the GDP (2015), and agricultural employment makes up 71.7 percent of total employment (2013). 7 Agriculture, specifically increasing agricultural production and productivity, is identified as an essential vehicle for wealth creation in the country's key policy documents (GoU, 2013;MAAIF, 2013).
Maize is a major staple, commercial, and export crop in Uganda. It is the leading cereal crop grown in almost all parts of the country. In Eastern Uganda, the country's leading maize producing region (UBOS, 2010), the crop accounts for the highest share (25 percent) of crop income (World Bank, 2016). At the same time, Eastern Uganda, following Northern Uganda, is also the region with the highest concentration of the country's poor, and the latest estimate of the regional absolute poverty rate stands at 24.5 percent (World Bank, 2016). Commercialization of maize is relatively low in Eastern Uganda, with less than 41 percent of maize producing households selling any quantity of maize. 8 According to FAOSTAT data depicted in Figure 1 and Figure 2, the trends in maize area harvested and maize yield are positive for the period of 1995-2014. The national area harvested increased steadily over time, from 571,000 hectares to 1.1 million hectares in 2014. The national yield 7 fluctuated around 1.5 tons per hectare mark from 1995 to 2007, but has been above 2.3 tons per hectare in the period of 2008-2014. The latest national yield estimate in 2014 was 2.5 tons per hectare. Recent FAOSTAT yield estimates, however, are markedly higher than those computed from household survey data, including from the methodological experiment informing this study, which employs objective measurement methods. These discrepancies in yield estimates call into question, at least for Uganda, the validity of FAOSTAT figures.

Overview
MAPS: Methodological Experiment on Measuring Maize Productivity, Soil Fertility and Variety is a two-round household panel survey that was conducted in Eastern Uganda to test the relative accuracy of subjective approaches to data collection vis-à-vis objective survey methods for maize yield measurement, soil fertility assessment, and maize variety identification. The survey has been implemented by the Uganda Bureau of Statistics, with technical and financial assistance provided by an inter-agency partnership that is led by the World Bank Living Standards Measurement Study (LSMS), using the Survey Solutions Computer Assisted-Personal Interviewing (CAPI) platform. 9

Sampling Design
In Round I, the MAPS fieldwork was conducted during the first rainy season of 2015, from April to October 2015, in Eastern Uganda, the top maize-producing region of the country. The sample was composed of 75 enumeration areas (EAs) that were selected from the 2014 Population and Household Census (PHC) EA frame and that were distributed across 3 strata, namely (1) Sironko district (15 EAs), (2) Serere district (15 EAs), and (3) a 400 km 2 remote sensing tasking area spanning Iganga and Mayuge districts (45 EAs). In each stratum, the EAs were sampled with probability proportional to size, in accordance with the pre-dissemination 2014 PHC EA-level household counts. 9 The technical assistance to MAPS I (2015) and MAPS II (2016)  In each sampled EA, a household listing exercise was conducted to identify, separately, the list of households that were cultivating at least 1 pure stand maize plot, and the list of households that were cultivating at least 1 intercropped maize plot, on which maize is self-identified to be the dominant crop. Given the interest in the validation of survey methods in key sub-samples and the fact that approximately two-thirds of all maize plots are intercropped in Uganda, the original intention had been to select, at random, 6 households from each of the pure stand and intercropped universes of households of an EA, and ensure an even sample split by maize cultivation status.
Still, due to the low incidence of pure stand households, and the cases in which pure stand households would switch to intercropping status between the household listing and the first interview, the sample at the start of MAPS I fieldwork was composed of 900 households, of which 385 were pure stand (43 percent) and 515 were intercropped (57 percent). Within the remote sensing tasking area specifically, the MAPS I fieldwork started out with 540 households, of which 249 were pure stand (46 percent) and 291 (54 percent) were intercropped. In each MAPS household, 1 maize plot, matching the household cultivation status, was selected at random by the Survey Solutions CAPI application for crop cutting (for objective yield measurement) and soil sampling (for objective soil fertility analysis). The variety identification component was implemented among the 540 MAPS I households residing in the remote sensing tasking area (specific to the plots on which crop cutting and soil sampling took place), as explained below.
In MAPS II, the MAPS fieldwork was conducted during the first rainy season of 2016, from June to October 2016. The field teams attempted to track and re-interview 540 households that had been interviewed in MAPS I in the 400 km 2 remote sensing tasking area cutting across Iganga and Mayuge districts. Appendix I provides the MAPS II household tracking protocol that was followed by the field teams. The MAPS II fieldwork successfully interviewed 489 out of 540 households, and the sample informing our analyses is composed of 440 households for which we have obtained crop cutting measures in both rounds. 10 Further, as in Round I, 1 maize plot was selected from each household for crop cutting and variety identification components, in accordance with the following rules. Whenever possible, a plot was selected among those that were matching the household cultivation status in Round I. Preference was also given such that a plot would be selected from the same parcel that had contained the plot selected in Round I. If multiple plots match the household cultivation status, the CAPI application selected one plot at random. Appendix II lays out the MAPS II plot selection protocol that was implemented by the field teams. 10 34 out of 51 households that we did not interview in MAPS II were due to the fact that they were not cultivating maize in the first season of 2016. The remaining 17 households can be broken down as follows: 5 households could not be tracked or were outside of the tracking area defined as the Iganga and Mayuge districts (5); 4 households had suffered total crop loss prior to post-planting interview; 7 households had already harvested their maize by the postplanting interview; and 1 household refused. The final analysis sample of 440 households vis-à-vis the original 100 records that had at least been subject to the post-planting interview in MAPS I do not exhibit statistically significant differences in terms of their MAPS I yield measures and control variables for our regressions. The mean comparisons are available upon request. In line with this finding, our multivariate regressions are robust to the use of inverse predicted response probabilities in MAPS I as attrition weights.

Fieldwork
In each round, there were three visits to each household, namely post-planting, crop-cutting, and post-harvest. During the post-planting visit, each household was administered a farm survey that collected information on (1) age, sex, education, economic activities, ethnicity, religion, and extension service receipts for all household members; (2) household dwelling and ownership of consumer durables and farm assets; (3) area, tenure, and individual-disaggregated ownership and rights for all parcels that were owned and/or cultivated by households during the reference rainy season; and (4) information on area, cultivation pattern, management and decision-making, conservation agricultural activities, farmer-assessed soil attributes and quality, pre-harvest labor and seed inputs for all maize plots that were cultivated during the reference rainy season. 11 The plot-level information was intended to be solicited from the corresponding plot manager. Specific to our analysis sample, the rate at which the plot-level information was solicited from the intended plot manager stands at 83 percent in both MAPS I and MAPS II. Following the completion of the household post-planting interview, the enumerator visited the randomly selected maize plot, measured its area and saved its boundaries on a Garmin eTrex 30 handheld GPS device, and set up crop cut sub-plots, in accordance with the international best practices, for later harvesting and weighing. A local crop monitor was recruited in each EA to ensure that the designated crop cut sub-plots were not harvested by the households until the subsequent visit. Specific to MAPS I, top and sub-soil samples were also obtained by the enumerators during the post-planting period, for objective soil fertility testing, as detailed below.
During the crop cutting visit, the enumerator harvested the crop cut sub-plots in order to obtain objectively measured harvest quantities. Two household members and the local crop cut monitor helped in the harvesting and shelling of the crop cut harvests that were weighed initially by the enumerator using an industrial digital scale. The samples were later transferred to a centralized facility for additional drying, moisture measurement, and final weighing. At crop cutting, the manager of the randomly selected maize plot also provided information on the morphological attributes of the maize plants on his/her plot with the help of a photo aid, as described below. In the final, post-harvest, visit, farmer-reported information on total plot-specific maize production, non-labor inputs and harvest labor inputs was solicited for all maize plots that were cultivated during the reference season. The post-harvest visit was scheduled within a 2-month period following the completion of each household's harvest.

Key Measurement Domains and Methods
The measurement methods in each of the key measurement domains of interest, namely maize production, plot area, and soil fertility, are summarized below.

Plot Area Measurement
After walking the perimeter of a given plot with the plot manager to identify the boundaries, the enumerators re-paced the perimeter and measured the area with a Garmin eTrex 30 handheld GPS device. The area was recorded on the questionnaire in square meters, and the raw GPS track outline was stored for the remote sensing work program; linking relevant geospatial variables to the plot location; and calculating plot shape metrics. 12 As noted above, the competing yield measures in our study are all anchored in GPS-based plot area measurement.

Crop Cutting
Crop cutting has been recognized as the gold standard for yield measurement since the 1950s by the Food and Agriculture Organization of the United Nations (FAO). Besides the cost-and supervision-intensive nature of the exercise, several concerns have been raised regarding the accuracy of the method. Even if one places only one random crop cutting sub-plot within the sampled plot, the resulting yield estimate may carry a sampling error if the yields exhibit withinplot heterogeneity. (1) More thorough-harvesting of crop cut sub-plots vis-à-vis the typical farmer harvesting practices, (2) possible rounding of crop cut production estimates obtained through scales; (3) using faulty or inappropriate scales; (4) omitting to net out the weight of the measurement container from the measured production; (5) including plants that fall outside of the sub-plot; and (6) non-random placement of crop cut sub-plots have also been suggested as possible sources of error (Fermont and Benson, 2011).
By implementing a well-supervised crop cutting operation that relied on (a) random sub-plot placement and high-precision digital weighing scales in both rounds, (b) full plot harvests in a subsample of MAPS II plots, and (c) field staff and local crop cut monitors that were intensively trained to circumvent the abovementioned criticisms, our crop cutting-based maize yield estimates, in comparison to those that rely on farmer-reported production, are assumed, and later shown, to be more accurate approximations of the true levels. Further, given our interest in the IR, the main assumption in the estimation of our production functions is that the error in crop cutting estimates is independent of the plot size. Given the random placement of the crop cut sub-plots whose size did not vary by plot size, this assumption should be tenable, as also maintained by Desiere and Jolliffe (2017).
Appendix III provides the MAPS sub-plot and full plot crop cutting protocol in detail. In Round I, a 4x4 meter subplot (divided into four 2x2m quadrants) and a separate 2x2 meter subplot were laid on the chosen maize plot during the post-planting visit following a strict protocol to ensure the location of the subplots was random, as described in Appendix III. The subplots were cordoned off until harvest, and were supervised by the local crop cut monitors between the post-planting and the crop cutting visits. Each plot manager was asked not to harvest any crop from the sub-plots until the crop cutting visits, and not to manage the sub-plot any differently than the rest of the plot. These messages, first communicated by the enumerator, were intended to be enforced by the local crop cut monitors.
During the crop cutting visit, the shelled maize harvests tied to each of the five 2x2m quadrants were weighed and barcoded separately in the field, and were reweighed at a central location in Kampala under strict supervision following additional drying (once the moisture content was in the range of 12 to 14 percent). At the time of the final weighing, the moisture content of each sample was captured as to standardize all crop cut sample weights used for our analyses at 12 percent moisture. The MAPS I sub-plot crop cutting based plot-level maize production estimates are computed by multiplying the combined crop cut sub-plot production across the 20m 2 area covered by the combination of the 4x4m and the 2x2m subplots by the ratio of the entire GPSbased plot area in m 2 to 20m 2 . 13 In MAPS II, only one 8x8 meter sub-plot (divided into four 4x4m quadrants) was laid on each plot. 14 The harvests tied to each of the four 4x4m quadrants were weighed and barcoded separately in the field, and the rest of the protocols for crop cut sub-plot supervision and harvest management were identical to those followed in MAPS II. 15 The MAPS II sub-plot crop cutting based plot-level 13 While not reported here, the MAPS I average sub-plot crop cutting plot-level maize yield was not sensitive whether one used the 20m 2 crop cut sub-plot area across both sub-plots; the 16m 2 crop cut sub-plot area covered by the 4x4m sub-plot; or the 4m 2 crop cut sub-plot area covered by the 2x2m subplot. These results are available upon request. 14 The change to the number and size of the sub-plots in MAPS II was underlined by two factors. First, in MAPS I, within the remote sensing tasking area, the difference between the average maize based on the 4x4m subplot and the comparable statistic based on the 2x2m subplot was not statistically significant. Second, the standard deviation among the yields obtained from 5 2x2 crop cut sub-plots was used to calculated the standard error of the mean yield for the field, assuming that the variability within the sample was representative of the variability throughout the field. A large fraction of the fields had standard errors above 10, even 20, percent of the mean, especially for fields with relatively low estimates of mean yield. This in turn raised a question around the extent to which, say, the yield based on a 4x4m sub-plot could be deemed representative of the yield based on the entire plot area. Taking these observations and other logistical considerations into account, the crop cut sub-plot area was increased to 8x8m in MAPS II, with each of the 4x4m quadrant harvests tracked separately to ensure maximum comparability to MAPS I. 15 In cases where the respondent harvested, despite the instructions during the post-planting visit, a portion of the cropcutting sub-plot prior to the crop cutting visit, the crop-cutting production figure was inflated by the percentage of the maize production estimates are computed by multiplying the crop cut sub-plot production across the 64m 2 area covered by the 8x8m subplot by the ratio of the entire GPS-based plot area in m 2 to 64m 2 . 16 In addition, prior to the start of the MAPS II fieldwork, half of the target household population was chosen at random, within each of the pure stand and intercropped domain in each EA, to be subject to a full-plot crop cut, as part of which the entire area of the selected plot was harvested, shelled and weighed by the enumerator, with help from the crop cut assistants recruited from the household members and the EA-specific crop cut monitor. On the plots selected for full-plot harvest, the harvest of the designated 8x8m subplot was weighed separately from the full-plot harvest to allow for comparative yield analysis. The full-plot harvests were only weighed in the EAs as their transport to and additional drying and reweighing at a central location was deemed logistically infeasible.

Remote Sensing
There is a longstanding and active literature on using remote sensing for agricultural monitoring and assessment. Among more recent advances, the GEOGLAM effort to provide in-season forecasting for major cropping regions (Becker-Reshef et al. 2010, Franch et al. 2015 represents an important step forward, as it has operationalized yield forecasting at the national scale. However, because it is based primarily on high temporal, coarse spatial resolution (~5km) data, it is of little value for field-level assessment. At the same time, many studies have evaluated correlations between different reflectance indices and field or sub-field scale crop biomass or yield in specific site/years (e.g., Shanahan et al. 2001, Lobell et al. 2003, Sibley et al. 2014. Although these studies have demonstrated the potential for accurate satellite-based yield estimates, they generally (i) are for large-scale commercial systems and (ii) rely on calibrations that are specific to sites and/or image timings and thus do not generalize well across broad regions. A recent effort by Lobell et al. (2015) has developed a more general approach to combining crop simulation models and remote sensing data to map yields at the field scale, but this approach so far has been tested at the field scale only in large, homogeneous fields of the U.S. sub-plot pre-harvested, as reported by the enumerator, based on his/her observation, on the crop-cutting questionnaire. In establishing the pre-harvest percentage, the enumerators were trained extensively to distinguish pre-harvest from crop damage. In cases where the full sub-plot was pre-harvested by the farmer, the observation was dropped from analysis. Of the remaining 440 observations in each wave, 8.4 percent and 20 percent of households pre-harvested a portion of the sub-plot in MAPS I and II, respectively. Of those plots on which pre-harvesting took place, the mean share of the sub-plot pre-harvested was 28.5 percent in MAPS I and 26.2 percent in MAPS II. The crop cut yields were not adjusted for any crop damage that may have materialized between the post-planting and the crop cutting visits. 16 While not reported here, the MAPS II average sub-plot crop cutting plot-level maize yield was not sensitive whether one used the 64m 2 crop cut sub-plot area or the 16m 2 crop cut sub-plot area covered by a randomly selected 4x4m quadrant within the 8x8m sub-plot. These results are available upon request. Several reviews of the sector point to the key needs of (i) better integration of different data sources, including the new higher resolution data available from several providers, (ii) development of more robust algorithms that can be applied in many different settings without additional calibration, and (iii) more complete and accurate data sets of ground-based measures of crop productivity in order to rigorously establish the accuracy of remote sensing approaches (Gallego et al., 2010;Atzberger, 2013;Lobell;. The MAPS remote sensing work program addresses all of these needs in a unique way. To our knowledge, no other research groups working on agriculture have comparable access to high resolution imagery from the private sector; approaches that are as scalable and potentially operational for field-scale mapping in smallholder systems; and ability to collect accurate plot-level maize yield measurement for hundreds of plots. Specifically, in Round I, remote sensing based yield estimates were obtained based on four images acquired by the Terra Bella (formerly Skybox) satellites over the 400 km 2 tasking area on May 15, June 9, June 27, and July 28, 2015. These images were first geometrically corrected to ensure that plot boundaries were properly aligned with the imagery. Clouds and cloud shadows were manually masked out, and each image was radiometrically corrected to surface reflectance by a standard approach of histogram matching to Landsat images from the same locations and time of year. Reflectance values for individual bands were then used to compute two standard vegetation indices, namely (1) normalized difference vegetation index (NDVI) defined by Rouse et al. (1973) as: and (2) green chlorophyll vegetation index (GCVI) defined by Gitelson et al. (2003) as: For each image date, the mean NDVI and GCVI values for each plot were calculated based on all pixels that fell within the GPS-based plot boundaries. The crop cutting yields, based on the 20m 2 area covered by the combination of the 4x4m and the 2x2m subplots, were then used to calibrate an empirical model relating yield to GCVI on the first three imagery dates, namely May 15, June 9, June 27. NDVI was also tested but performed significantly worse.
When using the entire data set of crop-cutting yields on pure stand fields (n = 235), the correlation between yields and GCVI was low, with an adjusted R 2 (R 2 adj) of 0.13 from a model using the three dates of GCVI. Based on the possibility that the area covered by the combination of the subplots may not represent the heterogeneity across the entire plot area, the model was estimated using different subsets of plots for which the standard error in the crop-cutting yields (as calculated based on the standard deviation among the yields obtained from five 2x2 crop cut sub-plots) was below different thresholds. These models demonstrated much higher adjusted R 2 values. For instance, R 2 adj was equal to 0.33 when using 20 percentage points as the threshold (n = 53) and 0.38 when using 10 percentage points as the threshold (n = 30). The latter model was then used to predict yields across the entire plot sample. 17

Farmer Estimation
Plot managers were asked to report their estimate of maize harvest at the parcel-plot-level during the post-harvest visit, replicating the design of the relevant Uganda National Panel Survey (UNPS) questionnaire modules. 18 Each plot manager was allowed to report production in non-standard measurement units, and was asked to report on both the condition (e.g. green harvested; dry after additional drying; etc.) and the state (e.g. with cob but without stalk or husk; grain; etc.) of up to three maize harvests that may have occurred on the plot over a period of time. The production measurement units, conditions, and states were borrowed directly from the UNPS, and are provided in Appendix IV. The dry grain-equivalent harvest quantities in kilograms were calculated in each round by using the conversion factor database that has been compiled by the UBOS during the 2007 Uganda Census of Agriculture (UCA) for each non-standard measurement unitcondition-state combination and that has been complemented by the data solicited during the UNPS 2009/10, 2010/11, and 2011/12 waves for the (rare) combinations that were not captured as part of the UCA exercise. 19 Moreover, in Rounds I and II, farmer-reported production information 17 The MAPS II remote sensing work program was continuing at the time of the publication of the working paper version of this manuscript. Given the limited availability of cloud-free Terra Bella imagery in 2016 (acquired mirroring the timeline of the Round I imagery acquisition), the remote sensing validation in MAPS Round II relies primarily on Sentinel-2 imagery. 18 It is important to note that the identification of parcels versus plots within parcels was anchored in the precise definitions that have been referenced above and that have been in effect since the UNPS 2009/10 wave. The operationalization of these definitions is such that each enumerator, prior to the administration of the post-planting questionnaire, has a detailed discussion with the holder regarding the organization of his/her farm. This conversation (1) ensures that the enumerator and the farmer are on the same page regarding what parcels versus plots within parcels mean, (2) often culminates in sketches of different parcels and plots within parcels that are being cultivated during that reference season, and (3) establishes how parcels and plots within parcels will be rostered in the questionnaire instrument. The established parcels and plots within parcels are then reviewed at each subsequent visit to the household. 19 The conversion factors have been made available as part of a study by Oseni et al. (2017), and can be accessed here https://goo.gl/HgdbBv. To calculate the dry grain-equivalent harvest quantities, we start with the UCA+UNPS augmented database that contains the kilogram value for each measurement unit-condition-state combination. We merge this file at the unit-condition-state level to each of the reported harvests on the maize plot, and multiply the reported quantity with the conversion factor to calculate an initial kg-equivalence. Further, there are two domains of raw harvest quantities -those that are reported on the cob versus those that are reported as grain. We first work in these domains separately. Within the "cob" domain, we adjust the initial kg-equivalence for all harvest quantities such that they would be reported in terms of the condition "dry, after additional drying" and state "with cob, without husk or stalk" The adjustments are precisely the conversion factor ratios within the cob domain between the preferred condition-state combination and all other observed combinations. Within the "grain" domain, we carry out a similar adjustment procedure such that all harvested quantities are expressed in terms of condition "dry after additional drying" and state "grain." The final adjustment is for expressing the standardized harvest quantities within the cob domain in terms of condition "dry after additional drying" and state "grain." To do this, we first compute EA-specific was solicited for all maize plots cultivated by the household during the respective season, inclusive of the plot on which crop cut sub-plots were laid.

Objective Soil Fertility Measurement (MAPS I Only)
Analysis of soil fertility was done in partnership with the World Agroforestry Center (ICRAF). Plot level soil samples were collected from each plot selected for crop cutting following a protocol carefully designed to maximize the representativeness of the samples while maintaining feasibility of implementation. From each plot, four samples were collected from the top-soil (0-20cm depth) and combined to create one composite top-soil sample. Additionally, a single sub-soil sample (20-50cm depth) was collected from the center of the plot. After being processed at the ICRAF Kampala office, the samples were shipped to ICRAF Nairobi office, where approximately 10 percent were subject to conventional wet chemistry testing and all samples were subject to spectral soil analysis. A portion of this 10 percent sample was used to calibrate the prediction models, while the remainder was used to verify the predictions made onto the spectral data. 20 The final results from the soil analysis include key indicators of soil fertility such as pH, texture analysis (% sand, % clay, % silt), cation exchange capacity, and the concentration of multiple elements and macroand micronutrients, including carbon, nitrogen, and potassium. 21

Descriptive Statistics
Before moving to the empirical strategy used to examine the existence of the IR, descriptive statistics are used to illustrate the deviations between yield measures based on farmer-reported maize production vis-a-vis crop-cutting and remote sensing. Though not reported, the average GPS-based plot area was 0.14 hectares in MAPS I (with a maximum value of 1.37 and a standard deviation of 0.16) and 0.18 hectares in MAPS II (with a maximum value of 2.56 and a standard deviation of 0.23). 22 , 23 The plot-level averages for maize production (kilograms) and yield adjustment factors as the average ratio between shelled and unshelled (on the cob, dry after additional drying, prior to shelling) crop cut sub-plot harvests obtained in MAPS II, and multiply all kg-equivalent production estimates within the "cob" domain in MAPS I and MAPS II with its corresponding EA-specific adjustment factor. 20 For details, see Shepherd and Walsh (2002). 21 During the post-planting interview and prior to visiting the plot for GPS-based area measurement and laying crop cut sub-plots, each plot manager was also asked a multitude of questions about the soil attributes, specifically color, texture, type, and overall quality, of the plot selected for crop cutting. We do not provide further information on these variables as we only rely on objectively-measured soil quality index. 22 The round-specific descriptive statistics, including GPS-based plot area, are reported in Table A1. Where available, means are reported for the comparable UNPS 2015/16 sample. Due to high missingness rates of GPS-based area measurement, and area measurement at the parcel level rather than plot level, plot area and yield figures for UNPS are not comparable to those in MAPS I or II, and are, therefore, not reported. 23 In MAPS I and II, the GPS-based area measure was obtained in each household only for the randomly selected maize plot subject to crop cutting. Farmer-reported area was, however, solicited for the complete set of maize plots that may have been cultivated by the household during the reference agricultural season. Following Kilic et al. (2017aKilic et al. ( , 2017b, we estimated a simple imputation model of GPS-based plot areas pooled across MAPS I and II as a function of farmer-reported plot area, a dichotomous variable identifying 2016 round observations, and EA fixed effects. The (kilograms per hectare, using GPS-based plot areas) based on self-reported maize production, crop cutting and remote sensing are presented in Table 2 for each round. 24 To quell any concerns of bias in self-reported estimates stemming from exposure to full-plot harvest, the computation of self-reported MAPS II averages excludes plots on which full-plot harvest was conducted. Certain patterns emerge that are consistent with expectations. First, maize production and yield are, on average, higher on pure stand plots than their intercropped counterparts, irrespective of the approach to measurement of maize production. This is part due to still using the entire cultivated plot area, without further adjustments for planting density, while computing the intercropped maize yields. Second, there is a marked decline in productivity from MAPS I to MAPS II. This is in line with the increased incidence of crop damage observed by both plot managers (on the full plot) and enumerators (on the crop cutting sub-plots), primarily attributable to a reported increase in drought. 25 Third, in each round, the overall self-reported maize yield is, on average, at least 85 percent higher than its crop cutting and remote-sensing based counterparts. The comparable degree of discrepancy is at least 25 percent in terms of plot-level maize production -although the self-reported maize production is, on average, 84 percent higher than the comparable figure based on full-plot crop cutting. While the means presented in Table 2 illustrate the over-estimation of maize production and yields on the part of our farmers, they do not address whether a systematic bias is at play. The correlates of over-estimation of yields with respect to sub-plot crop cutting based yield measurement are explored in Section 5.
Fourth, the comparison of the averages for production and yield based on alternative measurement methods, including sub-plot crop cutting, full-plot crop cutting and remote sensing, reveals a greater similarity among the estimated yields. On pure stand plots, in MAPS II, the differences in the average plot-level maize yield based on sub-plot crop cutting versus full-plot crop cutting are not statistically significant, lending confidence to sub-plot crop cutting estimates in this domain. Similarly, in MAPS I, the difference in the average plot-level maize yield on pure stand plots based on sub-plot crop cutting versus remote sensing is also not statistically significant, raising hopes regarding the application of high-resolution satellite imagery based remote sensing for the measurement of crop yields in smallholder production systems. R 2 for the imputation model was 0.59. Within a multiple imputation framework, we subsequently obtained a single imputation of GPS-based plot area for the maize plots that were not measured, using predictive mean matching. Specifically, we use the linear GPS-based plot area prediction as a distance measure to form a set of 5 nearest neighbors out of the plot sample measured with GPS, and randomly pick one of these neighbors whose observed GPS-based area value replaces the missing value for the incomplete case at hand. The completed data set with the imputed GPS-based plot areas was then collapsed at the household-round-level to induce a better understanding of the scale of the maize farms. The average GPS-based household-level total area cultivated with maize was 0.29 hectares in MAPS I (with a maximum value of 4.49 and a standard deviation of 0.36) and 0.26 hectares in MAPS II (with a maximum value of 2.56 and a standard deviation of 0.26). 24 The sample is restricted to those in which remote sensing, crop-cutting, and self-reported estimates are available, and to those households which had crop-cutting and farmer-estimated production estimates in MAPS I and MAPS II. 25 Please see Table A2 for round-specific breakdowns of the reported reasons for production loss.
On the other hand, our analysis further underscores the difficulty of estimating yields in the intercropped domain, even with objective measurement approaches. In MAPS II, we find that the average full-plot crop cutting yield on intercropped plots is 85 kilograms per hectare lower than its sub-plot crop cutting counterpart, and the difference is significant at the 5 percent level. This is in line with the expectation that the sub-plot crop cutting yield may not entirely reflect the true measure given the across-household variation in the types of crops intercropped with maize, and both across-household and intra-plot variation in maize seeding rate. And, if we can think of subplot crop cutting yield estimate in the intercropped domain as an upper bound for the true yield based on MAPS II data, we see that remote sensing overestimates plot-level maize production and yield in MAPS I by a significant margin -likely due to difficulties in attributing vegetation growth to maize production on intercropped plots. 26 Overall, the level and direction of discrepancy between farmer estimates of production and cropcutting observed in MAPS I and II run contrary to some previous works, synthesized by Fermont and Benson (2011). Specifically, while MAPS data reveal that maize yields based on farmerreported production are significantly over-reported relative to both sub-plot and full-plot crop cutting, Verma et al. (1988) assert that, on average, farmer estimates are in fact more accurate than sub-plot crop-cutting relative to full-plot harvests. There are, however, two concerns associated with this claim.
First, the small sample used by Verma et al. (1988) prohibited analysis at varying plot size levels. Yet, as presented below and in line with the findings of Desiere and Jolliffe (2017), the degree of error between farmer estimates and crop-cutting measures is systematic in nature, with production and yields more significantly over-estimated by the farmer on smaller plots. Ignoring distributional differences between farmer estimates and crop-cutting, therefore, can mask the true relationship between the two measures. Second, the plots analyzed by Vermal et al. (1988) were all subject to a full-plot harvest, which would likely contaminate farmer-reported production values. Full-plot harvests also eliminate the need for farmers to aggregate periodic harvests (such as early, green harvest with final, dry harvest), further contributing to the accuracy of farmer-reported production values in the presence of contamination.
Similarly, Fermont and Benson (2011) compile historical maize yield estimates in Uganda from a variety of sources, and report that the self-reported yield estimates are consistently lower than the estimates based on sub-plot crop cutting, in contrast with our findings. Yet, the assertion of Fermont and Benson (2011) is potentially misleading, since their reported estimates based on crop cutting originate from on-farm trials, where farmers may be positively selected and may receive technical assistance for optimal management practices, as opposed to standard household or farm survey operations in which no agronomic guidance would be provided to the farmers. Hence, cropcuts from on-farm trials may likely constitute an upper bound for maize yields in field conditions, and would, therefore, be expected to be greater than maize yields based on farmer-reported production.

Inverse Scale-Productivity Relationship
To investigate whether the inverse scale-productivity relationship is sensitive to the way in which the plot-level maize yield is measured, we estimate several variants of three regressions following form: (1) In each equation, Y is the logarithmic transformation of the plot-level maize yield (kilograms per hectare), based on GPS-based plot area. Equation 1 is a cross-sectional linear regression that is estimated separately in each survey round; Equation 2 is a panel linear regression that is estimated with household fixed effects; Equation 3 is a panel linear regression that is estimated with parcel fixed effects. As shown in Table 3, Equations 1 through 3 are estimated using several plot sample definitions, and alternative maize yield measures based on (1) self-reporting, (2) sub-plot crop cutting, (3) full plot crop cutting, and (4) remote sensing.
The following is an overview of the notation used in Equations 1 through 3. First, i and h denote plot and household, respectively; t denotes survey round in the panel regressions; and α and ε are the constant and the error term, respectively. The common vectors included in all equations include A, P, H, and M, whose choice takes into account the explanatory variables commonly featured in production functions estimated to investigate the inverse scale-productivity relationship.
A is the logarithmic transformation of GPS-based plot area, and β1 is the main coefficient of interest across all estimations. A negative and statistically significant β1 would be in support of the inverse scale-productivity relationship at the plot-level.
P is a vector of plot-level characteristics, including (1) a binary variable identifying whether the plot was pure stand with maize; (2) logarithmic transformation of seeding rate under intercropping 27 ; (3) logarithmic transformation of kilograms of maize seed planted; (4) a binary variable identifying whether any inorganic fertilizer was applied on the plot 28 ; (5) logarithmic transformation of total household member days of season-specific labor input on the plot 29 ; (6) a binary variable identifying whether there was any hired labor input on the plot; (7) logarithmic transformation of total hired days of season-specific labor input on the plot 30 ; (8) percent seasonal (May-June) rainfall deviation from plot location-specific long-term average rainfall 31 ; (9) logarithmic transformation of GPS-based distance between plot and dwelling in kilometers; (10) enumerator-assessed percent damage in the crop cut sub-plot 32 ; and (11) a binary variable identifying whether any cover crops were on the plot prior to planting.
H is a vector of household characteristics, including (1) wealth index, (2) agricultural implement and machinery index, (3) logarithmic transformation of household size, and (4) dependency ratio. M is a vector of plot manager characteristics, including (1) a binary variable identifying whether the respondent is also the plot manager, (2) a binary variable identifying whether the plot manager received agricultural extension services on topics relevant to crop production and marketing in the last 12 months, (3)  Further, the vector O includes the rest of the objectively-measured plot-level covariates, including (1) soil fertility index, (2) genetic heterogeneity of the maize on the plot, based on DNA fingerprinting of seed samples obtained from the combined harvest tied to the 4x4m crop cut sub-27 This is calculated as the ratio between the quantity of maize seed planted under intercropping and the counterfactual quantity of maize seed that would have been planted had the plot been cultivated pure stand, as reported by the farmer. The pure stand maize plots are assigned a value of 1. 28 The rare occurrence of inorganic fertilizer use prevents us from using the logarithmic transformation of the quantity of inorganic fertilizer applied. 29 This is calculated as the sum of all household member-specific labor inputs reported by the farmer at the plot-level. 30 The plots without any hired labor are assigned a value of 0. 31 The plot location-specific dekadal time series rainfall data are sourced from the CHIRPS database. The plot locationspecific long-term average for the period of May-June is computed over the period of 1981-2015 for MAPS I, and 1981-2016 for MAPS II. 32 In MAPS I, this is calculated as the average percent damage across both 4x4 and 2x2 sub-plots. 33 MAPS I parcel rosters were carried forward for updating in MAPS II to identify the MAPS I parcels that were still owned and/or cultivated by the households during the first rainy season of 2016. This information, together with the known nesting of the maize plots with the parcels in each round, was used to determine whether the selected plot in MAPS II happened to be on the same parcel as the selected plot in MAPS I. plot in MAPS I 34 , and (3) edge effects, specifically, the share of the crop cut sub-plot that is within 4 meters of the nearest plot edge (and separately, within 1 meter of the nearest plot edge, specifically in Round 2, given the increase in the crop cut sub-plot area to 8x8m). The inclusion of this vector is motivated by the competing hypotheses on the weakening or disappearing inverse scale-productivity relationship as a result of controlling for plot attributes, such as soil quality (Barrett et al., 2010), and edge effects (Bevis and Barrett, 2017). Appendix V provides more details on the construction of the soil fertility index and the edge effects.

Drivers of Farmer Over-Estimation
Descriptive analysis is first used to highlight the validity of the hypothesized sources of measurement error in self-reported production estimates, primarily heaping of production figures. Consistency of the error across time is additionally explored, with an eye for examining the potential to correct this measurement error econometrically. Following Desiere and Jolliffe (2017), we subsequently move to explore the correlates of farmer over-estimation through regression analysis of the log ratio of self-reported to crop cutting yields. Since it is feasible that the drivers of farmer over-estimation vary along the distribution of over-estimation, recentered influence function (RIF) regressions are estimated. The econometric method, put forth by Firpo et al. (2009), executes an unconditional quantile regression by first estimating an influence function recentered on a given quantile, and subsequently utilizing the estimated RIF value as the dependent variable in a linear regression. The influence function for the dependent variable, Y, in this case the degree of measurement error in self-reported production-based yields proxied by the log ratio of selfreported to crop cutting yields, is as follows: ≤ equals 1 if the dependent variable is less than or equal to the quantile QT (0 otherwise), QT is the population T-quantile of the unconditional distribution of Y, and fY(QT) is the density of the marginal distribution of Y. The RIF is a function of equation (4), assuming IF(y;v) is the influence function for an observed productivity outcome y: where v(FY) is the distributional statistic for the dependent variable. Therefore, Finally, the RIF values are regressed on a series of covariates using linear regressions. The covariates included are those believed to influence productivity itself, as discussed above, as well as controls for whether self-reported production estimates were rounded/heaped on whole numbers, included 100, 200, 300, 400 or 500 kg or 1, 2, 3, 4, 5, or 10 100 kg sacks. Table 4 reports the plot area coefficients from the regressions that are estimated as specified in Table 3. The full set of regression results are reported in the Appendix Tables A4.1-A4.3. As discussed above, a negative and statistically significant coefficient on plot area confirms the IR. The results suggest that when using farmer-reported production estimates to define plot-level maize yields, the IR holds across the entire set of specifications and analysis samples of interest, including in the instances where we control for unobserved time-invariant heterogeneity at the household-and parcel-level. The magnitude of the IR is non-trivial; on average, a 1 percent increase in GPS-based plot area results in a 0.35 to 0.9 percent reduction in maize yields, depending on the specification. In terms of magnitude, the coefficients are similar with respect to other published findings in Uganda (Larson et al. 2014).

Inverse Scale-Productivity Relationship
On the other hand, there is no evidence of the IR when we define maize yield on the basis of subplot crop cutting, full-plot crop cutting, or remote sensing-based crop production estimates. In MAPS II, using full-plot crop cutting maize yield, the GPS-based plot area coefficient is insignificant, statistically indistinguishable from zero, irrespective of cultivation statussuggesting constant returns to scale (CRS). In specifications that exploit the panel nature of the data, the coefficient of interest at times changes direction, but again is insignificant across all specifications and analysis sample definitions, with the exception of a marginally significant, positive coefficient under the parcel-panel column. Similarly, in MAPS I, remote sensing estimates, on average, suggest a marginally significant positive return to land area, but this result is not robust as the coefficient loses significance when the sample is disaggregated by cultivation status, and in the pure stand domain, where we have a higher degree of confidence in remotelysensed yields, it is near-zero in magnitude and statistically insignificant. Taken together, the findings based on maize yields that are measured with objective survey methods support constant, as opposed to decreasing, returns to scale.
These conclusions hold true (1) in the expanded sample with the households from the two districts, Serere and Sironko, that are excluded from the analysis sample due to lack of remote sensing and variety identification data; (2) if the top and bottom 5 percent of the plots in terms of maize yield are excluded from the analysis sample; and (3) if the analysis sample is limited only to households that cultivate only one maize plot. These robustness checks, which are available upon request, work to ensure that the results (1) hold across a greater geographic area in the Eastern region; (2) are not being driven by outliers in yield data; and (3) in part rule out the theory that errors in selfreported production are driven by difficulty in reporting production at the plot level, rather than the (maize) farm level.
Further, as documented in the Appendix Table A3, while IR persists at the mean and throughout the distribution of plot-level maize yields anchored in farmer-reported production, CRS is observed throughout the distribution of objective yield measures. This finding is in stark contrast with the new evidence put forth by Savastano and Scadizzo (2017), who, based on farmer-reported panel survey data, provide support for the IR among the sample of farmers above the median land productivity measure.
Contrary also to the findings by Bevis and Barrett (2017), who use the perimeter-to-area ratio as a proxy for edge effect and find plot edges to be more productive than the interior, the results summarized in Table 4, and reported in full in the Appendix Tables A4.1-A4.3, suggest that plot edges are generally not more or less productive than the interior of the plot, and when significant, the direction of the coefficient suggests the plot edges are less productive. This finding is robust to an alternative definition of the edge effect, namely a 1m buffer along the plot edge as opposed to the 4m buffer presented here. The support for CRS, as opposed to the IR, in the MAPS II estimations that use the full-plot crop cut yield as the dependent variable raises further questions on the viability of the edge effect hypothesis in our context.
Finally, to illustrate the difference in plot area coefficients across specifications, Figures 3, 4, and 5 present the plot area coefficients with 95 percent confidence intervals for the overall, pure stand, and intercropped sample, respectively. The confidence intervals around the estimated coefficients originating from regressions that use sub-plot crop cutting-based yield estimates are considerably tighter than the comparable confidence intervals associated with the plot area coefficients estimated from regressions that use self-reported production-based yield estimates. The following section examines the farmer-reported maize production data more closely and seeks to identify the correlates of farmer over-estimation.

Drivers of Farmer Over-Estimation
Before addressing the correlates of farmer over-estimation and potential sources of systematic error, we first address the feasibility of correcting this error with the use of correction factors. In order for correction factors to be an appropriate course of action to improve the quality of selfreported production data, the error must be consistent across time and within household. Figure 6, however, illustrates that this is not the case.
The pairwise correlation of the ratio of self-reported to sub-plot crop cut production estimates from MAPS I to MAPS II is 0.002, not statistically distinguishable from zero. Confidence in the consistency of self-reported measurement error in crop production is further degraded by the lack of predictive power from year to year. Table 5 presents the results of simple OLS regression of MAPS II yields on MAPS I yields, using both sub-plot crop cut production (columns 1, 2, and 3) and self-reported production (columns 4, 5, and 6).
When relying on the objective measure of production in the numerator, MAPS I yield is a significant predictor of MAPS II yields, particularly in the sample in which both plots were from the same parcel. This reflects farmer ability, the quality of the land, agricultural practices employed, etc. Conversely, self-reported production-based yields in MAPS I are not a statistically significant predictor of yields in MAPS II. The correlation of self-reported yields between MAPS I and II on the parcel panel sample is only 0.012, while the correlation between sub-plot crop cutting yields is 0.226. The inconsistency of measurement error in self-reported production estimates observed in Figure 6 and Table 5 severely limits the potential for utilizing a "corrected" self-reported production data in productivity analysis.
If the over-estimation of maize yields based on farmer-reported maize production were uniform across the distribution of plot sizes, it would have no influence on the observed relationship between land productivity and plot area. Rather, the differences in the plot area coefficients suggest that there is a systematic bias in farmer-reported maize production. To address whether the deviation between crop cutting and farmer estimates varies across the distribution of plot areas, Figure 7 presents the mean deviation in yields (as measured by self-reported yield minus cropcutting yield) by GPS-based plot area quintile for MAPS I and MAPS II. A clear downward trend exists, in which farmers more greatly over-estimate yields on smaller plots, with the degree of over-estimation decreasing with plot area. This systematic bias would have direct consequences on the IR.
This trend is consistent with evidence on the systematic nature of measurement error in farmer self-reported estimates of plot area. Carletto et al. (2013 find that farmers overestimate plot areas relative to GPS-based area measurements more significantly on smaller plots, while only slightly overestimating (or underestimating) the area of larger plots. The collection of both GPS-based area measurement and farmer self-reported area in MAPS I and II allows for analysis of the relationship between the measurement error in self-reported crop production and plot area. As expected, the degree of measurement error observed in these two variables is significantly and positively correlated, with a correlation coefficient of 0.437 and 0.535 in MAPS I and II, respectively. 35 To explore another potential source of farmer over-estimation yield, namely the persistence of rounding of farmer-reported production estimates, Figure 8 presents a histogram of reported quantities for the primary production units, kilograms and 100 kilogram sacks for MAPS I and MAPS II, as well as the maize-producing UNPS 2015/16 sample residing the Eastern region and reporting for the first season of 2015 (overlapping with the MAPS I reference period). There is clear evidence of heaping at common intervals, such as 50 kg, 100 kg, 200 kg, and 300 kg, and 1 sack, 2 sacks, and 10 sacks, which may explain, at least in part, the trend in greater over-estimation on smaller plots. Rounding of self-reported production quantity on plots with lower production may have a more severe impact in terms of percent of total production, therefore resulting in greater bias.
The results of the RIF estimations, as described in section 4, are presented in Table 6. Plot area is indeed negatively correlated with farmer over-estimation in MAPS I across the entire distribution of over-estimation, while area is only significantly different from zero (and negative) in the upper tail for MAPS II and the household panel. Supporting the theory that subjective estimates of production are complicated by multiple harvest periods and crop conditions, the number of conditions in which the farmer reports overall harvest is positively correlated with over-estimation in MAPS I and the household panel (insignificant in MAPS II except for the least productive decile). Also exacerbating yield over-estimation are the farmer-reported seed, household labor and hired labor input quantities, which may, therefore, suffer from the same subjective sources of measurement error. This theory is supported by the correlation of measurement error observed between self-reported plot area and crop production discussed above, and signals potential concern for production function estimations more broadly.
Finally, the enumerator-assessed percent damage in the crop cut sub-plot, which could be taken as a proxy to the damage on the plot as a whole, is significantly and positively correlated with farmer over-estimation, suggesting farmers may not be taking into account crop damage when estimating plot-level production. While some may theorize that farmer over-estimation originates in the inability of farmers to distinguish between the harvest on multiple plots (that is, potentially reporting the production of multiple plots on a single plot), the results in Table 6 suggest otherwise. Whether the household cultivates more than one maize plot or not has no bearing on the degree of farmer over-estimation.

Conclusions
Based on a two-round household panel survey conducted in Eastern Uganda to test the relative accuracy of subjective approaches to data collection vis-à-vis objective survey methods for maize yield measurement, soil fertility assessment, and maize variety identification, we provide unambiguous support for the sensitivity of the plot-level inverse scale-productivity relationship to the choice of the method by which maize production and yield (anchored in GPS-based plot area measurement) are computed.
While farmer-reported production-based plot-level maize yield regressions consistently lend support to the inverse scale-productivity relationship, in magnitudes that are similar to previously published findings on Uganda and other African settings, the comparable regressions estimated with maize yields based on sub-plot crop cutting, full-plot crop cutting, and remote sensing point towards constant returns to scale (CRS). In view of the competing hypotheses for the IR, the regressions control for objective measures of soil fertility, maize genetic heterogeneity, and edge effects at the plot-level; a rich set of plot, household and plot manager attributes; as well as household and parcel fixed effects in select specifications. The existence of the IR anchored in the use of farmer-reported plot-level maize production is also shown throughout the yield distribution, while CRS is documented throughout the distribution of maize yields based on crop cutting and remote sensing variants.
Our core finding is driven by persistent over-estimation of farmer-reported maize production and yield vis-à-vis their crop cutting-based counterparts, particularly in the lower half of the plot area distribution. The analysis (1) points to the rounding of maize production as a key factor in farmer over-estimation, (2) suggests that farmers do not consider the degree of crop damage when reporting production, and (3) provides evidence for multiple harvests (and the accompanying heterogeneity in harvested crop conditions) contributing to over-estimation. While some may argue that farmer over-estimation is driven by the inability of farmers to distinguish production at the plot level, rather than farm level, our results also serve to refute this claim.
Though the findings contribute to a larger (and renewed) body of literature questioning the inverse scale-productivity relationship based on omitted explanatory variables or alternative formulations of the agricultural productivity measures, the analysis, together with Desiere and Jolliffe (2017), is among the first documenting how errors in self-reported survey data on production mediate the existence of the IR. The consistency in the findings across our study and that of Desiere and Jolliffe (2017), which has a focus on Ethiopia, is noteworthy.
Taking into account the similarities in heaping in self-reported maize production information across our study and the Eastern region sub-sample of the UNPS for the same agricultural season, our findings emphasize the need for sustained focus on the improvement of crop production and yield measurement in the context of household and farm surveys that solicit farmer-reported production information and that capture farming at similar scales as well as pervasive use of nonstandard measurement units, and heterogeneity in harvested crop conditions that could be reported across and within households.
Although we use the official maize unit-condition-state specific conversion factors that are further augmented with MAPS-based measurements that provide additional nuance to expression of unshelled maize in shelled equivalent terms, the quality of conversion factors that are used for computation of farmer-reported maize production in kg-equivalent terms may mediate the accuracy of the self-reported maize production. However, there are open empirical questions regarding the extent that conversion factors should be specific in spatial and temporal terms, and whether an improved set of conversion factors, including through the introduction of non-standard measurement units that recognize the variation in sizes for specific units, would be enough to overcome other challenges that may plague self-reported production information, as reviewed in Section 2.
Finally, given the absence of the IR based on the objective productivity and plot area measures in our sample, the results provide further support for promoting a policy environment that reduces the scope for further subdivision of land and that prioritizes investments in land titling and land market development. It is important to note, however, that our findings do not suggest a broad development policy focus shift away from the needs of smallholder farming households since there are social and economic reasons for why one inherently cares about the improvement of living standards across a large segment of the population relying on farming. And to the extent that cultivating small plot(s) is positively correlated with poverty and that the smallest plots are associated with the most upward bias, the actual yields attained by the poorest may be less than previously estimated.  SR vs CC *** *** *** *** *** *** SR vs RS *** *** *** N/A N/A N/A CC vs FP N/A N/A N/A *** *** *** CC vs RS *** *** *** N/A N/A N/A Notes: The sample is limited to households with crop cutting in both MAPS I and MAPS II. MAPS II self-reported figures further exclude observations subject to full-plot harvest. ***/**/* denote statistical significance at the 1/5/10 percent level, respectively. The distributional differences are assessed based on the Kolmogorov-Smirnov tests of the equality of distributions.   (3) edge effects (specifications 2, 4, and 6). MAPS II regressions based on self-reported maize yield exclude observations subject to full-plot harvest. ***/**/* denote statistical significance at the 1/5/10 percent level, respectively.   ‡ Sample excludes plots that were subject to full-plot crop cut in MAPS II. † denotes a dummy variable. ***/**/* denote statistical significance at the 1/5/10 percent level, respectively. Production indicated as "heaped" if at least one condition of harvest was reported as 100, 200, 300, 400, or 500 kg, or 1, 2, 3, 4, 5, or 10 100 kg sacks. Overall R 2 reported for panel specifications.   Table A4.1.

Figure 4. Selected Plot Area Coefficients w/ 95% Confidence Intervals -Pure Stand Plots
Notes: SR, CC, RS, and FP stand for self-reporting, sub-plot crop cutting, remote sensing, and full-plot crop cutting, respectively. MAPS II SR estimates are based on the plot sample net of those subject to full-plot crop cutting. Household and Parcel Panel are defined as in Table A4.2.

Figure 5 -Selected Plot Area Coefficients w/ 95% Confidence Intervals -Intercropped Plots
Notes: SR, CC, RS, and FP stand for self-reporting, sub-plot crop cutting, remote sensing, and full-plot crop cutting, respectively. MAPS II SR estimates are based on the plot sample net of those subject to full-plot crop cutting. Household and Parcel Panel are defined as in Table A4.3. † denotes a dummy variable. ‡ Incidence of maize sales of any quantity, of maize-growing households. Sales data not available in MAPS I or MAPS II due to timing of survey. *** denotes statistical significance of the mean difference at 1 percent level.     ‡ Sample excludes plots that were subject to full-plot crop cut in MAPS II. † denotes a dummy variable. ***/**/* denote statistical significance at the 1/5/10 percent level, respectively.

Sub-Plot Crop Cut Yield
Remote Sensing Yield

Sub-Plot Crop Cut
Yield ‡            … If the parcel that was selected for crop cutting in 2015 is still in household's possession and has at least one intercropped maize plot, select a maize plot at random that is of intercropped cultivation status among the intercropped maize plots on the same parcel from 2015.

Sub-Plot
… If the parcel that was selected for crop cutting in 2015 either is not in household's possession OR is in household's possession BUT does not have any plots that is cultivated with maize, select an intercropped plot at random among all the intercropped plots that are being cultivated by the household in 2016.
If the household was of intercropped cultivation status in 2015, and the household is of not intercropped cultivation status and is of pure stand cultivation status in 2016 in accordance with the definition above: … If the parcel that was selected for crop cutting in 2015 is still in household's possession and has at least one pure stand maize plot, select a maize plot at random that is of pure stand cultivation status among the pure stand maize plots on the same parcel from 2015.
… If the parcel that was selected for crop cutting in 2015 either is not in household's possession OR is in household's possession BUT does not have any plots that is cultivated with maize, select a pure stand plot at random among all the pure stand plots that are being cultivated by the household in 2016." Pre-measured 4m x 4m PVC pipe: A set of PVC pipes that are pre-measured to create a 4x4 meter square will be provided to each enumerator to ensure the crop-cut area is precisely 4x4 meters.
Pre-measured 2m x 2m PVC pipe: A set of PVC pipes that are pre-measured to create a 2x2 meter square will also be provided to each enumerator to ensure the second crop-cut area is precisely 2x2 meters.
Sticks: These will be used to mark the four corners of the areas selected for crop cutting. Eight sticks will be used to mark the corners of the 4x4 meter subplot and four sticks to mark the corners of the 2x2 meter subplot.
Measuring Tape: This is a distance-measuring instrument marked in metric-units (segments), which will be used to determine the location of the areas in the plot.
Bags, Barcodes, Writing Materials: Each quadrant's harvest will be stored in bags that will be provided for sample transport. Each bag will be tagged with a water-resistant barcode sticker, whose duplicate will be placed inside the bag. You will be provided with pen and pencils for note taking.
Industrial Digital Weighing Scale: This will be used to weigh the harvested maize at the time of harvest (in grain form). Each of the five 2x2m quadrants must be weighed separately.

Procedure for Crop Cutting
We will be conducting crop cutting on a 4m x 4m subplot AND a 2m x 2m subplot of the maize plot. However, we will divide the 4mx4m area into four 2mx2m squares (also called quadrants). Therefore, there will be a total of FIVE 2m x 2m quadrants. The harvest of each quadrant will be recorded separately. Here, we describe in further detail each of the four main aspects to the crop cutting exercise.
You will first construct the 4m x 4m subplot by following steps 1 and 2 below. Only after demarcating the 4m x 4m subplot you will repeat steps 1 and 2 for the 2m x 2m subplot. The 4x4 and 2x2 subplots may not overlap.

1) Crop Cutting Area Selection:
a. Use Random Number Table #1 to identify the corner from which you will start. Use the first number in the random number table that matches one of the corners of the plot. The corner in which you started the area measurement, the northwest corner, is corner #1. Corner #2 is the next corner of the plot, moving around the plot clockwise.
b. Measure the distance of the two sides along the selected corner with the measuring tape. Identify which is the longer side and which is the shorter side.
c. Take the bearing from the start corner down the shorter side. Note this in your notebook.
d. Use the Random Number Table #2 provided for this household. The first number should be the number of meters that you will walk along the length of the longer side of the plot. If the first number is larger than the length of the side, choose the next random number (and so on, until you find a number that is less than the length of the side). For example, if the length of the longer side is 25 meters and the first random number in the list is 28, move on to the next number.
e. Beginning at your starting point and continuing along the longer side of the plot, walk the number of meters indicated by your random number.
f. Turn into the plot so that your bearing is the same as the bearing you measured down the shorter side of the plot. This means you will be entering the plot parallel to the shorter side. Choose the next random number from Random Number Table #2 that is shorter than the length of the shorter side and walk the number of meters indicated by this second random number. You should be walking in a direction that is parallel to the shorter edge of the plot. Walk in a straight line. Try not to veer to the right or left to avoid shrubs or wet spots.
g. The corner of the crop cutting subplot is located where your foot lands on the last step: this is point A.

2) Crop Cutting Subplot Demarcation:
a. At point A, insert the first stick firmly into the ground, then turn your face to the east and lay the PVC pipe square in front of you (if you do not have PCV pipe with you, measure 4 meters directly to the east). One side of the PVC square should go from Point A to the east, which we call Point B. From Point B, the next corner of the PVC pipe should be to the north where we put Point C (if you do not have PCV pipe with you, measure 4 meters directly to the north).
b. With the PVC square on the ground, insert sticks exactly at each corner.
c. Tie a rope around all four sticks. Carefully dis-assemble and remove the PVC pipe square, leaving only the rope. The rope will stay on the subplot until the time of harvest. d. In order to make sure that the subplot size is correct, check to make sure that the diagonal line (Line A-C) is: (1) 5.66 meters on the 4m x 4m subplot and (2) 2.828 meters on the 2m x 2m subplot.
W S E N Note: If the random numbers obtained from the random table for long and short sides of the plot do not fall in the crop plot area, drop both random numbers and start over again. Each time when one or both of the random numbers fail to fall in the plot, drop both and start again until both random numbers fall on the plot. Also, if the 2x2 subplot overlaps with the 4x4 subplot, you must drop the random numbers and start again on the 2x2 subplot.
If there is an obstacle in one or more of the crop-cutting subplots, such as a large tree stump, a boulder, large ant hill, etc. re-select the subplot by starting with a new random corner.
If in one or more of the crop-cutting subplots there is maize damage, DO NOT re-select the subplot. Leave it as it and we will record the damage in the crop-cut questionnaire.
e. FOR THE 4x4 SUBPLOT ONLY: The last step is to divide the 4m x 4m subplot into 4 equal quadrants. Measure 2m from each corner and enter a new stick. These new sticks will mark the middle of the rope on each side. Next, tie a piece of rope between the new sticks so that there are four (4) equal quadrants as in the example below. These four quadrants will be called quadrant A, B, C, and D. When conducting the harvest, you will need to keep the crop from each quadrant separate.

3) Harvest of Demarcated Section -Completed at the time of harvest
With the consent of the farmer, harvest all of the maize contained within the demarcated plot area, keeping the crop from each quadrant separate. Count the number of plants and cobs that are harvested from each quadrant. Once harvested and shelled, the grain should be weighed carefully using the digital scales, and the data recorded in the Module M of the Crop-Cutting Questionnaire. Remember, the crop from each quadrant must be weighed separately. The crop cut samples will be picked up on weekly supervision visits and delivered to the central drying location in Kampala. A separate team will then dry the crop for an additional time, weigh it again at a later date, and capture the moisture content at the time of the final weighing.

Full Plot Crop Cutting in MAPS II
Out of the initial target of 540 households, half of households in each of the pure stand and intercropped domains in each MAPS Round I EA were selected at random, prior to the start of the MAPS II fieldwork, to be subject to full-plot crop cutting, in addition to the 8x8m sub-plot crop cutting. This yielded a pre-full-plot crop cut sample of 282 households. Given the attrition dynamics in Round II and 31 households that were selected for full-plot crop cutting but that had to harvest their only plot prior to the crop cutting visit (with the exception of the crop cut sub-plot), the final sample that was subject to full plot crop cutting was composed of 214 households.
At the time of the harvest, the 8x8 crop cut sub-plot was harvested first, and both unshelled and shelled weights were taken, alongside shelled grain moisture readings for each of the 4x4m quadrant harvest. Subsequently, the rest of the plot was harvested for unshelled and shelled weight measurements and shelled grain moisture measurements, following the steps outlined below. For capturing unshelled and shelled weights tied to full-plot crop cut harvests, we used high-accuracy, digital, industrial HIWEIGH scales that were procured through and calibrated by the Uganda National Bureau of Standards. Each scale had a maximum load of 300 kilograms and a readability of 0.01 grams. We used DICKEY-john mini GAC moisture meters that were borrowed from the National Crop Resources Research Institute (NaCCRI). Each of the 3 survey teams had one scale and one moisture meter.
Due to different planting times, farmers were harvesting their grain at different times. Instead of concentrating on one EA at a time, the teams adopted a system of visiting an average of 2 to 3 EAs per day during the crop cutting fieldwork period, and conducting crop cuts on an average of 3 sampled plots across harvest-ready households. Each household was allocated, on average, four 100 kilogram bags to facilitate the full plot harvest. Every plot had 2 crop assistants recruited from the associated household to assist in the full-plot crop cut. Every crop assistant was paid 5000 Ugandan Shillings, approximately 1.5 USD. Further, each full-plot crop cut household received a tarpaulin (used for maize drying), in addition to a hoe and a panga knife that was provided to all MAPS II households. then constructed for the share of the subplot that fell within a 4-meter internal buffer of the plot boundary. A separate variable is computed considering a 1-meter plot buffer for MAPS II, given the increase in the crop sub-plot area to 64 m 2 in this round.
In the case of MAPS I, since there are two crop cut sub-plots (a 4x4m sub-plot and a separate 2x2m sub-plot) and each may have a different share of the subplot within the buffer zone, an aggregated variable is necessary. This variable is computed as the sum of the shares of the subplots within the buffer zones, weighted by the total crop cut area of 20 m 2 that a given crop cut sub-plot accounted for: The indicator for MAPS II is simply the share of the 8x8 meter subplot in the buffer zone.