Is Dirt Cheap? The Economic Costs of Failing to Meet Soil Health Requirements on Smallholder Farms

Agricultural productivity is hindered in smallholder farming systems due to several factors, including farmers’ inability to meet crop-specific soil requirements. This paper focuses on soil suitability for maize production and creates multidimensional soil suitability profiles of smallholder maize plots in Uganda, while quantifying forgone production due to cultivation on less-than-suitable land and identifying groups of farmers that are disproportionately impacted. The analysis leverages the unique socioeconomic data from a subnational survey conducted in Eastern Uganda, inclusive of plot-level, objective measures of maize yields and soil attributes. Stochastic frontier models of maize yields are estimated within each soil suitability class to understand differences in returns to inputs, technical efficiency, and potential yield. Only 13 percent of farmers are cultivating soil that is highly suitable for maize production, while the vast majority are cultivating only moderately suitable plots. Farmers cultivating highly suitable soil have the potential to increase their observed yields by as much as 86 percent, while those at the opposite end of the suitability distribution (with marginally suitable land) operate closer to the production frontier and can only increase yields by up to 59 percent, given the current technology set. There is heterogeneity in potential gains across the wealth distribution, with poorer households facing more heavily constrained potential. Assuming no change in technologies and management practices used by Ugandan farmers, there are limited economic gains tied to closing suitability class-specific productivity gaps, or even at the extreme reaching the average potential productivity levels observed in the high suitability class. JEL: O13, Q12, Q18.


Introduction
Agriculture is central to rural livelihoods in Sub-Saharan Africa, where smallholder family farming contributes up to 69 percent of rural household incomes (Davis et al., 2017), with direct effects on household consumption and nutrition outcomes (Azzarri et al. 2015;Dillon et al., 2017;Kirk et al., 2018;Slavchevska et al., 2015). In view of the importance of agriculture for farming households and the evidence regarding the disproportionate reduction in poverty associated with growth in the agricultural sector vis-à-vis other sectors, increasing agricultural productivity has been a long-standing goal of African governments. Nevertheless, the observed yields for staple crops, such as cereals, remain significantly lower than potential yields, especially in rain-fed areas (Lobell et al., 2009). Research has also asserted that (i) the large dispersion in agricultural productivity among African smallholders is driven in part by unobserved heterogeneity in land quality (Gollin and Udry, 2021), and (ii) poor crop yields in the region are driven in part by depletion of soil nutrients, which, besides its direct impact on crop yields, also adversely affects the effectiveness of non-land inputs (Berazneva et al., 2018).
The micro-level relationships between crop yields and a range of both climatic and non-climatic factors have been studied extensively in African smallholder farming systems, utilizing socioeconomic and agricultural survey data which, if georeferenced, have also been integrated with third-party geospatial data sources. 2 Despite the rich evidence base on the drivers of/constraints to crop yields, the research on the impact of soil fertility on farm-and plot-level agricultural productivity outcomes, including crop yields, has, however, been relatively limited.
The knowledge gaps have been mainly due to (i) systematic measurement errors in farmers' subjective soil quality assessments (Berazneva et al., 2018;Carletto et al., 2017); (ii) lack of integration of objective sampling and testing of soils as part of household and farm surveys that are critical for understanding the drivers of agricultural production and productivity at the microlevel ; and (iii) the mismatch between the scale of African smallholder farming and the spatial resolution of publicly-available, continent-wide geospatial data on soil properties -most notably generated with 250-meter spatial resolution as part of the Africa Soil Information Service (AFSIS) initiative (Hengl et al., 2015).
A related strand of research includes the geospatial assessments of the suitability of growing conditions for specific crops and these assessments have been made available also at the level of relatively aggregated geographic areas, leveraging primarily geospatial data that may or may not be complemented with ground data on soil properties (Abd-Elmabod et al., 2019;Ahamed et al., 2000;Hall et al., 1992). A prominent example of these efforts is the Global Agro-Ecological Zones (GAEZ) initiative, which makes available global geospatial datasets on crop suitability and attainable yield for 53 crops but at approximately 9-kilometer spatial resolution. 3 While the GAEZ data may be useful for assessing crop suitability across expansive geographies, the coarse resolution of this geospatial product limits its use for farm-or plot-level analyses of agricultural productivity in Sub-Saharan African contexts.
Against this background, our paper leverages unique household survey data collected in Uganda, inclusive of plot-level, objective measures of maize yields and soil attributes, to fill knowledge gaps regarding the linkages between soil fertility and smallholder agricultural productivityboth on the whole and within different farmer subpopulations that are defined by socioeconomic characteristics. In doing so, we additionally provide operational insights regarding the integration of objective soil testing into large-scale household surveys, and present empirical evidence regarding the shortcomings of existing geospatial data on soil attributes vis-à-vis plot-level soil sampling and analysis. The data originate from a methodological survey experiment conducted in Eastern Uganda; the top maize-producing region in Uganda where maize is a primary staple crop.
Specifically, our analysis (i) estimates the maize-specific soil suitability profile of each maize plot based plot-level measures of soil attributes and assigns each plot to one of four suitability classes that are anchored in the GAEZ definitions, (ii) demonstrates the heterogeneity, across the soil suitability classes, in observed versus potential levels of productivitythrough the estimation of stochastic frontier functions of crop cutting-based maize yields; (iii) quantifies production and income gains from closing suitability class-specific productivity gapsfor the sample as a whole and for various farmer sub-populations; and (iv) documents whether the use of 250-meter resolution AFSIS geospatial data on soil attributes changes our conclusions regarding the relationships between soil fertility and maize yields.
Measuring agricultural productivity on plots of varying levels of maize-specific soil suitability, and the potential gains in productivity on those plots, allows for a more thorough understanding of the ability of agriculture alone to generate significant income gains. And while some existing work integrates ground-based measures of realized crop production, the current literature fails to adequately address the linkages between soil suitability, observed versus potential crop yields, and other household-and community-level factors influencing productivity outcomes. 4 The results indicate that despite the standing of the Eastern Region as Uganda's leading maize producing region, only 13 percent of farmers are cultivating soil that is highly suitable for maize production, while the vast majority are cultivating only moderately suitable lands. The key soil health deficiencies differ across suitability classes, suggesting that soil-based interventions need to be carefully considered for the specific suitability profiles in which they take place. The relationship between soil suitability, observed yields, and yield potential, is positive and significant. The findings suggest that farmers cultivating highly suitable land have the potential to increase their observed yields by as much as 86 percent, up to 3,009 kg/ha, while those at the opposite tail of the suitability distribution, those with marginally suitable land, operate closer to the efficient frontier and thus can only seek to increase observed yields by up to 59 percent, or 1,315 kg/ha. Furthermore, the analysis reveals heterogeneity in potential gains across the wealth distribution, with poorer households facing more heavily constrained potential. The stochastic frontier estimations are sensitive to the use of geospatial AFSIS soil data vis-à-vis the plot-level soil measurements, and this is in part driven by the AFSIS data failing to distinguish between soil suitability classes to the same degree as, and in a consistent manner with, the plot-level soil data, with 19 percent of plots assigned a different suitability class when using the AFSIS data vis-à-vis the plot-level data. On the whole, assuming no change in technologies and management practices used by Ugandan farmers, there are limited economic gains tied to closing suitability class-specific productivity gaps, or even at the extreme reaching the average potential productivity levels observed in the high suitability class.
The paper is organized as follows: Section 2 describes the context of Ugandan smallholder farming, Section 3 describes the data, Section 4 presents the empirical methodology, Section 5 discusses the key results, and Section 6 concludes.

Country Context
Uganda has a population of 45.7 million, 73 percent of whom reside in rural areas. 5 The share of rural population falling below the national poverty line stands at 23.4 percenttwice the level observed in urban areas (UBOS, 2021). In rural areas, 78 percent of the working population is employed in agriculture (UBOS, 2021), and agricultural income makes up 67 percent of total rural household income (Davis et al., 2017). As such, the Government of Uganda has long recognized the role of increased agricultural productivity as an important driver in generating wealth and alleviating poverty (GoU, 2013;2015;MAAIF, 2013).
In Eastern Uganda, the primary maize-growing region in the country and the region of analysis here, maize accounts for the highest share of crop income (World Bank, 2016), and no more than 40 percent of maize-growing households sell any maize. 6 Eastern Uganda, following Northern Uganda, is also the region with the highest concentration of the country's poor, with a poverty rate of 26 percent (UBOS, 2021). In the analysis sample, as discussed in the subsequent section, nearly 52 percent of all parcels of land owned or cultivated by the household were inherited or allocated by family or local leaders, suggesting that there is limited mobility of land across households.

Data
The majority of the analysis that follows relies on household survey data collected through the Methodological Experiment on Measuring Maize Productivity, Soil Fertility, and Variety (MAPS), and the related plot-level soil sample testing results. These plot-level soil analyses are complemented by, and compared to, geospatial soil data extrapolated from the Africa Soil Information Service (AFSIS).

3.1.MAPS
The Methodological Experiment on Measuring Maize Productivity, Soil Fertility, and Variety (MAPS) is a two-round household panel survey aimed at testing alternative methods of measuring maize production and key agricultural inputs, including soil fertility, maize variety, and plot area. 7 The resulting MAPS dataset includes a unique collection of objectively measured variables paired with data on household socioeconomics, demographics, and agricultural practices. MAPS Round I was fielded in 2015, and Round II was implemented in 2016. As the second round of the study did not include soil analysis, in this paper we utilize only MAPS Round I, which collected detailed data on the first (and the main) agricultural season of the calendar year.
In order to ensure high quality data collection and supervision, the MAPS sampling design was limited in its geographic scope. The sampling for MAPS Round I was completed in a multi-stage process. First, three strata were identified in the primary maize-growing regions of Eastern Uganda, namely Serere district, Sironko district, and a 400km 2 area spanning Iganga and Mayuge districts. From each stratum, enumeration areas (EAs) were randomly selected with probability proportional to size (15 from Serere and Sironko each, and 45 from the Iganga/Mayuge stratum). In each selected enumeration area, a full household listing was conducted as part of the study, identifying households who cultivated at least one maize plot and whether they had pure stand and/or intercropped plots. Finally, 12 households were selected from each enumeration area and with an effort to have an even split of purestand versus intercropping maize households. Due to the low incidence of pure stand maize plots, and cases in which plots identified as pure stand in 7 MAPS was implemented through a collaboration between the World Bank's Living Standards Measurement Study (LSMS), the Uganda Bureau of Statistics, the World Agroforestry Centre, the CGIAR Standing Panel on Impact Assessment (SPIA), and Stanford University, with generous support from the UK government. It is part of a larger methodological research agenda undertaken by the World Bank's LSMS, aimed at identifying improved methods of agricultural and household data collection using more objective, yet scalable, methods. the household listing phase were intercropped at the time of the first interview, the final sample was made of up 385 pure stand maize plots and 515 intercropped maize plots (43 percent and 57 percent, respectively). Therefore, the sample comprises 900 maize plots, each one from a different household.
The MAPS fieldwork was implemented by the Uganda Bureau of Statistics, with technical and training support from the World Bank Living Standards Measurement Study (LSMS). Each household was visited three times for a post-planting interview, a crop cutting visit, and a postharvest interview. The post-planting visit involved the administration of a questionnaire and the GPS-based plot area measurements, the demarcation of crop-cutting subplots, and the collection of soil samples (discussed below) on the randomly selected maize plot. The post-planting questionnaire included a standard individual-level module on household composition and basic characteristics (age, gender, education, etc.), a durable assets module, a farming assets module, questions on the use and availability of agricultural extension services, and finally parcel and plotlevel details. 8 The plot-level modules made up the bulk of the post-planting questionnaire, with questions on tenure status, cultivation status, which household members manage the plot, what farm implements were used, what farm management practices were employed (for example, tillage, crop rotation, etc.), post-planting labor inputs, and most importantly, farmer assessment of plot area, soil quality, and seed usage. It is critical to note that farmer assessment was made prior to any objective measurement so as to not influence the farmer response. 9 In the second visit, the crop-cutting visit, enumerators harvested the demarcated subplots which were set during the post-planting visit in order to obtain objectively measured production quantities for the crop-cutting subplots, which are subsequently extrapolated to the full plot area. The final household visit took place following completion of all maize harvests. At this time, farmers were administered an additional questionnaire, which asked for the estimated total maize production per plot as well as fertilizer inputs and harvest labor inputs. 8 Smallholder agricultural questionnaires in Uganda are structured such that there is a parcel of land, and within that parcel there may be multiple plots. The level of interest in this paper is the plot. In MAPS, a parcel was defined as "a contiguous piece of land with identical (uniform) tenure and physical characteristics. It is entirely surrounded by land with other tenure and/or physical characteristics or infrastructure e.g. water, a road, forest, etc." A plot was defined as "a contiguous piece of land within a parcel on which a specific crop or a crop mixture is grown. A parcel may be made up of one or more plots." 9 Because MAPS was a small-scale methodological validation study, great care was taken to ensure that there were no missing values for the key variables, therefore, there are no concerns of missing data. There were, however, circumstances that required the sample to be restricted to 840 from 900. Plots which did not have any soil fertility measurement (due to mismatching of soil sample labels) or no crop-cutting (due to non-compliance of households) are excluded. The missingness of soil measurement is likely independent of production on the plot as the missingness stems from errors by the enumerator or laboratory. It could be argued, however, that non-compliance by the household (in which they harvest the crop-cutting subplot before the enumerator's arrival) could be a systematic problem in which households with fewer resources cannot afford to forgo the maize on the crop-cutting subplot.
In what follows, we provide more details on the methods used for data collection in domains that are central to our research.

Soil fertility:
Soil fertility testing was conducted by the World Agroforestry Center (ICRAF). During MAPS fieldwork, enumerators collected plot-level soil samples from each of the selected plots following a protocol carefully designed to maximize the representativeness of the samples while maintaining feasibility of implementation. From each plot, four samples were collected from the top-soil (0-20cm depth) and combined in the field to create one composite top-soil sample. Additionally, a single sub-soil sample (20-50cm depth) was collected from the center of the plot. After being processed locally, the samples were shipped to ICRAF Nairobi where all samples were subject to spectral soil analysis and approximately 10 percent were subject to conventional wet chemistry testing. A portion of this 10 percent sample was used to calibrate prediction models, while the remainder was used to verify the predictions made onto the spectral data. For details, see Shepherd & Walsh (2002). The final results from the soil analysis include key indicators of soil fertility such as pH, texture analysis (percent sand, percent clay), cation exchange capacity, and the concentration of multiple elements and micronutrients, such as carbon, nitrogen, and potassium.

Maize yields:
A 4x4 meter subplot (divided into four 2x2 meter quadrants) and a separate 2x2 meter subplot were laid on the randomly selected maize plot during the post-planting visit following a strict protocol to ensure the location of the subplots was random. The subplots were roped off until harvest, when the enumerators were alerted and completed the harvest with the assistance of the farmer and a local assistant. The shelled maize from each 2x2 meter subplot was weighed and barcoded separately. The maize was then dried by a dedicated team at a central, monitored location until moisture content was in the range of 12 to 14 percent. Once desired dryness was met, the maize was re-weighed, and the dry weight and final moisture content recorded. For analysis, all maize weights have been normalized to 12 percent moisture content.

Plot area:
Following conclusive evidence of systematic bias in farmer estimates of plot area among smallholder farmers in the region (see Carletto et al., 2013, Carletto et al., 2015, MAPS implemented area measurement using a Garmin eTrex 30 handheld GPS device. Both the area and the raw GPS track outline were stored. input rates are relatively low with 15% (4%) of plots being treated with inorganic fertilizer (pesticide). Average maize yields, as measured via crop-cutting, are 1068 kg/ha. Socio-economic indicators, plot manager characteristics, and agricultural variables are also included in Table 1, as they will be relevant for the analysis that follows.

3.2.AFSIS
Geospatially-derived soil data is more widely available to researchers and policy makers than plotlevel soil sampling linked to household surveys. Yet, geospatial data in this realm is of coarser granularity than what would be observed at the plot-level. In order to understand the implications of relying on geospatial soil data in cases where plot-level sampling is unavailable, we complement the analysis using MAPS-based soil properties with that using soil data drawn from one of the premier, publicly available geospatial soil databases -the Africa Soil Information Service's AfSoilGrids250m data product. 10 The AfSoilGrids250m product, henceforth referred to simply as AFSIS, utilizes multiple inputs to construct a map of more than 15 key soil properties at 250-meter resolution across the entire African continent. Inputs to the product, including the Africa Soil Profiles database (Leenars et al., 10 AfSoilGrids250m is a product developed by the World Soil Information (ISRIC) in collaboration with the World Agroforestry Centre (ICRAF), The Earth Institute (Columbia University), and the International Centre for Tropical Agriculture (CIAT).  (Vagen et al., 2010), the GlobeLand30 land cover database, and the SoilGrids 1km predicted values, are joined through the use of 3D regression kriging founded on random forests modeling (Hengl et al., 2015). A layer of geospatial data is produced for each soil property at anywhere from one to six different soil depths.
Because MAPS georeferenced each agricultural plot, it is possible to join the AFSIS data with the center point of the agricultural plot and extract the point estimates of each soil property of interest, at the soil depths of interest. The following soil properties were extracted from AFSIS and utilized in the soil suitability analysis: cation exchange capacity, electrical conductivity, organic carbon, and pH. Each of these properties were available in depths of 0-5cm, 5-15cm, and 15-30cm. Because MAPS soil samples were taken at depths of 0-20cm and 20-50cm, we use a weighted average of the AFSIS 0-5cm and 5-15cm values for comparison with the MAPS 0-20cm samples. Subsoils, MAPS 20-50cm or AFSIS 15-30cm depths, are not utilized. Table 2 provides summary statistics for key soil properties derived from both the MAPS and AFSIS sources.

Empirical Approach
Various approaches to agricultural productivity are used in the agricultural literature, depending on research objectives and data availability. Average measures of productivity, including partial and total factor analysis, can be used to create a single statistic but the methods require high quality crop price data for the monetization of reported production that are often hard to come by in rural agricultural contexts with thin markets. Alternatively, marginal productivity analysis can be conducted with more direct policy-related takeaways. Cobb Douglas functions, and variations of Cobb Douglas, are commonly used (Deininger et al., 2007;Sherlund et al., 2002). The limitation of a simple linear production function in this context is that it assumes all farmers to be performing at optimal levels, without explaining the deviations between the observed and attainable (predicted) output levels. In order to allow for the analysis of the heterogeneity in production potential conditional on crop-specific suitability, one of the main objectives of the paper, we use stochastic frontier analysis, which allows for a better understanding of the aforementioned deviations.
The following two-step empirical approach is employed: (1) estimate crop suitability measures at the plot-level; and (2) estimate stochastic frontier models to estimate production frontiers for each class of maize suitability. The contribution of this paper comes from the ability to execute each of these steps on the same sample and from being able to do so with objectively-measured soil properties and crop production. In addition to executing the aforementioned analytical steps using the MAPS plot-level soil data, the steps are replicated using the geospatially-derived soil property data from AFSIS.

4.1.Assigning Maize-Specific Soil Suitability Measures
Estimating aggregate crop suitability measures requires comparing a vector of optimal soil properties against the levels of said properties observed on each plot. Crop suitability cannot be reduced to a single soil property, as several properties affect plant growth simultaneously, and soil property requirements vary by crop. The crop suitability framework set forth by FAO (1976), and illustrated in Figure 1, will serve as the foundation for the suitability classifications at the cropsoil property level. The maize suitability analysis completed here includes pH, cation exchange capacity (CEC), organic carbon, salinity (soil electrical conductivity), and plot slope (percent). 11 After identifying the suitability class of each soil property individually, based on the propertyspecific critical values borrowed from Naidu (2006) and further reviewed and modified with input from the World Agroforestry Centre, we utilize a fuzzy membership method to construct a membership grade for each suitability class, allowing for identification of the suitability class that best approximates the soil sample overall. The fuzzy membership method is commonly employed in land suitability analysis with GIS data (Ahamed et al., 2000;Ceballos-Silva & López-Blanco, 11 Multiple variations of the soil suitability framework were created, each containing a different combination of key soil properties. The selected framework that was chosen based on its superior predictive power in bivariate regression on yields measured via crop-cutting. 2003; Hall et al., 1992;Kahsay et al., 2018;Kalogirou, 2002). This method is also applicable to the plot-level MAPS data, however, as the data includes precise measures of soil parameters that are often extrapolated from lower resolution geospatial data. In this paper, the unit of analysis is the plot rather than the pixel as in geospatial analysis.
The fuzzy membership method, drawn heavily from Ahamed et al. (2000) and Hall et al. (1992), begins with an identification of the similarity, or Euclidean distance, between the vector of soil properties on each plot, x, and the representative vector for a given suitability class. After normalizing values over the interval [0,10] for each property to eliminate unit-sensitivity (following Hall et al., 1992), the "distance measure" is constructed as follows: where: = ( 1 , 2 , … , ) is the vector of soil parameters on a given plot; and = ( 1 , 2 , … , ) is the representative vector of soil properties that corresponds to suitability class, c.
Equation 1 results in a distance measure for each suitability class, where a higher score reflects greater divergence (less similarity) between the properties on a given plot and the respective suitability class. Subsequently, a membership grade is computed for each suitability class, which indicates the relative fit of a given plot to the specific class, ranging from zero to one, allowing for comparison of fit across suitability classes. (2) , where m = 4 (the total number of suitability classes) Equation 2 results in a plot-level membership grade for each suitability class based on a given crop's representative vectors for each class. Each plot is then assigned the overall suitability class of that with the highest membership grade. It is important to note that the method above assumes equal weights for each of the soil properties, which may be a strong assumption considering agronomic needs. However, in the absence of literature upon which to anchor unequal weighting of soil properties for the Ugandan context, we utilize the equal weighting approach and leave exploration of alternative weighting schemes to future work.
In summary, the distance measure is an absolute measure of the difference between the soil properties on a given plot and a specific suitability class, while the membership grade is a relative score, ranging from zero to one, indicating the relative fit of a plot into each suitability class. The membership grades for S1, S2, S3, and N, therefore, sum to one for each plot. Two separate suitability class assignments are constructed for each agricultural plot: one derived from MAPS plot-level soil sample results and one from AFSIS geospatially-derived soil properties.

4.2.Econometric Modeling of Production Frontiers
Aigner , Lovell, and Schmidt (1977) lay out the potential problems in minimizing the sum of squares of a simple production function, such as Cobb Douglas, in estimating the maximum output for a given level of inputs. The authors argue that this method of estimation inadequately explains observed deviations from the maximum output for given levels of inputs. In their proposed stochastic frontier model, they explain the variation in deviations from the modeled maximum output, or the production frontier, and predict an observation-level measure of technical inefficiency.
Much of the literature on stochastic frontier models assumes a translog production function, in which inputs into the production function are also interacted (see Greene (2008), Sherlund et al. (2002), and Ekbom and Sterner (2008)). This can, however, result in an explosion of parameters to be estimated in the case of many inputs, such as in agricultural models. Rather than the translog function, we assume a log-linear Cobb Douglas model, following the seminal work of Aigner, Lovell, and Schmidt (1977) and the agricultural examples set forth by Deininger et al. (2007), Kilic et al. (2009), and others. The estimated stochastic frontier model is as follows: where is total maize grain output (in kilograms) on plot i, and and parameters to be estimated. X is a vector of traditional economic inputs, including land area, household and hired labor inputs, and inorganic fertilizer usage. 12 As this is a rain-fed agricultural system, rainfall is also controlled for. Plot-specific flowering season rainfall was computed as total rainfall during the 5 th to 8 th dekads following the onset of seasonal rainfall, using CHIRPS timeseries precipitation data (Funk et al., 2015). 13 The distance measure (from highly suitable soil), , is included in X for analysis conducted on the full sample, while the membership grades, , are used to disaggregate the full sample into suitability-class sub-samples upon which the estimations are conducted separately (discussed in more detail below). Because both pure stand and intercropped plots are included in the sample, a dummy for the cropping pattern and a continuous variable for the seeding rate are also included in the X vector. 14 Inclusion of indicator variables for administrative districts were attempted but were found to be problematic due to correlation with rainfall and other covariates.
The error term, , is disaggregated into a symmetric disturbance term, , and a non-negative disturbance, . The symmetric disturbance is assumed to be independently and identically distributed with (0, 2 ). It is assumed to be independent of and results from measurement error, climate-related shocks that affect production, and other exogenous shocks. The non-negative term, , represents the technical efficiency of the household cultivating the plot, or the distance from the potential production frontier. It is assumed to be from truncated normal distribution, (0, 2 ), with a zero-lower bound (Aigner, Lovell, and Schmidt, 1977). Furthermore, is modeled as a linear function of variables that are believed to explain a household's technical efficiency or ability (Deininger et al., 2007;Kilic et al., 2009): Zi is a J-vector of covariates used to explain technical efficiency, which includes plot manager age, an indicator for the manager's attainment of primary education, an indicator for whether the plot manager received agricultural extension services, the dependency ratio of the household, and the number of agricultural assets owned by the household. Uncertainty around climatic factors, which may influence farmer behavior with respect to farming practices, is proxied by the coefficient of variation of flowering season rainfall (over the period 1999 -2014), which is also included in Zi. Additional controls were initially included, such as gender of the plot manager and seasonal rainfall shocks, but due to correlation with other covariates and lack of explanatory power in linear regressions, these were ultimately excluded. Indicators of access to credit and public infrastructure could theoretically be included in Z, although data on these covariates are not available for this 13 The onset of the season is defined following the Water Requirement Satisfaction Index (WRSI), such that the season begins when at least 25mm of rain falls in one dekad, and a total of at least 20mm of rain falls in the subsequent two dekads (documentation available here: https://goo.gl/sgmTK8). 14 Pure stand plots are those on which only maize is grown. Intercropped plots are those on which maize and at least one other crop is grown. The "seeding rate" included here is a ratio of the quantity of maize seed used on the plot to the quantity of seed that would have been used had the farmer planted only maize. The seeding rate is, therefore, bounded (0,1] and equals 1 for all pure stand plots. The seeding rate is included in addition to the dummy variable for cultivation pattern because it is believed that some combinations of crops could improve potential maize yields. sample. The error term, , is assumed to be of a truncated normal distribution, with mean zero and truncated at −( + ∑ ) =1 , such that remains non-negative.
Technical efficiency and the parameters from Equation 3 are estimated jointly using maximum log likelihood. The model, which substitutes Equations 4 and 5 into Equation 3, is estimated four times for each of the MAPS-based and AFSIS-based soil suitability measures: (i) including all plots and controlling for soil suitability with the inclusion of the distance measure from suitability class S1; (ii) including only plots classified as highly suitable (S1); (iii) including only plots classified as moderately suitable (S2); and, (iv) including only plots classified as marginally suitable (S3). Analysis is implemented on a sub-sample basis, rather than in a single model with suitability indicators, as it allows for the estimation of suitability-class specific production frontiers.
Technical efficiency (TE) scores are computed based on the conditional distribution of given , following Battese and Coelli (1988), whereby the technical efficiency on plot i is defined as: The technical efficiency scores are then used to compute potential production and productivity for the given level of inputs. 15

5.1.Maize-Specific Suitability Measures
The fuzzy set membership method described above was implemented using both the MAPS plotlevel soil samples and the AFSIS geospatially-derived soil data. The results of the maize-specific soil suitability classification exercise are summarized in Table 3. Using the MAPS plot-level soil samples as the basis for classification, 13 percent of plots are considered highly suitable, the majority (75 percent) considered moderately suitable, and the remaining 12 percent of plots considered only marginally suitable. Note that classification into a specific group does not suggest that the plot-level soil properties fit that category in full. Rather, they are most closely aligned with that class relative to the other classes. No plots were classified as not at all suitable, in line with expectations as these are all maize-growing plots. Benchmarked against the preferred method of plot-level soil testing, there is evidence that the geospatial data fails to adequately distinguish soil suitability levels. AFSIS-based classification results in more intense clustering of observations in the central, moderately suitable class (88 percent), with only 7 percent classified as highly suitable and 6 percent as marginally suitable. While 81 percent of observations are mapped to the same suitability class regardless of whether MAPS or AFSIS soil data informs the classification, the suitability class of 19 percent of plots varies with the source of soil data utilized (see Figure 2), which will have economic implications for the perceived production frontiers for these households. A unique feature of using this method is the ability to identify specific constraints to suitability for each class. According to MAPS-based suitability classifications, cation exchange capacity is an overarching limiting factor, with 43 percent of plots being classified as not suitable for that particular property. The perceived constraints differ when AFSIS-based classifications are made. For example, AFSIS-based classifications suggest that only 3 percent of plots have not suitable levels of cation exchange capacity (compared to 43 percent when using MAPS-based data). Annex Table A1 identifies the limiting factors for each MAPS-based suitability class separately, enabling an assessment of what interventions would be most effective in increasing the suitability of plots from one level to the next.

AFSIS
The suitability classifications are consistent with expectations with respect to agricultural productivity. Figure 3 illustrates the distribution of maize yields (kg/ha) by suitability class. Using MAPS-based classifications, highly suitable plots realized an average of 1,614 kg/ha, while moderately and marginally suitable plots realized an average of 1,015 and 828 kg/ha, respectively (1,497 kg/ha, 1,053 kg/ha, and 789 kg/ha using AFSIS-based classifications). . Productivity by Suitability Class; MAPS-and AFSIS-based classifications. S1, S2, and S3 indicate highly suitable, moderately suitable, and marginally suitable soil for maize production, respectively. Note that existence of highly suitable soil does not, in itself, result in high maize yields, but rather increases the upper bound and incidence of high maize yields.

Estimation of Production Frontiers
The results of the stochastic frontier analyses are reported in Table 4. The MAPS-and AFSISbased overall estimations offer a fairly consistent understanding of output elasticities for most variables, suggesting that on average the geospatial soil data may be comparable, and potentially a substitute for, plot-level soil data. However, several findings temper the enthusiasm for the use of geospatial data as an acceptable substitute for plot-level soil data. The coefficients on soil suitability in these overall specifications, measured as the distance from the S1 vector, are both negative, thereby suggesting that the closer a plot's soil is to the optimal, the greater its production, as expected. The MAPS-based soil data exhibits a stronger relationship with production than does the AFSIS data, and the OLS regression replicating the production function portion of the stochastic frontier model (see Annex Table A2) reveals that the coefficients on soil suitability (as Panel A: MAPS Panel B: AFSIS measured by distance from S1) are statistically different across MAPS and AFSIS specifications. 16 Seasonal rainfall is only a moderately statistically significant input into production when controlling for plot-level soil suitability, but it is a strong and significant predictor in the AFSIS model. The latter finding comes contrary to expectations, as the plot-level soil data does not directly incorporate climatological variables while the AFSIS data does. A final notable difference in output elasticities across the overall specifications is that of inorganic fertilizer. Under the AFSIS model the application of inorganic fertilizer does not yield any statistically significant gains in production, contrary to expectations and findings in the MAPS-based model. This raises the first caution against the use of geospatial data as a substitute for plot-level soil data. 16 A test of the difference in coefficients across the overall MAPS and AFSIS specifications, columns 1 and 2 in Annex Table A2, indicates that the coefficients (-0.082 and -0.050, respectively) are significantly different from each other at the 1 percent level. The test of difference in coefficients is implemented by the execution of a seemingly unrelated estimation of the two OLS models, followed by a Wald test of equality of the specific coefficients. Due to the small sample sizes and evidence of inconsistent matching of suitability classification relative to the MAPS plot-level soil sample classification, results are not reported in Table 4 for highly suitable (S1) or marginally suitable (S3) categories for AFSIS. Analysis on these categories, available in annex tables for all tables from this point forward, revealed highly-sensitive results contrary to expectations, including those expectations rooted in agronomic science, and, depending on the specification, implied potential yields far greater than agronomically feasible (for S1 plots). These unattainable potential yields, which are a function of low technical efficiency scores, support the descriptive finding that the geospatial data fails to sufficiently and consistently distinguish between maize-specific soil suitability relative to ground-based measures. 17 Coefficients in the overall specification are generally in line with those reported for moderately suitable (S2) plots, both for MAPS-and AFSIS-based specifications.
While AFSIS results for S1 and S3 are not reported here due to the sensitivity and inconsistency of findings (see Annex Table A3 for full reporting), there are takeaways from the comparison of MAPS S1 and S3 categories. Most apparent is the insignificant effect of cultivation pattern on S3 plots. In this category, neither the binary indicator on pure stand cultivation nor the intercropping seed rate is significant, suggesting that production on S3 plots is unchanged with cultivation pattern. On S1 plots, however, returns to cultivating pure stand maize are positive and significant, and higher in magnitude than the returns to pure stand cultivation on S2 plots. Related to seeding, the S3 specification exhibits a negative and significant coefficient on quantity of seed planted. A negative coefficient on seed application is contrary to expectations. One potential explanation could be over-seeding on these lower-suitability plots in an attempt to encourage greater production, while serving only to crowd out successful plants.
Across all specifications reported, technical inefficiency is largely unexplained by observable factors. While not statistically significant, coefficients on the technical inefficiency predictors are generally in the expected direction (Table 4). For example, the manager's completion of primary school and count of agricultural assets both exhibit negative (but insignificant) coefficients in the overall specification, thereby hinting at a reduction in technical inefficiency. In the MAPS S2 specification, manager education has a positive (and insignificant) coefficient, but its value very near zero. The coefficient on manager's receipt of agricultural extension services is positive (but insignificant in all but MAPS S3 specifications), suggesting that use of extension services is correlated with increased technical inefficiency. It is conceivable, however, that this relationship is driven by those with lower technical efficiency self-selecting into the use of extension services. Only in MAPS S3 do we see statistically significant coefficients on household or manager characteristics: that on the household dependency ratio which suggests that technical inefficiency on S3 plots is reduced in households with a greater dependency ratio, likely through labor channels; and that on the use of extension services. Uncertainty around rainfall patterns, proxied by the coefficient of variation in flowering season rainfall, has no significant relationship with technical efficiency in the majority of specifications. It does, however, exhibit a positive and statistically significant (at the 10 percent level) coefficient for AFSIS S2 plots, suggesting that increased uncertainty results in decreased technical efficiency.
The lack of predictive power of observable household and manager characteristics on technical inefficiency is consistent with the results of separate productivity analysis conducted via ordinary least squares regression (available in Annex Table A4), in which none of the household or manager characteristics has a statistically significant relationship with productivity, with the exception of agricultural asset counts. It is conceivable that the use of crop-cutting-based production measurement, rather than farmer-estimated production as is most commonly utilized in smallholder agricultural analysis, results in a reduced effect of these manager characteristics. By using objectively measured production data, we eradicate the noise and/or bias associated with the plot manager's estimate of production, which may be correlated with manager's education, experience, exposure to extension services, etc. However, there is little empirical evidence to support this theory, at least within this particular dataset, as Gourlay et al. (2019) find that observable manager characteristics have very little explanatory power in yield overestimation (measured as self-reported production-based yield minus crop-cutting-based yield).
Stochastic frontier analysis was also executed on subpopulations of interest, including plots with female managers, plots with male managers, and pure stand plots. Results of these analyses are available in Table 5. As the sample sizes of these subpopulations are small, it is difficult to draw conclusions on the S1 and S3 classifications so only the overall and S2 classifications are reported. Output elasticities across plot manager gender are similar, but male managed plots experience greater returns to soil suitability and only male managed plots have positive and significant returns to inorganic fertilizer application on average. Additionally, the impact of weather-related uncertainty varies by plot-manager gender, with only male-managed plots exhibiting a positive and significant relationship between uncertainty and technical inefficiency (while the converse is true with respect to current season rainfall inputs in the production function). Output elasticities on pure stand plots are in line with those observed across the full sample, with the exception that inorganic fertilizer application has greater impact on the production of pure stand plots.  .447*** 6.335*** -13.098*** 5.349*** 6.231*** 6.399*** 5.984*** 6.039*** Overall S2

5.3.Technical Efficiency, Potential Gains, and Economic Implications
Technical efficiency scores, representing the distance from the potential production frontier, are computed in accordance with Equation 6. Figure 4 presents the distribution of technical efficiency scores under each suitability class for both MAPS-and AFSIS-based suitability classifications while Table 6 summarizes the scores, the potential production (in kilograms), and the potential yields (kilograms/hectare). As with the stochastic frontier analysis, AFSIS results are only reported for the overall sample and S2 classification (see Annex Table A5 for technical efficiency scores of S1 and S3 classes).
Farmers cultivating MAPS-based S3 plots exhibit the highest technical efficiency scores, indicating they are operating most closely to the production frontier given their soil suitability. Farmers cultivating S1 and S2 plots have higher realized yields and lower technical efficiency scores, suggesting they have the potential to achieve the most significant gains in their maize yields vis-a-vis their already superior observed levels. T-tests reveal that while the technical efficiency scores of S1 and S2 plots are not different to a statistically significant degree, the scores between S1 and S3, and between S2 and S3, are statistically different at the 10 percent level. The resulting potential output per plot and potential output per hectare are significantly different at the 1 percent level, between all classifications.
The gap between mean realized yield and potential yield is 1,394 kg/ha, or 86 percent of the realized mean yield, on S1 plots. S2 plots only have the potential to increase yields by 69 percent on average, while S3 plots are constrained to 59 percent yield growth from the 828 kg/ha realized average. Figure 5 presents the potential yield gains for each of the MAPS-based classifications and AFSIS S2 plots, illustrating the greater level and percentage increase in potential yields for more highly suitable plots.   To assess the potential production and income gains under various frontier attainment scenarios, Table 7 presents potential gains in terms of both kilograms and monetary values, for the specific plot as well as household level estimates. Household estimates are derived based on total area cultivated with maize by the household, which itself is imputed via multiple imputation methods to adjust for bias in self-reported area estimates. 18 A key assumption underlying the household level estimates is that the household experiences the same level of productivity on all maize plots. The reported USD values of potential maize production gains are based on the November 2015 Famine Early Warning Systems Network (FEWSNET) Uganda Price Bulletin kilogram unit price, and do not account for any additional expenditures required to attain the increased level of productivity. 19 If suitability-class-specific productivity gaps were closed such that all farmers were operating on their respective MAPS-based productivity frontier, the best-case scenario given soil constraints, households could produce an additional 208 kg, or USD 90, per bi-annual agricultural season, on average. 20 Households classified as S3 under MAPS, assuming the suitability level is the same across all maize plots in the household, only have the potential to earn an additional USD 56 per agricultural season, while those in the highly suitable (S1) category can earn USD 171 more.
Reaching this MAPS-based class-specific frontier has asymmetric benefits for female-and maleheaded households, with male-headed households earning USD 99 on average and female-headed households only USD 73.
The benefits of operating at this frontier also differ across the wealth distribution, with those in the poorest tercile having the potential to earn USD 77 per season, USD 28 less than those in the richest tercile. If soil constraints were addressed such that all households were able to operate at the S1 frontier, households on average could increase production by 549 kg, or USD 237, per biannual agricultural season. In this optimistic scenario, asymmetric benefits are still observed, particularly by gender of the household head where female-headed households still, despite reaching the S1 frontier, stand to gain less production than male-headed households.
Finally, the MAPS stochastic frontier analysis and the resulting production frontier are anchored in the current management practices and technology set used by farmers. We make an additional simulation to capture the potential economic gains that can be achieved by operating within a hypothetical, high input use scenario, as depicted in the geospatial data on potential agricultural yields that are disseminated by the Global Agro-Ecological Zone (GAEZ) initiative. The geospatial, modeled GAEZ data on potential crop yields factor in crop growth cycles, climate factors such as rainfall and temperature, soil moisture levels, geospatially-derived soil properties, among other factors. For each MAPS plot, the GAEZ potential maize yield under the high-input scenario was extracted based on the GPS coordinates of the plot centroid. 21 If the GAEZ highinput potential (as estimated for S1 plots) yields were to be attained on the sampled plots, the economic gains would increase dramatically, with households producing up to USD 880 per season on average. In this scenario, those in the richest tercile would enjoy greater gains over their current production levels than those in the poorest tercile. Female-headed households, on average, if operating at GAEZ high-input potential and irrespective of factors other than land quality and climatic covariates, still suffer from lower gains than male-headed households, with potential gains capped at USD 729 per agricultural season compared to USD 918.

Conclusions
In this paper we estimate a multi-dimensional measure of maize-specific soil suitability based on existing standards, across a sample of approximately 900 households, spanning 4 districts in Eastern Uganda, the leading maize-producing region in the country. This is made possible by collecting and laboratory-testing plot-level soil samples following international best practices in the context of a methodological household survey experiment. In addition to the plot-level soil data, analysis of maize-specific soil suitability is replicated using publicly available geospatial soil data. This research provides a greater understanding of both the heterogeneous productivity constraints and the potential maize-based production and income gains, across crop-specific soil suitability profiles.
Classifying the sampled plots into three suitability classes, namely highly-suitable, moderatelysuitable, and marginally-suitable, and leveraging plot-level crop-cutting-based maize yields allows for comparison of the distributions of observed maize yields by suitability class. We then extend this analysis by estimating stochastic frontier models of maize yield separately for each suitability class, using both the MAPS-based and AFSIS-based suitability classifications, to understand differences in (i) returns to factors of production, (ii) technical efficiency, and (iii) potential yield measures. Compared to observed yields, the potential yield estimation provides a unique overview of maximum yield gains that can be achieved in each suitability class by increasing the efficiency with which the current set of inputs into agricultural production are utilized. Pairing the household survey data with potential yield estimates from the FAO's GAEZ database allows for an estimate of production gains if the technology set, or intensity of input use, is dramatically improved.
The results clearly illustrate the production penalties for cultivating maize on land that is not highly suitable for maize production, particularly when using MAPS plot-specific soil samples. The use of AFSIS geospatially-derived soil data provided a close approximation to the results of the MAPS-based results on the overall sample but failed to distinguish between soil suitability classes to the same degree as, and in a consistent manner with, the MAPS plot-level soil data. The MAPSbased analysis reveals that farmers cultivating only marginally suitable land are operating with higher technical efficiency and, thus, have less room for improvement than farmers cultivating more agronomically suitable land, given the condition of their soil. This result has implications for agriculture-based poverty reduction and food security policies. Effectively, by cultivating maize on land that is only marginally suitable rather than highly suitable, farmers limit their production potential by as much as 1,694 kg/ha, or 129 percent. Extrapolating the potential yields to the household level, based on multiply-imputed total maize area per household, suggests that given the current set of inputs and soil constraints households only have the potential to increase the value of production by USD 90 per bi-annual season. Assuming equal production in both agricultural seasons, and given the average household size of 6.12 persons, this translates into a gain of USD 0.08 per capita per day on average, not considering additional expenditures that may be required to reach that production frontier. For those cultivating marginally-suitable soils, they can hope to earn only an additional USD 0.05 per capita per day. If soil constraints were addressed such that all households operated on highly-suitable soils, potential gains would increase to USD 0.21 per capita per day on average. Enhancing the technology set and achieving the GAEZ highinput use potential yield on highly-suitable soil would increase gains to USD 0.79 per capita per day. Although these estimates of potential economic gains from agricultural production are only for maize, maize makes up 66 percent of cultivated land across all households.
The findings hint that realizing agricultural production potential alone, given the current set of inputs and soil constraints, may not be sufficient for significant welfare gains. In order for agriculture to act as a key mechanism for poverty reduction, policies can include (i) significantly boosting the quantity and quality of inputs used by smallholder farmers, and (ii) implementing crop-specific agricultural interventions based on high-resolution soil data with the aim of increasing crop-specific soil suitability. Addressing specific soil deficiencies that render the land sub-optimally-suitable for a given crop, which can be identified with this dataset for example, can result in gains in agricultural productivity and associated income. Future research may include the analysis of interventions aimed at triggering a shift from one soil-suitability class to the next, and whether, considering the costs required for the shift, that would result in net gains for smallholder farmers. Additional work, in partnership with agronomic specialists, should be conducted on the application of an unequal weighting scheme in the suitability classification framework to confirm whether the results are sensitive to the application of equal weights to individual soil properties in this context.
From a methodological perspective, the experience of the MAPS study highlights the analytical value of integrating objective soil measurement into household surveys, while at the same time shedding light on the scalability of the current approach. The adoption of these methods in largescale household surveys, including those conducted by national statistical offices, will likely require, or at least benefit from, more scalable tools, such as in situ soil sensors (including handheld devices) that provide real-time measures of soil attributes during the fieldwork and increase the timeliness of data collection while reducing reliance on laboratories and overall costs. However, these tools require validation in the field, especially for use in smallholder farming systems. A related approach to facilitate the cost-effective scale-up of objective soil measurement in household surveys is through reliance on sub-sampling and imputation. In this case, soils can be objectively analyzed for an intelligently-selected sub-sample of agricultural plots and imputation methods can be leveraged to predict soil attributes for the remainder sample, with a model that is trained on the sample with objective soil measures, complementary survey data (including subjective assessment of soil characteristics) and publicly available geospatial soil data. The validation of this approach can too be a focus of future methodological research. To the extent that more scalable approaches are developed and integrated into recurrent household surveys in lowincome contexts, including longitudinal surveys, the resulting data would not only enhance the scope and accuracy of the research based on these data, but also inform downstream remote sensing applications, including on soil mapping, that would benefit from georeferenced, ground-truth measures of soil attributes.    Highest category (S1 comparator) S2 -0.542*** -0.472*** -0.332 -0.283 S3 -0.687*** -0.693*** -0.581* -0.623** Membership grade of S1