Collecting the Dirt on Soils: Advancements in Plot-Level Soil Testing and Implications for Agricultural Statistics

Much of the current analysis on agricultural productivity is hampered by the lack of consistent, high quality data on soil health and how it is changing under past and current management. Historically, plot-level statistics derived from household surveys have relied on subjective farmer assessments of soil quality or, more recently, publicly available geospatial data. The Living Standards Measurement Study of the World Bank implemented a methodological study in Ethiopia, which resulted in an unprecedented data set encompassing a series of subjective indicators of soil quality as well as spectral soil analysis results on plot-specific soil samples for 1,677 households. The goals of the study, which was completed in partnership with the World Agroforestry Centre and the Central Statistical Agency of Ethiopia, were twofold: (1) evaluate the feasibility of integrating a soil survey into household socioeconomic data collection operations, and (2) evaluate local knowledge of farmers in assessing their soil quality. Although a costlier method than subjective assessment, the integration of spectral soil analysis in household surveys has potential for scale-up. In this study, the first large scale study of its kind, enumerators spent approximately 40 minutes per plot collecting soil samples, not a particularly prohibitive figure given the proper timeline and budget. The correlation between subjective indicators of soil quality and key soil properties, such as organic carbon, is weak at best. Evidence suggests that farmers are better able to distinguish between soil qualities in areas with greater variation in soil properties. Descriptive analysis shows that geospatial data, while positively correlated with laboratory results and offering significant improvements over subject assessment, fail to capture the level of variation observed on the ground. The results of this study give promise that soil spectroscopy could be introduced into household panel surveys in smallholder agricultural contexts, such as Ethiopia, as a rapid and cost-effective soil analysis technique with valuable outcomes. Reductions in uncertainties in assessing soil quality and, hence, improvements in smallholder agricultural statistics, enable better decision-making.

Much of the current analysis on agricultural productivity is hampered by the lack of consistent, high quality data on soil health and how it is changing under past and current management. Historically, plot-level statistics derived from household surveys have relied on subjective farmer assessments of soil quality or, more recently, publicly available geospatial data. The Living Standards Measurement Study of the World Bank implemented a methodological study in Ethiopia, which resulted in an unprecedented data set encompassing a series of subjective indicators of soil quality as well as spectral soil analysis results on plot-specific soil samples for 1,677 households. The goals of the study, which was completed in partnership with the World Agroforestry Centre and the Central Statistical Agency of Ethiopia, were twofold: (1) evaluate the feasibility of integrating a soil survey into household socioeconomic data collection operations, and (2) evaluate local knowledge of farmers in assessing their soil quality. Although a costlier method than subjective assessment, the integration of spectral soil analysis in household surveys has potential for scale-up. In this study, the first large scale study of its kind, enumerators spent approximately 40 minutes per plot collecting soil samples, not a particularly prohibitive figure given the proper timeline and budget. The correlation between subjective indicators of soil quality and key soil properties, such as organic carbon, is weak at best. Evidence suggests that farmers are better able to distinguish between soil qualities in areas with greater variation in soil properties. Descriptive analysis shows that geospatial data, while positively correlated with laboratory results and offering significant improvements over subject assessment, fail to capture the level of variation observed on the ground. The results of this study give promise that soil spectroscopy could be introduced into household panel surveys in smallholder agricultural contexts, such as Ethiopia, as a rapid and cost-effective soil analysis technique with valuable outcomes. Reductions in uncertainties in assessing soil quality and, hence, improvements in smallholder agricultural statistics, enable better decision-making.

Introduction
"Noting that soils constitute the foundation for agricultural development, essential ecosystem functions and food security and hence are key to sustaining life on Earth," the UN General Assembly declared 2015 the International Year of Soils (A/RES/68/232). 1 The recent increased attention afforded to soil health is for naught, however, if soil health measurements are inaccurate or of inadequate resolution. This is especially critical in the face of increased variability in weather conditions brought on by climate change.
Renewed interest in increasing agricultural productivity to meet food security needs and increasing resilience of agricultural systems in developing countries, especially in Sub-Saharan Africa, makes understanding soil fertility constraints and trends ever more important.
Much of the current analysis on agricultural productivity is hampered by the lack of consistent, high quality data on soil health and how it is changing under past and current management. This is beginning to change, however. As soil testing methods become increasingly rapid and affordable, data constraints lessen. In Ethiopia, for example, an innovative national-scale soil mapping operation is underway. The Ethiopia Soil Information Service (EthioSIS) project, supported by the World Bank-funded Agricultural Growth Program and implemented by the Ethiopian Agricultural Transformation Agency, has begun to reveal its value (World Bank, 2016). 2 Although the project has yet to be completed at full scale, knowledge acquired through EthioSIS and disseminated by extension agents has already led to the reformulation of critical inputs and substantial increases in wheat yields (Sawa, 2016). The early successes of EthioSIS illustrate the potential agricultural gains that can be unlocked by improving the detail and geographical scope of soil data.
With an ever-expanding population and finite land resources, soils will become more and more taxed as we strive to produce sufficient food to meet the needs of the world population. Not only will there be a need to increase food production to accommodate the growing population, but at present roughly 795 million people are estimated to be undernourished, 98 percent of whom are in developing regions (FAO et al., 2015). Land, although of finite quantity, can be used more productively, as evidenced by startling yield gaps observed across the world. The magnitude of yield gaps varies significantly across crops and context. For example, Lobell et al. (2009) clearly illustrate the variation in maize yield gaps, as average tropical lowland maize yields in Africa are less than 20 percent of yield potential, while tropical lowland maize yields reach approximately 40 percent of yield potential on average in East and Southeast Asia.
Rice exhibits consistently smaller yield gaps, with average rice yields exceeding 80 percent of yield 3 potential in Bangladesh, Indonesia, and Nepal, among others (Lobell et al., 2009). Insufficient soil health is commonly used to explain, at least partially, said yield gaps (Cassman, 1999;Lobell et al., 2009;Tittonell et al., 2008).
Several methods for closing yield gaps have been identified by the scientific community. According to the FAO, the use of sustainable soil management techniques, such as zero tillage and agroforestry, could boost food production by as much as 58 percent (FAO, 2015). The use of improved crop varieties and chemical input use have been shown to improve productivity and/or resilience exponentially (Cassman, 1999;Duflo et al., 2008). Additionally, Kumar and Quisumbing (2011) draw positive linkages between improved varieties and nutritional status. However, the uptake of such practices has been unenthusiastic, particularly in Sub-Saharan Africa. Marenya and Barrett (2009) suggest that farmer demand for fertilizer use is variable on soil carbon level, with higher carbon content plots achieving greater marginal product of fertilizer, suggesting that soil quality has implications for adoption of fertilizer use. As will be illustrated in this paper, relying on subjective farmer assessments of soil quality as a proxy for carbon content may provide data unsuitable for use in targeting of fertilizer adoption programs.
Productivity has also been observed to vary with farm size. Soil quality has long been argued to explain the inverse farm-size productivity puzzle, which suggests that small farms are more productive than larger farms (Bhalla and Roy, 1988;Barrett et al., 2010;Carletto et al, 2013;Carletto et al., 2015;Lamb, 2003;Tatwangire and Holden, 2013). Despite the results from Barrett et al.'s (2010) experimental study which concluded that soil properties did not explain away the inverse productivity relationship, much research suggests that omitted high quality data on soil properties is at least partially responsible for the inverse relationship (for example, Bhalla and Roy (1988)).
Yield gaps and the quantity of crop production are not the only concerns related to soils, food security, and nutrition. The quality of food produced can vary, and lack of micronutrients can lead to hidden hunger (Cakmak, 2002;FAO, 2015). With a direct link between the micronutrient content found in crops and the soils from which they grew, soil health measurement and monitoring could lead to the identification and, ideally, prevention of micronutrient malnutrition.
Agricultural analysis is multidimensional. Knowing the quantity of production alone, or even productivity, is not sufficient to analyze determinants of strong yields, estimate adoption of sustainable or improved farm management practices, or establish causal links between agriculture and nutrition. These data, however, are most readily available in household-level surveys with a focus on agriculture, such as the Living Standards Measurement Study -Integrated Surveys on Agriculture (LSMS-ISA; www.worldbank.org/lsms). Historically, plot-level soil statistics derived from household surveys have relied on subjective farmer assessments of soil quality or on linking with soil raster data (when plots are geo-referenced). Direct systematic measurement of soil fertility as part of a large-scale household-level data collection operation has rarely been attempted due to the high costs of soil sampling and analysis.
Recently developed rapid low-cost technology for assessing soil characteristics using infrared spectroscopy, however, has increased the potential for direct soil fertility characterization in large studies.
The value of soil data is unquestionable, but the sources, quality, and resolution of such data vary widely.
And while platforms like EthioSIS provide invaluable information on soil from an agronomic perspective, having soil data integrated with household-level or plot-level data on input use, farm management practices, agricultural labor, agricultural production, and household socioeconomic characteristics holds extensive analytical value. Soil data from household and farm surveys also provide great opportunities for validation of information obtained through other means. However, the quality of the subjective soil data that are most often found with such inclusive agricultural household surveys has rarely been validated. In this paper, we seek to compare subjective farmer assessment of plot-level soil quality against objective laboratory analyses, by utilizing the data purposively collected for methodological validation under the LSMS Methodological Validation Program. Using a unique plot-level data set collected by the Living Standards Measurement Study (LSMS) of the World Bank in collaboration with the World Agroforestry Centre (ICRAF) and the Central Statistical Agency of Ethiopia, and with funding from UK Aid, which consists of a menu of subjective farmerestimated indicators of soil quality and results from objective conventional and spectral soil tests, this paper analyzes the impacts of relying on subjective farmer estimates of soil quality for policy-based decision making through comparison of subjective and objective measures of soil properties. Results from the methodological experiment data suggest that smallholder farmers are unable to clearly discriminate between soil fertility levels, which we hypothesize may partially explain the slow adoption of improved agricultural practices and inputs often observed in Africa.
Building on the few previously existing studies, such as those by Dawoe et al. (2012), Desbiez et al. (2004), Odendo et al. (2010), and Gray and Morant (2003), we aim to validate the use of subjective soil quality indicators against objective measures. Specifically, we compare a multidimensional farmer assessment of soils with plot-level soil analysis conducted using conventional and spectral testing, similar to the data used by Marenya and Barrett (2009).

5
The remainder of the paper is organized as follows. Section 2 details the specific subjective and objective soil data collected in the Ethiopia Land and Soil Experimental Research (LASER) project and provides descriptive statistics on each. Analytical comparison of the measurement methods is explored in Section 3, with an emphasis on the ability of respondents with various characteristics to more or less accurately assess the quality of their soils against the objective benchmark. Section 4 concludes.

LASER Study
In an effort to collect the highest quality data possible in a large-scale household survey context, the Living Standards Measurement Study has prioritized methodological research in recent years through implementation of the LSMS Methodological Validation Program. With the aim of identifying the magnitude and (potential) systematic nature of measurement error associated with various measurement methods, and with financial support from UK Aid, the LSMS has designed several methodological experiments focused on key aspects of agricultural analysis, including soil fertility. Such methodological experiments strive to find balance between quality and scalability, and ultimately implement the most appropriate methods in future surveys.
Nationally representative LSMS-ISA surveys commonly include basic subjective questions on soil fertility, often asked to the head of household or plot-manager. Additionally, when plots are georeferenced, indicators of soil health such as nutrient availability, toxicity, and salinity are derived from outside sources including the Harmonized World Soil Database and provided as supplementary data along with the full LSMS-ISA data set. However, in order to know how well the subjective assessments of soil quality correlate with true soil fertility measures, and whether there are any systematic measurement biases based on topography or respondent characteristics, the subjective measures must be taken alongside objective, plot-level measures. This was the motivation behind the Land and Soil Experimental Research (LASER) project.
Data collection for the LASER study was conducted in 3 zones of the Oromia region in Ethiopia (refer to Figure 1). Oromia region was selected because it represents a large area of Ethiopia and encompasses areas with great variation in rainfall, elevation, and agroecological zones. In total, 85 enumeration areas (EAs) were randomly selected using the Central Statistical Agency of Ethiopia's Agricultural Sample Survey (AgSS) as the sampling frame. Within each EA, 12 households were randomly selected from the AgSS household listing completed in September 2013.
Fieldwork was conducted in multiple waves. Post-planting activities were conducted during September -December 2013. Post-harvest activities were conducted from January to March 2014. Crop-cutting was conducted at any point during this period when the maize was deemed ready for harvest by the respondent. The post-planting, crop-cutting, and post-harvest questionnaires were administered using computer-assisted personal interviewing.

Farmer Subjective Assessment
Prior to the collection of physical soil samples, a series of subjective plot-level questions was administered to the self-identified 'best informed' household member on each plot. These questions ranged from a categorical coded-response "what is the soil quality of [field]?" to questions on soil color, texture, and type (clay, sand, loam, etc.). It is worth noting that the subjective questions were administered at the dwelling, not upon direct respondent observation of the soils, as the study was aimed at assessing farmer knowledge for larger-scale surveys that may not allow for visitation of each plot.
Refer to Annex I for the relevant portion of the questionnaire instrument. While subjective assessments of soil quality are both cost-and time-efficient, the quality of results may be questionable. Summary statistics of the subjective questions included in the LASER study, found in Table 1, reveal little discrimination by respondents. 3 The table focuses on the sample of plots for which spectral soil analysis was completed, as this is the sample that will be compared to the laboratory results in subsequent sections, but there is also a column for the full sample of plots. When asked about the quality of soil on a particular plot, 94 percent of all plots were reported to have either good or fair soil (35 percent good, 59 percent fair). On the whole, only 6 percent of plots were reported to have poor soils.
This heavy-tailed distribution holds across administrative zone and agroecological zone, with no more than 8 percent of plots in a single agroecological or administrative zone reported as poor. This finding is not unique to the LASER data set. In nationally representative LSMS-ISA surveys from Uganda (2013-14), Malawi (2013), and Tanzania (2012-13), only 3 percent, 12 percent, and 6 percent of plots were reported as having poor soil, respectively (UBOS, 2013;Malawi NBS, 2013;Tanzania NBS, 2012). In * Categories "White/Light" and "Yellow" combined for analysis° Categories "Very Fine" and "Fine" were combined for analysis, as were "Coarse" and "Very Coarse" Subjective assessments of soil fertility also suffer from a lack of intra-household variation. Of households with more than one cultivated field in the sample, 63 percent reported the same soil quality on all plots.
Similarly, 71 percent reported the same soil type, 73 percent reported the same soil color, and 68 percent the same soil texture. This is striking, especially given the high number of fields cultivated per household. Descriptive analysis suggests that farmers use soil color and texture as indicators of soil quality. As observed in Figure 3, self-reported dark and fine textured soils were categorized as good soils while red and course textured soils were more frequently categorized as poor soils. While the more specific subjective questions, such as texture and color, appear to be correlated with the overall quality assessments, the value of these questions in terms of correlation with objectively measured soil properties, believed to be the truest measure, remains to be analyzed. Section 3 will explore these correlations.

Objective Data
Soil samples were collected from up to two randomly selected plots per household (where applicable, one pure-stand maize plot was selected for crop-cutting). The in-field sampling protocol was designed with ICRAF, adapting the Land Degradation Surveillance Framework of the African Soil Information Service to fit the smallholder farm structure. 4 From each selected plot, two samples were tested: (1) a composite sample collected from four points within the plot at 0-20 cm depth following the layout in Figure 4 (referred to as topsoil), and (2) a single sample from the center of the plot at 20-50 cm depth (referred to as subsoil). Field staff were trained by ICRAF personnel to ensure comparability of field protocols. A thorough explanation of the soil collection, processing, and analysis protocols followed in LASER are found in the guidebook by Aynekulu et al. (2016).
Soil samples were delivered to local processing laboratories within five days of collection to prevent decomposition of organic matter. Local laboratories, which were also trained on ICRAF protocols, were responsible for drying, grinding, sieving, and weighing the samples. After processing, samples were shipped to ICRAF laboratories in Nairobi, Kenya for analysis. All analyses completed by ICRAF were done following African Soil Information Service (AfSIS) protocols so as to ensure comparability of results with separate pre-existing and ongoing research in the region. On average, soil sample collection took approximately 40 minutes per field. In a replication study, also by the LSMS, this time was reduced by incorporating implementation lessons from LASER, such as using barcoded labels rather than handwritten labels (see the guidebook by Aynekulu et al. (forthcoming) for details).
4 For more information on the Land Degradation Surveillance Framework see http://www.africasoils.net/data/ldsfdescription.

Figure 3. Farmers use soil texture (left) and color (right) as indicators of soil quality.
Two objective measures were employed by ICRAF laboratories. Conventional soil analysis (CSA), which includes traditional wet chemistry methods for soil nutrient extraction and some basic soil physical analyses, was conducted on 10 percent of samples (n=361). Conventional analysis, while often regarded as the gold standard in soil analysis, is expensive and destructive in nature. Spectral soil analysis (SSA), or soil infrared spectroscopy, the second set of tests conducted under the LASER study, is significantly less expensive and non-destructive, allowing for multiple tests over time.
Soil infrared spectroscopy (IR) is an emerging technology that makes large area sampling and analysis of soil health feasible (AfSIS, 2014;Shepherd and Walsh, 2007) and overcomes the current impediments of high spatial variability of soil properties and high analytical costs, which are key challenges in monitoring soil health at a large scale (Conant et al., 2011). A review by Bellon-Maurel and McBratney (2011) showed an exponential increase in the use of near infrared (NIR) and mid-infrared (MIR) reflectance spectroscopy for soil analysis. Because spectral analysis is rapid, it greatly increases the quantity of soil samples that can be processed while also expanding the number of fundamental soil properties that can be simultaneously predicted with little increase in analytical costs. This reduces errors in quantifying soil carbon and other key properties that are often caused by spatial heterogeneity of soils. Infrared data can be integrated with geostatistic data (Cobo et al., 2010), remote sensing data and topographic information for digital soil mapping at the landscape level (Croft et al., 2012). Rossel et al. (2014), for instance, used infrared data to develop a soil carbon map of Australia.
The suite of spectral analyses includes the following tests: mid-infrared diffuse reflectance spectroscopy (MIR), laser diffraction particle size distribution analysis (LDPSA), x-ray methods for soil mineralogy (XRD), and total element analysis (TXRF). MIR and LDPSA spectral tests were conducted on all topand sub-soil samples (n=3,611), while the x-ray tests, XRD and TXRF, were conducted on the same 10 percent on which conventional testing was executed. Ultimately, approximately 50 variables were predicted for each top and subsoil sample, containing both chemical and physical soil properties.

Predictions of Soil Properties from Spectra
Following the methods designed by Shepherd and Walsh (2002) the results of the CSA were used to predict soil properties onto the full sample based on the spectral signatures, an example of which is found in Annex II. Figure 5 illustrates the predictive power of the mid-infrared spectroscopy on key soil properties, while Table 2 summarizes selected predicted properties, disaggregated by top-and sub-soil.
The predictive models are successful in that, of the variables predicted, the lowest correlation between predicted value and actual value (using the reference sample upon which CSA was conducted) was 0.946 (prediction of zinc concentration using Mehlich 3 method). The highest correlation was in the prediction of aluminum concentration by TXRF, with a rho of 0.989. Key soil properties such as total carbon (percent), total nitrogen (percent), clay (percent), and pH were strongly predicted with correlation coefficients of 0.984, 0.983, 0.988, and 0.985, respectively. The near-perfect predictions lend confidence to our assumption that laboratory results obtained through spectral analysis are strong proxies for true measures.  (Lorenz and Lal, 2005). Levels of all presented properties are significantly different between top-and sub-soil at the 1 percent level, with the exception of sand percentage, which is significant only at the 10 percent level. In addition to variation across soil depths, levels of key soil properties vary across administrative zone. Figure 6 illustrates that distribution of total carbon and pH by administrative zone.
Carbon levels are highest in the West Arsi zone, followed by East Wellega and Borena (means across zones significantly different at the 1 percent level). High carbon and pH variability is observed in Borena, likely due to the great variation in agroecological zones enclosed within its borders. East Wellega has more acidic soils, which could be suitable for maize production (FAO, 1983).
13 Figure 5. Mid-infrared spectroscopy strongly predicts multiple soil properties

Comparison of Methods
Given the complexity of soil and the varying needs of different crops and agricultural systems, assessing the overall quality of soil at an objective level can be difficult in itself. Comparing categorical subjective questions to the array of objective measurements and evaluating how well those subjective data reflect the true soil quality is even more challenging. To simplify the process, we first analyze the ability of subjective questions to predict soil carbon levels, a proxy for overall soil health. We attempt to explain which respondent and plot characteristics improve the ability of subjective questions to accurately (or relatively more accurately) assess soil quality. Subsequently, in order to incorporate more of the rich laboratory data and better capture the complex nature of the soil, we construct two variations of soil quality indices. Basic OLS regression is then used to identify which subjective questions, if any, significantly predict changes in the soil quality indicators. Finally, to reinforce the value of plot-level spectral analysis, the spectral results are briefly compared with publicly available geospatial data. All analyses are conducted using top soil (0-20 cm depth) measurements unless otherwise specified. 5

Carbon as a proxy for overall soil quality
Carbon content is often considered to be the best single indicator of soil quality (IIASA/FAO, 2012).
Higher levels of organic carbon indicate greater soil fertility and more optimal soil structure (IIASA/FAO, 2012). Carbon is also highly correlated with other key properties such as total nitrogen (with rho of 0.974 in this data set). Do farmer assessments of overall soil quality reflect carbon levels?
Descriptive analysis reveals little relation between organic carbon content (percent) and the respondent's assessment of the soil as poor, fair, or good. As seen above, 42 percent, 53 percent, and 5 percent of the household respondents classified the status of their soil as good, fair, and poor, respectively. T-test results provide weak evidence of distinction between organic (or acidified) carbon content. 6 In plots with reportedly good or fair soils, there is a greater organic carbon content than in plots reported with poor soils (difference is statistically significant at the 10 percent level). There is no significant difference in organic carbon content on plots with good and fair soils. The significant difference in organic carbon content on good and poor soils (3.36 percent and 3.10 percent, respectively) is consistent with other 5 The regression analysis found in Section 3.2 was also conducted using sub-soil results. For brevity, the results are not reported here. The findings using sub-soils are largely consistent with those using top soils, however subjective indicators appear to be a slightly better predictor of top soil soil quality indices. Sub soil results available from the authors. 6 No significant difference is found between total carbon content in plots reported as good, fair, and poor. However, correlation between total carbon and acidified carbon among top soil samples in the LASER data is very high (0.9851).
studies, such as Desbiez et al. (2004) and Mtambanengwe and Mapfumo (2005), but with those studies finding a greater divergence in organic carbon content between categories. 7 To better illustrate the distribution of carbon levels across self-reported soil quality categories, Figure 7 presents a scatter plot relating organic carbon, clay and silt content, and self-reported quality category (left) and box plots of carbon levels disaggregated by self-reported quality category (right). The scatter plot reveals that the soils reported as poor are not concentrated in areas with low carbon levels, but rather seemingly randomly distributed. This suggests that the local assessment on overall soil quality may not be a robust method for mapping soil quality and making decisions on potential interventions like fertilizer recommendations to improve land productivity.
Disaggregating the self-reported indicators by respondent, geographic, and plot characteristics reveals slightly more explanation. Splitting the data into two age categories above and below 40 years (excluding the 68 observations in which the plot manager was not the respondent) shows that the younger respondents were able to differentiate between poor and good soils (p < 0.05), and between fair and poor soils (p < 0.01), but not between good and fair soils, where we define successful differentiation as a relative measure (higher carbon levels in reportedly better soils). There is no significant difference in total organic carbon levels across the three self-reported soil quality categories for the respondent age group of greater than 40 years. One might expect farmer age to be inversely correlated with education and literacy, 7 Both Desbiez et al. (2004) and Mtambanengwe and Mapfumo (2005) used a binary classification of plots, rather than 'good', 'fair', and 'poor'.

Figure 7. Scatter plot of organic carbon and clay/silt (%) by self-reported soil quality (left), and box plots of organic carbon by self-reported soil quality (right).
but when disaggregating by manager literacy, there is no significant difference in organic carbon levels between subjective soil quality categories. Disaggregation by manager (and respondent) sex yields less insight. Neither male nor female manager assessments of overall soil quality discriminate by carbon level.
Geographic characteristics, particularly the variation in soils in the immediate vicinity of the household, may play a role in the correlation between subjective assessments of overall soil quality and objectively measured indicators. Overall quality is a highly subjective and relative measure and thus, it is likely to vary with the reference set available to the farmer. That is, in areas with greater variation in soil properties, a farmer may be better able to distinguish between plots that have good, fair, and poor soil because they have a wider range of soils against which they can make comparisons. This theory is supported by the results in Table 3.
Limiting the sample to the enumeration areas with the highest and lowest quartile of variance in organic carbon content indeed suggests that subjective assessment of overall soil quality better approximates organic carbon content in areas with greater variation. In the enumeration areas with the highest quartile of variance in organic carbon content, statistically significant differences are observed in the carbon content of soils reported as good and poor, fair and poor, and, to a lesser degree, good and fair. In enumeration areas with the lowest variance, not only are the differences only marginally significant, but reportedly poor soils have a higher mean carbon content than soils reported as good and fair. Breaking down the sample by administrative zone reveals some support to the idea of variation affecting the ability of farmers to rate the overall quality of their plots, as Borena, the zone with the highest variance, is the only zone in which any significant difference is observed between the three categories, albeit with weak statistical significance.
Farm management practices and property rights may have implications on the ability of respondents to assess overall soil quality. Although there is no significant difference in organic carbon levels between plots that received and did not receive fertilizer (organic or inorganic), there is a difference in the relationship between subjective quality assessments and carbon content. On plots on which fertilizer was not used, respondents are better able to distinguish between lower and higher organic carbon levels. On these plots there is a significant difference in carbon levels between plots identified as good and poor, and fair and poor, but not between good and fair. No significant difference is found between plots of different classifications on which fertilizer was used. In a similar trend, plots for which the household holds a title are assessed more appropriately, again with a significant distinction between good and poor, and fair and poor, but not between good and fair. There was no significant distinction on plots without a title, which is potentially explained by reduced knowledge of plots that are not owned and perhaps have not been farmed by the respondent over multiple growing periods.
Descriptive analysis suggests that on the whole, farmers do not do well at assessing overall soil quality, at least in terms of carbon content. Above, Figure  concentration of sand as opposed to silt and clay.
The difference in sand concentration is significantly different than zero between all three categories, but the levels are in an unexpected direction as reportedly fine soils have 12.4 percent sand while soils reported between coarse and fine have 11.0 percent sand (reportedly coarse soil has 15.1 percent sand on average). Theoretically, soil texture does have an impact on objective soil quality, with sandy soils having less nutrient holding capacity. The impact of soil texture on soil quality indices is explored in the next section.

Soil quality indices
While carbon is commonly used as a proxy for soil fertility, it may not be the primary limiting factor of soils in the sample. To achieve a more dynamic measure of soil quality two indices are created. The indices were constructed following the guidance set forth by Mukherjee and Lal (2014) in their comparison of three approaches to soil quality indices. A simple additive and a weighted additive approach were utilized. 8 The indices proposed by Mukherjee and Lal include three components: root development capacity, water storage capacity, and nutrient storage capacity. Data are only available for the construction of the nutrient storage component, which is 40 percent of the complete weighted additive SQI. Therefore, results presented here only indicate constraints related to nutrient storage capacity.
Mukherjee and Lal use their expertise and existing literature to assign linear scores to relevant soil properties ranging from 0 to 3 based on the constraint posed by the level of the specific property (Mukherjee and Lal, 2014). These linear scores are summed to create the simple additive soil quality index (SA SQI). While Mukherjee and Lal assign scores for a multitude of soil properties, data in the LASER study allow for the inclusion of pH, organic carbon content (percent), total nitrogen content (percent), and electrical conductivity, the properties that together make up the nutrient storage capacity component. Unlike the weighted additive index (discussed below), the SA SQI is not normalized on the sample and therefore provides an indicator of overall soil quality that is not relative to the study sample.
The SA SQI ranges from 0 to 7, with a mean of 4.61.
The weighted additive index, referred to henceforth as the WA SQI, was constructed by assigning linear scores to the relevant soil properties (pH, soil electrical conductivity, organic carbon (percent), and total nitrogen (percent)), normalizing the scores for each individual property over the sample, and then applying the indicated weights and summing the scores. 9 The linear scores for each included property ranged from 0 to 1 and were determined by dividing all observations by the highest value in the sample for soil properties in which a higher value is more beneficial (carbon and nitrogen) and dividing the all observations by the lowest value in the sample for properties in which a lower value is preferred. Soil electrical conductivity and pH have an optimal range, and these were treated as such. 10 This method follows Mukherjee and Lal (2014), who learn from Karlen and Stott (1994) and Fernandes et al. (2011). WA SQI scores range from 0.32 to 0.86, with a mean of 0.46. Table 4 presents the correlation matrix of the three abovementioned soil quality indicies, as well as the organic carbon content. All correlation coefficients are significant at the 1 percent level.

Soil quality indices: Regression analysis
In order to determine how well subjective soil indicators are correlated with objective measures, including the soil quality indices and organic carbon content, basic ordinary least squares regression analysis is conducted. The primary objective of the regression analysis is to determine how well subjective soil assessments predict soil quality index measures, and which subjective questions perform best. The following model is executed: where SQI is one of the two soil quality indices defined above, is a constant, X is a matrix of subjective soil indicators, and is a random error term with the usual desirable characteristics. Organic carbon content is also run as a dependent variable for robustness. While several factors, such as plot slope and various agricultural practices, may influence the soil quality on the plot, those covariates are excluded from the simple model presented here. The objective is not to analyze the determinants of soil quality but rather to determine how well subjective measures of soil quality predict true measures (as proxied by laboratory results). The models are first run with individual subjective indicators in order to identify how well each variable predicts the index score independently, then with all subjective variables, in order to analyze the predictive power of the subjective indicators as a whole. Note that subjective soil texture was included rather than type, as the "other, specify" category of the soil type question consisted primarily of soil colors and as such, correlation between soil type and color was a concern.
Results of the regression analysis are presented in Table 5. Immediately evident is the low explanatory power of the subjective indicators of soil quality, as expressed by the R 2 , which ranges from 0.002 to 0.060. Specifications (1), (2), and (3), which look at the individual subjective indicators separately, suggest that soil color explains more of the variation in the soil quality indices and carbon content than do the subjective questions on overall quality and texture. The direction of the coefficients on red and white/light soils are as expected -they have a lower soil quality index score or organic carbon content than black soils. Coarse soils would be expected to have lower levels of nutrient availability, and therefore greater soil fertility, and this is reflected in the results, albeit with limited magnitude in the WA SQI model.
The descriptive analysis on the subjective assessment of overall soil quality invoked little confidence in its relationship with objective soil quality, at least in this particular sample. This sentiment is reflected in the regression analysis. The subjective assessment of overall soil quality had no significant relationship with the WA SQI when self-reported soil color and texture were controlled for. In the model on the SA   SQI and organic carbon, the results in specification (4) suggest that soils reported as fair are of greater quality than those reported as good.
The results presented in Table 5 do not control for inter-household differences. Within the full sample, and not controlling for differences across households, the subjective indicators of soil quality do not exhibit strong predictive power of the soil quality indices or organic carbon content, but there is some relationship. Looking strictly at intra-household effects by including household fixed effects (and limiting the sample to households which had top soil samples for two plots), suggests that within household, subjective indicators have even less relationship with soil quality indices (refer to values. This is potentially associated with the lack of intra-household variation of subjective indicators illustrated previously. Although the amount of variation in the soil quality indices and carbon content explained by the subjective indicators falls with the inclusion of fixed effects, there is one positive outcome. The coefficients on subjective overall soil quality gain statistical significance and move in the right direction, with self-reported poor soil possessing a negative coefficient (marginally significant in the WA SQI and carbon models, not significant in the SA SQI model), suggesting that plots may be ranked appropriately within households. Overly positive conclusions on the ability of subjective questions to reflect soil quality should not be drawn from this result, however, given the lack of intra-household variation observed and the low magnitude of the coefficients.
Several differences are observed in relationship between subjective indicators and the various soil quality indices. The SA SQI, which is not normalized on the sample, exhibits a weaker relationship with subjective indicators than the WA SQI, particularly when household fixed effects are included. This is likely explained by the fact that the WA SQI is normalized on the sample and, therefore, is more a measure of relative soil quality.

Spectral Analysis & Geospatial Data
To provide further confidence in the value of conducting spectral soil analysis on plot-level soil samples, a brief comparison is made with publicly available geospatial data. Admittedly, comparison may be made with more than one source of geospatial data. However, the AfSIS data have among the highest resolutions currently available in public data sets (250m) for Ethiopia (for details see Hengl et al, 2015).
They also may be the most comparable to the LASER data in a methodological sense considering both are conducted by the ICRAF. For these reasons, the comparisons made here may present an upper bound of comparability, at least in this particular context. Values were extracted from the AfSIS geospatial data set using the GPS coordinates of the specific plots. Comparison is made between organic carbon content (percent) as measured by plot-level spectral testing and that indicated in the AfSIS map.

Conclusions
Knowledge of soil quality indicators and overall health is becoming increasingly important as food security issues become more pressing and climate change threatens to change the face of agriculture. Soil health, both perceived and actual, can have impacts on the targeting and uptake of improved agricultural practices, which can improve both the quality and quantity of food produced. For instance, Marenya and Barrett (2009) show that fertilizer effectiveness, and in turn, demand, is dependent on soil carbon content.
However, results of the LASER study suggest that subjective soil quality indicators fail to effectively reflect true levels of organic carbon, thereby limiting the value of subjective assessments of soil quality in policy making.
Certainly, asking a farmer to categorically rate overall soil quality has limited benefit. This particular subjective soil quality question does not successfully distinguish between soil carbon levels or predict soil quality index scores, at least within this sample in Ethiopia. Subjective questions on soil color and texture were more effective in predicting soil quality index scores and organic carbon content, although the low explanatory power of these variables leaves much to be desired. The value of subjective soil quality indicators is further questioned by the severe lack of intra-household variation observed. Further research validating different subjective questions, potentially formulated with soil scientists, may yield more optimistic results. However, the questions included in the LASER study are those that have been historically included in LSMS-ISA surveys in multiple countries.
From a fieldwork implementation standpoint, the experience of the LASER study gives promise that the integration of soil spectroscopy into socioeconomic household panel surveys is feasible. The methodology is a relatively rapid and cost-effective soil measurement technique that could unlock further understanding of the effects of farm management practices and changes in soil health over time. Detailed guidance on implementation strategies and protocols implemented in the LASER study can be found in Aynekulu et al. (2016).
Despite the weak correlation observed here between laboratory analysis and subjective assessment, several studies find subjective assessments of soil quality to be a significant determinant of plot-level productivity (for example, Carletto et al., 2013). This suggests that if subjective soil quality assessments are not capturing true soil properties, they must be capturing something else relevant to agricultural production. As a potential explanation for this, we echo the sentiments of Tittonell et al. (2008) and others, who suggest that farmers have a 'holistic' view of soils, and that rather than assessing the soil properties explicitly, they often incorporate other components such as overall agricultural productivity and likelihood of crop theft, for example. This finding would indeed render subjective assessments of soil quality significant predictors of agricultural productivity, but largely leaving true soil quality omitted.
Additional research is needed (and ongoing) to determine the effects of including these objectively measured soil properties in productivity analysis as opposed to, or in addition to, subjective assessments.
Additionally, while a brief comparison of plot-level spectral analysis and AfSIS geospatial data was included to illustrate the ability of plot-level analysis to capture a greater degree of variation within small areas, further research in this arena would be valuable, including, for instance, a comparison of LASER results with the EthioSIS national soil map. Geospatial data on soil quality have been recently compared with subjective data by Kelly and Anderson (2016), who find a similar pattern in that farmers are often over-optimistic about the fertility of their soils with respect to the Harmonized World Soil Database. This line of work could be extended to include plot-level soil analysis and further validate the need for objective plot-level analysis.
Ethiopia is poised to benefit greatly from advancements in soil testing, particularly with the rollout of projects like EthioSIS combined with the upscaling of data collection efforts at the farm household level.
The results of the LASER study, which bring subjective estimates of soil quality under scrutiny and point to the need for more direct, yet practical, soil measurements, show the potential value of the complementarities between platforms like EthioSIS, and household-level data collection, based on which accurate soil information can be made available as part of rich data sets on the socioeconomic condition and farming practices of farming units. Soil data collection through household and farm surveys may also provide a much needed vehicle to groundtruth remote sensing information and calibrate soil models. In this vein, fostering stronger linkages between national EthioSIS soil data and surveys like the Ethiopian Rural Socioeconomic Survey, a household panel survey supported by the LSMS-ISA, offers great opportunities from the research and operational perspectives.
Evidence from the Ethiopia LASER study suggests that subjective farmer assessments of soil quality poorly explain objective laboratory results and lack intra-household variation. Spectral analysis has been proven to near-perfectly predict key soil parameters as measured by conventional wet chemistry methods while providing highly detailed data that can be useful in policy aimed at increasing agricultural output, such as fertilizer input programs and identifying optimal crop selection, as well as agricultural productivity analysis. Improving agricultural statistics by reducing the uncertainties in soil quality assessment via objective measurement can enable better decision-making, both at micro and macro levels.