Income Diversification Patterns in Rural Sub-Saharan Africa: Reassessing the Evidence

Is Africa's rural economy transforming as its economies grow? This paper uses comparable income aggregates from 41 national household surveys from 22 countries to explore the extent of income diversification among rural households in Sub-Saharan Africa, and to look at how income diversification in Sub-Saharan Africa compares with other regions, taking into account differences in levels of development. The paper also seeks to understand how geography drives income diversification, focusing on the role of agricultural potential and distance to urban areas. The countries in the African sample have higher shares of on-farm income (63 versus 33 percent) and lower shares on nonagricultural wage income (8 and 21 percent) compared with countries of other regions. Specialization in on-farm activities continues to be the norm in rural Africa (52 percent of households, 21 percent in other regions). In terms of welfare, specialization in nonagricultural income-generating activities stochastically dominates farm-based strategies in all of the countries in our African sample. Crop income is still important for welfare, however, and even at higher levels of household income, crop activities continue to play an important complementary role. Regardless of distance and integration in the urban context, when agro-climatic conditions are favorable, farming remains the occupation of choice for most households in the African countries for which the study has geographically explicit information. When urban integration is low and agricultural conditions more difficult, the picture is mixed, with households more likely to engage more fully in nonfarm activities in Niger and Malawi, but less likely to do so in Uganda and Tanzania.

Is Africa's rural economy transforming as its economies grow? This paper uses comparable income aggregates from 41 national household surveys from 22 countries to explore the extent of income diversification among rural households in Sub-Saharan Africa, and to look at how income diversification in Sub-Saharan Africa compares with other regions, taking into account differences in levels of development. The paper also seeks to understand how geography drives income diversification, focusing on the role of agricultural potential and distance to urban areas. The countries in the African sample have higher shares of on-farm income (63 versus 33 percent) and lower shares on nonagricultural wage income (8 and 21 percent) compared with countries of other regions. Specialization in on-farm activities continues to be the norm in rural Africa (52 percent of households, 21 percent in other regions). In terms of welfare, specialization in nonagricultural income-generating activities stochastically dominates farm-based strategies in all of the countries in our African sample. Crop income is still important for welfare, however, and even at higher levels of household income, crop activities continue to play an important complementary role. Regardless of distance and integration in the urban context, when agro-climatic conditions are favorable, farming remains the occupation of choice for most households in the African countries for which the study has geographically explicit information. When urban integration is low and agricultural conditions more difficult, the picture is mixed, with households more likely to engage more fully in nonfarm activities in Niger and Malawi, but less likely to do so in Uganda and Tanzania.

Introduction
As a share of aggregate output, agriculture declines with overall growth in GDP per capita as countries undergo the structural transformation that accompanies economic development (Chenery and Syrquin, 1975). In rural areas of developing countries, the relative reduction of the importance of agriculture and the expansion in rural non-farm (RNF) activities and income diversification are likely features of the process of economic development. Growth in RNF activities cannot be seen in isolation from agriculture, however, as both are linked through investment, production, and consumption throughout the rural economy, and in relation to urban centers, and both form part of complex livelihood strategies adopted by rural households. Better incentives for agriculture during the past decade, via the improvement of the policy environment and better terms of trade, provide a more conducive environment for higher agricultural growth and an opportunity for the much awaited structural transformation in Africa (Binswanger-Mkhize, McCalla, and Patel 2010). According to the IMF (2012), structural transformation has been taking place in Africa, but at a slow pace.
A rather large body of literature has developed over the last 20 years investigating the extent and determinants of rural household income diversification in the developing world, the importance and features of rural non-farm income and employment, and the determinants of households' participation in and returns to different income-generating activities (FAO, 1998;Barrett et al. 2001;Haggblade et al., 2007;Winters et al., 2009;Davis et al. 2010;Winter et al., 2010). The 2007 World Development Report on agriculture and the 2011 IFAD Rural Poverty Report also devoted much attention to these themes. A major conclusion of these studies is that rural household income diversification is the norm rather than the exception, and that while endowments (physical, human, natural capital) and wealth play a role in driving engagement in different economic activities, some degree of diversification off the farm is common at all levels of welfare. Due to data limitations, however, the question remains as to whether this is occurring in Africa, a latecomer to the process of structural transformation. Conventional wisdom would have it that rural households in Sub-Saharan Africa are primarily employed in agriculture, with relatively little agricultural wage labor, and even less non-agricultural wage labor due to limited industrialization.
Less touched in the literature is the role of geography in determining rural income diversification patterns. Why does location matter? Deichmann et al. (2008) identify two main strands of literature that help frame the arguments around location and income diversification. First, one key empirical regularity of the rural farm/non-farm employment (and income) literature is that at very low levels of development, non-farm activities tend to be closely related to agriculture. When agricultural growth starts taking off (e.g. due to technical change), so does the non-farm economy, thanks to the backward and forward linkages from agriculture.
Such growth patterns are likely not to be location neutral, as the potential for agricultural growth (e.g. agro-climatic conditions) and demand for agricultural products are not randomly allocated across space. Over time endogenous sectoral growth biases may play a role, as infrastructure and other investments may tend to locate where growth is occurring, leading to increased spatial disparities in growth patterns. In Latin America this has attracted considerable attention for instance in the context of the debate on the 'territorial approach' to rural development (De Ferranti et al., 2005). As sectoral policies are likely to have differential impacts across space, explicitly incorporating spatial issues into policy design can help counter territorial distortions in development patterns.
The second key strand of literature is the new economic geography debate, which focuses on the extent to which geography, as opposed to institutions, explains differential development outcomes. One main tenet of that debate is that even if soil quality and climate were the same everywhere, location would still matter. On the one hand, dispersion of economic activities occurs as firms tend to locate in areas with lower wages, and the production of non-tradable goods and services locates close to demand. Activities connected to non-mobile inputs (such as agricultural land) are by definition going to be spread over space to some extent. On the other hand, agglomeration pushes businesses to locate close to consumers or to the source of raw material. Businesses depending on mobile inputs but with higher transport costs for their outputs would tend to have the highest gains from concentrating in particular locations.
Moreover, the location of economic activities across space may be nonlinear. Fafchamps and Shilpi (2003) find for instance that in Nepal agricultural wage employment is concentrated in rural areas close enough to cities to specialize in high-value horticulture, but not so close as to be taken over by unskilled 'urban' wage labor opportunities. Non-linearities may also be relevant when city size is found to matter for engagement in non-farm activities (Fafchamps and Shilpi, 2003) or for poverty reduction (Christiaensen et al. 2013). Also, specialization may need a particular market size or specific types of markets to kick-in (Fafchamps and Shilpi, 2004).
Agricultural potential and distance may interact in determining locational advantage, occupational choices and returns to economics activities. Yamano and Kijima (2010) for Uganda and Deichman et al. (2008) for Bangladesh both hint at ways in which the role of agricultural potential in determining household productive choice changes between more or less connected areas. Finally, different patterns of urbanization (megacities versus growth in small towns) may be associated, or drive, development outcomes.
Bringing these arguments and evidence together, it becomes clear that both exogenous physical location, as well as the interaction between sectors (and factor markets) and endogenous issues related to policies (infrastructural as well as sectoral policies) come into play in complex ways that make it less than straightforward to predict the spatial location of economic activities in rural areas.
Taking advantage of newly available data, this paper seeks to gauge the extent of income diversification among rural households in Sub-Saharan Africa, how this diversification compares with other countries, taking into account differences in levels of development. We also seek to understand the role of geography in the level of income diversification, focusing on the role of agricultural potential and distance to urban areas.
In order to answer these questions, we use comparable incomes aggregates from 41 national household surveys with good quality income data from 22 countries from all developing regions, constructed as part of FAO's Rural Income Generating Activities (RIGA) project. The initial exploration of the RIGA database (Winters et al., 2009;Davis et al., 2010;Winters et al., 2010) highlighted a number of regularities concerning household's patterns of income diversification in developing countries, and hinted at a specificity of the Sub-Saharan African countries included in that database, which stood out as the only countries for which specialization in farming, as opposed to holding a diversified income portfolio, was the norm.
That analysis was however based on data for only four countries in Sub-Saharan Africa (Madagascar, Malawi, Nigeria, Ghana). This paper takes advantage of more recent data from some of the same countries and data on an additional five countries (Ethiopia, Kenya, Niger, Tanzania, Uganda), collected as part of the Living Standard Measurement Study -Integrated Surveys on Agriculture (LSMS-ISA) 1 project. This new set of countries accounts for 51 percent of the Sub-Saharan African (SSA) population in 2012, as opposed to 26 percent in the initial RIGA sample. While caution is still warranted in treating this sample as representative of SSA as a whole, its coverage is arguably much more complete. Also, we take advantage of the geo-referencing of households and of the focus on agricultural activities that are two of the defining features of the LSMS-ISA datasets, in order to analyze the role of geography in income diversification.
The paper continues as follows. In Section II, we present and describe the construction of the RIGA database. In Section III, we analyze the participation of rural households in income-generating activities and the share of income from each activity in household income, over all households and by expenditure quintile. We then move from the level of rural space to that of the rural household, examining patterns of diversification and specialization in rural income-generating activities, again over all households, and by expenditure quintile. We also use measures of stochastic dominance to characterize the relationship between types of income and income-generating strategies and welfare. In Section IV, we examine the role of location in income generation strategies in a multivariate framework, and we conclude in Section V.

II. The Data
The RIGA database The RIGA database is constructed from a pool of several dozen Living Standards Measurement Studies (LSMSs) and from other multi-purpose household surveys made available by the World Bank through a joint project with the FAO. 2 The most recent additions are the LSMS-ISA project countries (see complete list in Table 1). Each survey is representative for both urban and rural areas; only the rural sample was used for this paper. 3 While clearly not representative of all developing countries, or all of Sub-Saharan Africa, the list does cover a significant range of countries, regions, and levels of development and has proven useful in providing insight into the income-generating activities of rural households in the developing world. 4 2 Information on the RIGA database can be found at: http://www.fao.org/economic/riga/en/. 3 Each country has their own definition of rurality, and government definitions not comparable across countries may play some part in explaining cross-country differences. While recognizing that variation in country-specific definitions of rural may explain observed differences in income composition, the available survey data do not allow for straightforward construction of an alternative measure across all countries. We thus use the government definition of what constitutes rurality. Further, rurality is identified via household domicile, not the location of the job--a number of labor activities identified as rural may actually be located in nearby urban areas. 4 Details of the construction of the income aggregates can be found in Carletto, et al (2007).

Table 1. Countries included in the analysis
Following Davis et al. (2010), income is allocated into seven basic categories: (1) crop production; (2) livestock production; (3) agricultural wage employment, (4) non-agricultural wage employment; (5) non-agricultural self-employment; (6) transfer; and (7) other. 5 Non-agricultural wage employment and non-agricultural self-employment income have been further disaggregated by industry using standard industrial codes-though we do not take advantage of this disaggregation in this study.
The seven categories of income are aggregated into higher level groupings depending on the type of analysis. One grouping distinguishes between agricultural (crop, livestock, and agricultural wage income) and non-agricultural activities (non-agricultural wage, non-agricultural self-employment, transfer, and other income), and in a second, crop and livestock income are referred to as on-farm activities, non-agricultural wage and self-employment income as non-farm activities, and agricultural wage employment, transfer, and other income are left as separate categories. Finally, we also use the concept of off-farm activities, which includes all non-agricultural activities plus agricultural wage labor.
Income shares can be analyzed as the mean of income shares or as the share of mean income. In the first instance, income shares are calculated for each household, and then the mean of the household shares of each income category. In the second case, income shares are calculated as the share of a given source of income over a given group of households. 5 All income is net; agricultural income values all production, both consumed on farm and marketed; transfers include both public and private sources, such as remittances; and other covers a variety of non-labor sources of income, such as rental income or interest from savings. The two measures have different meanings. The mean of shares more accurately reflects a household-level diversification strategy, regardless of the magnitude of income. The share of means reflects the importance of a given income source in the aggregate income of rural households in general or for any given group of households. The two measures will give similar results if the distribution of the shares of a given source of income is constant over the income distribution, which is clearly not always the case. If, for example, those households with the highest share of crop income are also the households with the highest quantity of crop income, then the share of agricultural income in total income (over a given group of households) using the share of means will be greater than the share using the mean of shares. Since the household is our basic unit of analysis, we use the mean of shares throughout this paper.
In analyzing spatially the patterns of income diversification in our sample of LSMS-ISA data sets, we use a set of geo-referenced variables from external data sets that can be linked to household-level data via their GPS attributes (Murray, 2013). These variables are available for a subset of the Sub-Saharan African countries in the dataset: Ethiopia, Malawi, Niger, Nigeria, Tanzania, and Uganda. First, we use an aridity index as proxy for agricultural potential. The aridity Index is defined as the ration between mean annual precipitation and mean annual potential evapo-transpiration so that a higher valued of the index identifies wetter areas 6 . This is a purely physical, exogenous indicator that reflects long-term conditions in a locality. We maintain this is superior for this use to alternatives that embed the profitability or value of agricultural production in a given area, as those incorporate contingent factors such as prices and terms of trade. In this application we value the fact that the aridity index be truly exogenous.
Second, we proxy market access, distance and agglomeration effects with variables that measure the Euclidean ('crow -fly') distance to cities of 20, 100 , 500 and 1 million inhabitants. We choose this measure due to a concern with the potential endogeneity of travel time measures-roads and travel infrastructure may be built in response to agricultural production or potential (Fafchamps and Shilpi, 2005;Deichmann, et al, 2008)). The Euclidean distance is independent of travel infrastructure, but provides a good measure of the spatial dispersion of households with regards to urban populations.

The diversification of income sources in rural Sub-Saharan Africa
We look first at household participation 7 in and the share of income from rural income-generating activities (participation rates can be found in Table 2, while shares can be found in Table 3). First and foremost, the near totality of rural households in the countries of our sample are engaged in own account agriculture. This is true both in Africa (92 percent on average) and in other regions (85 percent), with some form of on farm activity, even at higher levels of GDP ( Figure 1). While for some households the importance of this participation is relatively minor, since it includes consumption of a few animals or patio crop production, agriculture continues to play a fundamental role in the rural household economic portfolio. In this and the figures that follow, we have added a separate trend line for the African countries in the sample 8 .
Overall, the share of non-agricultural income among rural households increases with increasing levels of GDP per capita ( Figure 5). The importance of on farm (crop and livestock) sources of income gradually decreases ( Figure 6) as they are replaced by non-agricultural wage income ( Figure 7) and public and private transfers (Figure 8). In our sample of African countries the largest share of income from non-farm sources is recorded in Nigeria (40 percent) and the lowest in Ethiopia (6 percent). Transfer income shares are highest in Kenya (19 percent) and lowest in Nigeria (1 percent), and within this range several countries record substantial shares of 9-10 percent which is compatible with the increasing role of cash transfer programs in Africa as well as with the documented importance of migrant remittances from urban areas as well as from abroad. Broadly speaking, these values are comparable to the ranges observed in non-African countries.
Lastly, African and non-African countries do not appear to be dissimilar in terms of participation in, and shares of income from, non-agricultural self-employment (Figures 10 and 11), where there does not appear to be any clear association with GDP levels.

The diversification of income sources by wealth status
The previous section illustrated the diversified nature of the rural economies in all the countries of our sample, including those of Sub-Saharan Africa. There is also likely to be significant variation in the returns to the different activities. The available literature shows that that, for both agricultural and non-agricultural income-generating activities, there is often a high productivity/high return subsector, confined mostly among privileged, better-endowed groups in high potential areas. These high return activities usually have significant barriers to entry or accumulation in terms of land, human capital, and other productive assets. These entry barriers activities may prevent more marginalized households from taking advantage of the opportunities offered by the more dynamic segments of the rural economy. The importance of entry barriers may derive from a combination of household inability to make investments in key assets and the relative scarcity of economic activities with low capital requirements in rural areas (Reardon et al., 2000).
In contrast a low productivity segment usually serves as a source of residual income or subsistence food production and as a refuge for the rural poor. This covers activities such as subsistence agriculture, seasonal agricultural wage labor, and various forms of off farm self-employment. These Non-Africa Overall Trend African trend Africa without NGA Share of non agricultural self employment income typically informal activities may provide a last resort to food security, helping to reduce the severity of deprivation and avoid more irreversible processes of destitution. 9 High and low return activities within farm and nonfarm sectors may feed into each other. For those with few assets, seasonal, and insufficient income from subsistence agriculture, and lack of access to liquidity or credit, poorly remunerated off-farm activities may be the only available option. Households that are able to overcome financial or asset constraints may diversify or specialize in agricultural and non-agricultural activities, depending not only on access to specific assets but also on household demographic characteristics and the functioning of local labor and credit markets. The observed dualism also often appears to be drawn along gender lines, with women more likely to participate in the least remunerated agricultural and non-agricultural activities. Given the existence of both low and high return rural income-generating activities, with varying barriers to access, previous empirical studies have shown a wide variety of results in terms of the relationship of rural income-generating activities, and in particular RNF activities, to poverty (FAO, 1998;Lanjouw, 1999;Elbers and Lanjouw, 2001;Adams, 2001 and2002;Isgut, 2004;de Janvry, Sadoulet, and Zhu, 2005;Lanjouw and Shariff, 2002).
The country case study literature suggests that household participating in higher-return RNF activities are wealthier and have more upward income mobility (Barrett et al. 2001;, among others), a relationship that holds up in cross country studies and across increasing levels of development Winters, et al, 2010). Recent literature focuses on the dynamics of household participation in RNF activities.  find that households able to accumulate capital, increase adult labor or increase access to credit and savings were more able to access high-return RNF activities. Chawanote and Barrett (2013) find the existence of an "occupational ladder" in rural Thailand in which transitions into the RNF economy lead to increased income and transitions into farming with reduced income.
To explore the relationship across countries between rural income-generating activities and wealth, for each country we examine activities by expenditure quintile. Figure 12a charts income shares by expenditure quintile for all countries in the African sample. Focusing on on-farm activities, the darkest color, we see a sharp decrease in the share of on-farm income with increasing levels of wealth, dropping from around 50 percent of income in the poorest quintile in most countries, to less than 20 percent in the wealthiest quintile. The drop in on farm sources of income is made up by the increasing importance of off farm (non-agricultural wage and self-employment) sources of income for wealthier rural households. The clear trend evident from the countries in the African sample is not as clear in the non-African countries in Figure 12b. Here Bangladesh, Bulgaria, Nepal, Pakistan and Tajikistan show the opposite trend; the share of on farm activities increases with wealth. 9 See  and Lanjouw and Feder (2001) for a general discussion relevant to non-farm activities and Fafchamps and Shilpi (2003) for Nepal and Azzarri, Carletto, Davis, Fatchi, and Vigneri (2006) for Malawi, for example, regarding the role of agricultural wage labor.

Figure 12b
On the other hand, participation in, and shares of income from, agricultural wage labor show for the most part a negative correlation with the level of expenditure, for both African and non-African countries. With the exception of those countries that have negligible agricultural labor wage markets, Transfers and Other Non-Labour Sources Non-farm Activities poorer rural households tend to have a higher rate of participation in agricultural wage employment. Similarly, the share of income from agricultural wage labor is more important for poorer households in these countries, and the relationship holds regardless of the level of development.

Diversification and specialization
The results presented thus far suggest that rural households employ a wide range of incomegenerating activities-though perhaps rural households in the African countries in our sample are more dependent on agriculture then rural households in other countries. The question remains, however, whether households specialize in activities-with diversity in activities across households in the rural space-or, whether households themselves diversify income-generating activities. We examine the degree of specialization and diversification by defining a household as specialized if it receives more than 75 percent of its income from a single source and diversified if no single source is greater than that amount. 10,11 Figure 13. Figure 14.
10 Other definitions of diversification and specialization are possible. Davis et al (2010) used 100% and 50% of income from a single source as alternative thresholds in order to examine robustness. They find that the extent of diversification is affected by the choice of the threshold, which drops to around 10% or less in all cases when using the 50% definition of specialization, climbing to around 90% with the 100% definition. The broad patterns by country and by level of welfare, however, did not change with choice of the threshold. Alternative groupings of income categories are also possible, such as joining together agricultural and non-agricultural wage labor, or non-agricultural wage labor and non-agricultural self-employment, which would increase the share of household specializing in these new categories. 11 Note that we are constrained from delving into the details of diversification due to how household survey data are often collected. The apparent diversification may derive from aggregation across seasons (with seasonal specialization by households) or across individuals (with specialization by individual household members).  Among rural households in the countries of our African sample, specialization in on farm activities continues to be the norm (practice by 52 percent of households on average), with anything between one-third (Kenya) and 83 percent (Ethiopia) of households generating in excess of 75 percent of their income from farming alone (Table 4). Among all countries, with the exception of Niger, a majority of households specialize in on farm activities. This result is quite different from the non-African households in our sample of countries, where only 21 percent of households on average specialize in farming. Within this group in only two countries the majority of households specialize in on farm activities, and diversification is rather the norm (45 percent of households fall in the diversified categories, on average). The relative differences between the African and non-African countries with increasing levels of per capita GDP can be seen in Figures 13 and 14. Rural households in the African country are clustered above the trend line in the former graph, and below the trend line in the latter.
When rural households in non-African countries do specialize, in a majority of cases this specialization is in on farm activities, although the percentages become lower the higher the per capita GDP. At higher GDP levels specialization in non-agricultural wage becomes more important (Figure 15), for both African and non-African countries. No distinct association between GDP levels and specialization in agricultural wage or self-employment is apparent for non-African countries, while for African countries the share appears to increase ( Figure 16). Taken together, the main transition these observations suggest is one from a high reliance on farming gradually giving the way to a greater reliance on non-farm wage employment, with non-farm self-employment the activity of choice for a more or less constant share of households as development occurs. This essentially confirms the trends observed based on the crude income shares data ( Figures 5-11 above).
Interestingly, only one of the African countries in our sample has more than 5 percent of household specializing in transfer income (Kenya, with 9 percent). In non-African countries observing more than 5 percent of households receiving more than three quarters of their earnings form transfers is not at all uncommon. It is hard to make hard conclusions out of these observations as transfer income is a mixed bag of several sources (social protection programs, pensions, migrant remittances and more) with very different institutional and socio-economic determinants. It is however worth flagging how very few African households are relying mostly on these sources of income for their livelihoods. Despite the widespread migration and expanding social programs, productive occupations are what keep most households afloat.  A rural household may have multiple activities for a variety of reasons: as a response to market failures, such as in credit markets, and thus earning cash to finance agricultural activities, or insurance markets, and thus spreading risks among different activities; failure of any one activity to provide enough income; or different skills and attributes of individual household members. Diversification into rural nonfarm activities can thus reflect activities in either high or low-return sectors. Rural nonfarm activities may or may not be countercyclical with agriculture, both within and between years, and particularly if not highly correlated with agriculture, they can serve as a consumption smoothing or risk insurance mechanism. Thus, the results raise the question of whether diversification is a strategy for households to manage risk and overcome market failures, or whether it represents specialization within the household in which some members participate in certain activities because they have a comparative advantage in those activities. If the latter is the case and it tends to be the young who are in off-farm activities, diversification may simply reflect a transition period as the household moves out of farm activities.
The empirical relationship between diversification and wealth is thus not straightforward. A reduction in diversification as household wealth increases could be a sign that those at lower income levels are using diversification to overcome market imperfections. Alternatively, a reduction in diversification as household wealth decreases could be a sign of an inability to overcome barriers to entry in a second activity thus indicating that poorer households are limited from further specialization. An increase in diversification as household wealth increases could be a sign of using profitability in one activity to overcome threshold barriers to entry in another activity, or complementary use of assets between activities.
The inability to conceptually sign a priori the correlation between diversification and household wealth status emerges from the data. Figures 17, 18 and 19 explore the relationship between diversification, specialization and household expenditure for the countries in our African sample. The share of rural households with a diversified portfolio of income-generating strategies shows few consistent patterns by wealth status in our sample countries, in both our African and non-African countries (Figures 17a and 17b). A clear pattern emerges, however, among the African countries, in terms of the share of households specializing in on farm activities. Here, the share of households in most countries decreases with increasing wealth. Conversely, the share of households specializing in self-employment activities and non-agricultural wage labor increases with wealth, at least for those countries where there activities are pronounced, such as Nigeria, Ghana, Malawi and Uganda.  Figure 19.

Stochastic dominance analysis
To take a more systematic approach to characterizing the association between sources of income, specialization and welfare, we use measures of stochastic dominance. For each of four of the African countries, covering six data sets-Malawi (2011), Niger (2011), Tanzania (2009 and 2010) and Uganda (2010 and 2011)-we first compared sources of income. We then look at income diversification and specialization income generation strategies by both total household income and by per capita expenditure. We include the latter as expenditure is considered the gold standard in terms of measuring household level welfare. In each case we calculated pairwise tests of stochastic dominance (see Appendix Figure A1 for an example), and we rank sources of income (or income On-farm Specialization by Per Capita GDP diversification/specialization) by dominance. We do not include transfer and other income in order to improve the clarity of the presentation. A summary of the analysis can be found in Table 5. Focusing on household income diversification and specialization in income-generating strategies, by total household income and per capita expenditure, a complementary story emerges, as seen in Figure 21 and again in Table 5. Across all countries, specialization in off farm activities-nonagricultural wage income and self-employment-stochastically dominates other household incomegenerating strategies, in terms of both total household income and per capita expenditure. These are followed by on farm specialization and diversified strategies, then agricultural wage labor which is clearly associated with the lowest levels of welfare. 12 This pattern is clearest in the case of the 12 The one exception is specialization in agricultural wage labor in Niger, which includes less than one percent of households, but with high incomes. distribution of per capita expenditure; with household total income, in some cases (such as Uganda) diversification is not stochastically dominated by specialization in non-agricultural wage and selfemployment. Overall, bringing together the results from the three stochastic dominance analyses, a clear picture emerges. Specialization in non-agricultural income-generating activities stochastically dominates farm based strategies in all countries. Crop income, however, is still important for welfare, and even at higher levels of household income (or wealth) crop activities continue to play an important complementary-though secondary-role in income generation.

Estimation approach
As we have noted earlier, much of the literature on rural income diversification in developing countries has sought to explain how assets endowments and barriers to entry tend to push or pull different households and individuals into different activities. Location may also be an important factor in the determining households' income diversification strategy decisions, but the literature is much more silent on this point mainly due to the lack of data that would allow spatially explicit analysis. The geo-referenced household data we work with, make efforts to start filling that gap possible.
In what follows our approach is similar to a meta-regression analysis in that (i) for each of the countries analyzed common metrics are used, (ii) explanatory variables for each country have been created in a uniform manner, and (iii) a standard regression model is employed in each case. This approach then minimizes the possibility that differences in results are driven by differences in the variables used or in the empirical approach, and facilitates our ability to compare results across countries.
Our modeling approach is to employ a multinomial logit model (separately for each country) to assess how location is associated with the likelihood that a household diversifies out of farming, controlling for other household characteristics. The choice of the multinomial logit is motivated by the fact that we have several unordered but mutually exclusive categories that we use to characterize household income strategies: A household can either be diversified, or fall within one of six diversification categories 13 . In the multinomial logit, k-1 models are estimated for any outcome consisting of k unordered categories. Parameters estimates are then interpreted with reference to the excluded base category (farm specialization in our case).
For a unit change in the regressor, the logit of the model outcome relative to the reference group is expected to change by its parameter estimate, holding other variables constant (UCLA, 2014). As we use farm specialization as the base category the coefficients on the main variables of interest can be immediately interpreted in terms of association with higher or lower likelihood that a household diversifies or specializes in off-farm activities, compared to being a farm specializer.
In a multinomial logit, given an unordered categorical outcome variable y with k outcomes, and a set of regressors X, one estimates a set of β coefficients for each outcome i from 1 to k-1 corresponding to: The model can be solved by Maximum Likelihood methods by setting β(k) equal to 0, and measuring the other coefficients in terms of changes relative to the k reference category (Long and Freese, 2006).
Previous studies have discussed the role of other key household characteristics, namely different forms of capital (human, natural, physical, social) and those findings are pretty consistent and robust across studies. One concern with that evidence is however the extent to which different levels and composition of assets may in fact be endogenous to decisions regarding the income generation strategy. In this paper the primary interest is to gauge the extent to which truly exogenous factors like climate and distance 14 from urban centers affect households' diversification decisions. We therefore adopted a stepwise approach to model specification, introducing first only a distance variables to the right hand side, then adding a quadratic term for distance (step 2), a variable to proxy agricultural potential (step 3), quadratic interactions between distance and agricultural potential (step 4), and finally a set of household level controls (step 5).
The outcome variables in the multinomial logit are the diversification and specialization that have been described in section II and used for the analysis in Section III.
To gauge the effects of distance, market access and agglomeration we employ the variables described in Section III that measure Euclidean ('crow-fly') distance in kilometers to cities of 20, 100, 500 thousands and one million inhabitants. For each of the 5 steps above, we therefore estimated four variants, one per each of the distance variables employed. Agricultural potential is proxied by an aridity index, also describe in Section II above. To capture the non-linearities in the relationship between diversification and distance, we introduce both a quadratic term for distance, and interaction terms between distance and aridity. This analysis allows measuring the extent of impact of location effects (agricultural potential, distance, and their interaction) on the choice of incomegenerating strategies. In specifying our model using distance to urban centers of different sizes, we are also interested in gauging how these relationships may vary when one considers distance to small towns, as compared to distance to mid-size and large cities.
The vector of regressors in the least parsimonious specification (step 5) includes a range of additional household characteristics that are known to impact decisions about occupational choice and income diversification: separate agricultural and non-agricultural wealth indexes, and an index of access to basic infrastructure (all calculated using principal component analysis); household demographic and composition characteristics (household size, age and gender of the head, number of working age members, share of female working age adults); and variables to measure key households assets (education of the head, land owned) 15 .
Based on the theoretical and empirical literature reviewed earlier in this paper, we have some clear expectations as per the sign of the correlation between household endowments and sectors of specialization, with for instance land being strongly associated with agricultural activities, education strongly associated with nonfarm (particularly) wage activities, and low levels of assets across the board being associated with agricultural wage employment.
To reason around the a priori expectations regarding the association between the key location variables (distance and aridity) and diversification outside of it may be useful to think about this schematically with the aid of a two by two matrix organized around high/low integration and agricultural potential (Figure 22).
In high potential, high integration 16 areas, one expects both farm and non-farm activities to thrive, with non-farm shares dominating the higher the integration levels. In low potential, highly integrated areas the expectation is clearly for non-farm activities to dominate as people reap off-farm opportunities as farming does not hold much promise given the unfavorable conditions. In low integration/high potential areas the expectation is on the contrary for farming to be relatively more important. Deichmann et al. (2008) find that in Bangladesh high return self and wage employment outside of agriculture tends to decline with distance to the main urban centers, and to decline faster as the agricultural potential increases.
14 Admittedly distance may itself be endogenous as existing employment opportunities clearly play a role in a household's decision on where to live. 15 Summary statistics for the variables are included in the Appendix Table A1. 16 In what follows, we loosely use the term integration as the inverse of distance.
The low-potential low-integration areas are more difficult to sign a priori as households will on the one hand have to rely to a large extent on subsistence farming for their own survival, while on the other hand try to complement the expected meager returns from farming with (possibly equally meager) returns from nonfarm activities, including migration. The distinction among diversification out of necessity as opposed to choice proposed for instance by Ellis (2000) is useful in characterizing the situation in these areas.

Figure 22 -Matrix of expected relationship between diversification outside agriculture, agricultural potential, and integration into urban areas
Our use of a quadratic distance term and of interactions between distance and aridity to reflect these expected non linearities. While we run all our estimates including these terms, whenever the joint significance of the terms is rejected by chi-squared test we stepwise drop the interaction, then the quadratic term and present these results instead.

Results: The impact of distance from urban centers and agricultural potential on household income generation strategies
As summarized in the above discussion we effectively estimate 5 logit models, in 4 stepwise variants, using 4 different city size categories. Space does not allow discussing each of these 80 regressions in detail, and in fact there is no need for that. The more parsimonious specifications (results not reported) tend to support expectations with non-farm specialization less likely at increasing distance from cities, particularly as cities of larger size are considered. We therefore focus the discussion on the extent to which we found presence of non-linearities, their extent and direction and on the regularities and differences we find across countries, between the role of urban centers of different sizes, and by agricultural potential. To convey the main results emerging from the analysis, we use to the extent possible graphs aiming at showing the broad directions and main nonlinearities in the variables of interest 17 . Figure 23 reports the graphic depiction of how the likelihood of being in the main non-farm specialization/diversification categories (non-agricultural wage specializers and non-agricultural selfemployment specializers), changes with distance, separately for areas with high and low agricultural potential, and distance to cities of different size (20 thousand plus or 1 million plus inhabitants) 18 . 17 The entire set of regressions is available from the authors upon request. 18 Similar graphs for the other activities and for cities of 50 and 100 thousand inhabitants are omitted for reasons of space. They are available from the authors.

Low High
Agricultural potential High Low + (?) + + (?) -We focus on these as the sectors that identify more univocally engagement in activities outside of agriculture.
The graphs convey the combined effect of the quadratic and interaction terms that are otherwise difficult to read in a standard table of coefficients. The first result that emerges is that non-linearities are clearly present in most of the estimated relationships. For most countries and 'sectors' of specialization, the role of distance changes markedly with potential and with city size.
Looking at the graphs for these two categories, the impact of distance from cities appears to be a lot more muted in areas of higher agricultural potential. This is shown for instance in the nonagricultural wage graphs for Malawi and Tanzania, and in the self-employment graphs for Malawi and Niger (it is worth recalling that in Niger the share of non-agricultural wage specializers is extremely low, so that it is not surprising not to find high variability for non-agricultural wage in the Niger graphs).
Uganda is the one country that bucks these broad regularities and displays a U-shaped relationship for self-employment and for non-agricultural wage (the latter limited to cities greater than 500 thousand inhabitants) for both high and low potential areas. In Uganda the lines for high and low potential areas tend to cross around the middle range of the distance variable distribution. Agriculture goes hand-in-hand with off farm diversification in highly integrated areas, whereas low potential areas tend to 'dominate' high potential in terms of the odds of households specializing outside of agriculture as distance increases.
In Tanzania, the country in this sample with relatively more abundant land, the difference in potential is only apparent for non-agricultural wage specialization when large cities are considered. Otherwise, similar trends are observed as distance increases, regardless of potential.
An additional observation concerns the fact that indeed the size of the city one considers the distance from does matter in this type of analysis. The forces puling households out of agriculture are in general smaller for small (and medium size) towns. In rural areas close to small towns (20,000 people) one does not observe substantial differences in the pull from off-farm activities, except for self-employment in Niger where households are in fact more likely to specialize in self-employment where agricultural potential is higher.
The story therefore appears to be one of the interaction of smaller towns and potential being more relevant for explaining diversification (as defined here), while larger towns being more relevant when specialization into non-farm is considered. These findings speak of really different dynamics when the role of small towns is considered and when large cities come into play. For small towns, we find support to the hypothesis that highpotential, low-integration areas see less specialization in off farm activities, the reverse being true for high-integration low-potential areas. These were the two cells in Figure 22 for which we had clear a priori expectations, but we also found that the role of potential is not particularly strong at least when the off-farm specialization categories are considered. The two cells where we had unclear expectations were the high potential-high integration, and low-potential low-integration areas. For the former we find that at least in Tanzania and Uganda the combination of favorable conditions for agriculture and lower distance from urban centers tends to create the conditions for more households to specialize in off-farm activities. When integration is lower and agricultural conditions more difficult, the picture is mixed, with households more likely to engage more fully in non-farm activities in Niger and Malawi, but less likely to do so in Uganda and Tanzania.
When distance to large cities is considered, the impact of distance in low potential areas is much more marked, as signaled by the relatively steep negative slope found in all countries for either selfemployment or non-agricultural wage work, even though a U-shape is still found for Uganda. In other words, the impact of proximity to large cities is highest in low potential areas, as expected. In lowpotential, low-integration areas the sign was uncertain a priori and we find that the impact of distance prevails, with the odds of specializing off-farm being lowest in three of the four countries. The exception is Uganda where the attraction of non-farm declines with distance from cities up to a point, but picks up again at higher distances. In high potential areas the effect of distance is much more muted, and the slopes flatter: only in Niger and Uganda do the odds of being specialized offfarm relative to agriculture decline significantly with distance from major cities for households sitting in areas with higher agricultural potential.
All in all, these results point to evidence that appears to be broadly consistent with the predictions of the theory. There is no sign of African households adopting income generation strategies that differ from those observed elsewhere in terms of their relationship to basic exogenous determinants such as agricultural potential and distance from urban centers. The fact that in high potential areas the odds of being specialized off-farm are pretty much unchanged regardless of distance is also compatible with the observation earlier in the paper that farming still dominates African rural areas. Regardless of distance and integration in the urban context, when climate is favorable, farming remains the occupation of choice for most.  (Dawson, 2014) for making publicly available the Excel program to produce these graphs.

Conclusion
Is Africa's rural economy transforming as its economies grow? Or is it trapped in some sort of peculiar natural-resource based growth pattern that may prove unsustainable in the long-run? And in particular, is there evidence of the share of agriculture in the economy decreasing following the familiar secular pattern followed by the vast majority of the countries now enjoying middle and highincome status? The analysis in this paper has attempted to look at the evidence coming from micro-data to respond to some of these questions from the perspective of the rural economy.
The analysis of the income-generating activities of rural households based on a large cross country dataset paints a clear picture of multiple activities across rural space and diversification across rural households. This diversification is true across countries at all levels of development and in all four continents, though less so in the African countries included in the sample. Bearing in mind the caveat that our sample is not representative of the whole of Sub-Saharan Africa, the evidence seems to point to African patterns of household level income diversification to have the potential to converge towards patterns similar to those observed in other developing regions. While African households are still generally more likely to specialize in farming compared to households in other regions, once the level of GDP is controlled for the shares of income and participation in non-agricultural activities are not far-off from those found elsewhere.
For most countries outside Africa-generally with higher levels of GDP-the largest share of income stems from off-farm activities, and the largest share of households have diversified sources of income. However, for the African countries in the sample most income still derives from on farm sources of income. For both African and non-African countries, diversification may function as a household strategy to manage risk and overcome market failures, or represent specialization within the household deriving from individual attributes and comparative advantage. Therefore diversification can be into either high or low-return sectors, reflect push or pull forces, and represent a pathway out of poverty or a survival strategy.
Specialization in on farm income-generating strategies is the norm among the African countries in the sample. Nevertheless, agricultural-based sources of income remain critically important for rural livelihoods in all countries, in terms of both the overall share of agriculture in rural incomes and the large share of households that still specialize in agricultural and on-farm sources of income. While the nature of the diversification response will vary by a given household, in each country, African and non-African, overall greater reliance on non-farm sources of income is associated with greater wealth. In almost all cases, wealthier households in rural areas have a higher level of participation in, and greater share of income from, non-farm activities. Similarly, wealthier households have a larger share of specialization into non-agricultural wage.
Conversely, agricultural sources of income are generally most important for the poorest households. Income from crop and livestock activities, as well as from agricultural wage labor, represents a higher share of total income for poorer households in almost all countries. Furthermore, a higher share of households specializing in on-farm activities, and particularly agricultural wage employment, is found at the low end of the wealth distribution.
The results offered here suggest the need to carefully consider how to promote rural development. While the diversification of rural households clearly indicates the need to look beyond agriculture in rural development policies, the overall importance of agriculture, particularly for poorer households, suggests that the promotion of RNF and agricultural activities both need to form part of any strategy. Policy makers must also pay attention to the likelihood that barriers to entry may limit the ability of poor households to take advantage of opportunities. The links between certain assets and activities imply that due consideration be given to those assets, or combination of assets, which will ensure broad growth in the rural economy. This complexity means that a particular policy is unlikely to fit different situations across countries and even within regions in a given country and that locationspecific policies are necessary.
The spatial analysis of the factors that drive specialization away from on-farm activities shows how the constraints to off-farm specialization are likely to differ between high-and low-potential and high-and low-integration areas. Also, small and large urban centers are likely to exert different influences on the transformation of the rural economy. While this adds complexity to the formulation of policies to promote rural non-farm growth, it also testifies to a series of trends that are not uncommon in other countries, and suggests that after all the African specificity in terms of higher incidence of farming activities may be due more to a GDP-level effect, than to a different response by households to the incentives and opportunities coming from agricultural and nonagricultural growth opportunities.