WPS7655 Policy Research Working Paper 7655 Jobs in the City Explaining Urban Spatial Structure in Kampala Arti Grover Goswami Somik V. Lall Social, Urban, Rural and Resilience Global Practice Group April 2016 Policy Research Working Paper 7655 Abstract This paper examines the spatial organization of jobs in Kam- employment. When explaining the variation in employ- pala, the capital city of Uganda, and applies the Lucas and ment density across localities in Kampala, the research Rossi-Hansberg (2002) model to explain the observed pat- highlights that (i) density falls by 23.5 percent per kilometer terns in terms of the agglomeration forces and the commuting increase in distance from the nearest potential subcenter; costs of workers. The paper suggests that: (i) Economic (ii) an increase in local production externalities of 10 per- activities are concentrated in the downtown—beyond cent increases density by 3.7 percent; and (iii) production which employment is spatially dispersed. (ii) Geographi- externalities in Kampala’s potential subcenters are extremely cally weighted regressions identify five potential subcenters weak to have any significant impact even on nearby tracts. in 2011; however, none of these contribute significantly to This paper is a product of the Social, Urban, Rural and Resilience Global Practice Group. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://econ.worldbank.org. The authors may be contacted at agrover1@worldbank.org and slall1@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Jobs in the City: Explaining Urban Spatial Structure in Kampala Arti Grover Goswami and Somik V. Lall JEL Classification: R12; R13; R14; R3; R52 Key Words: urban spatial structure; monocentric; subcenter; mixed land use; agglomeration economies; transport infrastructure Acknowledgements: This research is a part of the papers being commissioned for a Regional study on Spatial Development of African Cities, and has been sponsored by the UK DFID through the MDTF for Sustainable Urban Development TF071544. The authors appreciate valuable advice from Gilles Duranton on key empirical issues. The authors thank Elizabeth Schroeder for sharing the Uganda business registry (UBR) 2002 data, Rachel Sebudde for sharing the UBR 2011 data, Mark Iliffe, Katie McWilliams and Sarah Elizabeth Santos for GIS assistance, Nicholas J. Cox and Ernesto Calvo for help with GWR estimations, and Paulo Avner and Marguerite Duponchel for their generous help with data collection. The authors also appreciate UBOS for sharing the GIS maps on administrative boundaries for the 2012 population and housing census. Section 1: Introduction Historically, cities around the world evolved from a random spatial organization to a single nucleus of economic activity. A few decades later, monocentric cities gave way to dispersion again except that this time cities spread to multiple business centers in a more systematic manner. For example, cities such as Chicago, New York and Boston in the United States, Barcelona and Paris in Europe or Bogota in Colombia progressed from cities with a single business district to that with multiple nuclei of activity (Anas et al., 1998; Clark, 2000). Certain pre-conditions determine such an orderly distribution of economic activity to multiple centers.1 Specifically, the observed urban spatial structure of a city is a result of a trade-off between the cost of commuting to business centers and the potential benefits of agglomeration.2 Most work in this field is primarily based on evidence from cities in developed countries, which although helpful, does not provide enough information on the nature and extent of these centrifugal and centripetal forces in the context of cities in developing countries. Focusing on the capital city of Uganda (in Africa), this paper identifies the spatial structure of jobs in Kampala and explains the observed configuration using the theoretical underpinnings of the Lucas and Rossi-Hansberg (2002) model. The declining role of central business districts (CBD) and the decentralization of economic activity is well documented for modern cities (e.g. Mills, 1972). Evidence, mostly from cities in developed countries, suggests that as cities grow in size, the CBD loses its primacy, and the city transforms into a polycentric structure with clusters of activities spread within the built-up area (Bertaud, 2003). For example, A study on cities around the world reveals that the share of jobs in the CBD declined significantly from 25.4% in 1960 to 16.2% in 1990 (Kenworthy and Laube, 1999). In the recent years, research suggests that for the 50 largest metropolitan areas in the U.S. in the year 2000 the average share of jobs in the CBD has declined to 10.8±3.1%, starting from 25% in 1960s (Angel and Blei, 2015a).3 What explains such a spatial pattern of evolution in economic activity? Equilibrium urban spatial structure evolves due to the tension between the "centripetal" forces that tends to pull firms and households into agglomerations and the "centrifugal" forces that tends to disintegrate such agglomerations. The neo-classical theory models the centripetal forces for agglomeration as pure external economies and the centrifugal forces as arising from the need to commute to the CBD within each city. Innovations in transport technology that lowered the cost of commute explain the emergence of monocentric pattern of spatial organization, while production externalities primarily justify the non- monocentric spatial organization.4 Firms benefit from agglomeration because clustering generates externalities either due to "pure" external economies (e.g. spillover of knowledge between proximate firms), or due to market-size effects, (e.g. labor                                                              1 Smaller cities in the US such as Milwaukee in Mid-west or Buffalo in Up-state New York remain largely monocentric while others like South Florida, Dallas and Detroit have shown a pattern of urban development with dispersed employment (e.g., Lang, 2003; Gordon and Richardson, 1996). 2 See for example Fujita (1988), Lucas and Rossi-Hansberg (2002), Berliant et al. (2002), Berliant and Wang (2008), Fujita and Ogawa (1982) and Chatterjee and Eyigungor (2014). 3 However, this decline is not observed equally across all the cities in the world. For example, the same study reveals that in Tokyo, in fact, the share of jobs in the CBD increased by 2% during the same period. 4 See Mills (1972) for a monocentric model and for models explaining a non-monocentric pattern, see Fujita and Ogawa (1982); Ogawa and Fujita (1979, 1980). 2   market pooling or due to linkages between upstream and downstream industries).5 Literature suggests that in the presence of production externalities, firm productivity depends not only on its own location but also on the distance from other firms. Physical constraints on land, however, necessitate that not all firms can locate close to each other. As firms move away, the benefits from interaction between firms decline with distance. The higher is the pace of spatial decay of external effects, the greater is the probability of having multiple pure business areas in a city so that firms can reap benefits of production externalities.6 Research suggests that when commuting costs are low, a Mills-like monocentric city prevails if external effects are largely local, while a faster pace of spatial decay promotes the formation of a polycentric city. By contrast, a higher cost of commute coupled with the weak nature of production externalities results in a mixed land use structure. In sum, a non-monocentric pattern of spatial organization is a result of the interplay between the cost of commute and production externalities. The empirical literature on the determinants of urban spatial structure, even for developed countries, is limited and lags behind the theoretical models. Studies explaining subcenter formation (e.g. McMillen and Smith, 2003) focus solely on the importance of the cost of commute without any role for production externalities. Furthermore, these studies only tend to explain polycentricism without considering the possibility of a mixed land use pattern which is predicted in theoretical literature and observed in practice. In this regard, our paper makes two significant contributions to the existing literature: One, we extend the existing methodologies, such as the strategies for fitting monocentric models or identifying subcenters, to ascertain the urban spatial structure in a developing country city (Kampala, the capital of Uganda). Second, we explain the observed random pattern of employment dispersion in Kampala as a function of not only the cost of commute to the nearest potential subcenter but also through the extent of production externalities. In spite of a growing body of research in this field, we are yet unaware of any study that conducts such an analysis for an African city. How do cities in Africa organize spatially? Are they monocentric, have multiple centers of business activity or are dispersed rather randomly? How do they compare with their developed country counterparts? What determines the difference in their pattern of spatial development? Our paper seeks to answer some of these questions using the census of business establishment data from 2011 and 2002. We begin with identifying the pattern of employment in Kampala across various sectors, firm size and firm age. Our descriptive analysis suggests that: (i) firms in Kampala are increasingly involved in services that have a lower tradability potential and this share has increased in the past decade; (ii) most jobs in Kampala are created by young and small establishments; and (iii) the 2002 census indicates that the participation of women is largely in services with lower tradability and likewise the 2011 census suggests that these are also the sectors with the largest share of female owned establishments.                                                              5 Duranton and Puga (2004) summarize the gains from agglomeration in terms of sharing, matching, and learning effects. Sharing effects include the gains from a greater variety of inputs and industrial specialization, the common use of local indivisible goods and facilities, and the pooling of risk; matching effects correspond to improvement of either the quality or the quantity of matches between firms and workers; learning effects involve the generation, diffusion, and accumulation of knowledge. Agglomeration economies explain the existence of cities. This is particularly important given the growing evidence about the importance of such agglomeration economies. For a more recent survey on the evidence on agglomeration economies, see Combes and Gobillon (2015). For a more detailed exposition of the implications of introducing agglomeration economies in a monocentric city model, see Duranton and Puga (2014). 6 Also see, Berliant and Wang (2008). 3   The main prediction of a monocentric model in a static framework is that employment density falls smoothly as distance from the CBD increases. A linear regression of the log of employment density on distance to the CBD in Kampala suggests that although employment density declines with increase in distance from the CBD, it has a reasonable R-square for only up to 3 kilometers and thereafter the R-square declines to below 0.08. This result is consistently true for the linear model as well as for the more sophisticated splines version of the monocentric model. These results imply that while the CBD in Kampala is important, the city does not follow the standard comparative static predictions of a monocentric model. These findings are a first indication of the possibility of dispersed employment in most parts of the city. Such a pattern of spatial development is, however, not unusual for a city with low income levels.7 Next, we identify the potential subcenters of employment in Kampala using the geographically weighted regressions (GWR) suggested in McMillen (2001). The GWR technique basically produces a smooth function of the employment density by placing greater weight on nearby observations. By definition, potential subcenters are cites with significantly higher employment density than neighboring sites. This regression identifies subcenters as clusters of sites with positive residuals. Using the geographically weighted regressions and the contiguity matrix approach suggested in McMillen (2001, 2003), we identify five potential subcenters in Kampala in 2011 over and above the CBD. However, given that the concentration of employment included in all subcenters is extremely low, we are inclined to believe that Kampala does not have any other concentrated center of economic activity beyond the first 3 kilometer radius of the CBD. In fact, it appears that most land, besides that occupied by the firms in this part of the CBD, is being used for mixed purposes. Using the census of business establishment data from 2002 and 2011 our results show that Kampala has a very concentrated CBD and the share of the CBD in total employment has increased from 14% to 18% while that in firm count has increased from 19.5% to 20.5%. This pattern is in contrast to the changes in most developed countries’ cities around the world since the 1970s where the dominance of CBD in employment has declined over the past decades (e.g. Angel and Blei, 2015a). Further, our results reveal that the aggregate contribution of potential subcenters in Kampala’s employment is below 2.5%, thereby suggesting that none of the potential subcenters in Kampala is a significant site for job creation. In determining the covariates of urban spatial structure, our base estimations find that (i) while distance to the CBD matters for employment density, it is the distance to the nearest potential subcenter that can potentially accelerate localized job creation. In Kampala, the significance of the latter is lost once we control for local fixed effects and tract traits suggesting the heightened role of local traits in determining employment density. (ii) Production externalities of the tract itself is positively and critically associated with employment density, while that of the potential subcenter is insignificant in affecting the distribution of employment across tracts in Kampala. This suggests that mixed land use in Kampala could be explained due to the weak nature of agglomeration forces in the potential subcenter locations. (iii) Surprisingly, most labor markets in Kampala are localized where the presence of slums and very low income groups in tracts                                                              7 For instance, see, Bertaud and Malpezzi (2014) who compare the urban form of 57 cities around the world. According to their work, cities roughly follow the urban density model, a large number of them deviate from a standard declining gradient as well. For instance, Kampala’s density gradient is similar to that observed in Marseille in 1990 although the explanatory power of CBD is much lower. Beyond 3 km, the gradient and fit of Kampala is comparable to that found in Abidjan. 4   are correlated with higher employment density and mixed land use. (iv) As expected, the presence of a transportation network is also associated with higher concentration of jobs. These results sustain a number of robustness checks. First, we estimate the model using Heckman (1976) correction to address the problem of zero employment density observed in certain tracts. Second, we suspect that production externalities could be endogenous. To address this issue, we use a variety of instruments such as lagged production externalities, distance to universities and distance to schools. Third, we postulate that production externalities could perhaps non-linearly affect the pattern of land use in Kampala. This issue is addressed by estimating the model semi-parametrically. Fourth, we show that our results hold even after controlling for the historical persistence in employment density. Fifth, we also validate our results for higher rates of attenuation of externalities, which could be true for a city in African where production externalities are weaker. Finally, we estimate an extended version of the Lucas and Rossi-Hansberg model that additionally controls for the employment density and production externalities in the nearest potential subcenter sites. This specification deals with the noted concerns on excluding wages from our base specification. Our findings suggest that production externalities in Kampala’s subcenters are extremely weak to have any significant impact on even the nearby tracts. Urban spatial structure has profound implications for the efficiency of a city. An understanding of the urban morphology of a city is important to inform policy makers about what can and should be done—in terms of public plans and investments in transport infrastructure and also on regulatory reforms. In the case of developed countries, at least, studies have shown that policies that enhance the overall regional connectivity and those that permit speedier and longer commuting to take advantage of metropolitan wide economic opportunities play a positive role in making a city productive (Angel and Blei, 2012a; 2012b). Policies that remove impediments to the locational mobility of residences and workplaces for all income groups need to be supported so that they can easily relocate to be within tolerable commute range of each other.8 The paper beyond this point is organized as follows: Section 2 describes the conceptual framework for understanding urban morphology using the Lucas and Rossi-Hansberg (2002) model, Section 3 discusses the data requirements and sources used for implementing this study. This section also presents some descriptive results on employment density patterns across various dimensions. Section 4 discusses the empirical issues and estimation strategy, while Section 5 tests the monocentric model for Kampala and identifies the potential subcenters. Section 6 presents the results from an empirical application of the Lucas and Rossi-Hansberg (2002) model where we explain the mixed land use in Kampala by appealing to the opposing push and pull factors, that is, the costs of commute and production externalities. The paper ends with concluding remarks in section 7. Section 2: Conceptual Framework This section presents a synopsis of the Lucas and Rossi-Hansberg (2002) model to set the framework for explaining the observed urban form in Kampala. The model argues that the distribution of business and                                                              8 See Angel and Blei (2015a, 2015b). 5   residential land, wages, and land rents, are the result of the trade-off between spatial production externalities and commuting costs. Let the production per unit of land at location x be given by: Where A is the technology specific productivity parameter, is the employment density (that is labor employed per unit land) and is the production externality at location x. Lucas and Rossi-Hansberg (2002) propose that production externality at location x, is a linear function of employment density of all nearby locations, y. However, as firms move away from location x, this external effect falls by a parameter . That is, production externalities decay as the distance between firms increase.9 Mathematically, production externalities in a given tract x, , referred to as the agglomeration index in our paper, is given as: , (1) Where is the fraction of land used in production at location y and , is the distance between location x and y. Integration is done over each possible location in the city, although the extent of this effect diminishes sharply with increased distance. The presence of production externalities implies that a firm’s productivity at any site is governed not only by internal factors but also by the concentration of employment (and other firms) at neighboring sites. The central business district of a city illustrates this phenomenon in the most obvious way where firms agglomerate to take advantage of such externalities. Each worker seeks to reach a minimum reservation utility, which is a Cobb-Douglas function of land and other goods. The rent for land enters the objective functions for both the firms and the households. Optimization in the Lucas and Rossi-Hansberg model solves the equilibrium employment density at location x as: (2) Where is the wage income of workers at location x. The equation suggests that as production externalities increase so does employment density. However, a rise in wages (and land rents) imply an increase in congestion externalities and thus discourages further concentration. When the cost of commuting to the city center is high, the proximity of firms and workers is mutually beneficial and therefore the city is more decentralized. When the cost of commuting to subcenters is high, the wage income is reduced by commuting to this business location. Hence workers prefer to work close to their jobs. Then a mixed land use pattern emerges. If the cost of commute to subcenters is low, a polycentric structure emerges. Contrarily, when commute costs are low instead, a traditional monocentric city emerges.10 In fact,                                                              99 In Lucas and Rossi-Hansberg (2002) formulation, δ not only affects the rate at which production externality decays with distance but also the level of external economies. 10 Many studies reveal that the degree of decentralization is strongly related with the spread of automobiles (e.g. Anas et al., 1998, Glaeser and Kahn, 2004; White, 1976; Steen 1986 and Sullivan, 1986). 6   for many models, polycentricism and dispersed land use is the result of interplay between agglomeration diseconomies in the CBD and the costs related to transport infrastructure. Our econometric analysis begins with estimation of equation (2). However, since we do not have information on wages, we also estimate an extended model. Although the Lucas and Rossi-Hansberg (2002) model present residential and business as dichotomous areas of specified activity, in reality there are often gradual differences between predominantly residential and predominantly business areas. Since subcenters are statistically identified sites of local peak in employment density, we compare the density of any tract relative to the nearest peak in employment density. This comprises to be our measure of land use. A lower relative density in a tract vis-à-vis the nearest subcenter suggests that the tract is primarily residential while a higher density perhaps reflects a mixed land use pattern. The latter is so because even the peak density tract is unable to sustain a large proportion of local employment opportunities. In our extended model, we thus study the dispersion of employment density across locations in Kampala, by comparing the employment density at a given location vis-à-vis the nearest (potential) subcenter. Let us label the nearest subcenter location as s. Thus, our object of interest would be the ratio of employment density at location x relative that in the subcenter s. Using (2), this ratio is given as: Where is the employment density in the nearest subcenter s and and are the production externalities and wages in the nearest subcenter s. Lucas and Rossi-Hansberg contend that for a worker living in location x and commuting to the subcenter s would have the following earnings available to spend at his home location x: , , Where can be visualized as the cost of commute from location x to location s or the loss of labor time in commuting from location x to s. Contrarily, if workers from subcenter s commute to location x, this , relationship is inversed, that is, . Lastly, if people do not commute from one tract to another location for their employment, that is, if households co-locate with their places or work, then we cannot infer anything on the relationship between wages in the two locations. In fact, in the Lucas and Ross- Hansberg (2002) model, equilibrium wages in a mixed land use site is only a function of production externalities. Thus, in a mixed land use case the cost of commute between the two locations would not be significant in explaining the variations in the ratio of employment density. Substituting the wage relationship from the equation above, specifically, where a worker living in location x and commutes to the subcenter s, we get: , (3) 7   From equation (3), it may be inferred that if a tract is primarily residential and workers from this tract commute to the subcenter, then an increase in the cost of commute from tract x to subcenter s would increase the relative density of the tract x relative to the subcenter. Analogously, if the tract x is primarily a business location, and workers from subcenter s commute to x for their jobs, then an increases in the cost of commute to this tract x would decrease the density of the tract x as workers would prefer to find jobs close to their residences in tract s. Lastly, if the tract x is a mixed land use one, where workers co-locate with their places of work, the cost of commute would not be a significant driver of the employment density ratio. With respect to production externalities, this equation (3) suggests that own externalities positively affect the relative employment density of the tract while higher agglomerative forces in the subcenter tend to reduce the relative density. The main results of the model can be summarized as below: 1. With any rate of attenuation of production externalities, a CBD bordered with residential land use emerges if the cost of commute is extremely low.11 2. Mixed land use begins to appear when commute costs increase to a moderate level and in fact if the cost of commute is very high, the entire city could be devoted to mixed use. 3. A decline in production externalities reduces the size of the pure business district. If combined with a high cost of commute, business areas could be barely seen as spikes at the city edge. Section 3: Data This section describes the data requirement for identifying and explaining the urban spatial structure in Kampala. In addition, we also present a few facts about employment density patterns in the city. 3.1 Identifying urban spatial structure To ascertain the urban spatial structure in Kampala, we need to test the alternative models of city organization. For testing the monocentric model, we need information on employment density and the distance from CBD. Employment by establishments and their geographical coordinates are extracted from the Uganda Business Registry in the 2011 Census of Business Establishments (COBE). COBE is a nationwide census that the government of Uganda has undertaken three times. The one conducted in 1989- 90 was not published at the time due to limited resources. The 2001-02 wave, which we refer to as the 2002 census in the paper, covered 163,321 business enterprises across the country of which 55,448 belong to Kampala. Comparatively, UBR 2010-11 census, referred to as 2011, covered about three times as many enterprises –458,106 of which 1,33,663 belong to Kampala. These additional establishments in the recent census reflect not only the scale of net business formation since the 2002 census, but they could also be accounted by the additional coverage of all commercial farms and micro agribusinesses in the 2011 census.12 For each enterprise in the registry, UBR provides information on the official name and identity of the enterprise, its exact location (in terms of GIS coordinates), description of its main activity in terms of a four                                                              11 The intuition for this result is that higher is the cost of commute, the more likely is a mixed land use pattern where people live close to their places of work to economize on these costs. 12 In contrast to other sectors, only formal businesses activities were covered for the agricultural sector in 2002 census of UBR. The 2011 UBR covered both formal and informal agricultural businesses for the agricultural sector. 8   digit International Standard Industrial Classification (ISIC) code, the number of persons engaged in the enterprise on the date of the census and the year of establishment of the enterprise. Additionally, the 2002 census separately provides the count of male and female employees in each establishment while the 2011 census offers information on the gender of the owner of the establishment. To geographically locate the establishments enlisted in the UBR data, we obtained GIS maps from UBOS that provides administrative boundaries at various layers. The city of Kampala is divided into five sub- counties and each sub-county is disaggregated into parishes. There are 97 parishes in all in 2012 population and housing census conducted by UBOS. Each parish is further divided into villages for administrative data collection purposes and these villages are disaggregated into one or more enumeration area. There are 861 villages and 3,265 enumeration areas respectively in the 2012 population and housing census. For most covariates, village level is the lowest level of disaggregation for which we have data available. Going with the literature on urban spatial forms, we refer to the administrative unit of villages in Kampala as tract for the rest of our paper. Using GIS codes obtained for establishment from the UBR census, a spatial join was applied where attributes were assigned to the UBR data based upon which enumeration area the businesses were located. The latter information was in turn obtained from UBOS GIS maps. Using establishment’s GIS codes on the map of Kampala, we able to obtain the sub-county, parish and village (tract) corresponding to each establishment. We then aggregate the total employment by each tract and obtain employment density using area information from UBOS GIS maps. To obtain the distance from CBD, we determine centroid of the tract that is centrally located in the official definition of CBD in Kampala by Kampala Capital City Authority (KCCA). Using the GIS maps of tracts in Kampala we then obtain the distance to CBD as the minimum distance of the centroid of a tract to the centroid of the selected CBD. It is also noteworthy that in our work we experiment with different CBD tracts but our results are insensitive to this choice. In the next alternative set-up, we want to evaluate the degree of polycentrism in Kampala. The only data required for identifying subcenters as per the GWR technique is the employment density of a tract, and the geographic coordinates of the centroid of this tract. 3.2 Explaining urban morphology According the Lucas and Ross-Hansberg (2002) model, the main forces explaining the pattern of land use is the cost of commute and the extent of production externalities. Most workers walk to work in a city such as Kampala, therefore, it makes sense to proxy the cost of commute as the distance to the nearest subcenter. We use the straight line distances from the centroid of a tract to the centroid of the potential subcenter as a proxy for the cost of commute. Production externalities are calculated as per equation (1), where we use the land use maps provided by the Kampala City Council Authority (KCCA) to calculate the share of land devoted to commercial or business use in a given tract. Besides these variables of interest, we control for time-invariant observable tract characteristics such as proximity to parks, railroads, roads, public transit/bus station. These variables are calculated using the Open street map (OSM) data. Additionally, we control for accessibility to various amenities such as health center which are obtained from KCCA maps. Our 9   instruments, the distance to nearest university and schools are also obtained from KCCA. Finally, KCCA also provided information on the following variables in their land use maps: tract area used for transportation, tract area used for residential purposes, open space in a tract area, tract area used in slums, tract area in wetlands, tract area with irregular terrain and the tract area devoted to lakes. This data on land use, income, slums and wetlands was extracted from GIS maps by intersecting the respective maps with the 2011 administrative boundaries maps. 3.3 Some Descriptive Facts on Urban Spatial Structure in Kampala We begin by slicing Kampala’s establishment level data by industry to map the changes in allocation of establishments, employment and entrepreneurship over the last decade. Table 1a and 1b presents the firm and employment counts and their respective shares in 2011 and 2002 respectively for each sector, namely, agriculture, manufacturing and services. The 2011 census additionally allows us to infer the distribution of establishments that have female owners while the 2002 census shows the split in employment by gender.13 Since services contribute to more than 85% of employment and 92% of firm count in both waves of UBR, we present figures for several disaggregate services sectors. These tables suggest that retail trade, repair services and hotels and restaurants together account for nearly half of the employment and 70% of firm count. Over the last decade, the share of firms and employment in services with lower tradability potential has increased, albeit marginally. This increase has come at the expense of a decline in the share of employment in the more dynamic tradable services sector. In terms of gender composition of industry, table 1 reveals that women owned enterprises are dominant in the services with lower tradability such as such as retail trade and repairs hotels and restaurants sector while these were also the sectors that absorbed a large proportion of female labor in in 2002. 3.4 Decomposition of change in Employment density: 2002-2011 Table 2 presents our first tract-level analysis. Following the productivity growth decomposition work of Baily et al. (1992), Griliches and Regev (1995), and Foster et al. (2001), we decompose the observed changes in aggregate employment density from 2002 to 2011 into “within” tract changes in employment density (i.e., average growth in employment density for tracts weighted by their initial employment shares) versus “between” changes across tracts in activity (i.e., relocation of activity from tracts with low initial employment density to tracts with high initial employment density).14 Appendix Box B.1 summarizes the methodology for this decomposition. Using the technique of Griliches and Regev (1995) in panel B of table 2 we find that 1/3rd of the increase in overall employment density in the last decade is accounted by the “between tracts” component, that is, employment in Kampala is also significantly being allocated across tracts, implying an increased dispersion in economic activity.15 Using an alternative methodology in panel C, we find that both within and between                                                              13 These establishments may be co-owned by male owners. 14 Our table presents weighted employment density, weighted by employment share as well as the firm count of the tract. As noted earlier, the coverage of 2011 UBR data is much wider which may bias our decomposition of “within” and “between” components if these newly included sectors were drawn from a few select tracts only. These tracts would likely show a larger change in the “within” tract density. 15 This technique and its merits/de-merits are outlined in greater detail by Foster et al. (2001). Most importantly, the technique lacks a covariance component, with this feature instead absorbed into both the within- and between-district 10   components have served to reduce the employment density in Kampala, again indicating that on average employment density became even more dispersed over the last decade. Interestingly, the covariance term is positive, meaning that the increased employment density is only accounted by a few tracts that were growing base more than the national average. This could perhaps be the central business district and thus we witness an increase in importance of CBD over the last decade. For the rest of Kampala, employment density has declined and employment is even more dispersed than it was a decade ago. This exercise sets the stage for testing the available alternative models of urban spatial structure. Specifically, we want to examine whether Kampala is characterized by a monocentric configuration, polycentric structure or a mixed pattern of organization of economic activity. We describe our empirical strategy next followed by the analysis in the subsequent section. Section 4: Empirical Strategy We conduct our empirical work in two steps: First, aggregating employment from establishment level data, we obtain the employment density in each tract and ascertain the urban spatial structure of Kampala. Second, we identify the correlates of the observed distribution of employment density across tracts in Kampala using the Lucas and Rossi-Hansberg (2002) formulation. Accordingly in this section we first discuss the empirical techniques for isolating the urban spatial structure and then the strategy for identifying its determinants. Finally, we shed light on some of the econometric issues. 4.1 Urban spatial structure The monocentric model predicts that employment density declines smoothly as the distance from CBD increases.16 We begin with fitting the standard monocentric mode: ln (4) Where ln is the logged value of employment density of a tract x, is the distance of tract x from the CBD and is the stochastic error term. Linear and Cubic Splines are other attractive versions of the monocentric model which have been used by several researchers (e.g. Anderson, 1982). In this approach, the distance variable is split into intervals and a separate linear or cubic function is applied to each region. The function is constrained to be smooth at the boundaries between regions (which are known as “knots”). Distance from the CBD is divided into several intervals and an equation of the following form is estimated: ln , , ∗ , ∗ ∗ (5)                                                              components. Weighing against this disadvantage, the Griliches and Regev (1995) technique is more robust to measurement error. 16 Increasing incomes and urban populations have been shown to cause a decline in the slope of density gradient over time (McMillen, 2006). 11   where the minimum value of distance from the CBD is while the boundaries between regions are 3, 6 and 9; terms are dummy variables that equal one when d(x) .17 The interpretation of the coefficients is simple: is the slope coefficient for distance from CBD between 0 – 3 km; is the slope coefficient for distance from CBD between 3–6 km and so on. Similarly higher order splines may be estimated by generating polynomial functions of the distance intervals.18 Several approaches have been adopted to identify subcenters. A monocentric model can also aid the identification of subcenters of economic activity. For instance, the McDonald and Prather (1994) approach involves looking for clusters of significant positive residuals from a simple regression of the natural logarithm of employment density on distance from the CBD. Giuliano and Small (1991) suggest that subcenter can be identified by visual inspection of maps by defining a subcenter as a set of contiguous tracts each having a minimum employment density of 10 employees per acre and, together, having at least 10, 000 employees. Their method has been adopted by Anderson and Bogart (2001), Bogart and Ferry (1999), Cervero and Wu (1997, 1998), Small and Song (1994) and in the first stage of the study by McMillen and McDonald (1998). Other statistical procedures for identifying subcenters have been proposed by Craig and Ng (2001), Giuliano and Small (1991) and McDonald (1987). In these models, a reasonable value of employment density is chosen based on the local knowledge of the city. In general, the employment density required for subcenter status is likely to be higher in areas with higher overall density levels. Instead of relying on arbitrary cut-offs that requires local knowledge of an area, we use McMillen’s (2001) non-parametric technique to identify subcenters of employment in Kampala. McMillen (2001) uses a geographically weighted regression (GWR) to detect potential subcenter sites. GWR places more weight on nearby observations when estimating a predicted value for the natural logarithm of employment density at a target site. This procedure returns an estimate of the employment density at each site which can be used to identify the potential subcenters of a city. Subcenters are those sites that have densities significantly greater than the initial smooth. Said differently, density in a subcenter is considerably higher than the neighboring sites. Statistically, a site is a potential subcenter if the following holds: Where is the GWR estimate of employment density at site x, is the estimated standard error for the prediction; and c is the critical value for a normal distribution. Critical values associated with 5 percent, 10 percent and 20 percent significance levels are 1.96, 1.64 and 1.28. Clearly, the number of potential subcenter sites increase as c falls.19                                                              17 We attempted fitting the spline function at other appropriate distance bands and our results are not highly sensitive to this choice. Our categorization represents a uniform split of the total radius of the administrative boundary of the city of Kampala that is used in this study. 18 Other alternatives for estimating a monocentric model include nonparametric estimators and semiparametric estimators such as that used by McMillen (1996), however, these far more difficult to use and have few advantages when nonlinearity is confined to a single variable. 19 Since we are only able to use data at a tract level (rather than at the finest geographical scale, that is, enumeration area), we work with a lower cut-off for c (at 20% level of significance, which is 1.28). Most literature in the field 12   There are second stage methods available that eliminate sites with trivially small overall employment levels. For instance, Giuliano and Small (1991) approach suggests that a subcenter is a group of contiguous sites with significantly positive residuals, in which total employment exceeds a critical value. The critical value for total employment again introduces an arbitrary element to the subcenter definition. However, the critical value for total employment is less arbitrary than Giuliano and Small’s cut-off point for minimum employment density and is less likely to require variation across cities or within a metropolitan area. 4.2 Determinants of urban morphology Using insights from the conceptual framework (equation 2), we estimate the following base model: ln ln ln ∅ Τ 2 Where is the employment density in tract x, is the measure of production externalities in tract x, is the wage in tract x, and T is the vector of controls. All unobservable traits, including tract and nearest subcenter characteristics and land use policies such as zoning laws, non-market regulations are captured by the error term . Literature suggests many factors contained in the vector Τ that are key to determining the pattern of equilibrium land use. First and foremost, the attractiveness of the city center is perhaps the most important element (e.g. McMillen and McDonald, 1998). We denote A to be the vector that measures accessibility to CBD, mainly including the distance to CBD. The CBD would be accessible if the tract had proximity to bus stations railroads and roads. These measures are also included in vector A. Second, if subcenters are dominant business hubs, then access to subcenters should also appear as a determinant in the density equation. Measures for access to potential subcenters, represented by S, primarily include the distance to the nearest subcenter. Potential employment sites also have attractive amenities such as proximity to health centers and park that influence their likelihood of standing as a significant employment center. The proximity to these amenities are captured by vector V. In our specification, we model accessibility to subcenters and proximity to amenities in inverse form because it provides a simple and a direct means to determine whether such measures have any influence on densities without having to worry much about multicollinearity problems.20 Additionally, in Kampala a large proportion of people walk to work and an inverse form would be apt for capturing any non-linearity in the cost of commute. 21 Other tract traits included in Τ are the irregularity in the terrain of the tract, share of tract area assigned for residential purposes, open grounds and those covered with lakes and wetlands. These factors tend to                                                              pertains to developed countries where there is a problem of plenty rather than paucity in finding potential subcenter sites. These studies have mostly used a cut-off at 5% level of significance (e.g. McMillen, 2001). In the case of Kampala, choosing a 10% level of significance leaves us no potential subcenters except a few tracts that belong to the CBD and hence do not qualify to be subcenters. We, therefore, work with a lower cut-off so as to retain some potential subcenter sites in Kampala and effectively implement the Lucas-Rossi-Hansberg (2002) model. 20 As in McMillen and McDonald (1998), the inverse distance at the subcenter location is arbitrarily set to 4 rather than to infinity. 21 See KCCA (2012). 13   limit a tract’s participation in hosting employment activities. The presence of slums, percent of high and low income households in a given tract also impacts the employment density of the site. Such measurable idiosyncratic characteristics of the tract are represented by the vector W. Thus, T can be summarized as: T , , , Including both and T in our specification allows us to distinguish from different sources of agglomeration economies: one that arise purely from access to common infrastructure such as transport and bus stations, and the other that are internal to the group of establishments within the tract, that is, production externalities. For instance, if firm location is only determined by accessibility measures then we would expect that only the accessibility measures are statistically significant in an employment density regression. Contrarily, if production externalities exist, then a firm may bid more for locating in these tracts, independent of the other advantages. 4.3 Econometric Issues We now highlight some of the econometric issues in estimating equation 2 . 4.3.1: Omission of wage information Our true estimating equation from Lucas and Rossi-Hansberg (2002) model is 2 but we do not have information on wages across each tract in Kampala. Omission of wage information could lead to a bias in the estimated coefficients. In the Lucas and Rossi-Hansberg (2002) model, wages are determined endogenously, perhaps as a function of distance of the tract to the nearest potential subcenter and the production externality in the tract22: , (6) Substituting the above wage equation in equation 2 , we get: ln ln Ψ T (7) Where ; ; and T , , , Our specification in equation (7) is very similar to the McMillen and McDonald (1998) model here we additionally use insights from the Lucas and Rossi-Hansberg (2002) because production externalities are conceptually very important for explaining spatial variations in employment density. 4.3.2: Endogeneity concerns In estimating equation (7), we suspect that our main explanatory variable, that is, production externality, could be endogenously determined with employment density. Equation (1) builds the agglomeration index                                                              22 McMillen and McDonald (1998) contend that most of the spatial variation in wages within suburban areas could be explained by site-specific factors such as distance from the city center. 14   as a function of employment density in all the neighboring sites. Local interaction may be determined by the density of the given tract itself, because, for instance both are influenced deeply by geographical or locational features. Due to endogeneity of production externalities, any effect of this variable is likely to be biased and one needs instruments that are correlated with externalities in the current period. It is standard in the literature to use as instruments those variables that are correlated with agglomeration but uncorrelated with geographical location (Ellison and Glaeser, 1999). A natural candidate in this regard would be to use the lagged values of the explanatory variable, that is, the agglomeration index. Higher externalities in the past are positively and strongly correlated with current levels of externalities while these lagged values are not influenced by current levels of employment density. The only lagged agglomeration index available to us at a tract level is that from the UBR 2002 data. Our first set of instrumental variable (IV) results use this lagged agglomeration index as an instrument but also acknowledges that this 10 year lag is not sufficiently long. Thus, we suspect that externalities in production 10 years ago could not only have an indirect effect on current levels of density but may also directly affect density in 2011. Given the shortcoming of this instrument, we additionally experiment with alternative instruments such as distance to universities and schools. The idea here is that proximity to universities affects agglomeration forces either through knowledge spillovers from educational institutes or through changes in land use. The identifying assumption is that these spillovers only affect the extent of production externalities but have no direct impact on the motivations for clustering of firms. We discuss the validity and strength of each of these instruments as we estimate our model through an IV technique in Section 6. 4.3.3 Non-linearity of production externalities in affecting urban spatial structure In the estimating equation (7), we assumed that production externalities linearly impact employment density. However, the exact functional form could be non-linear. For instance at low level of production externalities, the impact of a marginal increase in external economies of the tract may remain negligible while at high level of agglomeration, such a change in externalities may possibly turn the tract into a potential subcenter site. If this is indeed the case then a linear regression of employment density would yield inconsistent and biased estimates of the parameters for production externalities s. We will therefore check the robustness of our results in (7) by estimating the equation using a semiparametric estimation procedure. This partially linear framework is estimated by employing Robinson’s (1988) double residual methodology and using local linear methods to estimate the functional form of production externalities affecting employment density. This method estimates, for each tract in Kampala, a weighted regression based on a Gaussian kernel. Endogeneity concerns can also be dealt with in a semi-parametric framework in the standard manner. In the first stage, information on production externalities measured by the agglomeration index in 2011 is regressed on a relevant instrument and all other independent variables. This is called the first stage and the residuals obtained in this stage are called first stage residuals. We could alternatively employ non- parametric estimation procedure to generate first stage residuals as well. After obtaining the residuals from the first stage regression, we run a semi-parametric regression of employment density on not only our usual 15   controls but also including the residuals obtained from the first stage (see Blundell and Powell, 2003). A semiparametric estimation of equation (7) would confirm if production externalities are indeed non-linear.23 4.3.4 Zeros in dependent variable As would be expected, not all tracts have a non-zero employment. In our sample of 861 tracts for 2011 UBR, 833 of them have a positive non zero employment. Although only a small proportion of tracts in our sample have missing employment values, especially in comparison, with studies such as McMillen and McDonald (1998) that had only 4887 out of 14,290 quarter sections in 1990 Chicago with non-zero employment, we nonetheless check for the biasedness in OLS estimates that use only non-zero densities. McMillen and McDonald suggest using Heckman’s (1976) two-stage maximum likelihood procedure which produces consistent estimates. The first stage of this estimation is a Probit model of the following equation using all observations: ́ Pr 1 Where is an indicator variable that equals 1 if the tract has a positive employment and 0 otherwise, is the vector of coefficients that determine land use; Κ are the vector of variables that impact employment density and is the normally distributed error term. In the second stage, selection bias corrected OLS estimates of the following equation is presented for tracts with non-zero densities: ln Κ Where ́ is the vector of coefficients determining the pattern of employment density and is the i.i.d. error term. We will use the Heckman’s procedure to check for the robustness of our OLS estimates. 4.4 Extended Model Although the McMillen and McDonald (1998) approach is useful in understanding the dispersion in employment density across tracts in Kampala, it does not explain the extent of mixed land use, if any. The model was built to explain the variation in employment density for a city like Chicago that had many significant centers of employment. However, such a model is perhaps not suitable to study a city like Kampala where, as we would see in the following sections, employment is dispersed in an unsystematic manner. Lucas and Ross-Hansberg (2002), on the other hand, provides a very useful framework for understanding the mixed land use pattern of urban development. Specifically, relative employment density in a tract vis-à-vis the nearest subcenter would be a useful test of the predictions of Lucas and Rossi- Hansberg model. For instance, in Lucas and Rossi-Hansberg formulation, equation 3 suggests that if the tract is primarily a residential location the cost of commute should have a positive impact on the relative employment density while the effect of this cost should be insignificant when an average tract is of mixed land use type. This is because the residents in location x do not commute to the subcenter for employment,                                                              23 In our estimations, we also consider the possibility that distance to the nearest potential subcenter is also non- parametric, but this hypothesis did not hold. 16   and thus the distance to subcenter is not likely crucial. In the extended model, we estimate the logged form of equation 3, that is,: , In principle, there is no reason to suppose that the parameter that measures the effect of production externalities of the given tract should be the same as that of the nearby potential subcenter. In most of our work on the extended model, we allow for varying parameters for these variables of interest by estimating the following equation: ln ln Ψ T ̃ (8) Where z(s) and n(s) are respectively the production externalities and employment density of the nearest subcenter. As in our base model, we run similar robustness checks to the OLS estimation of the above equation. Section 5: Spatial pattern of employment in Kampala This section identifies the urban spatial structure in Kampala, starting with some descriptive facts, and then investigating deeper into the issue using more robust techniques. 5.1 Employment Distribution within the city: Is Kampala Monocentric? Figure 1 presents the map of employment density in Kampala at a tract level. The map clearly shows that there are very few tracts outside the contiguous centrally located business district that have density higher than 4,000 jobs per kilometer square in both the census years 2002 and 2011. Kampala has a very strong core of employment concentration but, nonetheless, it appears that there may be certain pockets with high job concentration as well. A visual inspection of the map suggests that the density of jobs is not falling monotonically with distance from the CBD. To test this more formally, we estimate the standard monocentric model in equation 4. The result of this estimation for 2011 is shown in column 1 of table 3 (Appendix table A.1, column 1 shows this for 2002). The slope gradient in 2011 suggests that the tract employment density declines by 34% per km increase in distance from the CBD. Unlike, the African counter-parts such as Johannesburg and Cape town where the density gradient is found to be positive (Bertaud and Malpezzi, 2014), density data in Kampala conforms to a standard urban density model with declining density gradient. In fact, these estimates and the value of adjusted R-square are similar to those obtained by McDonald and Prather (1994) for Chicago in 1980 using 1,196 urbanized tracts.24 Interestingly, there has been a marginal decline in the density gradient from 0.38 (table A.1) in 2002 to 0.34 in 2011, a trend that has been widely documented for cities across the world. For example, the average population density gradient for four US metropolitan area declined from 0.40 in 1954 to 0.31 in 1963 (Mills, 1972). Many cities from developing countries have also shown a much sharper decline in density gradients over ten year periods. For instance, population density gradient in Bombay, a metropolitan city in India,                                                              24 Their estimates suggest that a 1 mile increase in distance decreased employment density by 13% per square mile (approximately 34% per square km). 17   declined from 0.373 in 1881 to 0.325 in 1891 (Brush, 1968). The decline in density gradient for Kampala may be related to an increase in average household income and urban employment in Kampala over the last decade (UBOS, 2010).25The marginal decline in density gradient for Kampala could also reflect the fact that commute costs in Kampala has not changed much over the last ten years.26 This declining trend in density, is however, in contrast to some cities in developing countries such as Hyderabad in India where density gradient increased from 0.09 to 0.324 during the period 1951-61 (Brush, 1968). Next, we test if the density gradient in Kampala falls smoothly with increase in distance from the CBD. Column 2 of table 3 presents the estimation results of equation 4 where we drop tracts within 3 kilometers of CBD in Kampala. Although the density gradient is still negative, the distance from CBD has a very low explanatory power. The comparative static prediction of a monocentric model on a smooth declining density gradient does not seem to be validated for Kampala beyond 3 km of CBD. This holds true for both the years of UBR census data. Our result is somewhat analogous to the case of Chicago in 1980 where distance from the CBD does not explain the variation in floor area ratio gradient, although, in Chicago this happens at a much higher cut-off of 18 miles (McMillen, 2006). Columns 3-6 present a spline estimation of employment density function following equation 5. In the simple version of a linear spline with one knot, shown in column 3, we split the estimation of density gradient up to 3 km and the other section is beyond 3 km. For the year 2011 the decline in density gradient is 0.91 for tracts lying within 3 km from CBD while it is merely 0.18 beyond 3 km from CBD, again confirming that most of the decline in economic activity occurs within 3 km of CBD and beyond that distance there are no significant centripetal or centrifugal forces that agglomerates or de-concentrates activities. Similar conclusions can be drawn from columns 4 and 5 that estimate a linear spline model with 3 knots27 and column 6 that estimates a cubic spline model.28 Figures 2a and 2b present a comparison of OLS estimates vis-à-vis splines. Spline estimation seems to capture the urban spatial distribution of employment in Kampala more appropriately. The figure suggests that most economic activity de-concentrates within 3 km of CBD and beyond that point employment in the rest of the city seems to be dispersing rather slowly. The density gradient is nearly horizontal beyond 3 km of CBD. In sum, our estimation of a monocentric model for Kampala suggests that the city has a very concentrated nucleus but beyond 3 km from CBD, it is spatially dispersed and perhaps characterized by mixed land use. 5.2 Is Kampala Polycentric? A Non-parametric approach to subcenter identification 5.2.1 Identification of potential subcenters                                                              25 Increases in income and rise in urban population gradually lead to a decline in the density gradient (McMillen, 2006). Comparing population density estimates for Baltimore, Milwaukee, Philadelphia, and Rochester for 1880– 1963, Mills (1972) finds support for the argument that density gradient is flatter when cities have higher populations and incomes and lower commuting costs. 26 We infer about the cost of commute from the state of roads and transport infrastructure which has stagnated over the last decade (UBOS, 2012). 27 Column 5 estimates equation 5 considering tracts beyond 3 km of CBD. R-square for this estimation is very low and thus distance from CBD does not significantly explain the variation in employment density beyond 3 km of CBD. 28 Although not reported in the paper, we find similar results when using firm density rather than employment. 18   Using the monocentric model in equation 4, McDonald and Prather (1994) suggest identification of subcenters as clusters of economic activity where the residual is higher than a given cut-off. Ranking the residuals by their size, we retain 25 subcenters, of which only 7 are not contiguous with the CBD tract (in Nakawa and Kawampe sub-county).29 However, since the fit of the monocentric model is poor beyond 3 km of CBD, we believe that it is not the most robust technique for identifying subcenters. A non-parametric estimation, geographically weighted regression (GWR), yields 27 subcenters in Kampala for the year 2011. These subcenters are highlighted on the map of Kampala in figure 3. 30 Using the matrix contiguity approach of McMillen (2003), we retain 6 subcenters besides the CBD. 31 On top of this, we also apply a minimum total employment cut-off for a subcenter to count as a site of significant economic activity. We adopt a conservative approach so as to retain most identified centers of employment and keep this cut-off at 750 jobs. Thus, we eliminate only one subcenter tract using this cut-off (Katale tract located in Kawempe sub- county and Bwaise II parish).32 5.2.2 Subcenter characteristics We now turn to subcenter characteristics to check if they reckon to be points of significant economic activity. Table 4a presents the aggregate contribution of CBD, potential subcenters and the remaining tracts towards employment and firm share. The table also presents their density in employment and firm count. The table shows that the CBD in Kampala has a job density of 82,404 per square kilometer, which is equivalent to 333 jobs per acre, a job density that is comparable to but yet less than the job density of 397                                                              29 The list of names of these subcenters is available on request. 30 GWR was estimated in the software GWR4 using an adaptive Gaussian type kernel. Although the subcenters are sensitive to the choice of bandwidth, cut-off significance level and the type of kernel chosen, we find that in most cases the subcenters identified outside of the central sub-county were consistently being drawn as sites of potentially higher employment density vis-à-vis neighboring sites. Specifically, we used the golden selection search feature of the GWR software for bandwidth selection to identify the appropriate bandwidth. Nonetheless, our results were broadly consistent with user defined ranges (up to 200 neighboring tracts). Once the residuals were obtained, a cut- off of 1.28 is chosen (20% level of significance) to weed out non-significant potential subcenters. 31 In the case of cities in the US, such as Atlanta, Boston, New York, Chicago, and Philadelphia, McMillen (2003) defines two sites as ‘contiguous’ if they are within 1.25 miles of one another. Since Kampala is small relative to a city in the US, we define two tracts as contiguous if they are within 1.25 km of each other. Using this criterion, we infer that the identified employment centers in the central sub-county are contiguous with each other while none of the identified potential subcenters outside the central sub-county are contiguous, except the ones identified in Nakawa sub-county. 32 Total employment in a given subcenter has been most often used in studies such as McMillen and Smith (2003) and McMillen (2003) to eliminate centers of non-significant employment. Depending on a city and a subcenter location, in most cases a cut-off of 10,000 or 20,000 employees is chosen for cities in the US. In the case of Mexico City, however, Aguilar and Alvarado (2004) applied a minimum cut-off of 5000 jobs and identified 35 subcenters in the city using 1999 census data. For Kampala, we believe that the minimum cut-off should be lower than 5,000 given the income and employment statistics vis-à-vis the Latin American city. A cut-off of 750 jobs per tract is seems extremely low but perhaps its is justifiable given the that total labor force in 1999 in Mexico was 39.5 million as compared to 13.6 million in Uganda in 2011, while the per capita income (as a proxy for development) also suggests that Uganda in 2011 was 1/6th of the level in Mexico in 1999. If we applied half the cut-off used in Mexico city, that is, an aggregate of 2500 jobs per identified subcenter tract, we are not be left with any subcenter all. Relative to the local distribution of jobs among non-CBD tracts in Kampala, a tract with 750 employees is over 90th percentile of tracts. Including the CBD tracts, a cut-off of 750 still represents over 85th percentile of tracts. 19   jobs per acre in Chicago’s CBD in 1956 (McDonald and McMillen, 1990). Table 4b presents analogous results for the 2002 UBR census, where a separate GWR estimation was done to identify these subcenters.33 Tables 4a and 4b suggest that over the last decade the share of CBD in total employment has increased from 14% to 18% while that in firm count has increased from 19.5% to 20.5%. The share of CBD in total employment is similar to that witnessed by large cities in the US during 1980s. However, since 1960s, metropolitan cities around the developed world are witnessing a decline in the importance of CBD (Kenworthy and Laube, 1999; Angel and Blei, 2015a). In contrast to large American cities where employment subcenters contribute to about 15% of employment in 2000, the contribution of Kampala’s subcenters to employment and establishment counts is abysmally small. Potential subcenters in Kampala account for only about 2.4% of employment and 2% of firms.34 The remaining employment in Kampala, which is about 80% in 2011, is rather dispersed all across the city. This confirms our preliminary finding using the monocentric model that Kampala is characterized by a mixed land use pattern.35 In conclusion, our work on ascertaining the urban spatial structure in Kampala suggests that the city has a concentrated business center, however, beyond the 3 km radius, employment is spatially dispersed. The small contribution to aggregate employment by the identified subcenters also confirms our intuition that Kampala is characterized by a mixed land use pattern. At this level of development and income, we do not expect a city like Kampala to have multiple centers of economic activity. However, the degree of monocentricity in Kampala seems to be lower and economic activity in the city is rather mixed with residential use. This sort of mixed land use, as argued in the literature, is not optimal and speaks volumes about the lack of a spatially organized productive structure. We now turn to explaining the observed pattern of spatial variation in employment density in terms of cost of commute and the extent of production externalities using the framework of the Lucas and Rossi-Hansberg (2002) model. Section 6: Explaining urban morphology in Kampala This section uses variation in employment density across tracts in Kampala to examine the factors that have shaped its urban spatial structure. Beginning with the estimation of the base model, we proceed to address the econometric issues highlighted in section 4. 6.1 Base Estimation Our base estimation is essentially an OLS replica of McMillen and McDonald (1998) model for Chicago in 1980 and 1990, except that we introduce production externalities in the spirit of Lucas and Rossi- Hansberg (2002) model. The results of these estimations are reported in columns 1 and 2 of table 5a, where                                                              34 Employment density is high in subcenters because the tract area under consideration is very small. This is in line with the findings of Angel and Blei (2015a) who report that the average area of subcenters in American cities was 12.80±.260 km. By contrast, the average size of a tract subcenter in Kampala is only 0.21 km square while that of CBD is only 0.70 km square. 35 Evaluating the contribution of CBD and potential subcenters by sectors , we found that CBD contributes less to manufacturing employment vis-à-vis their contribution in services. Though the differences across sectors are less stark in Kampala, nonetheless, we contend that finding is very much in line with the sectoral contribution of metropolitan cities in the US (Duranton and Puga, 2001). 20   the vector T , includes access measures to the CBD and the nearest potential subcenter.36 Access to the CBD is included in linear, square and cubic terms while access to the subcenter is proxied by the inverse of the distance to the nearest potential subcenter.37 The result in column 1 suggests that as the distance to the nearest potential subcenter decreases, employment density of the tract increases. Thus, tracts placed closer to the potential subcenter experience some positive spillovers from the nearest subcenter. Column 2 adds in the sub-county and parish fixed effects to the specification in column 1. Adding local fixed effects heightens the importance of nearest subcenters. Every one kilometer increase in distance from the nearest subcenter is associated with a decline in employment density of the tract by 33%. As compared to the study on Chicago in 1980, where McMillen and McDonald (1998) document a 33.9% decline in employment density per mile increase in distance from the nearest subcenter, this decline in Kampala in 2011 in much flatter. In the subsequent specifications 3-12, we introduce production externalities using the agglomeration index of the Lucas and Rossi-Hansberg (2002) model. In columns 3 and 4, we note that the impact of proximity to potential subcenters on employment density is reduced and partly absorbed by externalities in production. In fact, every kilometer increase in distance from the nearest subcenter is associated with a decline in employment density by 23.5% when tract production externalities and local fixed effects are taken into account.38 As for externalities itself, a 10% increase in pure external economies is associated with a 3.7% increase in employment density when local fixed effects are included (column 4). It is also noteworthy that in this specification distance to CBD is no longer significant in explaining the variation in employment density in tracts implying that job matching in Kampala is highly localized. In specifications for columns 5 and 6 we add a host of access related distance variables for tracts while in columns 7 and 8 we add other tract traits. In each of these specifications, we first estimate without local fixed effects and then in the subsequent specification, sub-county and parish fixed effects are included. Our results are robust throughout all these specifications in that production externalities have a significant and positive effect on the employment density of the tract while the cost of commute proxied to the nearest potential subcenter, negatively affects employment density. There are a few points worth noting from results in table 5a. One, the specification in column 8 suggests the supply of infrastructure matters for clustering of firms. The proximity to bus stations and tract area devoted to infrastructure networks (e.g. roads) have a significant and positive impact on employment density. Although, the magnitude of effect of transport network is high, the proximity to bus stations has a very small impact. This again points towards a much localized job market in Kampala with limited scope for commuting via buses.                                                              36 We refer to the retained subcenter tracts as “potential” centers because they account for a very small proportion of economic activity. 37 Appendix table A.2 describes the basic statistics of our variables of interest. 38 Notice that this result supports the theory that bid rents are higher near potential subcenter sites and so is employment density. Firms outbid households in such locations and therefore mixed land use is less rampant for locations closer to subcenters. The result is akin to the prediction of falling density gradient from CBD in monocentric model. Nonetheless, employment density, by itself, is not a good measure of mixed land use and hence an increase in employment density due to a decline in commute cost is not a true test of Lucas and Rossi-Hansberg (2002) model. 21   Two, surprisingly and unlike the cities in developed countries, the presence of slums has a positive and significant impact on employment density perhaps be due to the availability of local labor markets for informal low skill jobs. To explore this hypothesis further, we control for the proportion of high income, low and very low income residential areas in a given tract in columns 9 and 10.39 Areas with high income residential have lower employment density perhaps because richer people can afford to commute to the higher paying jobs in the CBD or subcenters. By contrast, it is the population in tracts with large proportion of very low income groups that search for local jobs and for whom the cost of commute to nearest subcenter and CBD is a significant determinant of employment choice. Finally, we note that once we add the income groups into the specification, distance to CBD is not significant even when local fixed effects are not included. This implies that the negative effect of distance from CBD on employment density was primarily capturing the impact on tracts with a large proportion of high income households who can actually afford to commute to CBD for better job opportunities. Most poor people walk to work locally and try their best to economize on the cost and time of this commute. Thus, the non-monocentric pattern of employment spread in Kampala is critically explained by the presence of local labor markets and the high cost of commute to the nearest potential subcenters. Most employment in Kampala comes from services that have a lower tradability potential. Do the factors that explain the variation in employment density differ across the two sectors? To this end, we identify potential subcenters in Kampala based solely on sector specific density and re-run the estimations in table 5a separately for manufacturing and services sectors. The main results from this estimation are shown in table 5b.40 A comparison of manufacturing with services suggests that accessibility to CBD as well as the subcenters matter much more in terms of magnitude for manufacturing vis-à-vis services, suggesting that services jobs are more localized relative to those in manufacturing. This fact is reiterated in the size of the coefficients for distance to bus stations and the proportion of tract area devoted to transportation use (mainly roads and walkways). Lastly, we note that income groups matter for explaining density variation only in the case of services, again indicating that job markets are localized only for the services sector rather than that in manufacturing. Finally, the impact from production externalities is larger for the services sector vis- a-via manufacturing.41 6.2 Robustness Checks on the base model As a robustness check, we re-estimate equation (7) in the following alternative specifications: 6.2.1 Historical persistence One of the shortcomings of the urban spatial models is that they are static in nature. Research suggests that nothing predicts density better than past density (e.g. McDonald and Bowman, 1979; McDonald, 1979; Brueckner 1986; Anas 1978; Wheaton 1982). Redfearn (2009) provides ample evidence that the local peaks                                                              39 The correlation between the slum variable and the proportion of low and very low income households is 0.1 and 0.4 respectively. 40 The four specifications in table 5b for each of the two sectors corresponds to columns 7, 8, 9 and 10 in table 5a. 41 Unreported estimations of the base model estimation that decompose production externalities index by sectors (but ignoring the spillovers across sectors) confirm that it is mainly the externalities in the services sector that explains the variation in aggregate employment density. 22   in employment density, locations of employment centers have remained persistent over time. This has happened in spite of dramatic changes in transportation costs, communication costs and production technology. As a robustness check, we additionally control for historical persistence of employment density in our base estimation in appendix table A.3 and verify the impact of cost of commute and production externalities in explaining variation in employment density across tracts in Kampala. Our results with this additional control confirm that in spite of strong persistence, production externalities and the cost of commute still explain a significant variation in employment density. 6.2.2 Higher rates of decay of Production externalities Production externality for tracts in Kampala is calculated with a decay parameter of 5, which is standard in the literature (see, Lucas and Rossi-Hansberg, 2002; Koster and Rouwendal, 2013).42 We experiment with a higher rate of attenuation because services sectors that have a lower tradability is the main activity in Kampala and for which clustering is beneficial only locally (e.g. restaurants; retail trade). It is thus possible that the benefits of agglomeration dissipate rapidly in Kampala. We experimented with a decay parameter of 10, 15 and 20 and Appendix table A.4 presents the results from an estimation where production externalities are calculated using a decay parameter of 20. Here we note that: one, the magnitude of impact of an increase in production externalities is considerably smaller vis-à-vis the model with decay parameter of 5. Two, our results with respect to the cost of commute to the nearest subcenter, the importance of road infrastructure, the presence of slums and low income residential areas having a positive impact on employment density continue to hold in this alternative specification. In fact, the magnitude of the effect of these variables remains comparable across the two models. 6.2.3 Zeros in dependent variable As explained in section 4, zeros in employment density for 28 tracts in 2011 could lead to biased estimates when the base equation is estimated using OLS. This is because there could be a strong effect of participation of a tract in employment and the ensuing employment density. This correlation between the density equation and the selection equation, represented by , could be positive if the error term in the density equation positively impacts the error term in the selection equation. On the other hand if the factors representing the error term in the density equation have a negative impact on error term of the selection equation, this correlation would be negative.43 Appendix table A.5 presents the result from Heckman’s maximum likelihood estimation. Also presented is the Wald test for independence of equations, which tests whether the selection of a tract to participate in employment activity is independent from the actual density. The p-value of the Wald test statistics suggests that we cannot reject the hypothesis that 0 and hence the selection equation is independent of the density equation. Since is not significantly different from zero, we can infer that OLS estimates presented in table 5a are not biased because of presence of zeros in the dependent variable.                                                              42 Koster and Rouwendal (2013) make a very interesting application of Lucas and Rossi-Hansberg (2002) model for studying the variation in land rents. 43 See McMillen and McDonald (1998) for more details. 23   6.2.4 Other Checks Some more robustness checks, not reported in the paper, include: (i) examining the impact of production externalities and the cost of commute on firm density rather than employment density, (ii) different choice of CBD. Our results remain broadly the same across such alternative specifications (iii) Our results remain robust to an alternative specification where we measure distances from the edge of the tract rather than the centroid. 6.3 Instrumental Variable Estimations We suspect that production externalities in a tract could be determined endogenously with employment density of that tract. For instance, it could be the case that a tract’s employment density positively impacts the density of neighboring tracts and in turn raises own externalities. In this case, the bias would be in upward direction. In the reverse case, own employment density could decrease the employment density of neighboring tracts either due to congestion forces or market constraints in the given tract. In this case, higher employment density of the given tract could lead to a decline in own externalities and thus have a downward bias in the estimates. We address this concern by estimating equation (7) using Instrumental variable (IV) technique. The identifying assumption in our IV approach is that higher employment density tracts attracting more (or less) employment opportunities in neighboring tracts can be overcome by focusing on what the employment density in 2011 would have been had they been based on past levels of production externalities. This approach relies on the assumption that the current levels of employment density cannot affect past levels of employment density which in turn could have an effect the past levels production externalities. Although lag identification, that is, the use of lagged values of the endogenous variable as an instrument, is widespread in the social sciences literature, there are nonetheless problems associated with this strategy. For instance, Bellemare et al. (2015) show that “lag identification” is an illusion in that it merely moves the channel through which endogeneity biases causal estimates by “replacing a selection on observables assumption with an equally untestable no dynamics among unobservables” assumption. Likewise, Reed (2014) suggests that lagged identification could be an effective estimation strategy if the lagged values do not themselves belong in the respective estimating equation. In using the lagged values of the agglomeration index, we doubt that this may necessarily hold. Given the unchanged economic structure in Kampala over the last decade, it seems unlikely that the sources of production externalities have changed much. Persistence in density is well documented in the urban economics literature and thus it is recommended to use longer lags in any identification strategy (see Ciccone and Hall, 1996 and also Combes et al. 2011 for a survey on such strategies). For our case, a longer lag is not feasible because 2002 is the only year for which we have employment data available.44 We believe that the extent of these externalities would be affected by knowledge spillovers                                                              44 We considered using a Bartik (1991) style instrument; however, it would more demanding in terms of the availability and quality of data in Kampala for previous years. Bartik type instrument uses industry level local variations to identify the effect of endogenous variables. 24   while the possibility of knowledge spillovers by itself does not directly attract firms. We, thus, experiment with distance of the centroid of a tract to the nearest university or school as a possible alternative instrument. The IV estimations results using the three instruments is summarized in table 6.45 Panel A of table 6 shows only the coefficients for the endogenous variable, agglomeration index (measuring production externalities), from the base OLS estimation of columns 4, 6, 8, and 10 of table 5a. Panel B shows the reduced-form estimates, where agglomeration index in current period in equation (7) is replaced by the respective instruments. The reduced-form estimates for each of the instruments resemble the OLS estimates for many covariates. For each of the instruments, the first-stage relationships are also quite strong. For instance, for the model in column 1 of table 6, the first stage elasticity (p-value) for the lagged values of agglomeration index is 0.5 (0.00) and the F-statistics is 111. For other specifications in columns 2-5 of table 6, the first stage elasticity ranges from 0.46 to 0.5 and the F-stat is never below 85. Likewise, with other instruments, the first stage relationships are strong and so is the F-stat. For each of our instruments, we also present the stats from the Cragg-Donald weak identification test and compare it with the critical values provided by Stock and Yogo (2005). For the lagged values of agglomeration index and the distance from nearest universities, it is not surprising that given the strong fit of the first-stage relationships, higher magnitude of the stats on weak identification test, the IV specifications generally confirm the OLS findings. In most cases, we do not statistically reject the null hypothesis that the OLS and IV results are the same (as per the exogeneity test reported in table 6). By contrast, the distance from schools does not appear to be a strong instrument as per the weak identification test. Moreover, the magnitude of the coefficient estimates for agglomeration index in 2011 seems to be highly inflated vis-à-vis the OLS estimates when instrumented using distance from the nearest school. This is again rightly so, as we do not expect to have much knowledge spillovers from schools vis- à-vis from universities which could be a source of research, information and skilled labor flows. 6.4 Semi-parametric estimations Calculation of production externalities of a tract is based on the summation of employment density in all neighboring tracts. There is no reason to believe that this summation of employment density of neighboring tracts should linearly impact density of the given tract. For instance, the impact of an increase in externalities in a primarily residential tract would not be the same as that of a tract that is dominantly business oriented. An OLS estimation of equation (7) rests on the strong assumptions of the functional form of the distribution of the error term that are probably not true. Thus, to validate the robustness of our results we need to estimate the model semi-parametrically. In this framework as well, we account for the possible endogeneity of production externalities. We estimate equation (7) using the procedure outlined in section 4. Given the framework of Lucas and Rossi-Hansberg (2002), we are interested in: one, the coefficient for the distance to the nearest potential subcenter that proxies for the cost of commute; and two, an unknown function that summarizes the important characteristics of the function Ψ and the distribution of the errors in equation (7). These functions are identified from the distributions of the random sample of observations on the dependent variable,                                                              45 Detailed results from IV estimation and reduced form estimates are available upon request. 25   independent variables and instrumental variables using standard semi-parametric methods. Columns 1- 3 of table 7 presents the results from a semi-parametric regression using Robinson’s (1988) double difference method while columns 4-6 additionally accounts for endogeneity of production externalities using lagged values as instrument and columns 7-9 uses the distance from the nearest university as an instrument in a semi-parametric framework. As in the case of OLS regression of equation (7) presented in table 5a, we find that proximity to a potential subcenter increases employment density of the tract but not once we control for tract traits that provide inputs into local labor markets, such as proportion of slum area or very low income groups in a tracts. Once these controls are introduced, labor market and job search is more localized and variation in employment density cannot be explained by the cost of commute to the nearest subcenter. Finally, we observe that the coefficient estimates for many of the covariates in columns 1-3 in table 7 are comparable to those obtained after addressing endogeneity in a semi-parametric framework when using lagged agglomeration index (columns 4-6) and distance to nearest university (columns 7-9). Thus, endogeneity does not seem to be a concern even in a semi-parametric framework.46 Figure 4 presents the results for the agglomeration index estimated using the semi-parametric technique for the fully loaded model in column 3 of table 7.47 In the figure we also show a 95-percent confidence intervals by means of a bootstrapping procedure. On average, a 10 percent increase in production externalities leads to an increase in employment density by slightly over 2 percent. These estimates are similar to those obtained using an OLS estimation of equation (7) (directly comparable with column 10 of table 5a). 48 6.5 Extended Model As outlined in section 2, by estimating equation (7), we are explaining variations in employment density but employment density cannot be used to infer the pattern if the residential land use in Kampala is mixed with business. Additionally, the effect of the cost of commute in equation (7) is confounded by the impact of wages on employment density. This is because wages in the Lucas and Rossi-Hansberg (2002) model could be a function of the cost of commute to the nearest subcenter without the residents actually having to commute to the business area. In an attempt to address this concern, we estimate equation 7 that models the relative of employment density in a tract vis-a-vis the nearest potential subcenter as a function of the relative production externalities and the cost of commute. From the Lucas and Rossi-Hansberg model, we infer that the cost of commute would appear as a significant determinant of the ratio of employment densities if the tract is a residential location and people from this location commute to the subcenter for employment                                                              46 The first stage of a semi-parametric IV involves regressing the endogenous variable on the instrument and other independent variables to generate residuals which are then used in the second stage for IV estimation. We also tried employing non-parametric estimations to generate first stage residuals but that does not alter the qualitative nature of our results. 47 We also considered the possibility that distance to the nearest subcenter enters the estimating equation non- parametrically. However, we could not reject the null hypothesis that it can be approximated by a linear function. 48 Our semi-parametric estimations conduct the Hardle and Mammen's (1993) test to evaluate whether the nonparametric fit could be approximated by a polynomial fit of order 2, 3 and 5. The p-values for the test of order 2 is reported in table 7 and these values suggest that we cannot approximate the impact of production externalities by a polynomial of order 2 (or more). The statistic as such is not normally distributed but it can be re-scaled in a way that it can be compared with the quantile of a Normal distribution. 26   opportunities or vice versa. If, on the other hand, the residents of the tract do not commute to the nearest subcenter, but live close to their work place in the same tract, the cost of commute to the subcenter should not appear as a key determinant of the relative employment densities. Table 8a presents the OLS estimation results from equation 7, with the condition that , that is, the coefficient estimates of agglomeration index of a given tract is the same as that of the nearest subcenter The table suggests that an increase in the relative production externalities, as measured by the agglomeration index of a tract vis-à-vis the nearest subcenter positively impacts the relative employment density of the tract. In the full model that controls for all tract traits and includes both the sub-county and parish fixed effects, we find that a 10% increase in the relative agglomeration index of the tract is associated with an increase in relative employment density by 1.3%. By contrast, we do not find any significant impact of the cost of commute on the relative concentration of employment in a tract, a result predicted for cities with mixed land use. In an alternative specification, we allow different coefficient estimates for the agglomeration indices for the subcenter and the tract. Table 8b presents the results from estimation equation 7, without any constraint on and . Columns 1-6 present the results for all three sectors together while columns 7-8 makes some comparison with the manufacturing sector and 9-10 does the same for services. This table (i) confirms the results obtained in table 8a that in a mixed land use pattern, the cost of commute does not explain the observed relative density of a tract vis-à-vis the nearest subcenter. This is because a high cost of commute is the primary reason residents chose to co-locate with their jobs in the first place. (ii) Own agglomeration index has a positive and significant impact on relative employment density of the tract. In the fully loaded model, a 10% increase in own agglomeration index is correlated with an increase in relative employment density of the tract vis-à-vis the nearest subcenter by 2.6% (column 4 of table 8b). Overall, we find this as evidence in support of the Lucas and Rossi-Hansberg (2002) model. (iii) Production externalities in Kampala’s subcenters are extremely weak to have any significant impact on relative employment density. To test this further, in columns 5-6 we introduce an interaction term on the inverse of the distance to the subcenter with the agglomeration index of the subcenter. Once local fixed effects are included, the interaction between proximity to subcenter and the latter’s agglomeration index is not significant. This implies that even tracts located in close proximity to the subcenter are unable to benefit from the nearest subcenter’s production externalities. This result re-emphasizes the poor magnitude of externalities in the identified subcenters in Kampala. (iv) When comparing the overall results with sector specific findings in columns 7-10 of table 8b, our work suggests that on average a tract in Kampala does not behave like a mixed land use area when it comes to manufacturing jobs because the cost of commute has a significant effect on the ratio of employment density. Contrarily, for services, the cost of commute does not seem to matter in explaining the variation in the employment density ratio, as shown in column 10 of table 8b. In both sectors, however, the subcenter’s production externalities have no role in determining the pattern of land use, indicating that potential subcenters, that measure local peaks in employment density, have no significant advantage over the rest of the tracts in curbing the extent of mixed land use across the city. To address endogeneity concerns in the extended model, we run IV estimation of equation 7 using lagged values of agglomeration index, the distance from the nearest university and schools. The result of this 27   regression is presented in table 9. The performance of the alternative set of instruments in this specification is very similar to that obtained in estimating equation 7 (in table 6). When using lag agglomeration index and distance from the nearest university the IV estimates for production externalities are strikingly similar to the OLS estimates. In fact, for all our specifications, we cannot reject the null hypothesis that production externalities are indeed exogenous.49 These estimates reinforce the message from OLS estimation of that is production externalities of the subcenter do not impact the relative employment density of tracts in Kampala in anyway. Similarly, the cost of commute to the nearest subcenter is not significant in most specifications in IV estimates as well, indicating that Kampala indeed has a mixed land use pattern because the cost of commute is so high that most people choose to live close to their work place. Table 10 presents results from semi-parametric estimation of equation 7. Columns 1-3 of this table estimate the equation without accounting for endogeneity while columns 4-9 present results from semi-parametric IV estimation, using lagged agglomeration index (columns 4-6) and distance from the nearest university (columns 7-9) as an instrument. The results from OLS estimation continue to hold even when production externalities of the tract have a non-parametric structural form. Specifically, the mixed land use pattern in Kampala seems to be largely driven by the high cost of commute and the weak nature of production externalities in Kampala’s business centers. Finally, own production externalities have a very strong and positive influence on mixed land use in Kampala. Figure 5 presents the impact of production externalities on relative employment density, suggesting that, on average, a 10% increase in agglomeration index, measuring production externalities in the tracts, increases the employment density ratio by nearly 2.6%. These estimates are very much similar to those obtained from OLS estimation of equation 7 (column 4 in table 8b). Section 7: Conclusions Our analysis on the city of Kampala in Uganda using the census of business establishment data for the years 2002 and 2011 suggests that Kampala has a very concentrated nucleus of economic activity. However, the comparative static prediction of a smoothly declining gradient of employment density for a monocentric model is obeyed only up to 3 km. There are no other significant peaks in employment density gradient beyond the city center. The preliminary analysis of a monocentric model suggests that employment in Kampala is spatially dispersed. A more robust non-parametric (GWR) estimation of employment density in the city identifies five potential subcenters in 2011. However, none of these subcenters are significant centers of economic activity and together they account for less than 2.5% of aggregate employment. Even the CBD contributes to less than 18% of city-wide employment. The fact that employment in Kampala is dispersed across the city rather than concentrated in the CBD or the subcenters speaks much about the weak nature of production externalities and the high cost of commute to business centers. Our paper finds that the cost of commute to the nearest subcenter and the extent of production externalities explain much of the variation in employment density across tracts in Kampala. In a stripped-down version of the basic model with local fixed effects, every kilometer increase in distance from the nearest subcenter is associated with 23.5% decline in employment density, while a 10% increase                                                              49 Reduced form estimates and the detailed IV regression tables for the extended model are available upon request. 28   in agglomeration index measuring production externalities is associated with an increase in density by 3.7%. The test of the Lucas and Rossi-Hansberg (2002) model is most apparent when we use the ratio of the employment density of tract vis-à-vis that of the nearest subcenter as the dependent variable. In this extended version of the model, we additionally control for production externalities in the subcenter . The Lucas and Rossi-Hansberg model predicts that in a mixed land use economy people locate very close to their jobs and therefore distance to the nearest subcenter should not matter for the relative density of the tract vis-à-vis the nearest subcenter. We find this to be true for Kampala; however, when running the model by sector, our results indicate that the cost of commute, on average, affects the pattern of land use in manufacturing jobs but not services. Since services contribute to about 85% of employment, overall results seem to be primarily driven by the dynamics in the services sector. Finally, our estimations suggest that production externalities are critical for determining the relative employment density. Specifically, a 10% increase in the agglomeration index of the tract increases the tract employment density vis-à-vis the nearest subcenter by 2.5%. However, the extent of externalities in Kampala’s subcenters is too weak to have any significant effect on even the nearby tracts. The literature suggests that higher commuting costs foster mixed land use by forcing workers and producers to co-locate. Even though mixed areas emerge in equilibrium, they never form an optimal outcome (Rossi-Hansberg, 2004). Given the dominance of mixed land use in Kampala, policy makers should think of implementing measures for closing the gap between equilibrium and the optimal allocation of land. Our work suggests that separating residential areas from business use could be encouraged by lowering the cost of commute, either through investments in infrastructure, public transport systems and so on as well as by bringing about relevant reforms in land use policy that directly affects production externalities. References Anas, A. 1978: Dynamics of urban residential growth. Journal of Urban Economics, 5: 66–87. Anas, A., Arnott, R., Small, K.A. 1998. Urban spatial structure. Journal of Economic Literature, 36: 1426–1464. Anderson, J. E. 1982. Cubic-spline urban-density functions. Journal of Urban Economics, 12: 155–67. Anderson, A.N. and W.T. Bogart. 2001. The structure of sprawl—identifying and characterizing employment centers in polycentric metropolitan areas. American Journal of Economics and Sociology, 60: 147–169. Aguilar, A.G. and C. Alvarado. 2004. ‘La Reestructuración del Espacio Urbano de la Ciudad de México: Hacia la metrópolis multinodal’, in: A.G. Aguilar (Coord.), Procesos Metropolitanos y Grandes Ciudades: Dinámicas Recientes en México y Otros países (México DF: Miguel Ángel Porrúa: pp. 265-308. Angel, S. and A Blei. 2015a. Commuting and the spatial structure of American cities. Marron Institute of Urban Management, New York University, Working Paper #20 29   Angel, S. and A Blei. 2015b. Commuting and the productivity of American cities. Marron Institute of Urban Management, New York University, Working Paper #19 Baily, M., C. Hulten, and D. Campbell. 1992. Productivity Dynamics in Manufacturing Plants. Brookings Papers on Economic Activity: Microeconomics: 187-249. Bartik, T.J. 1991. Who Benefits from State and Local Economic Development Policies? W.E. Upjohn Institute for Employment Research, Kalamazoo, MI. Bellemare, M. F., T. Masaki, T.B. Pepinsky. 2015. Lagged Explanatory Variables and the Estimation of Causal Effects (February 23, 2015). Berliant, M., Reed, R., Wang, P. 2006. Knowledge exchange, matching, and agglomeration. Journal of Urban Economics 60: 69–95. Berliant, Marcus & Wang, Ping, 2008. "Urban growth and subcenter formation: A trolley ride from the Staples Center to Disneyland and the Rose Bowl," Journal of Urban Economics, Elsevier, 63(2): 679-693 Bertaud, A. 2003. The Spatial Organization of Cities: Deliberate Outcome or Unforeseen Consequence?, World Development Report 2003: Dynamic Development in a Sustainable World, World Bank. Bertaud. A. and S. Malpezzi. 2014. The Spatial Distribution of Population in 57 World Cities: The Role of Markets, Planning, and Topography. Unpublished mimeo. Blundell, R. and Powell, J.L. 2003. Endogeneity in nonparametric and semiparametric regression models. In: Dewatripont, M., Hansen, L.P., Turnovsky, S.J. (Eds.), Advances in Economics and Econometrics: Theory and Applications. Cambridge University Press, Cambridge. Bogart, W.T. and W.C. Ferry. 1999. Employment centers in greater Cleveland: Evidence of evolution in a formerly monocentric city. Urban Studies, 36: 2099–2110. Brueckner, Jan K., 1986. A switching regression analysis of urban population densities. Journal of Urban Economics, Elsevier, 19(2): 174-189. Brush, J . E . 1968 . Spatial patterns of populations in Indian cities . Geographical Review, 58: 362-391 . Cervero, R. and K. Wu. 1998. subcentering and commuting: Evidence from the San Francisco Bay area, 1980–1990. Urban Studies, 35: 1059–1076. Chatterjee, S. and B. Eyigungor. 2014. A Tractable Circular City Model with Endogenous Internal Structure. Federal Reserve Bank of Philadelphia. Paper presented at Conference on Urban and Regional Economics (CURE), Brown University. Ciccone, A., Peri, G. 2006. Identifying human-capital externalities: theory with applications. Review of Economic Studies, 73: 381–412. Clark, William. A. V. 2000. Monocentric to polycentric: New urban forms and old paradigm. In A Companion to the City, edited by G. Bridge and S. Watson. Oxford, UK: Blackwell. 30   Combes, P., and L. Gobillon. 2015. The Empirics of Agglomeration Economies. In the Handbook of Regional and Urban Economics, Volume 5, by G. Duranton, J. V. Henderson, and W.S. Strange, Elsevier Craig, S.G and P. Ng. 2001. Using quantile smoothing splines to identify employment subcenters in a multicentric urban area. Journal of Urban Economics, 49: 100–120. Combes, P., Duranton, G., Gobillon, L. 2008. Spatial wage disparities: Sorting matters!. Journal of Urban Economics, 63: 723–742. The Empirics of Agglomeration Economies. In the Handbook of Regional and Urban Economics, Duranton, G. and D. Puga. 2004. Microfoundations of urban agglomeration economies. In Vernon Henderson and JacquesFrançois Thisse (eds.) Handbook of Regional and Urban Economics, volume 4. Amsterdam: North-Holland. Duranton, G. and D. Puga. 2014. The Growth of Cities In: Handbook of Economic Growth, edition 1, vol. 2, chapter 5: 781-853 Elsevier. Ellison, G., Glaeser, E.L. 1999. The Geographic Concentration of Industry: Does Natural Advantage Explain Agglomeration? American Economic Review 89(2): 311-316. Foster, L., J. Haltiwanger, and C.J. Krizan. 2001. Aggregate Productivity Growth: Lessons from Microeconomic Evidence”, in Dean, Edward, Michael Harper, and Charles Hulten (eds.) New Developments in Productivity Analysis (Chicago, IL: University of Chicago Press: 303-372. Fujita, M. 1988. A monopolistic competition model of spatial agglomeration: A differentiated product approach, Regional Science and Urban Economics 18: 87- 124. Fujita, M. and H. Ogawa. 1982. "Multiple equilibria and structural transition of non-monocentric urban configurations," Regional Science and Urban Economics, Elsevier, vol. 12(2): 161-196 Glaeser, E., Kahn, M.E., 2004. Sprawl and urban growth. In: Henderson, J.V., Thisse, J.-F. (Eds.), Handbook of Regional and Urban Economics, vol. 4. North Holland, Amsterdam, the Netherlands, pp. 2482–25 Giuliano, G. and K.A. Small. 1991. Subcenters in the Los Angeles region, Regional Science and Urban Economics 21(2): 163-182. Giuliano, G., Redfearn, C., Agarwal, A., Li, C., & Zhaun, D. 2007. Employment concentrations in Los Angeles, 1980 – 2000; Environment and Planning, A39: 2935–2957 Gordon, P. and H.W. Richardson. 1996. Beyond polycentricity: The dispersed metropolis, Los Angeles 1970–1990, Journal of the American Planning Association 62: 289–295. Griliches, Z., and H. Regev. 1995. Productivity and Firm Turnover in Israeli Industry: 1979-1988. Journal of Econometrics, 65: 175-203. 31   Hardle W., E. Mammen. 1993. Comparing nonparametric versus parametric regression fits. Annals of Statistics, 21: 1926-1947. Heckman, J. J. 1976. The common structure of statistical models of truncation, sample selection, and limited dependent variables and a simple estimator for such models, Annals of Economic and Social Measurement, 4(5): 475-492 Imai, H. 1982. CBD hypothesis and economies of agglomeration. Journal of Economic Theory 28:275– 299. Kampala Capital City Authority (KCCA). 2012. Updating Kampala Structure Plan and Upgrading the Kampala GIS Unit. Draft Final Report Kenworthy,J.R.and F.B.Laube.1999. An international sourcebook of automobile independence in cities, 1960-1990. Boulder: University of Colorado Press. Koster H. and J. Rouwendal. 2013. Agglomeration, commuting costs, and the internal structure of cities. Regional Science and Urban Economics. 43: 352-366. Lang, R. E. 2003. Edgeless Cities: Exploring the Elusive Metropolis. Washington, DC: Brookings Institution Press. Lucas, R.E. Jr., E. Rossi-Hansberg. 2002. On the internal structure of cities. Econometrica 70: 1445– 1476. Mills, E. 1972. An Aggregative Model of Resource Allocation in a Metropolitan Area. The American Economic Review, 57(2 : 197-210 McDonald J. F. 1979. An Empirical Test of a Theory of the Urban Housing Market. Urban Studies , 16: 291 -297 McDonald J, F. 1987. The identification of urban employment subcenters. Journal of Urban Economics 21 242-258 McDonald J, F. and Bowman, H.W. 1979. Land value functions : are-evaluation. Journal of Urban Economics 6: 25-41 McDonald, J.F. and P.J. Prather. 1994. Suburban employment centres: The case of Chicago, Urban Studies, 31: 201–218. McDonald, J.F. and McMillen, D. P. 1990. Employment subcenters and land values in a polycentric urban area: the case of Chicago. Environment and Planning A, 22(12):1561-1574 McMillen, D. P. 1996. One hundred fifty years of land values in Chicago: A nonparametric approach. Journal of Urban Economics 40(1):100–124. 32   McMillen, D.P. 2001. Nonparametric employment subcenter identification, Journal of Urban Economics 50: 448– 473. McMillen, D.P. 2003. Identifying subcenters using contiguity matrices. Urban Studies, 40: 57–69. McMillen, D. P. 2006. Testing for monocentricity. In A. Richard, & D. McMillen (Eds.), A Companion to urban economics. Malden, MA: Blackwell Publishing Ltd. http://dx.doi.org/10.1002/9780470996225.ch8 McMillen, D.P and J.F. McDonald. 1998. Suburban subcenters and employment density in metropolitan Chicago, Journal of Urban Economics, 43:. 157–180. McMillen, D.P., Smith, S.C. 2003. The number of subcenters in large urban areas. Journal of Urban Economics 53: 321–338 Ogawa, H. and M. Fujita. 1980a. Equilibrium land use patterns in a nonmonocentric city. Journal of Regional Science 20: 455–475. Ogawa. H. and M. Fujita. 1979. Nonmonocentric urban configurations in two-dimensional space, Working Paper, in Regional Science and Transportation, 18 (University of Pennsylvania. Philadelphia, PA). Redfearn, C. 2009. Persistence in urban form: The long-run durability of employment centers in metropolitan areas. Regional Science and Urban Economics, 39: 224–232 Reed,W. R. 2014. A Note on the Practice of Lagging Variables to Avoid Simultaneity. Working Paper, University of Canterbury. Robinson P.M. 1988. Root-N consistent semiparametric regression. Econometrica, 56: 931-954. Rossi-Hansberg, E. 2004. Optimal urban land use and zoning. Review of Economic Dynamics, 7: 69– 106. Small, K.A. and S.F. Song. 1994. Population and employment densities—Structure and change, Journal of Urban Economics, 36: 292–313. Steen, R.C., 1986. Nonubiquitous transportation and urban population density gradients. Journal of Urban Economics 20: 97–106. Stock, J.H. and M. Yogo. 2005. Testing for Weak Instruments in Linear IV Regression. In D.W.K. Andrews and J.H. Stock, eds. Identification and Inference for Econometric Models: Essays in Honor of Thomas Rothenberg. Cambridge: Cambridge University Press, 2005, pp. 80–108. Sullivan A. 1986. A general equilibrium model with agglomerative economies and decentralized employment. Journal of Urban Economics, 20: 55–74 33   Uganda Bureau of Statistics (UBOS). 2010. Uganda National Household Survey 2009/10. Socio- Economic Module. UBOS, Kampala. Uganda Bureau of Statistics (UBOS). 2012. Statistical Abstract, 2012. White, M. 1976. Firm suburbanization and urban subcenters. Journal of Urban Economics 3: 323–343. Wheaton, W. C. 1982. Urban Residential Growth under Perfect Foresight. Journal of Urban Economics 12: 1–21. FIGURES Figure 1 Figure 2a: Linear fit versus spline Figure 2b: Linear fit versus cubic spline 34      Figure 3: Subcenters identified in 2011 using GWR Figure 4: Semiparametric Estimations of the base model 35   Figure 5: Semiparametric estimations of the extended model TABLES 36   37   38   39   40   41   42   43   44   45   46   47   48   APPENDIX TABLES 49   50   51   52   53   Appendix Box B.1: Decomposition Methodology Let E denote the employment density of Kampala and Ex be the employment density of village x. sx is village x’s share of employment in Kampala. By definition, E = ∑x=1,...,X Sx·Ex, where x indexes villages. Following Foster et al. (2001), our primary decomposition of changes from 2001 to 2011 takes the form: ΔE01-11=∑x Sx,01·ΔEx,01-11+∑x (Ex,01 –E01)·ΔSx,01-11+∑x ΔEx,01-11·ΔSx,01-11 where the first term =∑x Sx,01·ΔEx,01-11 is the within component that represents changes in employment density within villages with villages weighted by initial employment shares Kampala in 2002. Negative values indicate that villages tended to have a decline in employment density over the last decade. The second term +∑x (Ex,01 –E01)·ΔSx,01-11 measures the between component which represents changes in employment shares across villages interacted with the initial deviation of villages from the national employment density. Negative values indicate employment tended to be reallocated towards villages that had lower initial employment density. The third term ∑x ΔEx,01-11·ΔSx,01-11is the covariance component that represents the interaction of changes in employment density for villages across the period with changes in employment shares for villages across the period. Positive values indicate that fast growing villages also experienced rising employment density. These three components by definition sum to the total change employment density in aggregate. As we do not consider entry or exit of villages across years, our decomposition requires a balanced panel in Table 2. 54