Economic Geography: Real or Hype?* Jun Koo Assistant Professor Maxine Goodman Levin College of Urban Affairs Cleveland State University Somik V. Lall Senior Economist Development Research Group World Bank Abstract Economic geography has become a mantra for many economists, geogra- phers, and regional scientists. Many previous studies have tested the im- portance of economic geography for production activities and found a sig- nificant association between them. Most of these studies, however, have not taken into account that economic geography influences location deci- sions at the firm level. This paper illustrates a potential bias that can arise when firm location choices are not considered in estimating the contribu- tion of economic geography to industry performance. Analysis using mi- crodata of Indian manufacturing firms shows there is an upward bias in the contribution of economic geography to productivity when firm location choices are not considered in the analysis. World Bank Policy Research Working Paper 3465, December 2004 The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presenta- tions are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not nec- essarily represent the view of the World Bank, its Executive Directors, or the countries they represent. Policy Re- search Working Papers are available online at http://econ.worldbank.org. * The research has been partly funded by a World Bank research program grant on "Urbanization and Quality of Life." The authors can be contacted as follows: Koo ­ 2121 Euclid Avenue, UR 349, Cleveland State University, Cleveland, OH 44115; Tel 216.687.5597, Fax 216.687.9277, Email jkoo@urban.csuohio.edu; Lall ­ MC2-621, 1818 H Street NW, World Bank, Washington DC 20433, Email: slall1@worldbank.org 1 1. Background The geographic aspect of economic activities has long been of interest to many econo- mists, geographers, planners, and regional scientists. For instance, early location theorists probed the location of industries, land use patterns, and their economic implications (Christaller, 1933; Losch, 1956; von Thünen, 1826; Weber, 1929). Economic geographers have examined how interactions between increasing returns to scale and geographic location lead to a particular distribution pattern of production activities (Krugman, 1980; Pred, 1966). Analytic difficulties in modeling increasing returns to scale, however, marginalized geography in mainstream eco- nomic analysis (Krugman, 1991a). As a result, until recently, geography was forgotten in eco- nomic research. Economic geography has since been revived and expanded over the past decade due to advances in mathematical theories that model increasing returns to scale and economies of spa- tial agglomeration (Dixit & Stiglitz, 1977; Krugman, 1991b). Agglomeration theory, based on such technical development, attributes the geographic concentration of firms to cost-saving ex- ternalities. Many recent studies have shown that location is indeed an important factor affecting the economic performance of firms and regions (Beeson, 1987; Feser, 2001; Fogarty & Garofalo, 1988; Henderson, 1986; Moomaw, 1981, 1988). These studies have demonstrated that firms can improve their productivity by locating in large urban areas where similar production activities are concentrated and input factors (e.g., workers) are abundant. In most empirical models, agglomeration is often treated as a location-specific externality that can occur within the same industry (localization economies) or across all industries as a con- sequence of the scale of a city or region (urbanization economies) (Feser, 2001; Henderson, 2 1986; Moomaw, 1988; Nakamura, 1985). Therefore, it varies across industries or locations but is invariant across firms within the same industry or location. Such a specification is meaningful and innovative in that it incorporates spatial aspects of economic activities that have been largely ignored into an economic model. However, it may also introduce a bias arising from a firm's endogenous location decision process. The benefit of locating in a large urban area can be mate- rialized only if a firm makes a location decision accordingly. Firms located in small towns do not benefit from agglomeration economies as much as their counterparts in large cities. There- fore, the agglomeration economies that firms benefit from are a function of firm location choices. Firms decide their locations to minimize production costs and maximize profit. If a firm is heavily dependent upon natural resources, it will likely locate near those resources to reduce transport costs. On the other hand, if a firm relies heavily on a specialized labor force (i.e., workers with specialized skills), it will likely locate in places where well-educated workers are abundant. Although final location choices of profit-maximizing firms may not be absolute- optimal because firms often have only limited information on markets for factor inputs and other determinants of production costs, they can be at least sub-optimal with respect to cost under con- strained information conditions. Accordingly, one can expect firm location choices to follow some systematic patterns. In particular, given that there are centrifugal forces (e.g., competition, congestion, pollution, etc.) as well as well-known centripetal forces in economic geography (i.e., agglomeration economies), more productive firms that can afford a higher cost of doing business are more likely to locate in large urban areas. Firms that rely on out-dated technologies or low- skilled workers may not benefit enough to offset the higher cost of doing business in major cit- ies. In other words, a systematic difference in productivity between firms locating in urban and 3 rural areas may arise not only from spatial externalities in large cities but also from firms' volun- tary choices of production locations. The discussion thus far raises an interesting issue about the specification of economic ge- ography. It is a proven fact that urban firms are more productive than non-urban firms. Ag- glomeration theory attributes such a productivity gap to spatial externalities created by well- developed buyer-supplier chains, deep labor pools, and knowledge spillovers in large urban areas (Fujita & Ogawa, 1982; Helsley & Strange, 1990; Venables, 1996). However, the productivity gap may result firm location choices as well. If more productive firms tend to choose urban ar- eas, production function parameter estimates may suffer from a serious selection bias unless the firm location decision process is incorporated into empirical models. This paper questions the fundamental assumptions of economic geography. If higher productivity of urban firms is indeed associated with individual firms' location decisions, which are developed to minimize their production costs, the implications of economic geography de- rived from most previous studies can be misleading. When proper consideration is not given to this issue, the effects of economic geography on productivity in many empirical studies are likely to be seriously overestimated. This paper presents a new approach to thinking about the contri- bution of economic geography to productivity and illustrates this point by estimating simple Cobb-Douglas production functions for 18 2-digit Indian industries as defined by the National Industry Classification (NIC), with and without consideration of firm location choices. The next section lays out an analytic framework that describes the selectivity issue in the production function estimation and presents an alternative approach that takes into account firm location choices. Section 3 describes the empirical model and hypothesis and Section 4 de- 4 scribes the data and variables. Section 5 discusses concentration patterns of NIC 2-digit Indian industries and their distributions. Section 6 presents the results, and the last section discusses the implications for research and policy. 2. Modeling a Production Function under Self-Selection To model a production function under the self-selection process of a location decision, consider a simple production function equation (1) and a location decision equation (2) with a latent variable: Oij = XijB + uij (i = 1,2,3,...,n) (1) Iij = ZijR + eij (j = 1,2,3,...,m) * (2) where Oij is the output of firm i in region j, Xij is a vector of input factors (in log term) utilized by firm i in region j, Iij is a latent variable representing firm i's decision to locate in region j, and Z * is a vector of firm and location characteristics that determine the firm location decision process. Since a firm's location decision is an endogenous process influencing agglomeration economies and the firm's productivity, the level of output is conditional upon not only input factors but also location decisions. Therefore, Oij is observed only if firm i chooses to locate in region j, and, consequently, the observed distribution of Oij is truncated. A classic selectivity issue arises as follows: E(Oij | Iij = j) = XijB + E(uij | Iij = j) (3) where Iij=j represents a firm's decision to locate in region j. Since E(uij | Iij = j) 0, the OLS estimation of equation (1) will be biased. 5 Alternatively, following Maddala (1986), a polychotomous-choice model with m catego- ries can be incorporated into the production function framework to correct the self-selection bias. Consider a profit maximizing firm's location decision (subscript i is dropped for simplicity): I = j iff I*j > Max Is * (4) where s = 1,2,3,...,m, js. Let j = Max Is -ej (s = 1,2,3,...,m, js) * (5) Then it follows that I = j iff j < Z R (6) j Following Domencich and McFadden (1975), the probability for firm i to choose region j is de- fined as equation (7): exp(Z R) Pr( j < Z R) = Pr(I = j) = j (7) j s exp(ZsR) Thus, the distribution of j can be written as Fj () = exp() (8) exp() + s =1,2,3,...,m(s j)exp(ZsR) Therefore, for each location choice j, we now have the model Oij = XijB + uij , where Oij can be observed only if j < Z R . j Finally, based on a modified version of Heckman's (1979) two-stage method, we can es- timate a production function based on firm location choice behavior. The first stage estimators from equation (2) are obtained by running a modified version of the McFadden's (1974) condi- tional logit model on firm location choices. After estimating the first stage location choices 6 specified in equation (7), we can estimate equation (1) with a correction factor derived from the first stage: Oj = X B - j [ (ZjR)] j (9) j j + vj Fj (Z R) j where j is the standard deviation of uij, j is the correlation coefficient between uij and eij, and j(ZjR) is the inverse of the standard normal distribution function that transforms non-normal distributions to normal (Lee, 1982). 3. Empirical Model and Hypothesis To implement the two-stage estimation model proposed in the previous section, we calcu- late the correction factor as follows. First, a total of 496 districts are categorized as rural, non- metro-urban, and metro-urban areas, and firms are hypothesized to choose their locations among them.1 We then estimate a conditional logit model by regressing location choices on firm attrib- utes, such as factor intensities, labor productivity, and age, as well as location attributes, such as market access, literacy, and infant mortality rate. The results show that 1) no location-specific attribute significantly affects the odds of choosing a particular location; 2) higher capital inten- sity increases the odds of locating in metro-urban areas but decreases the odds of locating in non- metro-urban areas; 3) higher labor intensity decreases the odds of locating in non-metro urban or metro-urban areas; 4) higher labor productivity increases the odds of locating in metro-urban ar- eas; and 5) higher age increases the odds of choosing non-metro-urban and metro-urban areas.2 1Location categories are defined based on population sizes and our judgment. 2The estimation results are included in Appendix A. 7 Based on the correction factor calculated from the first-state estimation, a simple Cobb- Douglas production function with economic geography variables are estimated as follows:3 lnOij = ln Kij + ln Lij + ln Eij + ln Mij + e ln EGej + Cij (10) where O, K, L, E, and M are output, capital, labor, energy, and material, respectively; C is the location correction factor (i.e., mills ratio) derived from the first-stage location choice model; and EG represents economic geography variables. We develop economic geography variables based on the new economic geography litera- ture (Fujita, Krugman, & Venables, 1999). First, the transportation infrastructure significantly improves access to markets and inter-regional connectivity. Accordingly, the availability of reli- able transportation networks can reduce the unit cost of production and generate consumer sur- plus, thereby improving productivity and attracting private investment. Two transportation in- frastructure-related measures are proposed to capture scale economies from improved market access and transportation networks. Market accessibility reflects the effects of improved access to consumer markets; distance to transport hubs captures the effects of location in transportation networks. In addition, the model includes industry concentration and urban density variables to cap- ture classic localization and urbanization economies, respectively (Hoover, 1937). Firms located in close proximity to other firms in the same industry often share skilled labor and industry- specific knowledge (i.e., localization economies). They can also benefit from more efficient 3We also estimated more complicated specifications (e.g., translog). The difference between models with and with- out the consideration of location choices was not as clear in more complicated models as that simple models. Al- though more complicated models are still conceptually sound, a large number of parameters may dilute the effects of the location correction factor. Since the purpose of this paper is to illustrate that the importance of economic geog- raphy may be exaggerated when firm location choices are not considered, we report the results from simple Cobb- Douglas production function models. 8 subcontracting and possibilities for collectively lobbying regulators. On the other hand, firms located in large urban areas can benefit from different kinds of sources, such as access to special- ized professional services, a large labor pool, and availability of the general infrastructure (i.e., urbanization economies). If the selectivity issue is indeed relevant, the correction factor is expected to be statisti- cally significant. However, whether incorporating firm location choices into the estimation process will completely wipe out the effects of economic geography is unclear. Although spatial external economies can be offset by the resolved selectivity issue as well as increased costs for labor, land, and transportation, theoretically, economic geography may still play a role (i.e., a smaller role than was believed) in improving firm productivity. Given that more productive firms are likely to locate in large urban areas, we hypothesize that the effects of economic geog- raphy variables in the production function estimation are overestimated when firm location choices are not taken into account. 4. Data and Measures Data. To implement the proposed two-stage estimation model, we use establishment level data from the 1994 Indian Annual Survey of Industries, conducted by the Central Statistical Office of India. The data include various plant level attributes such as output, sales, labor, capi- tal, materials and energy use. These plant level data are supplemented by district and metropoli- tan level demographic and economic geography variables that are designed to capture scale economies arising from the concentration of economic activities such as improved market access 9 and localization/urbanization economies. After deleting records that violate simple accounting principles, the total of 47,324 plants are used for the analysis. Measures. This study measures traditional input and output variables as follows. Output is defined as the ex-factory value of products manufactured for sale during the accounting year. Capital is often measured by perpetual inventory techniques that require continuous observations of the same plant over time. These techniques, however, are difficult to use with micro-level survey data because sample sizes differ by year and a system for tracking firms over time does not exist. Instead, capital is defined as the gross value of the plant and machinery. It includes not only the book value of the installed plant and machinery, but also the approximate value of the rented-in plant and machinery. Doms (1992) demonstrated that it is reasonable to define capital as a gross stock. Labor is defined as the total number of employee mandays worked and paid for by the factory during the accounting year. Energy is measured by the total purchase value of fuels, lubricants, electricity, and water consumed in the production process during the accounting year. Material is measured by the total delivered value of all raw materials, compo- nents, chemicals, and packing materials that entered into the production process during the ac- counting year. Defining economic geography variables, particularly those related to transportation infra- structure, is not as straightforward as defining traditional input and output variables. In this analysis, we use the transport and market access variables developed in Lall et. al (2004), where access to markets is determined by the distance from and the size of market centers around the plant. Market accessibility is defined as Ii = d S j (11) b j ij 10 where Ii is the accessibility indicator estimated for location i, Sj is a size indicator at destination j (e.g., population, purchasing power, or employment), dij is a measure of distance between origin i and destination j, and b describes how increasing distance reduces the expected level of interac- tion. The measure is constructed based on the Indian road network and urban population centers. Lall et. al (2004) also calculated distances (measured by travel times) between district centroids and transport hubs to examine if a short travel time to transport hubs has external economies above and beyond the effects of market accessibility. At the industry level, a simple location quotient (LQ) is used to measure localization economies. In addition, this study uses urban population density (i.e., the ratio of the urban population to the urban area of the district) as an indicator for urban scale economies. While many other studies have used urban sizes as a proxy for urbanization economies, we use density because it better reflects spatial concentration. 5. Spatial Industrial Concentration in India The essence of economic geography is the spatial concentration of economic activities and subsequent economic benefits. Therefore, examining spatial concentration patterns of firms is the first necessary step when investigating economic geography. This section presents a brief overview of spatial industrial concentration in India. We examined spatial concentration patterns of 18 NIC 2-digit Indian industries using a concentration measure that Ellison and Glaeser (1997) recently proposed: r =i=1(si - xi)2 -(1- i xi)H M M (12) (1- i xi )(1- H) M 11 where si is region i's share of the study industry, xi is the regional share of total employment, and H is the Herfindahl industry plant size distribution index, H = Nj z2j . =1 The Ellison-Glaeser (EG) index has several advantages over other widely used concentra- tion indexes, such as location quotients (LQ) and Gini coefficients. First, the index is developed based on an explicit micro theory because it is derived from firm location choices. Second, the index takes on a value of zero when plant location distribution patterns are random (as opposed to uniform). Therefore, it captures agglomeration above and beyond what we would observe if firm location decisions were random. Third, the index is designed to make comparisons across industries, countries, and over time. [Table 1 Here] We calculate the raw concentration measure G, Herfindahl index H, and EG index r for 18 NIC 2-digit Indian industries. Following Ellison and Glaeser's definition of concentration (r<0.02: not very localized, 0.02<=r<=0.05: intermediate, and r>0.05: highly localized), jute tex- tile, beverages, leather/leather products, miscellaneous food products n.e.c., wood/wood prod- ucts, textile products, and wool/silk products show very high levels of local concentration, whereas non-metallic mineral, transport equipment/parts, machinery other than trans- port/electronic/electrical, electronic/electrical machinery/parts/apparatus, rubber/petroleum/coal products, metal, and paper/paper products are hardly localized. The results indicate that more resource-intensive industries tend to be more locally concentrated. Overall, spatial industrial dis- tribution patterns in India resemble the concentration patterns of the U.S. manufacturing indus- tries that Ellison and Glaeser investigated. 12 We then examine labor productivity in rural, nonmetro-urban, and metro-urban areas. A simple comparison of productivity does not prove any causal relationship between economic ge- ography and productivity differences. It is, however, meaningful since it can highlight important characteristics of firms located in different areas, which might result from location choices. Ta- ble 2 illustrates that there is a noticeable difference in labor productivity among firms in rural, nonmetro-urban, and metro-urban areas. Firms in large urban areas are substantially more pro- ductive than those in rural areas. The difference might be an outcome of economic geography, firm location choices, or both. [Table 2 Here] 6. Results To illustrate a potential bias created by the firm location decision process, we estimate two sets of Cobb-Douglas production functions for 18 NIC 2-digit Indian industries: one with the location correction factor derived based on firm location choices and the other without it. For both cases, we run simple OLS models with and without regularity restrictions (i.e., monotonic- ity and quasiconcavity). Regularity restrictions do not make any substantial difference in overall results. Therefore, this section discusses results from models with regularity restrictions. A major difference between this paper and others is the inclusion of the location correc- tion factor in the production function estimation, which will demonstrate a potential selection bias arising from firm location choices. The significance level of the correction factor suggests whether the two-stage estimation process that takes into account firm location choices is indeed necessary. If the correction factor is not statistically significant, firm location choices will not 13 create any estimation bias. This implies that firms make their location decisions randomly. It is often the case in developing countries where information on the market is limited. In other words, individual firms may make rational decisions with limited information. The collective firm location patterns, however, can be close to random. Therefore, a comparison between the corrected model (with the correction factor) and the uncorrected model (without the correction factor) can illustrate a potential selection bias caused by firm location choices. The correction factor is statistically significant in 15 out of 18 NIC 2-digit Indian indus- tries, indicating a strong selection bias. Among economic geography variables, location quotient and urban density, which represent localization and urbanization economies, show mixed signs. Table 3 shows that, in both corrected and uncorrected models, the location quotient affects out- put levels negatively in six industries (miscellaneous food products n.e.c., non-metallic mineral products, metal products, textile products, wood/wood products, paper/paper products) and posi- tively in three industries (wood/silk textiles, transport equipment/parts, and leather/leather prod- ucts). In addition, urban density affects output levels negatively in five industries (food prod- ucts, miscellaneous food products n.e.c., chemical/chemical products, wood/silk textiles, and transport equipment/parts) and positively in two industries (jute textile and textile products). This implies that centrifugal forces as well as centripetal forces of economic geography are in place. Firms are expected to benefit from spatial scale externalities arising from buyer-supplier linkages, a deep labor pool, knowledge spillovers, and the availability of specialized services, and a general infrastructure. On the other hand, a significant concentration of economic activi- ties can also cause negative externalities, such as competition, congestion, and pollution that will increase the cost of doing business. 14 [Table 3 Here] The two transportation-related economic geography variables show clearer patterns of as- sociation with output levels. In uncorrected models, market access significantly increases output levels in 11 industries (miscellaneous food products n.e.c., beverages, chemical/chemical prod- ucts, rubber/petroleum/coal products, wood/silk textiles, basic metals/alloys, machinery other than transport/electronic/electrical, electronic/electrical machinery/parts/apparatus, textile prod- ucts, paper/paper products, leather/leather products); distance to transport hubs significantly de- creases output levels in 12 industries (food products, chemical/chemical products, rub- ber/petroleum/coal products, cotton textiles, wool/silk textiles, basic metals/alloys, metal prod- ucts, machinery other than transport/electronic/electrical, electronic/electrical machin- ery/parts/apparatus, transport equipment/parts, paper/paper products, and leather/leather prod- ucts). An interesting pattern emerges when the correction factor is added to the estimation. Market access loses its statistical significance in five industries (chemical/chemical products, rubber/petroleum/coal products, electronic/electrical machinery/parts/apparatus, paper/paper products, and leather/leather products), and distance to transport hubs loses statistical signifi- cance in two industries (chemical/chemical products, leather/leather products). This implies that the traditional production function estimation, which ignores firm location choices, can create a bias and wrongly reject the null hypothesis of parameter estimates. In addition, the results also suggest that the importance of transportation infrastructure in particular may not be as critical as was believed after firm location choices are taken into account. 15 As far as the magnitude of parameters is concerned, economic geography variables in un- corrected models have a stronger influence on output levels than those in corrected models. In other words, the absence of the correction factor tends to inflate parameter estimates of the eco- nomic geography variables. In particular, when the correction factor is not included, the influ- ence of market access and distance to transport hubs is exaggerated in 11 and 12 out of 18 indus- tries, respectively. When these two variables are statistically significant, they are always overes- timated without the correction factor. If we only consider industries with statistically significant correction factors, the importance of market access is overestimated in 10 out of 15 industries, and that of distance to transport hubs is also inflated in 12 out of 15 industries. The results thus far indicate that the importance of economic geography, particularly the benefit of transportation infrastructure to productivity, is somewhat oversold. Estimates for scale externalities from the transportation infrastructure can be more significantly biased by firm loca- tion choices than those for localization and urbanization economies. The transportation infra- structure is still, however, an important determinant of productivity for many firms and indus- tries since market access and distance to transport hubs still play strong roles in production ac- tivities in six and ten industries, respectively, even after controlling for firm location choices. In sum, economic geography may not be hype, but its effects are not as real as typically believed. 7. Conclusion Economic geography has become a mantra for many economists, geographers, and re- gional scientists. Many previous studies have tested the importance of economic geography for production activities and found a significant association between them. Methodologically, how- ever, they have not taken into account that economic geography influences firm location choices. 16 In other words, most previous research did not acknowledge that spatial scale economies in large urban areas are materialized only after firms make their location decisions accordingly. When a contingent nature of economic geography is ignored, the validity of empirical findings can be seriously questioned. This paper proposes a new approach to thinking about economic geography and illus- trates a potential bias that can arise when firm location choices are not considered as part of eco- nomic geography. An analysis using microdata of Indian manufacturing firms shows that when firm location choices are not given proper consideration, the role of economic geography can be overemphasized. This is particularly true for transportation infrastructure. The results indicate that the importance of market access and distance to transport hubs is exaggerated in many in- dustries. Economic geography still matters to many firms and industries even after firm location choices are taken into account as part of economic geography. Its magnitude, however, is not as significant as has been believed. Therefore, policymakers need to exercise caution when inter- preting results from previous research and applying them to future regional development strate- gies. 17 Reference Beeson, P. (1987). Total factor productivity growth and agglomeration economies in manufactur- ing, 1959-73. Journal of Regional Science, 27, 183-199. Christaller, W. (1933). Die zentralen Orte in Süddeutschland. Jena: Gustav Fischer. Dixit, A. K., & Stiglitz, J. E. (1977). Monopolistic competition and optimum product diversity. American Economic Review, 67, 297-308. Domencich, T., & McFadden, D. (1975). Urban Travel Demand: A Behavioral Analysis. Am- sterdam: North-Holland. Doms, M. E. (1992). Essays on Capital Equipment and Energy Technology in the Manufacturing Sector. Ph.D. Dissertation: Univ. of Wisconsin at Madison. Ellison, G., & Glaesar, E. L. (1997). Geographic concentration in U.S. manufacturing: a dart- board approach. Journal of Political Economy, 105(5), 889-927. Feser, E. (2001). A flexible test for agglomeration economies in two US manufacturing indus- tries. Regional Science and Urban Economics, 31, 1-19. Fogarty, M., & Garofalo, G. (1988). Urban spatial structure and productivity growth in the manufacturing sector of cities. Journal of Urban Economics, 23, 60-70. Fujita, M., Krugman, P., & Venables, A. (1999). The Spatial Economy: Cities, Regions, and In- ternational Trade. Cambridge, MA: MIT Press. Fujita, M., & Ogawa, H. (1982). Multiple equilibrium and structural transition of non- monocentric urban configurations. Regional Science and Urban Economics, 12, 161-196. Heckman, J. J. (1979). Sample selection bias as a specification error. Econometrica, 47, 153-161. Helsley, R., & Strange, W. (1990). Matching and agglomeration in a system of cities. Regional Science and Urban Economics, 20, 189-212. Henderson, V. (1986). Efficiency of resource usage and city size. Journal of Urban Economics, 19, 47-70. Hoover, E. M. (1937). Location Theory and the Shoe and Leather Industries. Cambridge, MA: Havard Univ. Press. Krugman, P. (1980). Scale economies, product differentiation, and the pattern of trade. American Economic Review, 70, 950-959. Krugman, P. (1991a). Geography and Trade. Cambridge, Mass.: MIT Press. Krugman, P. (1991b). Increasing returns and economic geography. Journal of Political Econ- omy, 99, 183-199. Lall, S., Shalizi, Z., & Deichmann, U. (2004). Agglomeration economies and productivity in In- dian industry. Journal of Development Economics, 73, 643-673. Lee, L. F. (1982). Some approaches to the correction of selectivity bias. Review of Economic Studies, 49, 355-372. Losch, A. (1956). The Economics of Location. New Haven, CT: Yale Univ. Press. Maddala, G. S. (1986). Limited Dependent and Qualitative Variables in Econometrics: Cam- bridge Univ. Press. McFadden, D. (1974). Conditional logit analysis of qualitative choice behavior. In P. Zarembka (Ed.), Frontiers in Econometrics. New York: Academic Press. Moomaw, R. L. (1981). Productivity and city size: A critique of the evidence. Quarterly Journal of Economics, 96, 675-688. 18 Moomaw, R. L. (1988). Agglomeration economies: localization or urbanization? Urban Studies, 25, 150-161. Nakamura, R. (1985). Agglomeration economies in urban manufacturing industries, a case of Japanese cities. Journal of Urban Economics, 17, 108-124. Pred, A. (1966). The Spatial Dynamics of U.S. Urban-Industrial Growth, 1800-1914: Interpre- tive and Theoretical Essays. Cambridge, MA: MIT Press. Venables, A. (1996). Equilibrium locations of vertically linked industries. International Eco- nomic Review, 49, 341-359. von Thünen, J. H. (1826). Der isolierte Staat in Beziehung auf Landwirtschaft und Nationaloe- konomie. Hamburg: F. Perthes. Weber, A. (1929). Theory of the Location of Industries. Chicago: Univ. of Chicago Press. 19 [Table 1] Concentration of Indian Industries Industry NIC Code No. of States G H r Jute Textiles 25 12 0.548 0.021 0.570 Beverages 22 23 0.313 0.019 0.329 Leather and Leather Products 29 17 0.143 0.012 0.146 Miscellaneous Food Products, n.e.c. 21 24 0.092 0.003 0.098 Wood and Wood Products 27 26 0.079 0.007 0.080 Textile Products 26 20 0.066 0.002 0.070 Wool and Silk Textiles 24 20 0.058 0.006 0.058 Food Products 20 26 0.043 0.001 0.046 Basic Metals and Alloys 33 24 0.053 0.020 0.038 Cotton Textiles 23 21 0.029 0.002 0.030 Chemicals and Chemical Products 30 24 0.027 0.002 0.027 Non-Metallic Mineral Products 32 26 0.019 0.001 0.019 Transport Equipment and Parts 37 22 0.025 0.009 0.018 Machinery other than Transport/Electronic/Electrical 35 22 0.018 0.006 0.013 Electronic and Electrical Machinery, Parts, and Apparatus 36 24 0.018 0.009 0.010 Rubber, Petroleum and Coal Products 31 24 0.011 0.005 0.007 Metal Products 34 27 0.007 0.004 0.002 Paper and Paper Products 28 25 0.006 0.004 0.002 Mean 0.083 0.008 0.083 Source: Annual Survey of Indian Industries [Table 2] Location and Productivity No. of Firms Labor Productivity Rural 12,378 1,022.7 Non-metro Urban 24,691 1,163.6 Metro Urban 10,255 1,391.2 Total 47,324 1,176.0 Source: Annual Survey of Indian Industries 20 [Table 3] Cobb-Douglas Production Function Estimation with Economic Geography Variables* Food Products Chemical and Chemical Products Corrected Model Uncorrected Model Corrected Model Uncorrected Model Variable Estimate StdErr Estimate StdErr Variable Estimate StdErr Estimate StdErr Intercept 4.079 0.134 3.991 0.130 Intercept 2.524 0.184 2.063 0.169 Capital 0.090 0.006 0.092 0.006 Capital 0.074 0.006 0.079 0.006 Labor 0.250 0.009 0.248 0.009 Labor 0.256 0.010 0.250 0.010 Energy 0.198 0.008 0.199 0.008 Energy 0.164 0.008 0.165 0.008 Material 0.431 0.003 0.431 0.003 Material 0.529 0.006 0.528 0.006 LQ -0.001 0.005 -0.002 0.005 LQ -0.004 0.006 -0.006 0.006 Density -0.055 0.004 -0.055 0.004 Density -0.013 0.006 -0.011 0.006 Access -0.021 0.013 -0.017 0.013 Access 0.019 0.017 0.049 0.017 Hub -0.018 0.003 -0.020 0.003 Hub -0.005 0.003 -0.009 0.003 Correction 0.043 0.016 Correction 0.125 0.020 Miscellaneous Food Products, n.e.c. Rubber, Petroleum and Coal Products Corrected Model Uncorrected Model Corrected Model Uncorrected Model Variable Estimate StdErr Estimate StdErr Variable Estimate StdErr Estimate StdErr Intercept 3.440 0.197 2.991 0.195 Intercept 2.107 0.212 1.773 0.189 Capital -0.009 0.008 -0.005 0.008 Capital 0.080 0.008 0.085 0.008 Labor 0.410 0.009 0.413 0.009 Labor 0.336 0.014 0.334 0.014 Energy 0.161 0.008 0.159 0.009 Energy 0.180 0.011 0.178 0.011 Material 0.442 0.004 0.442 0.004 Material 0.466 0.007 0.465 0.007 LQ -0.034 0.007 -0.041 0.007 LQ 0.007 0.006 0.006 0.006 Density -0.033 0.005 -0.038 0.005 Density -0.003 0.007 -0.002 0.007 Access 0.077 0.017 0.103 0.017 Access 0.033 0.020 0.057 0.019 Hub 0.006 0.005 -0.002 0.005 Hub -0.011 0.003 -0.014 0.003 Correction 0.224 0.022 Correction 0.075 0.022 Beverages Non-metallic Mineral Products Corrected Model Uncorrected Model Corrected Model Uncorrected Model Variable Estimate StdErr Estimate StdErr Variable Estimate StdErr Estimate StdErr Intercept 2.689 0.434 2.214 0.386 Intercept 2.447 0.132 2.390 0.126 Capital 0.036 0.012 0.037 0.012 Capital 0.039 0.005 0.040 0.005 Labor 0.328 0.019 0.326 0.019 Labor 0.321 0.010 0.321 0.010 Energy 0.170 0.019 0.172 0.019 Energy 0.249 0.006 0.248 0.006 Material 0.446 0.012 0.444 0.012 Material 0.420 0.005 0.420 0.005 LQ 0.009 0.014 0.011 0.014 LQ -0.013 0.004 -0.013 0.004 Density 0.014 0.015 0.019 0.015 Density 0.008 0.004 0.007 0.004 Access 0.082 0.041 0.117 0.038 Access -0.001 0.013 0.002 0.013 Hub 0.009 0.010 0.005 0.010 Hub 0.017 0.004 0.016 0.004 Correction 0.122 0.051 Correction 0.025 0.018 * Bold represents significant economic geography variables at <0.05; grey scale represents potentially overestimated economic geography variables. 21 Cotton Textiles Basic Metals and Alloys Corrected Model Uncorrected Model Corrected Model Uncorrected Model Variable Estimate StdErr Estimate StdErr Variable Estimate StdErr Estimate StdErr Intercept 4.375 0.187 4.244 0.182 Intercept 2.195 0.156 1.809 0.149 Capital 0.052 0.008 0.054 0.008 Capital 0.088 0.006 0.095 0.006 Labor 0.219 0.012 0.219 0.012 Labor 0.192 0.012 0.188 0.012 Energy 0.196 0.010 0.195 0.010 Energy 0.170 0.008 0.171 0.009 Material 0.431 0.004 0.430 0.004 Material 0.529 0.006 0.529 0.006 LQ 0.006 0.007 0.007 0.007 LQ 0.010 0.005 0.009 0.005 Density -0.032 0.009 -0.028 0.009 Density 0.002 0.006 0.005 0.006 Access 0.001 0.018 0.003 0.018 Access 0.062 0.014 0.080 0.014 Hub -0.019 0.006 -0.021 0.006 Hub -0.007 0.003 -0.012 0.003 Correction 0.070 0.024 Correction 0.132 0.018 Wool and Silk Textiles Metal Products Corrected Model Uncorrected Model Corrected Model Uncorrected Model Variable Estimate StdErr Estimate StdErr Variable Estimate StdErr Estimate StdErr Intercept 3.868 0.253 3.474 0.245 Intercept 2.707 0.179 2.560 0.160 Capital 0.098 0.010 0.106 0.010 Capital 0.086 0.007 0.087 0.007 Labor 0.204 0.015 0.209 0.015 Labor 0.295 0.011 0.296 0.011 Energy 0.138 0.013 0.132 0.013 Energy 0.142 0.008 0.141 0.008 Material 0.462 0.006 0.461 0.006 Material 0.493 0.006 0.492 0.006 LQ 0.021 0.006 0.018 0.006 LQ -0.024 0.006 -0.025 0.006 Density -0.031 0.011 -0.023 0.011 Density -0.001 0.006 0.001 0.006 Access 0.052 0.027 0.066 0.027 Access 0.000 0.017 0.010 0.016 Hub -0.019 0.006 -0.030 0.005 Hub -0.006 0.003 -0.008 0.003 Correction 0.182 0.033 Correction 0.038 0.021 Jute Textiles Machine other than Transport/Electronic/Electrical Corrected Model Uncorrected Model Corrected Model Uncorrected Model Variable Estimate StdErr Estimate StdErr Variable Estimate StdErr Estimate StdErr Intercept 1.267 1.115 1.316 0.987 Intercept 2.328 0.158 2.084 0.144 Capital 0.026 0.028 0.026 0.028 Capital 0.055 0.006 0.057 0.006 Labor 0.255 0.054 0.254 0.054 Labor 0.327 0.011 0.329 0.011 Energy 0.191 0.044 0.191 0.044 Energy 0.145 0.009 0.142 0.009 Material 0.472 0.024 0.473 0.024 Material 0.515 0.005 0.513 0.005 LQ 0.046 0.029 0.045 0.027 LQ -0.002 0.005 -0.004 0.005 Density 0.126 0.032 0.126 0.032 Density -0.009 0.006 -0.007 0.006 Access 0.122 0.104 0.119 0.098 Access 0.037 0.016 0.055 0.016 Hub -0.034 0.019 -0.034 0.018 Hub -0.017 0.003 -0.020 0.003 Correction -0.010 0.110 Correction 0.066 0.018 * Bold represents significant economic geography variables at <0.05; grey scale represents potentially overestimated economic geography variables. 22 Electronic and Electrical Machinery, Parts, and Ap- Textile Products paratus Corrected Model Uncorrected Model Corrected Model Uncorrected Model Variable Estimate StdErr Estimate StdErr Variable Estimate StdErr Estimate StdErr Intercept 0.977 0.350 0.835 0.325 Intercept 2.453 0.227 2.046 0.187 Capital 0.028 0.011 0.028 0.011 Capital 0.047 0.008 0.050 0.007 Labor 0.362 0.016 0.366 0.015 Labor 0.317 0.013 0.320 0.013 Energy 0.196 0.015 0.196 0.015 Energy 0.148 0.011 0.145 0.011 Material 0.421 0.007 0.419 0.007 Material 0.534 0.007 0.532 0.007 LQ -0.024 0.011 -0.026 0.011 LQ -0.002 0.006 -0.003 0.006 Density 0.027 0.016 0.033 0.015 Density -0.002 0.008 0.000 0.007 Access 0.245 0.038 0.250 0.038 Access 0.011 0.024 0.043 0.021 Hub -0.002 0.006 -0.003 0.006 Hub -0.010 0.004 -0.013 0.004 Correction 0.043 0.040 Correction 0.074 0.023 Wood and Wood Products Transport Equipment and Parts Corrected Model Uncorrected Model Corrected Model Uncorrected Model Variable Estimate StdErr Estimate StdErr Variable Estimate StdErr Estimate StdErr Intercept 4.086 0.279 3.632 0.270 Intercept 3.459 0.269 2.902 0.244 Capital 0.000 0.010 0.001 0.010 Capital 0.017 0.011 0.026 0.010 Labor 0.330 0.018 0.332 0.018 Labor 0.368 0.015 0.377 0.015 Energy 0.287 0.014 0.283 0.014 Energy 0.120 0.014 0.111 0.014 Material 0.361 0.006 0.360 0.006 Material 0.509 0.008 0.506 0.008 LQ -0.019 0.009 -0.015 0.009 LQ 0.022 0.008 0.021 0.008 Density 0.007 0.007 0.011 0.007 Density -0.038 0.011 -0.038 0.011 Access -0.040 0.024 -0.011 0.024 Access 0.010 0.029 0.042 0.028 Hub 0.003 0.005 -0.002 0.005 Hub -0.021 0.005 -0.027 0.005 Correction 0.173 0.031 Correction 0.141 0.030 Paper and Paper Products Leather and Leather Products Corrected Model Uncorrected Model Corrected Model Uncorrected Model Variable Estimate StdErr Estimate StdErr Variable Estimate StdErr Estimate StdErr Intercept 2.668 0.227 2.449 0.201 Intercept 3.143 0.758 2.030 0.647 Capital 0.076 0.008 0.077 0.008 Capital 0.002 0.019 0.010 0.018 Labor 0.339 0.013 0.340 0.013 Labor 0.382 0.027 0.387 0.027 Energy 0.130 0.009 0.129 0.009 Energy 0.239 0.024 0.225 0.023 Material 0.470 0.007 0.469 0.007 Material 0.395 0.010 0.394 0.010 LQ -0.040 0.007 -0.039 0.007 LQ 0.029 0.014 0.029 0.014 Density 0.007 0.007 0.010 0.007 Density 0.023 0.019 0.007 0.018 Access 0.023 0.021 0.039 0.020 Access 0.037 0.081 0.153 0.070 Hub -0.007 0.003 -0.009 0.003 Hub -0.007 0.010 -0.020 0.009 Correction 0.053 0.025 Correction 0.179 0.064 * Bold represents significant economic geography variables at <0.05; grey scale represents potentially overestimated economic geography variables. 23 Appendix A. Location Selection Model Estimation Variable Coefficient Hazard Ratio Non-metro-urban 0.871150* 2.390 Metro-urban -0.241080* 0.790 Market Access 0.000001 1.000 Literacy -0.000280 1.000 Infant Mortality 0.006510 1.007 Capital intensity*Non-metro urban -0.466980* 0.627 Capital intensity*Metro-urban 0.722100* 2.059 Labor intensity*Non-metro-urban -0.756240* 0.469 Labor intensity*Metro-urban -0.336060* 0.715 Labor productivity*Non-metro-urban 0.000010 1.001 Labor productivity*Metro-urban 0.000063* 1.001 Age*Non-metro-urban 0.000942* 1.001 Age*Metro-urban 0.000869* 1.001 * Significant at <0.05