Policy Research Working Paper 9433 In Search of Better Opportunities Sorting and Agglomeration Effects among Young College Graduates in Colombia Marigee Bacolod Jorge De la Roca María Marta Ferreyra Latin America and the Caribbean Region Office of the Chief Economist October 2020 Policy Research Working Paper 9433 Abstract This paper studies the dynamic sorting of workers prior size are the most important determinants of the decision to to labor market entry that leads to skill differences across move for college. The relatively less able remain in medium cities of different sizes, as well as its consequences on the and small cities or move there for work after attending col- estimation of agglomeration effects. Using rich administra- lege in big cities. Pre–labor market sorting thus concentrates tive data for young, college-educated workers in Colombia, population and skill in big cities. As a result of this sorting, the paper shows that the most talented and best trained sort agglomeration effects are stronger for college than work to big cities primarily because they attend college there and city size, even after controlling for mediating factors such remain for work. The availability of colleges in an individu- as individual ability or college selectivity. al’s high school city, parental resources, and high school city This paper is a product of the Office of the Chief Economist, Latin America and the Caribbean Region. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at mferreyra@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team In search of better opportunities: Sorting and agglomeration effects among young college graduates in Colombia Marigee Bacolod* ‡ Naval Postgraduate School Jorge De la Roca* § University of Southern California María Marta Ferreyra* ¶ The World Bank Key words: agglomeration effects, regional migration, spatial inequality jel classification: j15, r23 * We thank Andrea Franco for her exceptional research assistance. Thanks to Harris Selod for helpful comments and discussions. ‡ Corresponding author. Graduate School of Defense Management, Naval Postgraduate School, 555 Dyer Road, Monterey, ca 93943, (e-mail: mbacolod@nps.edu; website: https://sites.google.com/site/marigeebacolod/). § Sol Price School of Public Policy, University of Southern California, 650 Childs Way rgl 326, Los Angeles, ca 90089, usa (e-mail: jdelaroc@usc.edu; website: http://jorgedelaroca.name). ¶ Office of the Chief Economist Latin America and the Caribbean, The World Bank, 1818 h Street, nw, msn I8-808, Washington, dc 20433 (e-mail: mferreyra@worldbank.org; website: https://sites.google.com/site/ mariamartaferreyraphd/). 1. Introduction Workers earn higher wages in bigger cities. One explanation is the learning advantages that bigger cities provide, as well as the larger array of opportunities for workers to match with a firm. However, an alternative explanation for the higher observed wages is the sorting of more skilled workers into bigger cities. We know workers in bigger cities exhibit higher educational attainment and tend to work in occupations that demand more skills. Yet, we continue to know little about the contribution of internal migration to the sorting of more skilled workers into bigger cities. Recent studies that examine differences in skills between workers in big and small cities rely on rich administrative worker-level panel data that follow workers throughout their careers (Combes, Duranton, Gobillon, and Roux, 2012, Carlsen, Rattsø, and Stokke, 2016, Matano and Naticchioni, 2016, De la Roca and Puga, 2017). However, sorting may take place prior to entering the labor force as more skilled individuals may leave their small towns upon college graduation and move to a big city to commence their job search and career. Furthermore, sorting may occur even earlier as high-ability high school graduates may move to a big city to pursue a college degree and remain there after graduation to work. The sorting of skilled individuals into big cities can, therefore, occur prior to labor market entry and may be a dynamic process. In this paper we study the dynamic sorting that leads to differences in skills across cities of different sizes and its consequences on estimated agglomerative effects for work, college, and high school city, using rich administrative data for young college-educated workers in Colombia. For college graduates who work as employees in the formal sector of the economy, we observe their wages, as well as their college, degree, and major. We also observe a rich set of background charac- teristics, including parental income and education, as well as individual scores in the mandatory national high school exit exam, Saber 11, which measures academic readiness for higher education (or ‘ability,’ for short.). Importantly, we observe the location of an individual’s high school, college, and work sites. Existing evidence for developed countries shows an important role for the pre-labor market sorting of more educated and skilled individuals into big cities. Ahlin, Andersson, and Thulin (2017) find that individuals with higher grades in high school and more educated parents sort into bigger cities in Sweden, largely as a result of attending college there.1 Winters (2011) finds that us cities with a high share of college graduates tend to attract more undergraduates who end up developing local human capital and remaining there after graduation. In the closest study to ours, Suhonen (2013) shows that the positive earnings gap between Helsinki and smaller cities in Finland among young college graduates can be fully explained by students’ pre-university characteristics such as high school grades and parental education. Meanwhile, Bosquet and Overman (2019) show the significant role of birthplace on adult wages in the uk, which accounts for 4.2% or two thirds of the 6.8% raw elasticity of wages with respect to city size. They conclude that lifetime immobility explains much of the correlation between birthplace and current city size: in their sample of Britons, 43.7% of workers live in the same place where they were born. 1 Ahlin,Andersson, and Thulin (2017) focus on urban vs. rural differences in Sweden and do not investigate the consequences of such sorting on wages. 1 To our knowledge, our study is the first that uses data outside of relatively homogeneous developed country settings. In addition to exploring pre-labor market sorting, we investigate its impact on estimated agglomeration effects. Pre-labor market sorting is likely more consequential in a developing than a developed country due to wider disparities in college access and quality between big and small cities, and due to tighter credit constraints that restrict young people’s ability to move for college or work. Our conjecture is that, in this context, young workers’ migratory decisions prior to labor market entry are largely driven by access to colleges, household income, and individual ability. These decisions may, in turn, generate divergence in wages by city size and a spatial distribution where the most able workers may locate in big cities, while the least able may locate in smaller ones. We find a significant urban wage premium in Colombia in line with the previous literature, an elasticity of wages with respect to city population of 0.052 (Duranton, 2016). More importantly, our estimates suggest a substantial effect on wages for the size of the city where individuals obtain their college degree, that turns out to be relatively more important than the size of the city where they work or finish high school. We find this to be the case even when controlling for various mediating factors such as individual ability, college selectivity, parental background, and college major. Potential reasons for the relative importance of college city size come directly from the micro- foundations of agglomeration: matching, learning, and the importance of networks (Duranton and Puga, 2004, Puga, 2010). Thicker markets in bigger cities generally provide more and better matches with college students’ skills and preferences, whether it be in college major choices or urban amenities. Bigger cities also provide more settings for learning, experimentation, and knowledge spillovers, such as those arising from additional summer internships or professional interactions. Further, they provider larger and denser networks that facilitate, for instance, a job search. Unobserved ability and pre-labor market sorting may also drive this observed college city size premium (Faggian and McCann, 2006, Suhonen, 2013). To investigate this and test our hypothesis, we explore the determinants of the sequential migration choices of individuals in our sample— whether to move for college to a bigger city, and then again for work to a smaller or bigger city. Given endogeneity concerns, we instrument for the mobility decision for college using the extent of college availability in the city where an individual lived during her high school years. The data reveal substantial mobility frictions, as 67 percent of individuals in our sample do not move at all, 82 percent do not move for college, and 55 percent of those who move for college do not move for work.2 According to our estimates, individuals who attend high school in cities with little college availability are more likely to move for college, as are individuals from small high school cities—even controlling for local college availability. Regardless of whether they move for college, individuals from small and medium high school cities are more likely than others to move after college for work, particularly to a bigger city. Further, the factors that make an individual proportion of college graduates who do not move at all is higher than the one found in the uk by Bosquet and 2 The Overman (2019), but in line with the low rates of college mobility across macro-regions in Italy provided by Brunello and Cappellari (2008). 2 more likely to move for college—such as attending a highly selective college, or having some unobserved traits—also make her more likely to stay in the college city for work. Through this sorting, population quantity and quality concentrate into big cities. Prior to moving for college, the spatial distribution of ability—as proxied by Saber 11—among high school graduates in our data is already more favorable in big than in smaller cities. Migration only inten- sifies this advantage, as big cities attract migrants (population quantity) who are highly talented (population quality). Meanwhile, the relatively less able either remain in small and medium cities, or move there from bigger ones. This sorting, along with agglomeration effects, is reflected in the distribution of wages across individuals and over space. We find that, in cities of every size, migrants earn more than non- movers.3 In big cities, top earnings accrue to individuals who move there for college and stay. In medium and small cities, they accrue to individuals who arrived there for work, having gone to college in bigger cities. Importantly, individuals who move to bigger cities after college tend to earn less than those who move to smaller cities. It is precisely the individuals who move for work, after college, those who separately identify the college and work city size agglomeration effects. Those who move to smaller cities for work are relatively high-ability individuals who attend college in big cities, where they train at selective, private institutions. Although they are not as able as workers in big cities, they are positively selected into their smaller work cities, where they are the most able, highest-earning workers. In contrast, individuals moving to bigger cities for work are not necessarily the most able and have attended less-selective colleges in small and medium cities. Although they are more able than their peers who do not move out of their college city, they are not positively selected into their big work cities. Taken together, the two types of after-college move help explain our finding of a much higher elasticity of earnings with respect to college than work city size. This finding highlights the importance of our pre-labor market locational data. Administrative data sets usually follow an individual from the beginning of her work life and use repeated obser- vations per individual to handle the inherent self-selection into the work city, with identification of (work city) agglomeration effects being driven by migrants (Combes, Duranton, and Gobillon, 2008, D’Costa and Overman, 2014, De la Roca and Puga, 2017). In contrast, our pre-labor market locational data enable us to examine the spatial trajectory leading individuals to their work city, thereby underscoring at what point of this trajectory a large city pays off the most. Our study furthermore highlights the role of luck. Individuals born in big cities or who grew up there do not need to move in order to attend college, in contrast with those who grew up in small cities with limited college access. Given the spatial disparities in college access and quality in Colombia, policies aimed to address poverty and inequality should consider lowering the costs of college mobility for talented high school graduates from small towns and cities. The rest of this paper is organized as follows. Section 2 describes the data and institutional setting. Section 3 discusses our results, and Section 4 concludes. 3 See Greenwood (1997), for a survey on evidence of migrants being positively selected relative to non-movers. Borjas, Bronars, and Trejo (1992) show that more educated and productive workers in the us are more likely to migrate regardless of their state of origin. Hunt (2004) finds that migrants are more skilled that non-movers across federal states of West Germany. De la Roca (2017) shows that migrants across Spanish cities are more educated and productive than comparable non-movers in their city of origin. 3 2. Data Colombia: Institutional setting Colombia is a developing economy with a per capita Gross National Income of us $ 6,180. In terms of per capita gdp, Colombia is below the Latin America and the Caribbean average. As with most developing countries in the midst of a demographic transition, Colombia is highly urbanized with 71.3% of its population living in urban areas (Ferreyra and Roberts, 2018). Students in Colombia in their final year of high school are mandated to take the Saber 11, a standardized test covering multiple academic fields similar to high school exit exams in the us. Qualified high school graduates may then enroll into either of two program types offered in Colombia’s higher education system: four- and five-year programs similar to us bachelor’s programs or shorter two- or three-year programs. Higher education institutions (hei) that offer bachelor’s programs include universities (henceforth, "colleges"), technological schools and tech- nical and technological (hereafter, t&t) institutes. In what follows, ‘moving for college’ means, in general, moving to pursue higher education, regardless of the hei type attended. Most departments in Colombia have a large public university, with the largest public univer- sities usually being the most selective.4 Average tuition at public colleges is significantly lower than in private colleges; calculations by (Carranza and Ferreyra, 2019) show that for an individual with annual family income equal to 12 monthly minimum wages, annual tuition for a bachelor’s program is 19% versus 100% of income at a public versus private college, respectively. In addition to costs, average student ability by college varies substantially within and across the public and private sectors (Carranza and Ferreyra, 2019). Data sources and sample Our sample draws from two datasets. The first one is the universe of high school students who took Saber 11 between 2000 and 2009. For these students, we observe test scores (which we normalize by semester) as well as a rich set of characteristics such as gender, ethnic origin, and parental income. Importantly, this data set records the location of the high school attended by the student. The second data set is administrative data from the Labor Observatory of Education on college graduates. For these individuals we observe hei, degree, and major. For those who work as employees in the formal sector between 2007 and 2013, we also observe wages and work location. The combination of these two datasets yields a dataset of young college graduates (n = 729,726) for whom we observe hei, degree, major, Saber 11 standardized score, background characteristics, and importantly, location when finishing high school, attending college, and beginning to work. We first restrict the data to those who obtained a bachelor’s degree (n = 473,775). We further limit the sample to individuals who had graduated from college by 2012 (since the last year we observe a wage is 2013), were aged between 20 and 30 at college graduation, took the Saber 11 test between the ages of 14 and 24, had at least 4 years between taking Saber 11 and completing 4A department is similar to a us state and is a relatively autonomous country subdivision governed by a Governor. Bogota is the Capital District and not one of the 32 departments. 4 their bachelor’s degree, and are 35 years of age or younger at the time we observe their wages. These age restrictions ensure we compare individuals at a similar stage in their lifecycle profile and results in 389,749 observations. Our sample is further restricted to exclude those with missing locations and all observations in 2008 because the geographic identifiers in that year were incorrectly recorded (resulting in n = 310,581). Excluding workers with zero observed wages—either because they are self-employed, long-term unemployed, not in the labor force, or informally employed—we have n = 252,788 individuals.5 Our next restriction is to exclude workers in the health sector, who are subject to a mandatory rural service requirement for doctors and college graduates in these fields.6 For ease of interpretation in our results below, we also leave out individuals who moved to a smaller city for college. Only 2.7% of the sample at this stage moves to a smaller city for college, whereas 17.7% of individuals moves to a bigger city. Nonetheless, the inclusion of those who move to a smaller city for college does not alter our results. Our final sample is comprised of n = 212,023 individuals for whom we use their last recorded annual wage if they are observed in multiple years, and we deflate nominal annual wages using the Colombian cpi (Dec 2008 = 100). Definition of cities For the purposes of the analyses that follow, we define a ‘city’ as the metropolitan area constructed from the aggregation of municipalities that are interconnected through commuting flows, as laid out in Duranton (2015). Duranton (2015) takes all pairs of origin and destination municipalities, and designs an algorithm that flags those pairs for which the share of commuters from the origin is above some chosen commuting threshold. Around each ‘urban core’ or central city, the algorithm then identifies the commuting shed as the set of areas whose commuter share is above the commut- ing threshold. Both the urban core and the commuting shed thus constitute the metropolitan area. Duranton (2015) also proposes a commuting threshold of 10%, which appears to be reasonable given that Colombian municipalities tend to be fairly large. From hereon out, when we discuss cities, we are referring to the metropolitan areas as defined by this algorithm. Next we classify a city as big if it has a population of 2.5 million or more (this would include Bogota, Medellin, and Cali), medium if it has a population of 400,000 to 2.5 million, and small if it has less than 400,000 inhabitants. To define a migratory move across cities, we first create a matrix of road distances for all municipalities. We then define an individual in our data as having moved for college if her high school city is different from her college city and the road distance between those two cities is greater than 50 kilometers. Moves after college for work are defined similarly. Descriptive statistics Table 1 reports summary statistics for our sample of college graduates in the formal sector. Our sample are young workers (average age is 26.6), majority female (57%), 49% of whom are first-born 5 Our data do not allow us to identify the work status of an individual outside formal employee or formal self- employee. We observe some individuals over a few years so we can identify them when working, even if they start their careers in unemployment or out of the labor force. 6 Thus, mobility after college for health care workers is not voluntary and their salaries are set at the national level. 5 Table 1: Summary statistics Mean S.D. Min Max Individual characteristics Standardized Saber 11 0.940 1.099 -8.2 7.1 Female 0.573 0.495 0.0 1.0 Age 26.591 2.414 21.0 35.0 Ethnic origin 0.010 0.099 0.0 1.0 Birth order #1 0.489 0.500 0.0 1.0 Birth order #2 0.276 0.447 0.0 1.0 Birth order #3 or higher 0.235 0.424 0.0 1.0 Parental characteristics Very high household income 0.212 0.409 0.0 1.0 High household income 0.196 0.397 0.0 1.0 Moderate household income 0.219 0.413 0.0 1.0 Low household income 0.287 0.452 0.0 1.0 Very low household income 0.086 0.281 0.0 1.0 Mother, college degree 0.305 0.460 0.0 1.0 Mother, associates degree 0.210 0.407 0.0 1.0 Mother, high school degree 0.279 0.448 0.0 1.0 Mother, less than high school degree 0.207 0.405 0.0 1.0 Educational background Attended college 0.834 0.372 0.0 1.0 Attended a technological school 0.158 0.364 0.0 1.0 Attended a t&t institute 0.009 0.094 0.0 1.0 Attended a public college 0.386 0.487 0.0 1.0 Attended a public high school 0.427 0.495 0.0 1.0 In a big city during high school 0.504 0.500 0.0 1.0 In a medium city during high school 0.205 0.404 0.0 1.0 In a small city during high school 0.291 0.454 0.0 1.0 Higher education access # of public colleges within 50km 3.458 2.685 0.0 8.0 # of private colleges within 50km 11.632 10.779 0.0 26.0 # of technological schools within 50km 26.302 27.933 0.0 64.0 # of t&t institutes within 50km 15.295 16.595 0.0 39.0 Notes: Number of observations is 212,023. t&t for technical and technological. offspring in their families. The average ability score (Saber 11) is 0.94 standard deviation above the national average. While a fifth (21.2%) of individuals come from households with very high income (i.e., annual income is more than five times the annual minimum wage), only 8.6% come from households with very low income (i.e., with annual incomes below the annual minimum wage). A high share (30.5%) of individuals also have mothers with a college degree. These characteristics are what we would expect given that the individuals in our sample are all college graduates in a developing country setting. We note, too, that half of individuals are in a big city during high school, while under a third 6 Table 2: Spatial distribution of hei types in Colombia Big cities Medium cities Small cities HEI categories Public colleges 9 11 38 Private colleges 41 27 12 Technological schools 97 20 24 t&t institutes 56 15 19 Public colleges by quality Tier 1 (highest) 6 5 3 Tier 2 2 5 7 Tier 3 0 1 6 Tier 4 0 0 3 Tier 5 1 0 9 Private colleges by quality Tier 1 (highest) 10 1 0 Tier 2 8 3 0 Tier 3 9 6 3 Tier 4 9 9 4 Tier 5 4 8 4 Notes: hei stands for higher education institution and t&t for technical and technological. (29%) are in a small city at that time. The majority attended a private high school, with only 43% having attended a public high school. On average, the individuals in the sample have the following hei’s available to them within 50 km of their high school city: 3.5 public colleges, 11.6 private colleges, 26.3 technological schools, and 15.3 t&t institutes. The majority (83.4%) also attended a college for higher education, while the rest obtained their degree from a technological school and less than one percent from a t&t institute. Table 2 displays the spatial distribution of all hei types and private/public colleges by quality across Colombia. The majority of public colleges are actually located in small cities rather than in big cities (using the definition of ‘cities’ as metropolitan areas outlined above). Recall that each de- partment in Colombia has one large public college with multiple branches. Private colleges tend to be concentrated in big cities, with only 15% of private colleges in small cities. Technological schools and t&t institutes also tend to be concentrated in big cities relative to small cities. These numbers show an uneven distribution of hei types across the country, with more variety concentrated in big cities. To characterize hei quality, we classify the selectivity of public and private colleges into five quantiles. Using the universe of Saber 11 test scores in 2002, we first generate an average Saber 11 by hei. Tier 1 colleges have an average Saber 11 in the upper 20% among all colleges, tier 2 in the 60-80% range, and so on. We use scores in 2002, an earlier period to avoid endogeneity concerns in our analysis. Turning now to the spatial distribution of hei quality, table 2 also shows that the highest quality heis are concentrated in big cities. Of the 14 tier 1 public colleges, six are located in big cities, five in medium cities, and only three are in small cities. Top tier private colleges 7 are nearly all concentrated in big cities, with not a single tier 1 private college in any small city. Meanwhile, the lowest quality heis are disproportionately located in small cities, with nine out of the ten bottom tier public colleges located in small cities. We anticipate that given this spatial distribution in quantity and quality of hei institutions in Colombia, individuals in small cities would seek to attend college in big cities while for the most part, only the most talented or with sufficient means would be able to move for college. Our sample vs. Colombian population We compare demographics in our sample to those in the Socio-Economic Database for Latin America and the Caribbean (sedlac), a nationally representative household survey of Colombia. We restrict the sedlac data to workers with a bachelor’s degree employed in the formal sector who are aged 20-35. Our sample is remarkably similar to the sedlac sample. For instance, in the sedlac sample 55.5% are female (vs. 57% in our sample), aged 29 on average (vs. 26.6 in our sample), and earn raw average annual wages of 1,692,957 Colombian pesos (vs. 1,647,482 in our sample). Most importantly, the spatial distribution of the sedlac sample is similar to ours: 62.5% are in a big city for work (vs. 64.5% in our sample), 22% are in a medium city (vs. 17.2% in our sample), and 15.4% are in a small city (vs. 18.2% in our sample). We also use sedlac data to examine labor force participation and incidence of work in the formal sector among young college graduates. The majority (78%) of those aged 20-35 with a bachelor’s degree in Colombia are employed in the formal sector, only 2.5% are in the informal sector, while 20% are not employed. It is notable that of the 20% not employed, the proportion who is female is quite high at 75.8%, while the proportion of female workers among those who are active is 57% (as in our sample). Further, conditional on being employed, women are just as likely to work in the formal or the informal sector. Again, importantly for our purposes, formal and informal workers are similarly distributed across city sizes. In particular, 55, 21, and 23 percent of formal workers are located in big, medium, and small cities, whereas the corresponding percentages are 53, 19, and 28 percent for informal workers. We are thus confident that our sample is highly representative of the population of interest, that is, young college graduates aged 20-35 in Colombia. Although Colombia has a large informal sector ranging between 55% to 59% in 2009–2013 (International Labour Organization, 2014), the incidence of informality is substantially lower in our sample. As a result, we observe the great majority of college graduates when they start their careers. 3. Results Agglomeration effects Our first finding is that there is a work city size wage premium in Colombia. In column (1) of table 3, we estimate an elasticity of wages with respect to city population of 0.052 for college graduate workers, in line with previous estimates for the more general population in Colombia 8 Table 3: Agglomeration effects for college graduates Log wage (1) (2) (3) (4) Log work city size 0.0515 0.0214 (0.0105)∗∗∗ (0.0112)∗ Log college city size 0.0796 0.0714 (0.0051)∗∗∗ (0.0077)∗∗∗ Log high school city size 0.0399 -0.0069 (0.0067)∗∗∗ (0.0047) Female -0.1006 -0.0967 -0.1021 -0.0966 (0.0053)∗∗∗ (0.0049)∗∗∗ (0.0049)∗∗∗ (0.0048)∗∗∗ Ethnic origin -0.1104 -0.0787 -0.0998 -0.0763 (0.0281)∗∗∗ (0.0258)∗∗∗ (0.0371)∗∗∗ (0.0258)∗∗∗ Technological school -0.0842 -0.1022 -0.0839 -0.1026 (0.0228)∗∗∗ (0.0227)∗∗∗ (0.0222)∗∗∗ (0.0217)∗∗∗ t&t institute -0.2235 -0.2118 -0.2231 -0.2132 (0.0317)∗∗∗ (0.0441)∗∗∗ (0.0339)∗∗∗ (0.0439)∗∗∗ Experience 0.1128 0.1106 0.1145 0.1097 (0.0047)∗∗∗ (0.0060)∗∗∗ (0.0055)∗∗∗ (0.0059)∗∗∗ Age indicators Yes Yes Yes Yes Year indicators Yes Yes Yes Yes 2-digit sector indicators Yes Yes Yes Yes Observations 190,817 190,817 190,817 190,817 R2 0.2183 0.2313 0.2128 0.2336 Notes: All specifications include a constant. Coefficients are reported with robust standard errors in parenthesis, which are clustered by urban area of work. ∗∗∗ , ∗∗ , and ∗ indicate significance at the 1, 5, and 10 percent levels. t&t stands for technical and technological. of 0.054 (Duranton, 2016).7 In columns (2) and (3), we instead examine how wages vary based on the cities where individuals went to college or high school, respectively. In column (2) we obtain a higher elasticity of 0.08 for college city size, while in column (3) we obtain a lower elasticity of 0.04 for high school city size. In column (4) we control for work city, college city, and high school city sizes. We find that the elasticity with respect to work city size shrinks from 0.052 to 0.021 and becomes only marginally significant, while the elasticity with respect to college city size remains relatively high at 0.071 and the elasticity with respect to high school city size becomes almost zero. Thus, agglomeration effects seem to arise from college city size rather than work city size.8 7 It is worth noting that for many workers we observe wage data more than one year (usually two or three years). In all estimations we restrict the analysis to the last observed wage (often the second or third year after entering the labor market), to avoid capturing an internship spell that takes place immediately upon graduating from college. Given the reduced number of annual observations per worker and the sizeable number of workers with only one wage observation, we cannot include worker fixed effects in our estimations. 8 We are aware that when we include simultaneously the three location variables, the coefficients on city sizes are estimated exclusively on the basis of migrants. It is not our purpose to provide causal estimates for each of these city sizes. At this point, we want to underscore the relatively larger effect that college city size has on wages. To unfold potential drivers for this large magnitude, we will later analyze location trajectories of migrants, with special emphasis on those who move for work to bigger and smaller cities. 9 As table 4 indicates, the college city size agglomeration effect persists even as we consider mediating factors that might explain it. First, we consider that the best colleges are generally located in big cities. Thus, agglomeration returns to college city size might simply reflect college selectivity. In column (2) we control for college selectivity linearly (average Saber 11 in college), whereas in column (3) we use tier indicators for college quality.9 We find that, as expected, workers who obtain their college degree from more selective institutions earn significantly more than others, regardless of how selectivity is measured. For instance, workers who graduate from tier 1 colleges with the highest scores earn 28.7 percent (e0.2521 -1) more than those who graduate from the lowest quality colleges (i.e., the tier 5 omitted category). However, even conditional on these measures of college quality, the college city size premia survives, with an elasticity of 0.050 to 0.059. Second, we consider wage specifications controlling for individual ability (Saber 11) in columns (4) to (6). As expected, more able workers earn more. Nonetheless, the college city size wage premium persists and remains high, at 0.049 to 0.054, conditional on both individual ability and college quality.10 Our findings raise the question of what explains the substantial returns to college city size. As discussed in the introduction, the micro-foundations of agglomeration—matching, learning, and networking—provide some possible explanations (Duranton and Puga, 2004, Puga, 2010). In big cities, which have a greater number and variety of higher education offerings, students are more likely to find a good match to their abilities and preferences. Further, these cities provide a better setting for knowledge spillovers, as well as more extensive networks for job search. Unobserved ability might also be an additional driver for the observed college city size premium, as big cities might attract individuals with the ability to thrive in them (Glaeser and Maré, 2001, Combes, Duranton, and Gobillon, 2008). These results indicate the importance of college city size on a worker’s productivity, more so than work city size. However, these estimates are primarily identified off of movers, since non- movers have the same high school, college, and work city. Since migration choices thus drive the identified of agglomeration effects, we now turn to characterizing the possible migration paths. Migration paths: descriptive analysis Since an individual in our sample could move at two points in time—first for college, and then for work—we consider the sequential mobility decisions of the young workers in our data as depicted in figure 1. An individual could choose to remain in her high school city and attend college there, or move for college to a city larger in size than her high school city. Conditional on this choice, she could then stay in that same city or move again for work, to a city bigger or smaller than her college city. These sequential choices give rise to six possible combinations. In what follows, 9 As explained earlier, note that we do not use our sample to calculate average Saber 11 by college. We instead rely on the universe of test scores back in 2002, an earlier period, in order to avoid endogeneity concerns. 10 In appendix table A.1 we also consider additional mechanisms, such as parental education, household income, birth order, college major, as well as whether the high school or college attended are public or private. The college city size premium persists at 0.035–0.036 across these specifications. 10 Table 4: Mechanisms on agglomeration effects for college graduates Log wage (1) (2) (3) (4) (5) (6) Log work city size 0.0215 0.0191 0.0198 0.0178 0.0172 0.0173 (0.0114)∗ (0.0106)∗ (0.0105)∗ (0.0105)∗ (0.0103)∗ (0.0100)∗ Log college city size 0.0723 0.0503 0.0587 0.0597 0.0494 0.0535 (0.0078)∗∗∗ (0.0083)∗∗∗ (0.0091)∗∗∗ (0.0080)∗∗∗ (0.0079)∗∗∗ (0.0090)∗∗∗ Log high school city size -0.0067 -0.0068 -0.0075 -0.0099 -0.0093 -0.0097 (0.0046) (0.0043) (0.0045)∗ (0.0038)∗∗∗ (0.0038)∗∗ (0.0039)∗∗ Average Saber 11 in college 0.1582 0.0936 (0.0115)∗∗∗ (0.0106)∗∗∗ Tier 1 (highest quality) 0.2521 0.1654 (0.0353)∗∗∗ (0.0293)∗∗∗ Tier 2 0.0760 0.0390 (0.0343)∗∗ (0.0336) Tier 3 0.0500 0.0398 (0.0259)∗ (0.0255) Tier 4 0.0498 0.0474 (0.0221)∗∗ (0.0190)∗∗ Standardized Saber 11 0.1100 0.0863 0.0888 (0.0062)∗∗∗ (0.0042)∗∗∗ (0.0035)∗∗∗ Female -0.0957 -0.0808 -0.0846 -0.0603 -0.0591 -0.0600 (0.0045)∗∗∗ (0.0054)∗∗∗ (0.0050)∗∗∗ (0.0053)∗∗∗ (0.0058)∗∗∗ (0.0057)∗∗∗ Ethnic origin -0.0740 -0.0174 -0.0333 -0.0120 0.0081 0.0031 (0.0265)∗∗∗ (0.0251) (0.0227) (0.0215) (0.0222) (0.0202) Technological school -0.1002 -0.0093 -0.0231 -0.0444 -0.0026 -0.0092 (0.0179)∗∗∗ (0.0173) (0.0149) (0.0146)∗∗∗ (0.0180) (0.0126) t&t institute -0.4361 -0.2953 -0.3192 -0.3181 -0.2602 -0.2686 (0.0689)∗∗∗ (0.0750)∗∗∗ (0.0603)∗∗∗ (0.0606)∗∗∗ (0.0698)∗∗∗ (0.0584)∗∗∗ Experience 0.1110 0.1074 0.1067 0.0994 0.0997 0.0990 (0.0058)∗∗∗ (0.0043)∗∗∗ (0.0041)∗∗∗ (0.0044)∗∗∗ (0.0040)∗∗∗ (0.0037)∗∗∗ Observations 184,719 184,719 184,719 184,719 184,719 184,719 R2 0.2345 0.2565 0.2567 0.2660 0.2723 0.2744 Notes: All specifications include a constant and indicator variables for age, year and 2-digit economic sector. We use standardized Saber 11 scores to calculate averages for each college. Tier 5 (lowest quality) is the omitted category. Coefficients are reported with robust standard errors in parenthesis, which are clustered by urban area of work. ∗∗∗ , ∗∗ , and ∗ indicate significance at the 1, 5, and 10 percent levels. t&t for technical and technological. we use the terms ‘migration paths’ for these combinations; ‘college (work) movers’ and ‘college (work) stayers’ for those who move and do not move for college (work), respectively. Table 5 reports summary statistics for each of the six migration paths, conditional on whether they did not move for college (columns 1 to 3) or did move for college (columns 4 to 6). The existence of mobility frictions is clear: 67 percent of individuals in our sample do not move at all; 82 percent do not move for college; and 55 percent of those who move for college do not move for work. Relative to college stayers, college movers are slightly more able and have attended public high schools at higher rates. Most college movers (81 percent, weighted average across columns 4 to 6) come from a small high school city, whereas most college stayers (61 percent, weighted average across columns 1 to 3) come from a big high school city. College movers are more likely than 11 Table 5: Summary statistics by migration path Did not move for college Moved for college Move Move Move Move Stayed bigger smaller Stayed bigger smaller city city city city (1) (2) (3) (4) (5) (6) Individual and parental characteristics Standardized Saber 11 0.967 0.846 0.750 1.077 0.854 0.758 Female 0.587 0.554 0.552 0.556 0.508 0.514 Very high household income 0.226 0.160 0.139 0.233 0.128 0.199 High household income 0.208 0.168 0.170 0.181 0.173 0.166 Moderate household income 0.229 0.205 0.226 0.179 0.182 0.198 Low household income 0.279 0.331 0.333 0.264 0.321 0.289 Very low household income 0.058 0.136 0.131 0.143 0.196 0.147 Mother, college degree 0.313 0.255 0.223 0.360 0.256 0.299 Mother, associates degree 0.221 0.195 0.200 0.180 0.188 0.181 Mother, high school degree 0.276 0.306 0.307 0.248 0.295 0.280 Mother, less than high school degree 0.190 0.243 0.270 0.211 0.261 0.241 Attended a public high school 0.362 0.558 0.532 0.538 0.666 0.570 Lifetime city sizes In a big city during high school 0.691 0.111 0.407 0.041 0.003 0.027 In a medium city during high school 0.187 0.415 0.251 0.179 0.097 0.149 In a small city during high school 0.122 0.474 0.342 0.780 0.900 0.825 In a big city during college 0.703 0.117 0.419 0.733 0.185 0.477 In a medium city during college 0.190 0.424 0.255 0.171 0.449 0.294 In a small city during college 0.107 0.459 0.326 0.096 0.366 0.229 In a big city during work 0.707 0.803 0.048 0.733 0.833 0.044 In a medium city during work 0.188 0.114 0.118 0.170 0.115 0.171 In a small city during work 0.105 0.083 0.834 0.097 0.052 0.785 Higher education Attended a public college 0.343 0.583 0.544 0.327 0.626 0.409 Tier 1 (highest quality) 0.411 0.345 0.381 0.469 0.419 0.389 Tier 2 0.222 0.238 0.214 0.196 0.166 0.196 Tier 3 0.171 0.143 0.146 0.156 0.130 0.171 Tier 4 0.124 0.141 0.131 0.110 0.166 0.122 Tier 5 0.071 0.134 0.128 0.070 0.120 0.122 Arts major 0.054 0.026 0.021 0.041 0.023 0.034 Education major 0.118 0.135 0.250 0.072 0.130 0.130 Social science major 0.185 0.150 0.168 0.200 0.149 0.207 Economics & business major 0.313 0.273 0.178 0.292 0.181 0.217 Engineering major 0.295 0.367 0.311 0.353 0.449 0.354 Math & science major 0.024 0.031 0.043 0.030 0.045 0.032 # of public colleges within 50km 4.406 1.534 3.177 1.088 0.687 0.882 # of private colleges within 50km 15.494 3.514 10.364 2.080 0.632 1.361 Log wages 14.086 14.156 14.157 14.183 14.181 14.070 Observations 142,361 18,350 12,796 21,218 5,810 11,488 12 Figure 1: Migration paths college stayers to attend tier 1 (highest quality) colleges and earn higher salaries, thus, providing a first indication of returns to (college) mobility. When comparing among the six migration paths, we note that, regardless of whether they moved for college or not, work stayers have higher ability, and attend tier 1 and private colleges at a higher rate than work movers. They are more likely to study economics & business, arts, or social sciences. About three quarters of them work in big cities. A group of work stayers—those who are also college movers—earn the highest average salary (14.183) among all migration paths.11 In contrast, work stayers who do not move for college (i.e., individuals who never move) earn the second lowest average wage, although this average masks great heterogeneity by city size— within this group, those in big, medium, and small cities earn an average log wage of 14.177, 13.902, and 13.804 respectively. This heterogeneity suggests that, while never moving away from a big city may deliver a premium, never moving away from a small or medium city may do the opposite. Regardless of whether they moved for college or not, those who move to a bigger city for work are more able than those who move to a smaller one. They are also more likely to study engineering and attend a public college. Further, they earn high wages. Among those who move for work to smaller cities after having also moved for college, about half are ‘return movers,’ who moved from a small high school city to a larger one for college, and then return to their high school city. The average wages of those who moved for college and then to a smaller city for work are the lowest of all migration paths. We will further characterize these migrants in our analyses below. Mobility and the spatial distribution of human capital We now turn to characterizing the spatial distribution of human capital given the six migration paths. Big cities attract young college graduate workers. While only 3 percent of college movers come from a big high school city, 54 percent of them end up working in a big city. And, while 61 percent of college stayers come from a big high school city, 67 percent of them end up working in a big city. The trajectory towards the big city, however, differs among migration paths. College 11 Theyalso come from affluent backgrounds. For instance, relative to all groups, they are more likely to have a very high household income (i.e., more than five times the annual minimum wage) and to have a mother with a bachelor’s degree. 13 movers who stay for work come mostly from small cities and move to big cities in one single step—when moving for college. In contrast, college movers who move to a bigger city for work arrive there through a progression—first they move to a college city larger than their high school city, but not necessarily one of the three big cities, and then move to a big city for work. This is the group with the second highest average log wages (14.181), which provides further evidence on the returns to mobility. Not only do big cities absorb a large number of graduates; they also attract the most skilled ones. For example, those who move for college and stay there for work are the group with the highest average ability and wages, and the most likely to attend tier 1 colleges. Meanwhile, the average ability of those who move to smaller cities for work is the lowest of all groups, consistent with having the lowest earnings on average noted above. Figure 2 provides further evidence of the concentration of high ability individuals in big cities. The red lines display the ability distribution for high school graduates from small, medium, and big cities. They reveal a mild overlap in the ability distributions for individuals from big and medium high school cities, yet the distribution of ability in big cities lies clearly to the right of that in small cities. Since we proxy ability through a measure of college academic readiness, the ‘ability premium’ of big cities might be due, for instance, to better k-12 schools and more educated, higher income parents. The blue lines show the distribution of ability by work city size. Among workers, the most able are in big cities with the ability distribution of workers in big cities shifted to the right of that in medium cities. Further, the ability distribution of those who work in small cities lies markedly to the left of all others. Since the red lines reflect the distribution that would prevail without mobility, while the blue lines reflect the distribution that results from mobility decisions, their contrast illustrates how mobility leads to a concentration of high-ability individuals in big cities and of lower-ability individuals in small ones. Big cities start out with an advantage in terms of ability and further attract high-ability individuals. In contrast, small cities start at a disadvantage in terms of ability. Not only do they lose high-ability individuals, but they also attract low-ability migrants. The same story goes with medium-sized cities: while they start out with an ability distribution fairly similar to big cities, through migration they lose high ability individuals and attract relatively low ability ones, from both small and big cities. Determinants of pre-labor market sorting: econometric analysis Next we explicitly estimate the migration choices depicted in figure 1. Our ultimate goal is to estimate a multinomial probit (mnp) model for the work migration decision, conditional on the college migration decision. Since the unobserved elements that affect the latter may also affect the former, we will account for the endogeneity of the college migration decision through a control function (cf) and estimate a mnp-cf model. To build intuition for the mnp-cf model, we first examine the college migration decision. Then we explore the work migration decision separately for college movers and non-movers. We finalize with the estimation of the mnp-cf model, which pools college movers and non-movers. 14 0.40 Big HS city Medium HS city Small HS city Big work city 0.30 Medium work city Small work city 0.20 0.10 0.00 −2 −1 0 1 2 3 4 5 Standardized Saber 11 Figure 2: Ability distributions by high school and work location Determinants of college move We estimate a probit model of the likelihood to move for college: P( Move collegei = 1) = Φ(γx Xi + γz Zi ) (1 ) where Move collegei = 1 if individual i moved for college to a city larger in size than her high school city, 0 otherwise; Φ(.) is the cdf of the standard normal distribution; Xi denotes a vector of student characteristics as of the end of high school; and Zi is a vector of variables associated with the availability of higher education institutions in or near i’s high school city. Later on, in the estimation of the mnp-cf, Z will serve as our exclusion restriction, as it is reasonable to assume that the availability of higher education institutions may affect the college migration choice but not the work migration choice. In particular, Z contains the number of public and private colleges available within 50 km of the high school city, and their interactions with student ability and household income. We include these interactions because the mere existence of heis within that radius does not necessarily give the student the choice of attending them. In effect, she must be admitted by the hei, which depends on her ability. She must also be able to pay for it, which depends on her household income. In table 6, columns (1) and (2) report parameter estimates, while columns (1a) and (2a) report the marginal effects corresponding to columns (1) and (2), respectively. In this discussion, and others, we focus on coefficients that are significantly different from zero. Estimates from column (1) are consistent with our descriptive comparison of college movers and non-movers above. According to our estimates, more able individuals, with more educated mothers, and who attended high school in small or medium cities are more likely to move for college. Individuals from households with very high income are the most likely to move. Further, those with fewer public or private 15 colleges available within 50 km of their high school city are more prone to move. The availability of public colleges is a stronger deterrent to move than that of private colleges, given that tuition is generally much lower at public than private colleges. Marginal effects in column (1a) indicate that individuals from small cities are significantly more likely to move for college, by about 30 percentage points. Individuals from the highest income households are also 7 percentage points more likely to move for college than those from the lowest income. The availability of an additional public college indeed decreases the probability of moving by 3.8 percentage points, while the effect of a private college is weaker (2.4 percentage points.) All these margins are greater in magnitude than the effects of other characteristics including ability. Nonetheless, among all determinants, high school city size is particularly strong—even when controlling for college availability. For example, about eight public colleges would be needed in a small city to render an individual from such a city indifferent between moving and not. In other words, individuals from small cities perceive a value in moving for college that goes beyond the mere access to college. Relative to column (1), estimates in column (2) add interactions of college availability with ability and household income. The interaction of ability and number of public colleges is not significantly different form zero. The interaction of ability and number of private colleges indicates that higher ability, holding the number of private colleges constant, makes people less inclined to move for college, presumably because it raises their admission chances to those colleges. The interactions of household income and number of public colleges shows that the existence of public colleges is less of a mobility deterrent for individuals from high income households than their lower-income counterparts, likely because they can afford to move. In contrast, the interaction of household income and number of private colleges shows that the availability of private colleges is more of a mobility deterrent for individuals from high income households than their lower-income counterparts, likely because they can afford local private colleges and stay. Column (2a) reports the marginal effects associated with column (2). The marginal effects that change the most relative to column (1a) are those related to income. To highlight the relative roles of college availability, household income, and individual ability, we plot predicted probabilities of moving for college in figure 3. The figure’s panels show the predicted probability of moving as a function of the number of public or private colleges available, for individuals of various levels of household income and ability. In general, the panels show the decline in the probability of moving for college as the number of available public or private colleges increases—although with differences at the extensive and intensive margin. In particular, these probabilities decline most steeply when going from zero to one private or public college, while they flatten out beyond three or four colleges. Interestingly, the deterrent effect on mobility of college availability varies more with household income than ability, as is clear from comparing the upper and lower panels. Regardless of the number of private or public colleges available, higher ability individuals are marginally more likely to move because they are admitted by more institutions. As the upper panels show, individuals from the bottom of the household income distribution (dashed blue line) have a probability of moving higher than 0.6 if they attended high school in a city without any public college within 50 km; this probability declines to 0.5 with one public 16 Table 6: Determinants of college move Move for college Probit Marginal effects (1) (2) (1a) (2a) # of public colleges -0.3464 -0.5674 -0.0376 -0.0449 (0.1215)∗∗∗ (0.1193)∗∗∗ (0.0160)∗∗ (0.0174)∗∗∗ # of private colleges -0.2188 -0.1409 -0.0238 -0.0290 (0.0523)∗∗∗ (0.0586)∗∗ (0.0084)∗∗∗ (0.0088)∗∗∗ Standardized Saber 11 0.1711 0.2140 0.0186 0.0113 (0.0200)∗∗∗ (0.0249)∗∗∗ (0.0049)∗∗∗ (0.0027)∗∗∗ Very high household income 0.5104 0.5286 0.0711 0.0166 (0.0633)∗∗∗ (0.1155)∗∗∗ (0.0172)∗∗∗ (0.0147) High household income 0.1081 0.0267 0.0109 -0.0307 (0.0660) (0.0892) (0.0067) (0.0150)∗∗ Moderate household income -0.0318 -0.0992 -0.0028 -0.0301 (0.0564) (0.0713) (0.0052) (0.0127)∗∗ Low household income -0.0745 -0.1526 -0.0064 -0.0163 (0.0345)∗∗ (0.0486)∗∗∗ ∗ (0.0036) (0.0102) # of public colleges × Saber 11 0.0222 (0.0252) # of private colleges × Saber 11 -0.0173 (0.0068)∗∗ # of public colleges × very high hh income 0.3369 (0.1276)∗∗∗ # of public colleges × high hh income 0.2658 (0.0882)∗∗∗ # of public colleges × moderate hh income 0.1613 (0.0567)∗∗∗ # of public colleges × low hh income 0.1175 (0.0467)∗∗ # of private colleges × very high hh income -0.1367 (0.0318)∗∗∗ # of private colleges × high hh income -0.1024 (0.0218)∗∗∗ # of private colleges × moderate hh income -0.0599 (0.0148)∗∗∗ # of private colleges × low hh income -0.0320 (0.0119)∗∗∗ Mother, college degree 0.3069 0.3338 0.0351 0.0447 (0.0496)∗∗∗ (0.0478)∗∗∗ (0.0104)∗∗∗ (0.0104)∗∗∗ Mother, associates degree 0.1035 0.1004 0.0100 0.0112 (0.0381)∗∗∗ (0.0364)∗∗∗ (0.0043)∗∗ (0.0046)∗∗ Mother, high school degree 0.0375 0.0400 0.0034 0.0042 (0.0243) (0.0229)∗ (0.0024) (0.0026) Medium city during high school 0.7574 0.5871 0.1175 0.0964 (0.3258)∗∗ (0.2484)∗∗ (0.0657)∗ (0.0520)∗ Small city during high school 1.6529 1.4641 0.3116 0.2911 (0.3614)∗∗∗ (0.2944)∗∗∗ (0.0950)∗∗∗ (0.0825)∗∗∗ Observations 212,023 212,023 212,023 212,023 Pseudo R2 0.4351 0.4453 Notes: All specifications include a constant and indicators for female, ethnic origin, public high school, birth order, age and year. Number of technological schools and t&t institutes within 50km are also included as controls. Coefficients and marginal effects are reported with robust standard errors in parenthesis, which are clustered by urban area of high school. ∗∗∗ , ∗∗ , and ∗ indicate significance at the 1, 5, and 10 percent levels. Marginal effects are evaluated at the mean values of the independent variables. Predicted probabilities for missing interactions in column (2a) are shown in figure 3. ‘Mother, less than high school’, ‘Very low household income,’ and ‘In a big city during high school’ are the excluded categories. 17 1.0 1.0 Probability of moving for college Probability of moving for college 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.0 0 1 2 3 6 7 8 0 1 2 3 4 5 6 7 8 9 10 Number of public colleges Number of private colleges Very low household income High household income Very low household income High household income Low household income Very high household income Low household income Very high household income Panel (a) Panel (b) Availability of public colleges and income Availability of private colleges and income 1.0 1.0 Probability of moving for college Probability of moving for college 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.0 0 1 2 3 6 7 8 0 1 2 3 4 5 6 7 8 9 10 Number of public colleges Number of private colleges Low ability Median ability High ability Low ability Median ability High ability Panel (c) Panel (d) Availability of public colleges and ability Availability of private colleges and ability Figure 3: Marginal effects of college availability college, and declines further with additional public colleges. In contrast, the probability of moving for individuals in the highest income quintile (orange solid line) is only about 0.35 when their high school city has no public college, and the probability declines only slightly (to about 0.3) when one public college is present. These individuals are less responsive to the number of public colleges because they can afford private ones, should they exist in the high school city. A cross-over in the moving probabilities for the highest and lowest incomes happens when three public colleges are available, at a moving probability of about 0.25. When more than three public colleges are available, individuals from the highest income households move at a higher rate than all others, partly because they can afford to do so. Taken together, our findings indicate that the decision regarding the first move (for college) is heavily affected by college access, as given by the combination of college availability and household income, and by high school city size. We now turn to the second move (for work). 18 Determinants of work move: naive estimates Whether individuals move or not for work is likely related to whether they moved for college—not only because the same factors (such as ability) may affect both, but also because where they went to college may affect whether they move for work. For example, a student who moved to a big city for college (maybe to attend a top tier college), may find work there and stay. In contrast, a student who moved to a medium-sized city for college may want to work in a big city and hence move there. Thus, in this subsection we examine the work location decision separately for college movers and non-movers. We use the term ‘naive’ to refer to these estimates because they fail to correct for the self-selection of individuals into the college mover or non-mover group. Nonetheless, these naive estimates provide a first characterization of post-college sorting. We estimate the following multinomial probit models: P(Yi = j| Move collegei = 0) = F ( β Xi + δ Xi∗ ) (2 ) P(Yi = j| Move collegei = 1) = F ( β Xi + δ Xi∗ ) (3 ) where Yi , the work location, can take on values 0, 1, and 2, corresponding to the choices of staying in the college city, moving to a bigger city and moving to a smaller city, respectively. While Xi includes the same background characteristics included in the college move probit, Xi∗ consists of variables related to the individual’s higher education experience, including college selectivity and type (public or private), and college major. We reiterate that the estimation is separate for college movers and non-movers. Table 7 shows the marginal effects for college non-movers (columns 1a, 1b, and 1c) and movers (columns 2a, 2b, and 2c). Recall that 57 percent of college movers and 62 percent of college non- movers attend college in a big city. These figures rise to 82 and 84 percent for college movers and non-movers, respectively, when we include both big and medium cities. In terms of the work migration decision, table 7 shows some common patterns among college movers and non-movers. College graduates are more likely to stay in their college city if they attended a private college (presumably because it provides better labor market connections than a public one); studied economics & business, arts, or social sciences (which may provide better local labor market connections than other fields); and come from a big high school city (as individuals who grew up in big cities may have a preference for them). They are more likely to move to a bigger city if they attended a public college (perhaps the one in their department’s capital) and grew up in a medium or small city (as they may perceive job opportunities to be better in bigger cities). At the same time, table 7 shows some important differences in the work migration decision of college movers and non-movers. Several individual characteristics affect the work location decision of college movers but not that of college non-movers. This is the case, for instance, of ability and college tier. Among college movers, the more able students, and those who attend high-quality tier 1 and 2 colleges, are more likely to stay in their college city. Further, characteristics such as household income affect college movers and non-movers differently. Overall, the findings from our naive mnp estimations provide a first characterization of the determinants of post-college sorting. For both college movers and non-movers, the important 19 Table 7: Multinomial probit estimation of moving for work, marginal effects Did not move for college Moved for college Move Move Move Move Stayed bigger smaller Stayed bigger smaller city city city city (1a) (1b) (1c) (2a) (2b) (2c) Standardized Saber 11 -0.0014 0.0050 -0.0037 0.0442 -0.0080 -0.0362 (0.0044) (0.0028)∗ ∗ (0.0021) (0.0078)∗∗∗ (0.0064) (0.0047)∗∗∗ Very high household income 0.0063 0.0112 -0.0175 -0.0721 -0.0104 0.0825 (0.0089) (0.0092) (0.0049)∗∗∗ (0.0130)∗∗∗ (0.0134) (0.0136)∗∗∗ High household income 0.0223 -0.0087 -0.0135 -0.0442 -0.0001 0.0443 (0.0090)∗∗ (0.0063) (0.0049)∗∗∗ (0.0197)∗∗ (0.0158) (0.0092)∗∗∗ Moderate household income 0.0283 -0.0145 -0.0138 -0.0344 -0.0230 0.0575 (0.0081)∗∗∗ (0.0072)∗∗ (0.0039)∗∗∗ (0.0147)∗∗ (0.0098)∗∗ (0.0097)∗∗∗ Low household income 0.0295 -0.0128 -0.0167 -0.0226 -0.0122 0.0347 (0.0069)∗∗∗ (0.0063)∗∗ (0.0034)∗∗∗ (0.0114)∗∗ (0.0078) (0.0097)∗∗∗ Public high school -0.0193 0.0058 0.0134 0.0353 -0.0055 -0.0298 (0.0034)∗∗∗ (0.0033)∗ (0.0023)∗∗∗ (0.0150)∗∗ (0.0094) (0.0101)∗∗∗ Public college -0.0335 0.0210 0.0126 -0.1869 0.1579 0.0290 (0.0077)∗∗∗ (0.0068)∗∗∗ (0.0042)∗∗∗ (0.0388)∗∗∗ (0.0284)∗∗∗ (0.0167)∗ Tier 1 (highest) 0.0029 0.0010 -0.0039 0.1723 -0.0872 -0.0851 (0.0171) (0.0132) (0.0078) (0.0709)∗∗ ∗ (0.0503) (0.0388)∗∗ Tier 2 -0.0240 0.0232 0.0008 0.1268 -0.0733 -0.0535 (0.0161) (0.0146) (0.0071) (0.0466)∗∗∗ (0.0328)∗∗ (0.0367) Tier 3 0.0003 0.0039 -0.0042 0.0863 -0.0459 -0.0404 (0.0119) (0.0102) (0.0063) (0.0569) (0.0476) (0.0277) Tier 4 0.0021 0.0060 -0.0081 0.0440 0.0262 -0.0702 ∗ (0.0157) (0.0117) (0.0077) (0.0859) (0.0648) (0.0412) Arts major 0.1049 0.0003 -0.1053 0.1422 -0.0048 -0.1374 (0.0176)∗∗∗ (0.0155) (0.0169)∗∗∗ (0.0489)∗∗∗ (0.0383) (0.0400)∗∗∗ Education major 0.0691 -0.0257 -0.0434 0.0561 -0.0113 -0.0448 (0.0254)∗∗∗ (0.0092)∗∗∗ ∗ (0.0263) (0.0565) (0.0318) (0.0551) Social science major 0.0769 -0.0044 -0.0725 0.1066 -0.0092 -0.0974 (0.0178)∗∗∗ (0.0074) (0.0155)∗∗∗ (0.0541)∗∗ (0.0190) (0.0472)∗∗ Economics & business major 0.1174 -0.0111 -0.1063 0.1750 -0.0159 -0.1590 (0.0178)∗∗∗ (0.0075) (0.0165)∗∗∗ (0.0479)∗∗∗ (0.0195) (0.0412)∗∗∗ Engineering major 0.0504 0.0160 -0.0663 0.0768 0.0407 -0.1175 (0.0210)∗∗ (0.0096)∗ (0.0166)∗∗∗ (0.0594) (0.0267) (0.0425)∗∗∗ Math & science major 0.0274 0.0014 -0.0288 0.0752 0.0215 -0.0966 (0.0212) (0.0113) (0.0176) (0.0558) (0.0278) (0.0390)∗∗ Medium city during high school -0.2721 0.2506 0.0215 -0.1873 0.2420 -0.0547 (0.0475)∗∗∗ (0.0507)∗∗∗ (0.0117)∗ (0.0676)∗∗∗ (0.0965)∗∗ (0.0360) Small city during high school -0.4079 0.3639 0.0440 -0.1411 0.1486 -0.0075 (0.0464)∗∗∗ (0.0530)∗∗∗ (0.0116)∗∗∗ (0.0548)∗∗ (0.0532)∗∗∗ (0.0138) Observations 167,539 37,689 Notes: All specifications include a constant and indicators for female, ethnicity, birth order, maternal education, age and year. Marginal effects are evaluated at the mean values of the independent variables and reported with robust standard errors in parenthesis, which are clustered by urban area of college. ∗∗∗ , ∗∗ , and ∗ indicate significance at the 1, 5, and 10 percent levels. In columns (1a)–(1c), the sample includes individuals who did not move for college and the dependent variable takes value zero if the individual stayed in the city, value one if she moves for work to a bigger city, and value two if she moves for work to a smaller city. In columns (2a)–(2c), the sample includes individuals who moved for college and the dependent variable behaves similarly as in (1a)–(1c). ‘Mother, less than high school’, ‘Very low household income,’ ‘Tier 5 (lowest quality),’ ‘Agriculture & veterinary major’ and ‘In a big city during high school’ are the excluded categories. 20 drivers are high school city size, college major, and college type (public or private). Among college movers, those with high ability, and who attended top tier or private colleges, are more likely to stay in their college city. These results echo the raw summary statistics in table 5 above; we will return to them later. Determinants of work move: correcting for endogeneity We now turn to analyzing the work migration decision for the full sample (including college movers and non-movers). We wish to estimate the following mnp: P(Yi = j) = F ( β Xi + α Move collegei ) (4 ) To allow for the endogeneity of Move college, we employ the cf approach outlined by Wooldridge (2015) for non-linear models with an endogeneous binary explanatory variable. Estimation pro- ceeds in two stages. In the first stage, we estimate a probit model of the likelihood to move for college. This is the same model that we already estimated in equation 1. Recall that our exclusion restriction, Z, is the number of higher education institutions by type available within 50km of the high school city of person i and its interactions with i’s ability and household income. Using the first-stage parameter estimates, we estimate the "generalized residuals" (for brevity, residuals) from the first-stage probit: ri = Move Collegei λ(γˆx Xi + γˆz Zi ) − (1 − Move Collegei )λ(−ˆ γ x Xi − γˆz Zi ) (5) where λ(.) = φ(.)/Φ(.) is the inverse Mills ratio. The residuals capture the elements not included in the first-stage probit that could make an individual more likely to move for college and could also affect her work location decision (making her more likely, for instance, to move for work). Such elements include her college major, the selectivity of her institution, her socio-emotional skills, and her adaptability to new circumstances. Note that, while we have controlled for some of these elements (such as college major) in our naive mnp models, we have not controlled for other, unobserved elements (such as her socio-emotional skills). In the second stage, we use the estimated residuals, ri , and estimate the following mnp-cf model: P(Yi = j) = F ( β Xi + α Move collegei + δri ). (6) Intuitively, the inclusion of the residuals controls for the part of the college move decision that is not captured by X or Z and depends on unobserved individual characteristics. Once this regressor is included, Move college can be seen as exogenous. Table 8 reports mnp-cf estimates, and table 9 reports the corresponding marginal effects. To highlight the role of the cf approach, columns (1a) and (1b) in table 8 report estimates without the cf (and hence without the residuals), and columns (1a)–(1c) in table 9 report the corresponding marginal effects. Regardless of the cf inclusion, estimates and marginal effects indicate that individuals who moved for college are less likely to stay in their college city. This finding is consistent with the fact that only 55 percent of college movers stay in their college city, in contrast with 82 percent of college non-movers. It reflects a certain inertia on the part of college non-movers 21 Table 8: Multinomial probit estimation of moving for work, control function approach Moved to Moved to Moved to Moved to bigger smaller bigger smaller city city city city (1a) (1b) (2a) (2b) Moved for college -0.3498 0.7930 0.5339 0.9146 (0.3452) (0.1275)∗∗∗ (0.2522)∗∗ (0.1600)∗∗∗ Standardized Saber 11 0.0658 -0.0436 0.0333 -0.0460 (0.0409) (0.0211)∗∗ (0.0586) (0.0252)∗ Low household income -0.1564 -0.0911 -0.1226 -0.0890 (0.0424)∗∗∗ (0.0316)∗∗∗ (0.0430)∗∗∗ (0.0328)∗∗∗ Moderate household income -0.1983 -0.0761 -0.1823 -0.0757 (0.0607)∗∗∗ (0.0425)∗ (0.0630)∗∗∗ (0.0446)∗ High household income -0.1383 -0.0940 -0.1495 -0.0949 (0.0715)∗ (0.0563)∗ (0.0774)∗ (0.0587) Very high household income -0.0729 -0.0813 -0.1612 -0.0875 (0.1073) (0.0573) (0.1778) (0.0602) Mother, high school degree 0.0660 0.0196 0.0554 0.0185 (0.0216)∗∗∗ (0.0220) (0.0235)∗∗ (0.0247) Mother, associates degree 0.0956 0.0327 0.0667 0.0302 (0.0298)∗∗∗ (0.0286) (0.0313)∗∗ (0.0301) Mother, college degree 0.1032 0.0103 0.0359 0.0049 (0.0443)∗∗ (0.0341) (0.0754) (0.0360) Medium city during high school 1.7711 0.6841 1.6389 0.6659 (0.4415)∗∗∗ (0.0973)∗∗∗ (0.6895)∗∗ (0.1493)∗∗∗ Small city during high school 2.2315 1.0029 1.7392 0.9398 (0.4371)∗∗∗ (0.0868)∗∗∗ (0.6329)∗∗∗ (0.1658)∗∗∗ Public high school 0.1214 0.1152 0.1274 0.1161 (0.0286)∗∗∗ (0.0295)∗∗∗ (0.0337)∗∗∗ (0.0336)∗∗∗ Female -0.1289 -0.1306 -0.1079 -0.1290 (0.0208)∗∗∗ (0.0138)∗∗∗ (0.0242)∗∗∗ (0.0174)∗∗∗ Ethnic origin 0.2617 0.3770 0.2284 0.3739 (0.1659) (0.1178)∗∗∗ (0.2023) (0.1187)∗∗∗ Birth order #2 0.0201 0.0197 0.0136 0.0192 (0.0133) (0.0104)∗ (0.0145) (0.0101)∗ Birth order #3 or higher 0.0621 0.0783 0.0373 0.0763 (0.0261)∗∗ (0.0273)∗∗∗ (0.0340) (0.0319)∗∗ Residual, moved for college -0.6099 -0.0834 (0.2108)∗∗∗ (0.0971) Observations 212,023 212,023 Notes: All specifications include a constant and indicators of age and wave year. Coefficients are reported with robust standard errors in parenthesis, which are clustered by urban area of college. ∗∗∗ , ∗∗ , and ∗ indicate significance at the 1, 5, and 10 percent levels. Columns (2a)–(2b) report residuals of a first stage estimation on the probability of moving for college, as shown in table 6. In columns (2a)–(2b), standard errors are bootstrapped with 200 iterations. ‘Mother, less than high school’, ‘Very low household income,’ and ‘In a big city during high school’ are the excluded categories. (given that they did not move the first time, they do not move the second time either), as well as a willingness to move again on the part of college movers (given that they moved the first time, they are willing to move the second time as well). Individuals who moved for college are also more likely to move to a smaller city, which is partly a reflection of ‘return moves.’ In accordance with our naive mnps, the estimates and marginal effects also show that individ- 22 uals who come from a big high school city are more likely to stay than others, whereas those who come from a medium or small city are more likely to move bigger than others. Moreover, high school city size has a quantitatively larger effect than that of other variables: attending high school in a small city makes an individual 31 percentage points less likely to stay in the college city, 23 percentage points more likely to move to a bigger city, and 8.1 percentage points more likely to move to a smaller city. Effects for individuals from medium-sized high school cities are of the same direction and similar magnitudes. Given that 82 percent of individuals do not move for college, the marginal effects in table 9 are quite similar to those of table 7 for college non-movers. Thus, our results are broadly consistent with the naive estimates. One important difference, however, pertains to the role of ability. In the overall sample of table 9, ability has a very small effect, making individuals slightly less likely to move to a smaller city. Nonetheless, as we saw with our naive estimates, more able college movers are more likely to stay. Controlling for the endogeneity of college move delivers important information. In column (2a) of table 8, the coefficient on the residual for move to bigger city is significantly different from zero, which is evidence of endogeneity in the decision of moving to a bigger city relative to staying in the college city. The negative coefficient indicates that the individual unobserved characteristics that make the person more likely to move for college (such as her intended college major, college selectivity, and unobserved skills) also make her more likely to stay in the college city and less likely to move to a bigger city. As a result, the inclusion of the residual lowers even more the probability that college movers will stay, and makes them more (rather than less) likely to move to a bigger city. Taken together, estimates from our various mnp models help us characterize the post-college sorting. The main determinant of the decision to stay in the college city or move to a bigger city is high school city size, which shows a tendency to ‘go big,’ either by staying in a big college city or by moving to a bigger city. The characteristics (such as college tier) that make a person more inclined to move for college also make her more inclined to stay in the college city. Although higher ability does not affect work location decisions for college non-movers, it makes college movers more likely to stay in the big city and less likely to move to smaller cities. These findings are consistent with our previous observations that migration makes individuals concentrate in big cities. While we have seen before that college movers who stay for work are the highest ability, highest-earning migrant group, our mnp models indicate that these individuals are positively selected in terms of unobserved characteristics—which, in turn, may help explain their high earnings. Through migration, individuals with the highest observed and unobserved ability, and best college training, sort into big cities, while others sort into medium or small cities. Returns by migration path As a result of the sorting analyzed in the previous section, wages differ across migration paths. While table 5 provided us with a first glimpse of wage differences among these paths, we now undertake a more detailed analysis. Our object of study is the wage net of basic demographic characteristics and sector of occupation. This ‘wage residual’ varies across individuals because of 23 Table 9: Marginal effects of the determinants of moving for work, control function approach Stayed Moved to Moved to Stayed Moved to Moved to bigger smaller bigger smaller city city city city (1a) (1b) (1c) (2a) (2b) (2c) Moved for college -0.0581 -0.0534 0.1115 -0.1473 0.0378 0.1095 (0.0324)∗ (0.0226)∗∗ (0.0118)∗∗∗ (0.0576)∗∗ (0.0450) (0.0198)∗∗∗ Standardized Saber 11 -0.0008 0.0079 -0.0071 0.0023 0.0045 -0.0067 (0.0072) (0.0057) (0.0019)∗∗∗ (0.0110) (0.0106) (0.0017)∗∗∗ Low household income 0.0243 -0.0155 -0.0089 0.0213 -0.0120 -0.0093 (0.0073)∗∗∗ (0.0055)∗∗∗ (0.0040)∗∗ (0.0086)∗∗ (0.0070)∗ (0.0045)∗∗ Moderate household income 0.0261 -0.0200 -0.0061 0.0248 -0.0185 -0.0063 (0.0101)∗∗∗ (0.0075)∗∗∗ (0.0052) (0.0121)∗∗ (0.0096)∗ (0.0057) High household income 0.0231 -0.0135 -0.0096 0.0242 -0.0147 -0.0095 (0.0112)∗∗ (0.0076)∗ (0.0069) (0.0123)∗∗ (0.0085)∗ (0.0081) Very high household income 0.0159 -0.0066 -0.0093 0.0244 -0.0161 -0.0083 (0.0104) (0.0113) (0.0083) (0.0118)∗∗ (0.0146) (0.0103) Mother, high school degree -0.0076 0.0064 0.0013 -0.0067 0.0054 0.0013 (0.0034)∗∗ (0.0026)∗∗ (0.0029) (0.0040)∗ (0.0041) (0.0037) Mother, associates degree -0.0117 0.0093 0.0024 -0.0090 0.0064 0.0026 (0.0049)∗∗ (0.0040)∗∗ (0.0037) (0.0056) (0.0060) (0.0044) Mother, college degree -0.0098 0.0106 -0.0007 -0.0035 0.0036 -0.0001 (0.0079) (0.0071) (0.0042) (0.0124) (0.0136) (0.0053) Medium city during high school -0.3099 0.2752 0.0347 -0.2847 0.2455 0.0391 (0.0501)∗∗∗ (0.0550)∗∗∗ (0.0135)∗∗ (0.0765)∗∗∗ (0.0822)∗∗∗ (0.0197)∗∗ Small city during high school -0.3916 0.3276 0.0640 -0.3096 0.2287 0.0809 (0.0375)∗∗∗ (0.0423)∗∗∗ (0.0136)∗∗∗ (0.0777)∗∗∗ (0.0822)∗∗∗ (0.0225)∗∗∗ Public high school -0.0233 0.0106 0.0127 -0.0239 0.0112 0.0127 (0.0063)∗∗∗ (0.0042)∗∗ (0.0036)∗∗∗ (0.0106)∗∗ (0.0080) (0.0047)∗∗∗ Female 0.0257 -0.0111 -0.0146 0.0237 -0.0088 -0.0149 (0.0031)∗∗∗ (0.0035)∗∗∗ (0.0023)∗∗∗ (0.0060)∗∗∗ (0.0058) (0.0027)∗∗∗ Ethnic origin -0.0720 0.0210 0.0510 -0.0683 0.0168 0.0515 (0.0304)∗∗ (0.0163) (0.0169)∗∗∗ (0.0350)∗ (0.0248) (0.0143)∗∗∗ Birth order #2 -0.0039 0.0017 0.0022 -0.0033 0.0010 0.0023 (0.0019)∗∗ (0.0014) (0.0013)∗ (0.0022) (0.0019) (0.0013)∗ Birth order #3 or higher -0.0142 0.0050 0.0092 -0.0118 0.0024 0.0095 (0.0054)∗∗∗ (0.0031) (0.0032)∗∗∗ (0.0076) (0.0046) (0.0039)∗∗ Residual, moved for college 0.0613 -0.0628 0.0015 (0.0241)∗∗ (0.0214)∗∗∗ (0.0076) Observations 212,023 212,023 Notes: All specifications include a constant and indicators of age and wave year. Marginal effects are reported with robust standard errors in parenthesis, which are clustered by urban area of college. ∗∗∗ , ∗∗ , and ∗ indicate significance at the 1, 5, and 10 percent levels. Columns (2a)–(2b) report marginal effects of residuals of a first stage estimation on the probability of moving for college, as shown in table 6. In columns (2a)–(2b), standard errors are bootstrapped with 200 iterations. ‘Mother, less than high school’, ‘Very low household income,’ and ‘In a big city during high school’ are the excluded categories. 24 0.30 wage adjusted for individual and job characteristics 0.20 Residual log wage −0.30 −0.20 −0.10 0.00 0.10 100,000 250,000 750,000 2,500,000 8,000,000 City population (log scale) Non−mover Work smaller Work bigger Moved college, stay Moved college, work smaller Moved college, work bigger Figure 4: Agglomeration returns by migration path differences in characteristics that affect sorting (such as ability, college quality, and unobserved characteristics), and because of work city size. As such, the wage residual reflects both self- selection and agglomeration returns. Figure 4 depicts the non-parametric wage residual profiles of each migrant type by work city size. To calculate these residual wages, we recover residuals from a regression of log wages on the same controls used in table 3—basic demographics (gender, ethnicity, age), work experience, type of hei attended, two-digit economic sector indicators and year. We then plot the local polynomial regression (using an Epanechnikov kernel) of residual wages on work city sizes for each migration path. For a given migration path, the wage profile shows the behavior of wage residuals relative to city size; for a given city size, the wage profiles show the migration paths that obtain the highest and lowest wage residuals. For each city size category (big, medium, small), this gives us the upper and lower envelope of wage residuals. Several key points are noteworthy from these profiles. First, for a given migration path, wage residuals are generally higher in bigger cities—either because of agglomeration effects or because of the individual characteristics that lead to a positive self-selection into bigger cities. Second, the upper envelope consists of movers (rather than non-movers) for all migration paths. In other words, the highest wage residuals do not accrue to the non-movers who attended high school there, but to individuals who moved into the city for college or work. Third, in medium and small cities, the lower envelope consists of non-movers. While non-movers are not at the lower 25 envelope in big cities, they are not at the upper envelope either. All in all, these plots illustrate why individuals migrate: to reach better opportunities. We now focus on the the specific migration paths that lie at the upper and lower envelope for each city size category. To facilitate this analysis, table 10 compares individuals from each migration path who work in big cities (top panel), medium cities (middle panel), and small cities (bottom panel). We begin with the upper envelope. In big cities, this consists of individuals who moved there for college and stayed for work. These individuals are positively selected, as they have the highest ability, the most educated parents and are the most likely to have attended a top tier or private college. Further, among all individuals from this particular migration path, it is the ‘best’ ones who end up in big cities (as can be noticed when comparing characteristics in column 2a across the three panels). Meanwhile, in medium and small cities the upper envelope (and the line immediately below it) in figure 4 consists of individuals who moved there for work while coming from larger college cities. In medium and small work cities, these individuals are positively selected—they have the highest ability and come from the most advantaged background in terms of parental education, household income and high school education. They are also the most likely to have attended a top tier or private college. It is interesting to note that individuals who move for work to bigger cities (regardless of whether they have moved for college before) are not at the upper envelope. In every migration path and city size, they are negatively selected, as they have the lowest ability, the least educated and affluent parents, and are the least likely to have attended a top tier or private institution. We now turn to the lower envelope. In big cities, as shown in figure 4, this corresponds to those who moved for college first and then to bigger cities for work. Recall that these individuals move to progressively bigger cities between high school and work. In big cities, these individuals are the least able and have the lowest odds of attending a private high school or college, graduating from a tier 1 college, or having affluent or educated parents. In medium and small cities, the lower en- velope corresponds to non-movers. They are intermediately selected relative to migrants—‘better’ than those coming from smaller cities, yet ‘worse’ than those coming from bigger cities. They stand out for their share of females (59 and 63 percent in medium and small cities, respectively), which is the highest of all migration paths and city size categories. Since our wage residuals capture both agglomeration effects and self-selection into migration paths, we now illustrate the relative roles of ability and high school city (which affect self-selection) and work city size (which determines agglomeration effects) by depicting wage residuals for low- and high-ability individuals in figure 5. In panels (a) and (b) of this figure, we show residual wages for individuals in the 25th and 75th percentiles, respectively, of the ability distribution in our sample. As expected, at every city size wages are higher for the more able workers. To illustrate the role of high school city size, consider a low-ability individual from the largest city, Bogota (on the blue line in panel a). Drawing a horizontal line from there to the blue line in panel (b), we find that she ‘earns’ (in the sense of a wage residual) the same as a high-ability individual from a city of about 700,000 people. In other words, living in Bogota compensates the low-ability individual for her lack of ability. In contrast, the high ability individual suffers a ‘disadvantage’ relative to her lower ability counterpart for not living in a big city. If she had 26 Table 10: Summary statistics by migration path and work city size Did not move for college Moved for college Move Move Move Move Stayed bigger smaller Stayed bigger smaller city city city city (1a) (1b) (1c) (2a) (2b) (2c) In a big work city Standardized Saber 11 1.089 0.957 1.223 1.218 0.895 1.365 Very high household income 0.269 0.187 0.290 0.288 0.144 0.542 High household income 0.223 0.183 0.265 0.191 0.185 0.139 Moderate household income 0.228 0.215 0.235 0.175 0.181 0.107 Low household income 0.247 0.312 0.195 0.238 0.308 0.169 Very low household income 0.033 0.104 0.015 0.109 0.182 0.044 Mother, college degree 0.343 0.287 0.379 0.412 0.273 0.579 Public high school 0.286 0.507 0.264 0.486 0.643 0.266 Private college 0.701 0.420 0.673 0.724 0.406 0.769 Tier 1 (highest) 0.475 0.268 0.565 0.526 0.335 0.617 Log wage 14.177 14.213 14.322 14.280 14.218 14.185 Observations 100,617 14,736 614 15,550 4,838 504 In a medium work city Standardized Saber 11 0.805 0.517 1.080 0.818 0.711 1.106 Very high household income 0.160 0.072 0.257 0.109 0.057 0.420 High household income 0.197 0.133 0.222 0.186 0.132 0.195 Moderate household income 0.236 0.181 0.207 0.191 0.193 0.156 Low household income 0.320 0.394 0.255 0.309 0.382 0.164 Very low household income 0.087 0.219 0.058 0.205 0.237 0.066 Mother, college degree 0.288 0.156 0.352 0.261 0.195 0.512 Public high school 0.474 0.720 0.343 0.619 0.725 0.287 Private college 0.551 0.259 0.617 0.552 0.249 0.798 Tier 1 (highest) 0.267 0.160 0.425 0.338 0.183 0.473 Log wage 13.902 13.942 14.234 13.952 13.989 14.100 Observations 26,725 2,098 1,511 3,604 668 1,968 In a small work city Standardized Saber 11 0.440 0.223 0.676 0.473 0.512 0.648 Very high household income 0.062 0.020 0.113 0.041 0.030 0.132 High household income 0.124 0.071 0.157 0.100 0.082 0.162 Moderate household income 0.221 0.143 0.229 0.188 0.168 0.213 Low household income 0.418 0.430 0.352 0.383 0.391 0.323 Very low household income 0.174 0.335 0.148 0.289 0.329 0.170 Mother, college degree 0.151 0.086 0.196 0.143 0.122 0.237 Public high school 0.674 0.828 0.574 0.782 0.898 0.649 Private college 0.259 0.118 0.364 0.275 0.060 0.500 Tier 1 (highest) 0.116 0.102 0.275 0.086 0.042 0.265 Log wage 13.804 13.900 14.136 13.860 14.020 14.057 Observations 15,019 1,516 10,671 2,064 304 9,016 27 0.40 0.40 wage adjusted for individual and job characteristics wage adjusted for individual and job characteristics 0.20 0.20 Residual log wage Residual log wage 0.00 0.00 −0.20 −0.20 −0.40 −0.40 100,000 250,000 750,000 2,500,000 8,000,000 100,000 250,000 750,000 2,500,000 8,000,000 City population (log scale) City population (log scale) Non−mover Moved college, stay Non−mover Moved college, stay Panel (a) Panel (b) Low ability (percentile 25th on Saber 11) High ability (percentile 75th on Saber 11) Figure 5: Agglomeration returns by migration path and skill lived in Bogota, her salary would have been about 25 percent higher (on the blue line in panel b, compare wage residuals at city sizes equal to 700,000 and Bogota’s). In other words, high school city—perhaps more broadly, place of origin—is highly important. This disadvantage may be more salient in a setting such as Colombia, where individuals face tight mobility constraints. Mobility, of course, can lessen the importance of place of origin. For example, a low-ability person from a small city (of about 100,000 people) can increase her earnings by about 30 percent if she moves to a big college city and stay there (panel a). Even more dramatically, a high-ability person from a small city can increase her earnings by about 50 percent when allowed to move to a big college city and stay there. From an efficiency standpoint, then, mobility is crucial to maximizing an individual earning’s potential. Back to agglomeration effects As a final note, we return to our findings in table 3 of relatively larger agglomeration effects for college than work city. These effects are separately identified by those who move for work after college, regardless of whether they have moved for college before. We draw on two insights from our preceding analysis of sorting and returns by migration path. First, those who move to smaller cities for work are relatively high-ability individuals who attend college in big cities, where they train at selective, private institutions. Although they are not as able as workers in big cities, they are positively selected into their smaller work cities, where they are the most able, highest-earning workers. Therefore, they might lower the elasticity of earnings with respect to work city size (when observed working in smaller cities earning relatively high wages), and might raise the elasticity of earnings with respect to college city size (when assigned their college location in a big city). Second, individuals moving to bigger cities for work are not particularly able and have attended less-selective colleges in small and medium cities. They are not positively selected into their big work cities, where they are not the most able or highest-earning workers. Therefore, it is unlikely 28 they could generate an upward bias in the elasticity of earnings with respect to work city size (when observed working in big cities with average or below average wages). Taken together, these two insights explain our finding of a larger elasticity of wages with respect to college than work city. This finding highlights the importance of our pre-labor market locational data. Administrative datasets usually follow an individual from the beginning of her work life and use repeated obser- vations per individual to handle the inherent self-selection into the work city, with identification of (work city) agglomeration effects being driven by migrants (Combes, Duranton, and Gobillon, 2008, D’Costa and Overman, 2014, De la Roca and Puga, 2017). In contrast, our pre-labor market locational data enable us to examine the spatial trajectory leading individuals to their work city, thereby underscoring at what point of this trajectory a large city pays off the most. 4. Summary and conclusions Our study examines the dynamic sorting that leads to differences in skills across cities of different sizes, and the consequences of this pre-labor market sorting on estimated agglomeration effects. We find an elasticity of wages with respect to city population of 0.052, in line with the previous literature. In addition, we find a substantial effect on wages for college city size; this effect is much larger than that of high school or work city size. We also document substantial returns to mobility, particularly with moves to attend college in big cities. Turning to the determinants of pre-labor market sorting, particularly with respect to the choice to move for college, we find that ability (as proxied through Saber 11, a national high school exit exam) is not a key determinant for mobility in Colombia. Instead, we find that access to college opportunities and household income are major drivers of pre-labor market sorting. Individuals from small cities also perceive a value in moving for college that goes beyond mere access to college, as those from small cities are disproportionately more likely to move for college. We show the economic consequences of this pre-labor market sorting in terms of worker earn- ings by city size. In line with our hypothesis, in the big cities the highest wages are for those who moved there for college and stayed in that city for work. In the small and medium cities, the highest wage earners are workers who moved there for work having attended college in a larger city. These workers are on average less able compared to those who work in big cities, but are relatively more able than the ones who never moved from small cities and, thus, are the highest earners in the small cities. This last finding is suggestive of a comparative advantage story: the most able migrants attain labor market success in the big city, while the relatively less able who attended college in the big city move to a small city for work, where they are the highest earners. Altogether, our findings illustrate the dynamic sorting of skilled workers prior to labor market entry in a developing country setting, with the most talented sorting to big cities as a result of being there for high school or of attending college there. Our study also highlights the role of luck. Those born in big cities such as Bogota, or who happen to be in a big city for high school, do not have to move to access college, while those who spent their youth in small cities with limited access to public colleges have to move if they want to attain a college degree. More important, 29 even when young individuals in big cities are of relatively low ability, their labor market earnings can exceed those of high-ability individuals in small cities. These individuals in small cities can raise their future earnings by moving to bigger cities for college, although this career path may not be accessible to everyone. Policies expanding college access, particularly for students from low income families, might hence mitigate spatial inequality in Colombia. References Ahlin, Lina, Martin Andersson, and Per Thulin. 2017. Human capital sorting: The “when” and “who” of the sorting of educated workers to urban regions. Journal of Regional Science 58(3): 581–610. Borjas, George J., Stephen G. Bronars, and Stephen J. Trejo. 1992. Self-selection and internal migration in the United States. Journal of Urban Economics 32(2): 159–185. Bosquet, Clément and Henry G. Overman. 2019. Why does birthplace matter so much? Journal of Urban Economics 110: 26–34. Brunello, Giorgio and Lorenzo Cappellari. 2008. The labour market effects of Alma Mater: Evidence from Italy. Economics of Education Review 27(5): 564–574. Carlsen, Fredrik, Jørn Rattsø, and Hildegunn E. Stokke. 2016. Education, experience, and urban wage premium. Regional Science and Urban Economics 60: 39–49. Carranza, Juan Esteban and María Marta Ferreyra. 2019. Increasing higher education access: Supply, sorting, and outcomes in Colombia. Journal of Human Capital 13(1): 95–136. Combes, Pierre-Philippe, Gilles Duranton, and Laurent Gobillon. 2008. Spatial wage disparities: Sorting matters! Journal of Urban Economics 63(2): 723–742. Combes, Pierre-Philippe, Gilles Duranton, Laurent Gobillon, and Sébastien Roux. 2012. Sorting and local wage and skill distributions in France. Regional Science and Urban Economics 42(6): 913–930. D’Costa, Sabine and Henry G. Overman. 2014. The urban wage growth premium: Sorting or learning? Regional Science and Urban Economics 48: 168–179. De la Roca, Jorge. 2017. Selection in initial and return migration: Evidence from moves across Spanish cities. Journal of Urban Economics 100: 33–53. De la Roca, Jorge and Diego Puga. 2017. Learning by working in big cities. Review of Economic Studies 84(1): 106–142. Duranton, Gilles. 2015. A proposal to delineate metropolitan areas in Colombia. Revista Desarrollo y Sociedad 75: 223–264. Duranton, Gilles. 2016. Agglomeration effects in Colombia. Journal of Regional Science 56(2): 210– 238. Duranton, Gilles and Diego Puga. 2004. Micro-foundations of urban agglomeration economies. In J. Vernon Henderson and Jacques-François Thisse (eds.) Handbook of Regional and Urban Eco- nomics, volume 4. Amsterdam: Elsevier, 2063–2117. 30 Faggian, Alessandra and Philip McCann. 2006. Human capital flows and regional knowledge assets: a simultaneous equation approach. Oxford Economic Papers 58(3): 475–500. Ferreyra, María Marta and Mark Roberts. 2018. Raising the bar for productive cities in Latin America and the Caribbean. Washington, dc: The World Bank. Glaeser, Edward L. and David C. Maré. 2001. Cities and skills. Journal of Labor Economics 19(2): 316–342. Greenwood, Michael J. 1997. Internal migration in developed countries. In Mark R. Rosenberg and Oded Stark (eds.) Handbook of Population and Family Economics, volume 1B. Amsterdam: Elsevier, 647–720. Hunt, Jennifer. 2004. Are migrants more skilled than non-migrants? Repeat, return, and same- employer migrants. Canadian Journal of Economics 37(4): 830–849. International Labour Organization. 2014. Trends in informal employment in Colombia: 2009–2013. Report, International Labour Organization, forlac Programme for the Promotion of Formaliza- tion in Latin America and the Caribbean. Matano, Alessia and Paolo Naticchioni. 2016. What drives the urban wage premium? Evidence along the wage distribution. Journal of Regional Science 56(2): 191–209. Puga, Diego. 2010. The magnitude and causes of agglomeration economies. Journal of Regional Science 50(1): 203–219. Suhonen, Tuomo. 2013. Are there returns from university location in a state-funded university system? Regional Science and Urban Economics 43(3): 465–478. Winters, John V. 2011. Why are smart cities growing? Who moves and who stays. Journal of Regional Science 51(2): 253–270. Wooldridge, Jeffrey M. 2015. Control function methods in applied econometrics. Journal of Human Resources 50(2): 420–445. 31 Appendix A. Additional mechanisms on agglomeration effects Table A.1: Additional mechanisms on agglomeration effects for college graduates Log wage (1) (2) (3) (4) (5) Log work city size 0.0173 0.0163 0.0152 0.0151 0.0151 (0.0100)∗ (0.0100) (0.0099) (0.0098) (0.0099) Log college city size 0.0535 0.0356 0.0349 0.0351 0.0358 (0.0090)∗∗∗ (0.0129)∗∗∗ (0.0109)∗∗∗ (0.0106)∗∗∗ (0.0102)∗∗∗ Log high school city size -0.0097 -0.0118 -0.0078 -0.0068 -0.0069 (0.0039)∗∗ (0.0039)∗∗∗ (0.0039)∗∗ ∗ (0.0037) ∗ (0.0038) Tier 1 (highest quality) 0.1654 0.2079 0.1987 0.1917 0.1820 (0.0293)∗∗∗ (0.0343)∗∗∗ (0.0340)∗∗∗ (0.0321)∗∗∗ (0.0305)∗∗∗ Tier 2 0.0390 0.0567 0.0609 0.0582 0.0536 (0.0336) (0.0309)∗ (0.0340)∗ (0.0330)∗ (0.0325)∗ Tier 3 0.0398 0.0280 0.0099 0.0119 0.0137 (0.0255) (0.0254) (0.0220) (0.0218) (0.0222) Tier 4 0.0474 0.0357 0.0267 0.0313 0.0339 (0.0190)∗∗ (0.0260) (0.0295) (0.0282) (0.0268) Standardized Saber 11 0.0888 0.0878 0.0726 0.0697 0.0670 (0.0035)∗∗∗ (0.0033)∗∗∗ (0.0039)∗∗∗ (0.0037)∗∗∗ (0.0035)∗∗∗ Public high school -0.0165 -0.0121 0.0032 0.0118 (0.0053)∗∗∗ (0.0053)∗∗ (0.0055) (0.0050)∗∗ Public college -0.1331 -0.1132 -0.1016 -0.0885 (0.0355)∗∗∗ (0.0268)∗∗∗ (0.0246)∗∗∗ (0.0228)∗∗∗ Mother, college degree 0.0736 0.0513 (0.0111)∗∗∗ (0.0106)∗∗∗ Mother, associates degree 0.0370 0.0332 (0.0081)∗∗∗ (0.0083)∗∗∗ Mother, high school degree 0.0187 0.0172 (0.0051)∗∗∗ (0.0050)∗∗∗ Birth order #2 0.0089 0.0084 (0.0051)∗ (0.0050)∗ Birth order #3 or higher 0.0248 0.0222 (0.0055)∗∗∗ (0.0055)∗∗∗ Very high household income 0.0652 (0.0125)∗∗∗ High household income 0.0000 (0.0067) Moderate household income -0.0103 (0.0066) Low household income -0.0107 ∗ (0.0061) Major of study indicators No No Yes Yes Yes Observations 184,719 184,719 184,719 184,719 184,719 R2 0.2744 0.2827 0.2988 0.3002 0.3017 Notes: All specifications include a constant and indicator variables for age, year and 2-digit economic sector. Addi- tional controls are experience and indicators for female, ethnicity, technological school and technical and technological institute. ‘Tier 5 (lowest quality),’ ‘Mother, less than high school’ and ‘Very low household income’ are the omitted categories. Coefficients are reported with robust standard errors in parenthesis, which are clustered by urban area of work. ∗∗∗ , ∗∗ , and ∗ indicate significance at the 1, 5, and 10 percent levels. 32