Policy Research Working Paper                   9990




          Spatial Misallocation, Informality,
             and Transit Improvements
                    Evidence from Mexico City

                             Román D. Zárate




Development Economics
Development Research Group
March 2022
Policy Research Working Paper 9990


  Abstract
 This paper proposes a new mechanism to explain resource                            show that transit improvements reduce informality by 7
 misallocation in developing countries: the high commut-                            percent in areas near the new stations. The paper develops
 ing costs within cities that prevent workers from accessing                        a spatial model that accounts for the direct effects of infra-
 formal employment. To test this mechanism, the paper                               structure in perfectly economies and allocative efficiency.
 combines a rich collection of microdata and exploits the                           Changes in allocative efficiency driven by workers’ realloca-
 opening of new subway lines in Mexico City. The findings                           tion to the formal sector amplify the gains by 20–25 percent.




 This paper is a product of the Development Research Group, Development Economics. It is part of a larger effort by the
 World Bank to provide open access to its research and make a contribution to development policy discussions around the
 world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The author may
 be contacted at rzaratevasquez@worldbank.org.




         The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development
         issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the
         names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those
         of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and
         its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.


                                                       Produced by the Research Support Team
     Spatial Misallocation, Informality, and Transit Improvements:
                                  Evidence from Mexico City∗

                                            Román David Zárate†

                                                   World Bank




      Keywords : Informality, allocative eﬃciency, urban transit infrastructure.




   ∗
     I am extremely grateful to my advisors Ben Faber, Cecile Gaubert, and Andrés Rodríguez-Clare for their con-
tinuous support and guidance in this project. I also want to thank my discussants Clare Balboni, Gabriel Ulyssea,
and Alejandro Molnar, and seminar participants at the Dallas Fed, McGill University, the World Bank, ITAM, Uni-
versidad de los Andes, Universidad del Rosario, PUC Rio, Vancouver School of Economics, Nottingham University,
the Macroeconomics Conference at Oxford University, the Online Urban Economics Seminar, the NBER SI in Urban
Economics, the Cities and Development Workshop, Hitotsubashi University, and the 2021 ASSA meetings for very
useful suggestions. I also want to thank David Atkin, Kirill Borusyak, David Card, Andrés González-Lira, Marco
González-Navarro, Patrick Kennedy, Chris Severen, Joaquín Klot, Isabela Manelici, Pablo Muñoz, Mathieu Pede-
monte, Darío Tortarolo, Nick Tsivanidis, Jose P. Vásquez, Román Andrés Zárate, and Isabel Hincapie for very helpful
comments and discussions. I thank the National Institute of Statistics and Geography (INEGI) and especially Dr.
Natalia Volkow for granting me access to the data. Financial support from the Clausen Center at UC Berkeley is
gratefully acknowledged. The views expressed are only those of the author and they do not reﬂect the views of INEGI.
   †
     World Bank, DEC TI. E-mail: rzaratevasquez@worldbank.org. The ﬁndings, interpretations, and conclusions
expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the World
Bank and its aﬃliated organizations, or those of the Executive Directors of the World Bank or the governments they
represent.
1        Introduction

Poor transportation infrastructure is a common characteristic of cities in developing countries. For
instance, in Mexico City, it takes a typical low-skilled worker approximately two to three hours to
commute to work in the center of the city. In recent decades, governments around the world have
spent billions of dollars on infrastructure projects to facilitate commuting. Recent research exam-
ines the aggregate gains from public transit improvements, assuming perfectly eﬃcient economies.
However, perfectly competitive models may fail to capture key features of developing economies,
where labor market frictions and other economic distortions are salient.1 In this paper, I study the
economic impacts of transit infrastructure in Mexico City, while considering both its direct eﬀects
in perfectly eﬃcient economies and the role played by distortions on allocative eﬃciency.
        Labor market informality is one of the most signiﬁcant sources of distortions in low- and middle-
income countries, with important implications for aggregate eﬃciency. Within developing countries,
50 to 60 percent of total employment is informal. Informal ﬁrms are less productive than formal ones,
avoid paying taxes, and do not make social security contributions to their workers.2 As a result,
the informal sector creates input wedges that cause factor misallocation, which ultimately lowers
aggregate total factor productivity (TFP) (Banerjee and Duﬂo, 2005; Hsieh and Klenow, 2009;
Restuccia and Rogerson, 2008). These intersectoral distortions between the formal and informal
sectors imply that any policy or shock that impacts informality may have ﬁrst-order eﬀects on
aggregate welfare through an allocative eﬃciency margin.3
        This study explores the link between transit improvements, informality, and aggregate eﬃciency
at the city level. I test whether infrastructure projects that facilitate transit within a city improve
allocative eﬃciency by reallocating workers from the informal to the formal economy. As a result,
aggregate gains from these projects can be larger relative to those demonstrated by standard urban
models that assume perfectly eﬃcient economies. The core intuition is that in cities in developing
countries, workers in remote locations prefer to work in low-paid informal jobs near their home
rather than incurring the high cost of commuting to formal employment. Transit improvements
may thus provide better access to formal jobs, leading to an expansion of the formal sector and a
more eﬃcient labor allocation.
        The paper makes two main contributions. First, I combine rich administrative microdata with
a transit shock to provide new empirical evidence on the eﬀect of urban transit improvements on
worker reallocation across the formal and informal sectors. Second, I rationalize these results through
the lens of a quantitative spatial model. To this end, I extend recent work (Ahlfeldt et al., 2015;
    1
     See e.g., Atkin and Khandelwal (2019) for a recent review of market distortions in the context of the gains from
market integration, and Busso et al. (2012); Levy (2018) for the eﬀect of distortions on total factor productivity in
the Mexican context.
   2
     See Gollin (2002, 2008) and La Porta and Shleifer (2008, 2014) for the relationship between the prevalence of the
informal sector and economic development.
   3
     See, e.g., McCaig and Pavcnik (2018) and Dix Carneiro et al. (2018) for the case of trade policies and their eﬀect
on the informal economy and the aggregate gains from trade. The former paper studies the eﬀect of the Free Trade
Agreement between the US and Vietnam, and the latter the impact of the Brazilian trade liberalization episode.


                                                          1
Allen et al., 2015; Tsivanidis, 2019) by adding intersectoral distortions and factor misallocation to
an urban framework. Following Baqaee and Farhi (2020), I provide a formula that decomposes the
welfare gains from transit developments into a “direct” eﬀect and an allocative eﬃciency term. This
latter term captures two diﬀerent components: factor misallocation and agglomeration forces when
they diﬀer between the formal and informal sectors.4
    Mexico City constitutes a relevant and informative case study for several reasons. First, it has
a dense concentration of economic activity, accounting for around 8.9 million people. Second, the
Mexican case is typical of developing countries, especially in Latin America, where more than 50%
of the urban labor force and 70% of business establishments are informal.5 Furthermore, the city
constructed a new primary subway line in the early 2000s, connecting remote areas in the north
with the center of the city. The entire line was not initially planned by the Government and there
were multiple delays in its construction, suggesting that the opening dates were uncorrelated with
local demand and supply shocks. Moreover, Mexico City collects unique data that I use to estimate
the impact of transit improvements on informality. Throughout, I use the standard deﬁnition of
informality: a worker is informal if he/she does not receive social security beneﬁts based on the
contractual relationship with his/her employer.
    At the center of the analysis is a rich collection of administrative microdata. I observe the geog-
raphy of jobs and worker residences for both the formal and informal sectors at the high granular
census tract level. In the analysis, I use four main sources of data. First, I use conﬁdential microdata
from several reports of the Mexican Economic Census, covering the universe of business establish-
ments located in Mexico City. Second, I use the microdata of the Mexican Population Census to
determine the residence of both formal and informal workers. Third, I use detailed information on
the transportation network in Mexico City, which I complement with transportation diaries (origin-
destination survey data). Additionally, I use the 2015 Intercensal Survey to construct commuting
and trade ﬂows at the city level for both sectors. I also use standard household survey data to
calibrate some of the parameters of the model.
    In the ﬁrst part of the paper, I document three empirical ﬁndings that suggest a negative rela-
tionship between the accessibility of jobs and informality.
    First, I exploit cross-sectional variation among informal vs. formal workers to show that informal
workers spend less time commuting, and work closer to home relative to their formal counterparts.
For instance, informal workers spend 40% less time commuting on average. This implies that informal
workers are more sensitive to commuting costs than formal workers. These ﬁndings are robust to
controlling for diﬀerent sets of ﬁxed eﬀects and individual characteristics.
    Second, I compare the residence of informal and formal workers. I document that formal jobs
concentrate in the west and center of the city, where most economic activity takes place. By contrast,
   4
     See Bartelme et al. (2019) for recent work that studies the eﬀect of optimal policies when agglomeration exter-
nalities diﬀer across sectors or locations.
   5
     See Perry et al. (2007) and Ulyssea (2018) who document informality rates in Latin America. In this region, the
informal economy varies from 35% in Chile to 80% in Peru.


                                                         2
most informal workers reside in the east and the periphery of the city.
       Third, I exploit the construction of a new subway line that connected remote locations with the
center of Mexico City to provide causal evidence that transit infrastructure leads to a decrease in
informality rates. Speciﬁcally, I estimate a series of diﬀerence-in-diﬀerences speciﬁcations that use
variation in access to new transit. These speciﬁcations control for initial characteristics of census
tracts and capture changes in informality after the transit shock in locations close to the new subway
line. The key identiﬁcation assumption is that the opening dates of these new commuting links were
unrelated to other local demand- or supply-side shocks that aﬀected locations near the new line.
This assumption is supported by the decades-long planning horizon in which part of the line was
included, and several unexpected and multi-year delays in the opening schedule. I further corroborate
this assumption by documenting no apparent pre-trends among the most aﬀected locations in the
preceding periods.
       The main ﬁnding is that informality rates decrease in locations close to the new subway stations.
I ﬁnd that the ratio of formal to informal residents increases by approximately 6%-7% in locations
close to the new stations after the shock. Similarly, Workers’ informality rates decrease by 2 to 4
percentage points after the construction of the new line, and ﬁrms’ informality rates decrease by 1 to
3 percentage points. These estimates represent a 6.2% decrease in workers’ informality rates, using
the average informality rate in the baseline year as a benchmark. I also construct bounds of the
eﬀect to control for the sorting of workers using retrospective questions. In particular, even under
the extreme assumption that all workers who moved from the central areas to the treated locations
were formal in the pre-period, the eﬀect is still 4.5%.
       To check the robustness of the diﬀerence-in-diﬀerences speciﬁcation, I compare the new line with
similar planned metro lines that were not completed over this period for unrelated reasons, using
an expansion plan from 1980. Reassuringly, this robustness check yields similar estimates to the
baseline speciﬁcation. Another potential concern with identiﬁcation is a change in the composition
of households in areas close to the new stations. To examine this, I also estimate the diﬀerence-in-
diﬀerences speciﬁcation using household characteristics as dependent variables. I ﬁnd that the transit
shock did not lead to changes in the composition of households based on observable characteristics.6
Also, to control for sorting, I also estimate bounds of the eﬀect, assuming that all workers who
moved to the locations that experience the shock were formal.
       To calculate and decompose the welfare gains from transit improvements, I build a quantitative
model with multiple sectors and wedges that captures the three empirical ﬁndings. The model allows
me to quantify the aggregate eﬀects of new infrastructure, while considering its additional impact
on factor allocation. Following Baqaee and Farhi (2020), I provide a formula that decomposes the
welfare eﬀects of any trade/commuting costs shock into two diﬀerent components: a “direct” eﬀect
term and an allocation term.7 Intuitively, the sign of this additional impact will depend on whether
   6
     Moreover, the quantitative model accounts for employment and location decisions within the city. Thus, I consider
the change in household characteristics after the transit shock based on unobservables.
   7
     This approach provides the main intuition for the allocation channel. Since it is a ﬁrst-order approximation. To


                                                          3
the shock reallocates workers to sector-locations with larger wedges or stronger agglomeration forces.8
The logic is similar to that in Hsieh and Klenow (2009) who show that sectors with larger wedges
are too small relative to the ﬁrst-best allocation. If a shock reallocates workers to ﬁrms bearing
higher distortions, the aggregate welfare eﬀects are larger.
    I calibrate the model using structural relationships. The key parameter to estimate is the labor
supply elasticity across sectors, which governs the reallocation of workers from the informal to the
formal sector. I follow Tsivanidis (2019) and calculate measures of market access for residents and
ﬁrms by sector. To recover this key elasticity, I exploit variation across locations after the shock by
running a triple-diﬀerence estimator that associates changes in labor allocation between the formal
and informal sectors with changes in market access. I ﬁnd that the estimates for the labor supply
elasticity parameter, around 1.9, are consistent with the theoretical assumptions of the model and
the data. Intuitively, if a transit shock connects workers to better formal jobs relative to informal
jobs, workers reallocate from the informal to the formal sector, generating additional welfare gains.
    Moreover, to calibrate the wedges, I use two diﬀerent approaches that yield similar results. First,
I follow Hsieh and Klenow (2009) and use the inverse of the labor and capital share relations. Under
the assumption that all ﬁrms within the same sector use the same production function, diﬀerences
in these shares capture the wedges. Second, I use the analysis from Levy (2018) that documents
diﬀerences in taxes, subsidies, and other distortions between formal and informal establishments in
the Mexican economy. I assume a constant wedge for formal ﬁrms considering all these distortions
and a wedge of zero for the informal ones.
    Next, I quantify and decompose the welfare gains from the transit shock by varying trade/-
commuting costs in the GE model. I ﬁnd that the allocative eﬃciency margin drives a signiﬁcant
fraction of the total gains. I am able to run counterfactuals from an initial equilibrium inverting
the model and recovering scale parameters. I compute the counterfactuals, using the estimates of
the key elasticities and the initial equilibrium conditions (Dekle et al., 2008). The results suggest
that the new subway line increased welfare by around 1.8%. I ﬁnd that the direct eﬀects explain
approximately 79% of the total gains, while the reallocation of workers from informal to formal
ﬁrms explains 18% and the remaining 2% are driven by diﬀerences in agglomeration between the
two sectors.9 The counterfactual analysis also suggests that the reductions in commuting and trade
costs account for a similar amount of the total gains.
    In terms of the cost-beneﬁt analysis, the allocative eﬃciency margin increases net welfare by a
considerable margin. According to oﬃcial documents from the Government of Mexico City (hence-
forth the Government), the net present value of the total cost of a subway line with 20 km and 20
capture actual changes in welfare, the approach rests on the assumption that any reduction in commuting costs is
inﬁnitesimal. Accordingly, I compute the counterfactuals using percentage changes.
   8
     The third term arises in the presence of heterogeneous agglomeration externalities or trade imbalances as in
Fajgelbaum and Gaubert (2020). In the case of the eﬃcient economy, I am assuming trade balances. In the ineﬃcient
economy, the labor wedges create trade imbalances, and the third term arises.
   9
     I compute two counterfactuals. The ﬁrst one allows workers to migrate within the city. The second one holds
constant the population in each census tract. The results of both counterfactuals are very similar.


                                                       4
stations is approximately 0.72% of the total GDP of Mexico City. Since line B increases welfare
between 1.7% and 1.9%, this represents a net gain of around $2.5 USD per every dollar spent on the
infrastructure. This gain would be lower if we didn’t consider the allocative eﬃciency margin in a
perfectly eﬃcient economy. For example, in the case of migration, the reallocation of workers from
the informal to the formal sector increases the average real income net of the total cost by 26%.
       I run other counterfactuals in which I simulate diﬀerent policies that the Government can imple-
ment to reduce informality rates. The results suggest that transit infrastructure can be an eﬀective
policy tool to reduce informality by connecting informal workers with formal jobs. For example,
to reduce informality rates by 0.5% at the aggregate level, the Government needs to reduce the
ﬁxed cost of entry to the formal economy by more than 7% or increase the ﬁxed cost of entry to
the informal economy by more than 10%. Similarly, I show that connecting informal workers with
formal jobs is more eﬃcient than implementing de-agglomeration policies that reallocate ﬁrms to the
outskirts. I also show that in the case of constructing central transit lines, the allocative eﬃciency
margin would explain a lower fraction that line B.
       Overall, the ﬁndings suggest that it is important to consider the role of the allocative eﬃciency
margin in the optimal allocation of infrastructure. Recent papers such as Fajgelbaum and Schaal
(2017), Balboni (2019), and Santamaría (2020) have estimated the infrastructure misallocation in
spatial general equilibrium models. My results suggest that when a social planner decides where to
allocate infrastructure, there are also ﬁrst-order eﬀects driven by the resource misallocation compo-
nent that are economically important.

Related Literature

This paper contributes to diﬀerent strands of the literature. The ﬁrst is the economic geography
and urban economics literature, which has assessed the economic impacts of urban infrastructure.
The second is the macro-development literature, which has studied the main drivers of the informal
economy, including the eﬀect of allocative eﬃciency on TFP. This latter strand is related to a large
literature on international economics that has estimated the impact of trade reforms on allocative
eﬃciency in the presence of domestic distortions.
       First, a new strand of literature has explored the impact of transit infrastructure within cities
(Ahlfeldt et al., 2015; Baum-Snow, 2007; Gonzalez-Navarro and Turner, 2018; Heblich et al., 2018;
Monte et al., 2018; Tsivanidis, 2019). For example, Tsivanidis (2019) assesses the welfare and
distributional eﬀects of a new bus rapid transit system in Bogotá, and Heblich et al. (2018) study
the economic consequences of the subway in London. My paper adds to this literature by examining
the eﬀect of transit infrastructure on allocative eﬃciency. I depart from standard urban economic
models by adding distortions and resource misallocation.10
       This paper also contributes to a literature on the role of factor misallocation in lowering ag-
  10
   Another type of distortion in the context of an urban model is studied by Pérez Pérez (2018) who assesses the
impact of minimum wage on aggregate employment and commuting patterns across US cities.



                                                       5
gregate TFP (Banerjee and Duﬂo, 2005; Hsieh and Klenow, 2009; Restuccia and Rogerson, 2008).
These studies have shown that the dispersion in distortions across ﬁrms and sectors generates factor
misallocation, and more so in developing than in advanced economies. In the particular case of
Mexico, Busso et al. (2012) show that if workers reallocate from the informal to the formal sector by
eliminating wedges, TFP increases by approximately 50%. Other studies have aimed to understand
the main causes of the large levels of resource misallocation in developing countries. Some of the
primary explanations consist of regulations, markups, and the wedges caused by the informal sector.
Similarly, other papers such as Fajgelbaum et al. (2019) and Hsieh and Moretti (2019) have shown
that state taxes and housing restrictions generate spatial misallocation in the US.
   Third, my work also contributes to a strand of the international economics literature that studies
gains from trade through the allocative eﬃciency channel. This literature was recently reviewed by
Atkin and Khandelwal (2019), who discuss the role of distortions on the aggregate gains from market
integration. Most of these articles have explored the response of markups to trade liberalization
episodes or changes in infrastructure (Arkolakis et al., 2019; Asturias et al., 2016; Edmond et al.,
2015; Holmes et al., 2014; Hornbeck and Rotemberg, 2019). Similar to my paper, some studies have
                                                              ecki, 2017), and others the eﬀect of
analyzed the eﬀect of intersectoral distortion on welfare (Świ¸
trade on informality (Dix Carneiro et al., 2018; McCaig and Pavcnik, 2018; McMillan and McCaig,
2019). While this literature focuses on trade reforms that aﬀect labor demand, my paper examines
the impact of commuting and urban trade on aggregate productivity.
   Other studies, such as Moreno-Monroy and Posada (2018) and Suárez et al. (2016), have also
explored the relationship between commuting and informality. They argue that the high commuting
cost to a formal job faced by a large part of the population increases informality rates in developing
countries. My paper investigates this relationship by providing empirical evidence on the relationship
between informality and transit infrastructure. Moreover, through a quantitative model, it measures
the economic impact of infrastructure on allocative eﬃciency by analyzing factor reallocation to the
formal sector.
   The rest of the paper is organized as follows. Section 2 introduces the setting of my study in
Mexico City and describes the transit shock. Section 3 presents the reduced-form evidence of the
eﬀect of commuting on informality. Section 4 develops an urban quantitative model with multi-
ple sectors and intersectoral distortions. Section 5 estimates the main parameters of the model.
Section 6 quantiﬁes and decomposes the welfare gains from transit improvements and run other
counterfactuals. Section 7 concludes.




                                                  6
2         Institutional Context

2.1        Transit System

In the second half of the twentieth century, Mexico City had severe public transport problems, with
congested main roads and highways, particularly in the downtown area. In 1967, the Government
established a decentralized public oﬃce to build and operate a rapid transit system of underground
trains to facilitate public transportation in Mexico City. Two years later, on September 4, 1969, the
Government inaugurated the ﬁrst line. Today, the system has grown into 12 lines with 195 stations,
for a total length of 128.4 miles. The subway is the largest in Latin America and the second-largest
system in North America after the New York City Subway.
         The Plan Maestro 1985-2010 guided the expansion of the subway. It set the mobility goals
that the transport system needed to satisfy over the long run, based on best practices in urban
development and the operational constraints of the project. The Plan Maestro 1985-2010 underwent
some modiﬁcations from what the Government had initially planned. For example, Line B was
originally Line 10 and experienced extensive changes (Ramírez et al., 2017). These modiﬁcations
responded mainly to changing demand patterns for transportation in Mexico City, which forced the
Government to redesign some lines. Part of my empirical strategy is to compare the unplanned
modiﬁcations to the subway lines with the original and un-executed plans.
         In my empirical strategy, I exploit the construction of line B. This line had the distinct feature
of connecting informal workers in remote areas with jobs in the central business district (CBD) of
Mexico City. It was inaugurated in 2000, and most of it was initially planned as part of Plan Maestro
1985, which reduces potential endogeneity concerns between the opening of the new stations and
local demand/supply shocks. Moreover, the construction of the line also experienced multiple delays
given changes in the regulatory framework and the 1994 ﬁnancial crisis.11 The line is approximately
20 kms long and has 21 stations. It connects the metropolitan area of the city with some adjacent
municipalities in Mexico State, such as Ecatepec de Morelos and Ciudad Nezahualcoyot. These areas
are characterized by high poverty rates, low education, and high informality rates.12 As a result,
line B has the distinct feature of connecting informal workers with formal jobs. To date, it is the line
with the fourth-highest number of passengers in the network. The total cost of this line, including
the net present value of service operations, maintenance, and other overheads, was approximately
$2,900 million in 2014 USD dollars, which represents 0.7% of the total GDP of Mexico City.
         Figure 1 depicts a map of the Mexico City subway system in 2000, highlighting the lines that
I use in my empirical strategy. Line B (purple) connects the northeastern area, including locations
in the State of Mexico, with the center of the city. I also use line C and line 12 for robustness
checks. Line C (green) was planned as a feeder line in the early 2000s, similar to line B; however,
    11
     The initial plan of the Government was to ﬁnish the line in 1997. However, they ﬁnished the construction of the
entire line in 2002.
  12
     In the Appendix, I relate census tract characteristics before the shock to line B to show this result.



                                                         7
the Government never constructed it. Line 12 (red) is the newest subway line in Mexico City and
was opened in 2012.


2.2      Informality

Following Busso et al. (2012); Kanbur (2009), and Levy (2018), I use two deﬁnitions of informality.
The ﬁrst is the standard deﬁnition and is based on whether ﬁrms comply with labor regulations. A
worker is deﬁned as informal if the ﬁrm does not pay social security taxes.13 These workers can be
salaried or non-salaried workers. The second deﬁnition of informality covers self-employed workers
and family members that work in a household business. The latter deﬁnition is a more restrictive
one, as it includes only the non-salaried workers of the ﬁrst group.14
       As in most developing countries, informality in Mexico is a signiﬁcant problem. It aﬀects 57%
of the total workforce and 78% of ﬁrms (INEGI). Figure A1 in the Online Appendix compares
informality rates (using the standard deﬁnition) in countries in Latin America and the Caribbean
to the average of the OECD.
       Informality rates in the entire region are very high. The average across the region is 50%, which
is much higher than the OECD average of 17%. Relative to other countries in the region, Mexico
has one of the highest informality rates, and the diﬀerence is even more signiﬁcant when we compare
Mexico to other countries with a similar income level, such as Argentina or Colombia.15
       The presence of the informal sector and the fact that informal ﬁrms avoid paying taxes create
a labor wedge across establishments. According to recent estimates, a ﬁrm that fully complies with
salary regulations is expected to pay social security taxes amounting to 18%-33% of a worker’s wage
(Busso et al., 2012; Levy, 2018) and 20% on sale taxes.These wedges create distortions across ﬁrms
that decrease welfare and TFP. Figure A2 in the Online Appendix plots the size and productivity
distribution of diﬀerent deﬁnitions of formal and informal ﬁrms in the Mexican context. Informal
ﬁrms are smaller and less productive than formal ﬁrms. However, due to the presence of labor
wedges (social security taxes), informal ﬁrms are larger and formal ﬁrms smaller, relative to a social
optimum.16 As a result, reallocating workers from the informal to the formal sector may lead to
productivity gains that impact welfare. Diﬀerent studies have examined the gains from removing
the informal sector in Mexico, ﬁnding that TFP would increase by approximately 200% in a world
without these distortions (Busso et al., 2012).
       In the next section, I show how informality rates are unequally distributed across the city. On
the one hand, most formal ﬁrms are usually located in the center. On the other, informal workers
usually live in the periphery and have poor access to formal employment.
  13
      Social security beneﬁts include health care, savings for retirement, social beneﬁts for recreation, and invalidity
allowances.
   14
      The second group is a subset of the ﬁrst group.
   15
      I do not observe the second deﬁnition of informality in other countries.
   16
      Busso et al. (2012) and Levy (2018) study the formal vs. informal sector in the Mexican context, and show that
wedges are larger for formal ﬁrms.



                                                           8
3         Data and Motivating Findings

3.1        Data

My primary unit of observation is the urban census tract (Area Geoestadística Básica in the Mexican
micro-data). I use a sample of approximately 3,500 census tracts from 116 diﬀerent neighborhoods
and 24 diﬀerent municipalities, 16 municipalities of which are in Mexico City and 8 of which are
adjacent municipalities from the State of Mexico.
         The ﬁrst source of information is standard GIS data on the location of the transportation network
and the new transit subway lines. I also use data on roads and highways in Mexico City to calculate
commuting times for diﬀerent transportation modes using the network analysis toolkit from ArcMap.
By merging these datasets, I can estimate commuting/trade costs before and after the transit shock
using a weighted average of travel times across the diﬀerent transportation modes.
         The second source of data is the Mexican Economic Censuses collected by INEGI. This is an
establishment level data set that provides standard information such as sales, value added, number
of workers, salaried workers, social security, and other outcomes. This census is carried out every
ﬁve years starting in 1994. I am able to deﬁne the informal sector at the ﬁrm level using social
security payments as discussed in Section 2. I categorize ﬁrms and workers in four diﬀerent groups
based on labor market regulations. I also calibrate wedges for each location and sector using wage
bill, sales, and social security payments.17
         The third source of information is the Mexican Population Census. This census is carried out
every ten years, and INEGI provided me with the data since 2000. With this information, I am
able to calculate the number of informal, formal, and total residents in each location. In 2000, the
Population Census also reported other variables such as household income and job characteristics
the week before the census interview. Moreover, I use the 2015 Intercensal Survey that provides
information on the workplace, residence, and transportation mode of formal and informal workers
at the municipality/locality level. This data allows me to observe commuting ﬂows in Mexico City
for each sector. I also use the 2017 Origin-Destination Survey collected in the commuting zone area
of Mexico City. I use this data for two purposes. First, I infer trade ﬂows across the city using
trips to restaurants and other types of shops at diﬀerent hours of the day. Second, I discuss some
motivational ﬁndings on commuting patterns.
         Finally, I complement my results with standard household survey data from the Encuesta Na-
cional de Ocupación y Empleo (ENOE). I calibrate some of the parameters of the model using this
data.
    17
    For the Economic Census, I observe data for periods before and after the transit shock, which allows me to test
for parallel trends in my main speciﬁcation.




                                                        9
3.2     Empirical Facts

In this section, I discuss three empirical ﬁndings that show a negative relationship between informal-
ity rates and the accessibility of formal jobs in Mexico City: 1) informal workers are more sensitive
to commuting costs and spend less time commuting; 2) informal workers are located in areas in
which they have poor access to formal employment; and 3) informality rates decrease with transit
improvements that connect informal workers to formal employment.


3.2.1    Cross-sectional Variation

Finding 1: Informal workers spend less time commuting and work closer to their home relative to
formal workers.

To reach the ﬁrst ﬁnding, I use the 2015 Intercensal Survey. With this data, I observe the residence
and workplace and average commuting time of each worker at the municipality level. Exploiting
cross-sectional variation, I compare the average commuting time and the workplace decisions of
informal vs. formal workers. I adopt the standard deﬁnition of informality based on the contractual
relationship of the worker, I also restrict the sample to individuals who worked the week before
the census interview. I run the following linear probability model to test whether informal workers
spend less time commuting:


                       yi = β0 + β1 Informali + γXi + γl(i) + γn(i) + γm(i) + i ,                   (3.1)

where yi is a dummy variable that takes the value of 1 if individual i commutes to a diﬀerent
municipality than the one in which he/she resides, whether he/she works in the central business
district (CBD) of Mexico City, or whether their average commuting time is within some window of
time (i.e., 16 to 30 minutes); Xi is a vector of individual characteristics that includes age, education,
gender, relationship to head of household, and a dummy variable indicating whether the individual
has an African or indigenous background; γl(i) and γn(i) are origin and destination ﬁxed eﬀects;
γm(i) is a transportation mode ﬁxed eﬀect to compare informal vs. formal workers that use the same
transportation mode; and,    i   is the error term of the regression.
   Figure 2 depicts the point estimate and conﬁdence interval of a linear probability model. I
relate the probability that the average commuting time of a worker is within some window of time
with a dummy variable that takes a value of 1 if the worker is informal. As the ﬁgure shows,
informal workers spend less time commuting than formal workers. For instance, the ﬁrst bar shows
that informal workers are more likely to work from their home relative to formal workers by 13
percentage points. Similarly, informal workers are more likely than formal workers to spend 15
minutes commuting. On the other hand, formal workers are more likely to spend 30, 60, or 120
minutes commuting.
   To provide more evidence of this result, table B1 in the Online Appendix reports the results for


                                                    10
the dummy variables of whether the worker commutes to another municipality; and whether he/she
works in Mexico City. The results imply that informal workers spend less time commuting. For
instance, the probability of commuting to a diﬀerent municipality, decreases on average, between 8.0
and 25.0 percentage points for informal workers. Similarly, informal workers are less likely to work
in the CBD between 4.0 and 9.0 percentage points. In the fourth column, I show that diﬀerences in
transportation modes are not driving these eﬀects.
      Overall, the results from Table B1 and Figure 2 suggest that informal workers spend less time
commuting. One potential interpretation of this result is that informal jobs are easier to substitute
across locations than formal ones.18

Finding 2: Most formal jobs are located in the central areas of the city, while most informal workers
reside in the outskirts.

The second ﬁnding is that most formal jobs are available in the center and west of the city, while
informal workers reside in other, less connected areas. As a consequence, workers that cannot aﬀord
the high rents in the center of the city, live in outlying areas with poor access to formal jobs.
      Figure 3 presents a heat map of informality rates in Mexico City and adjacent municipalities in
the State of Mexico in terms of jobs and the residence of workers. Panel A in ﬁgure A10 in the
Appendix shows that the west and the center of Mexico City have the highest level of economic
activity.19 Combining these two ﬁgures, we see that informality rates are lower in the west and the
center of the city than in the east and the periphery of the city. This suggests that workers who live
in remote locations usually have poorer access formal employment.


3.2.2      Diﬀerence-in-Diﬀerences Speciﬁcation

Finding 3: Informality rates decline with transit improvements that improve market access of formal
employment to informal workers.

I now exploit the construction of line B of the subway in Mexico City by estimating a series of
diﬀerence-in-diﬀerences speciﬁcations. I compare locations close to the new subway line with loca-
tions in the rest of Mexico City and test whether those that improved their market access experienced
a change in informality rates after the transit shock while controlling for initial characteristics. One
feature of line B is that it connects remote locations in the State of Mexico, close to Ecatepec de
Morelos, with the city’s center. The identiﬁcation assumption is that the opening of the new stations
is uncorrelated with local demand/supply shocks. The fact that most of the line was planned decades
earlier makes this assumption plausible. Moreover, since the construction of infrastructure may be
endogenous (Redding and Turner, 2015), I include a set of covariates as controls to compare similar
areas. Another potential concern to the identiﬁcation is sorting given by a change in the residents
that prefer to work in the formal sector. I show in the next section that household characteristics
 18
      This interpretation would be later corroborated by estimating gravity equations.
 19
      Panel B in ﬁgure A10 shows that the labor wedge in these locations is higher.


                                                          11
are not correlated with the opening of line B. Furthermore, I also estimate lower bounds of the eﬀect
using retrospective questions, and in the quantitative framework, I consider this channel by allowing
migration within the city. I use both jobs’ and workers’ informality rates as dependent variables.
    I test ﬁrst for changes in informality in terms of the locations in which the workers live. I use
data from the Population Censuses and estimate the following speciﬁcation relating the transit shock
to the change in the ratio between formal and informal workers:


                                 ∆ (ln LiF − ln LiI ) = β Ti + γXi + δs(i) + i ,                                 (3.2)

where Lis is the number of individuals that live in census-tract i and sector s, Ti is one of four dif-
ferent treatment variables: log distance in meters, log distance in walking minutes using the network
of roads, a dummy variable indicating whether the closest station is within the 10th percentile of the
Euclidean distance, and a dummy variable whether the closest station is within 25 minutes, δs(i) are
state or municipality ﬁxed eﬀects,20 and Xi is a vector of census-tract characteristics that include
distance controls such as: the area in square kilometers, distance to other stations of public transit,
a central business district dummy variable, and some productivity measure in the baseline year in
which I include value added per worker and the number of ﬁrms to capture how good is the location
in terms of jobs. This equation relates the transit shock to the log of the ratio between formal and
informal workers.21 I estimate equation 3.2 for the pool of workers and for diﬀerent groups based
on skills.22
    Table 1 reports the results for diﬀerent speciﬁcations of equation 3.2, while Figure A3 in the
Online Appendix depicts the three-point estimates of my preferred speciﬁcation for the pool of
all workers, low-skilled workers, and high-skilled workers, respectively. Overall, the results imply
that locations close to the new subway line experienced a decrease in workers’ informality rates.
In particular, the ratio of formal to informal individuals increased between 3.0% and 6.9% after
the shock. These results are robust to diﬀerent speciﬁcations, for example, to the use of diﬀerent
deﬁnitions of the treatment variable or to the use of diﬀerent sets of ﬁxed eﬀects or controls. In
addition, in panels C and D, I control for the change in workers’ composition in terms of skills and
report the results only for low-skilled workers. The estimates are very similar to the ones found for
the entire pool of workers. For instance, the ratio between formal and informal low-skilled workers
increased on average between 4.0% and 7.1%. Moreover, in panels E and F, I report the results
restricting the sample to the areas not located in the CBD of Mexico City. The eﬀects should be
larger in these locations since more informal workers live in these areas. Overall, I ﬁnd larger eﬀects
for this speciﬁcation; the ratio between formal and informal workers increased by almost 10% in
these areas.
  20
     For the municipality ﬁxed eﬀects speciﬁcations, I classify locations in the State of Mexico into four diﬀerent
groups: northwestern, northeastern, west central, and east central for a total of 20 municipalities.
  21
     Equation 3.2 corresponds to a structural relationship that I will derive from the model in section 5.
  22
     One caveat of this speciﬁcation is that I cannot test for parallel trends due to data constraints because I cannot
observe the location of informal/formal residents before the 2000 Census.


                                                          12
    Moreover, in table 2 I estimate line B’s eﬀect on the overall log number of individuals and
disentangle the eﬀect from the previous regression between formal and informal workers. In panel
A, I report the results for the pool of workers, while panels B and C report the results for the number
of formal and informal workers. The dependent variable in the ﬁrst and third columns is the log
number of workers. On the one hand, the point estimates suggest that the eﬀect is very small on
the number of individuals. For instance, it is only 1.7% in the case of the pool of workers, and 2.2%
for low-skilled. On the other hand, the second and fourth columns show the estimates for the log
number of formal workers. The results suggest that the locations aﬀected by the shock experienced
an increase in formal workers between 3% and 6%. This eﬀect is larger than the estimate on the
number of individuals. Finally, the third and sixth column reports the results for the log number of
informal workers. In the case of municipality ﬁxed eﬀects, the number of informal workers decreased
by around 3% in the locations that experienced the shock.
    There are two main potential concerns of the interpretation of this eﬀect. First, line B may be
endogenous; I address this concern in section 3.2.3 by comparing line B with planned lines. Second,
worker sorting maybe explaining the results. I address this concern in section 3.2.4.
    Second, I use data from the Economic Censuses and test whether the shock also generated an
indirect eﬀect aﬀecting the “treated” location in terms of jobs. I estimate the following speciﬁcation
to study whether areas close to the new subway lines experienced changes in workers’ informality
rates:


                             yi,t =                βτ Ti + δi + δs(i),t + γt Xi +   i,t ,              (3.3)
                                         τ =1994

where yi,t is one of the outcomes of interest of census tract i at moment t. I estimate equation 3.3
for the following outcomes: the share of informal workers and the share of informal ﬁrms; Ti is one
of the four diﬀerent treatment variables; δi are census tract ﬁxed eﬀects, and δs(i),t are state-time
or municipality-time speciﬁc trends, γt · Xi are census-tract characteristics-time-speciﬁc trends that
include the distance controls,   i,t ,   is the error term of the regression. The coeﬃcients of interest are
the parameters βτ , and the baseline year is 1994. Since the line was built in 2000, the placebo for
parallel trends corresponds to 1999. I compute the standard errors with clusters at the census-tract
level.
    Figure 4 and table B4 in the Online Appendix report the point estimates for the main outcome,
the share of informal workers. I ﬁnd that workers’ informality rates decrease in locations near line
B after the transit shock. I also ﬁnd evidence of parallel trends since the point estimate is small
and not signiﬁcant in 1999. On average, informality rates decrease between 2.0 and 4.0 percentage
points in locations that experienced the shock. The results are similar using the standard deﬁnition
of informality or a stricter deﬁnition of informality that considers only informal and non-salaried
workers that do not have an actual contract with the establishment. Moreover, these eﬀects are
robust to the use of the Euclidean distance, the walking distance using the network of roads, or


                                                            13
dummy variables indicating whether locations are close to the new stations within some range (i.e.,
2100 meters or 25 minutes). Furthermore, in columns ﬁve to eight, I include municipality-time ﬁxed
eﬀects and the results hold, suggesting that even after I compare locations to those within the same
municipality, census tracts closer to new stations experienced a change in informality rates after the
shock.23 . Overall, the results suggest that informality rates decrease between 5.0 and 8.0 percent
after the transit shock.
       Table B5 in the Online Appendix reports the results for the share of informal ﬁrms. The results
are similar to the ones for the share of informal workers. In particular, after the transit shock,
informality rates decrease between 1 and 2.5 pp., which corresponds to a 2 to 3 percent decrease in
informality rates when the mean in 1999 is used as a baseline. There are some issues with parallel
trends since there are small eﬀects but signiﬁcant for 1999.24
       In the next two sections, I test the robustness of my results and show that in terms of observed
covariates, there is a negligible change in the composition of households, which is a potential concern
of my identiﬁcation strategy. I also construct bounds using retrospective questions that ask where
were you living before.


3.2.3      Robustness Checks

For the robustness checks, I compare locations close to line B of the subway with locations near
subway expansions that the Government planned to build in the 1980s or actually built years later.
In particular, panel b of Figure 1 plots a map of Mexico City highlighting the three lines that I
will compare in this section: Line B, which is the infrastructure project that I’m studying; line C,
a feeder line, similar to line B, that was to connect northwestern locations in the State of Mexico
with the center of Mexico City, but was never built; and Line 12, which is the latest subway line,
opened in 2012.
       I estimate the same diﬀerence-in-diﬀerences speciﬁcation from equations 3.3 and 3.2. The only
diﬀerence is that the treatment variable corresponds to a dummy variable indicating whether the
centroid of the census tract is within some buﬀer zone of line B (i.e., 1500 meters), and similarly,
the control group consists of locations within some buﬀer zone of line C and/or line 12. I run these
regressions for four diﬀerent buﬀers: 1500, 2000, 2500, and 3000 meters.
       Figure A4 in the Online Appendix depicts the point estimates for the log of the ratio between
formal and informal workers from equation 3.2. I ﬁnd a similar pattern to the previous results.
The log of the ratio between formal and informal workers increases by approximately 10% when I
compare treated locations with census tracts close to the other two lines. As shown, in the graph,
this ﬁnding is robust to the use of diﬀerent buﬀer zones and is very stable.
  23
      I also ﬁnd similar results restricting the sample to census tracts with a centroid that is farther than 600 meters
from one of the new stations. I also show that the results are similar if the dummy variable is constructed using a
walking range of 20 minutes (Figure A11)
   24
      One reason that may explain the eﬀects in 1999 is that the Government announced the construction of the line
in 1994.


                                                          14
       In addition, Figure A5 in the Online Appendix depicts the main result from these regressions. I
plot the coeﬃcients for the most restricted deﬁnition of workers’ informality. The main ﬁnding is that
there is a negative relationship between informality rates and transit improvements when locations
that experienced the shock are compared with census tracts close to lines that were planned in
the 1980s. For instance, informality rates for workers decrease on average between 4.0 and 11.0
percentage points, which is a more signiﬁcant eﬀect than the one found previously. This eﬀect
corresponds to a decrease of approximately 15%, using the mean of the control group before the
shock. In most of the speciﬁcations, I also ﬁnd parallel trends, suggesting that after the shock
treated locations experienced changes in informality.


3.2.4      Households’ Composition and Lower Bound Eﬀects

A potential concern regarding the identiﬁcation strategy from the previous section is that locations
close to the new subway line might experience a change in the composition of households due to
worker sorting.25 For example, high-skilled workers that would prefer to work in the formal sector
might migrate to these census tracts and, as a result, there would be a decrease in informality rates
that could explain my ﬁndings. Ideally, I would deal with this issue by using a multi-year panel of
workers before and after the shock. Unfortunately, no such panel is available.
       I deal with this concern by comparing household characteristics before and after the shock. The
goal is to show that at least in terms of observable covariates, there was no change in households’
composition. For that purpose, I run the same speciﬁcation in equation 3.2 on household character-
istics, such as the high-skilled share of workers on the left-hand side.
       Table 3 reports the results, including all the set of controls. On average, I ﬁnd that household
characteristics in locations close to line B were not aﬀected by the shock relative to other areas in
Mexico City. For example, the point estimates for the share of high-skilled workers, the number of
kids, or the household size are not signiﬁcant. On the other hand, the coeﬃcients that are signiﬁcant
are very small. For example, the student share’s point estimates imply that the locations aﬀected
by the shock experienced a slight increase of 0.4 percentage points. Overall, this ﬁnding suggests
that at least in terms of observable characteristics, there is no change in households’ composition
due to the transit shock that can bias my estimates.26
       Furthermore, I also estimate lower bounds of the eﬀect using retrospective questions. In par-
ticular, INEGI asks the state in which the person was living before in the census. According to
  25
     In the model, I am allowing for changes in terms of unobserved characteristics since it allows for migration within
the city. However, the model only assumes one type of worker and, therefore, I also analyze changes in households’
composition in terms of unobserved characteristics.
  26
     This result corroborates the ﬁndings of other papers in the Mexican context. For example, Gonzalez-Navarro and
Quintana-Domeque (2016) exploit a random allocation of street asphalting in peripheral neighborhoods in Veracruz.
The authors follow individuals for two years and ﬁnd a negligible reallocation of households across locations in the city.
Similarly, Hernández-Cortés et al. (2021) ﬁnd negligible reallocation eﬀects exploiting subways and BRT expansions in
Mexico City. Moreover, it is related to other papers that have shown that there are high migration costs in developing
countries.



                                                           15
this variable, 1.27% of the population living in the treated locations resided in Mexico City before.
Around 971 thousand people were living in this area before the shock; this means that approxi-
mately 12,500 people moved from Mexico City to the treated municipalities. Then, I can estimate a
lower-bound eﬀect under the extreme assumption that all the migrants were formal. In particular,
removing these people from the speciﬁcation yields a lower bound of the impact. Figure 5 plots the
results of this lower bound for diﬀerent values of the population that moved and decided to live in
the treated areas after. For the value of 1.27%, there is a lower bound of 4.6%. Then, even under
this extreme scenario, the ratio of formal to informal employment increases in the locations that
experience the transit shock.


4         Model

In this section, I present a quantitative model to assess ﬁrst-order aggregate welfare eﬀects of transit
infrastructure on allocation. The model is based on recent work by Tsivanidis (2019), Monte et al.
(2018), Heblich et al. (2018), and Ahlfeldt et al. (2015). My model extends this framework by adding
intersectoral wedges and resource misallocation.
         The main theoretical result is a formula from a ﬁrst-order approximation that decomposes the
total change in welfare after a transit shock into three diﬀerent components: the ﬁrst is a “direct”
eﬀect term, and the second is an allocation term that can be decomposed into a resource misallocation
term and an agglomeration externality term. This formula is similar to the general case from Baqaee
and Farhi (2020) of GE models on changes in productivity.
         In the model, I assume that there are three groups of agents in the economy: workers denoted
by L, house owners denoted by H , and commercial ﬂoor space owners denoted by Z .27


4.1        Preferences

There is a mass of N locations in the economy that are indexed by n and i. There is a mass of LL
workers that operate in 2 sectors indexed by s ∈ I, F , where I and F represent the informal and
formal sectors respectively. The utility function takes a standard Cobb-Douglass form. Consumers
obtain utility from a composite consumption good and housing. The utility function of worker ω is:

                                                      α             1−α
                                             Cnisω         Hnisω
                                 Unisω =                                  · d− 1
                                                                             ni ·   nisω ,
                                              α            1−α
where C is consumption, H is housing, the parameter α is the expenditure share on the consumption
good, dni is an iceberg commuting cost to move from location n to i, and                     is an idiosyncratic shock
    27
    The focus of the paper is eﬃciency. In the Appendix, I generalize the results to consider diﬀerent group of workers
such as high- and low-skilled workers. Intuitively, the results are isomorphic if preferences for the formal and informal
sector come from the scale parameters of Fréchet shocks, or if the commuting and labor supply elasticities diﬀer
between the two groups, and low-skilled workers prefer to work in the informal sector.




                                                           16
to worker ω . After solving the maximization problem, the indirect utility of worker ω living in
location n and working in sector s and location i is

                                                         wis d− 1          ¯
                                                              ni nisω (1 + t)
                                           Vnisω =                αr 1− α     ,                                    (4.1)
                                                               Pn   n

where wis is the wage per eﬃciency unit in location i, and sector s, Pn is the price index of the
                                                  ¯ is a proportional tax rebate from the Government.
consumption good, rn is the rent for housing, and t
In the Online Appendix, I show the results when the rebate is only given to formal workers. The
term    nisω   is an idiosyncratic utility shock that is drawn from a nested Fréchet or extreme-value
type II distribution H (·),
                                                                           κ    η 
                                                                                   κ
                                                                             θs
                                                                   −θs
                H ( ) = exp −          Bn         Bns                               , with η < κ < θs ∀s.
                                                                                  
                                                                   nis
                                    n           s             i


Each worker receives a one-time shock and makes three decisions, one for each nest: 1) location to
live, 2) sector (formal or informal), and 3) workplace.28 . In the Online Appendix, I derive the model
when the shock is to eﬃciency units instead of utility units. The parameters η, κ, and θs measure
productivity dispersion across locations, sectors, and workplaces respectively and capture the notion
of comparative advantage.29 On the other hand, the parameters Bn capture speciﬁc amenities that
attract residents to each location n. I assume that these parameters are ﬁxed over time.
    I allow the third parameter θs to diﬀer across sectors to capture the fact that productivity
diﬀerences across locations are larger in the formal sector, or in other words, that formal jobs are
more diﬃcult to substitute across locations than informal jobs. This parameter also represents the
labor supply elasticity with respect to commuting costs conditional on working in sector s. The
estimation θF < θI implies that workers in the informal sector are more sensitive to commuting
costs, and thus, prefer to work close to their residence, as documented in Section 3.
    From the properties of the Fréchet distribution, the probability of living in location n and working
in (i, s) is

                                  −αη     −(1−α)η    η                        κ
                                                                         Bns Wns                 θs −θs
                              Bn Pn   rn            Wn                          |n              wis dni
               λnisL =                −αη      −(1−α)η    η                       κ                  θs −θs
                                                                                                               ,   (4.2)
                                  Bn Pn   rn             Wn              s   Bns Wns |n         i   wi s dni
                              n
                                         λnL                                  λnsL|n             λnisL|ns

  28
     I am assuming that the idiosyncratic shock is to utility, but another possibility is to assume that the shock is to
earnings. From a welfare point of view this assumption does not have any implications. In the Appendix, I consider
a version of the model with Fréchet shocks to earnings and eﬃciency units.
  29
     Diﬀerent articles have assumed a similar structure to analyze the allocation of workers across sectors. For example,
Lagakos and Waugh (2013) study selection in the agricultural sector in developing countries using this kind of shock;
Hsieh et al. (2019) study the allocation of talent in the past 50 years across diﬀerent occupations in the US, and Galle
et al. (2017) study the distributional implications of trade given that workers have idiosyncratic productivities for
sectors.




                                                              17
       κ =              κ                                           θs                                    θs −θs
where Wn           s   Wns |n is a wage index from location n, and Wns|n =                           i   wis dni is a wage index
from location n and sector s. This probability can be decomposed into three terms as in Monte
et al. (2018). First, there is the probability of living in n; second, the probability of working in s
conditional on living in n; and third, the probability of working in i conditional on living in n and
operating in sector s. Note that          i λnis|ns     = 1,       s λns|n   = 1, and         n λn   = 1.
       Using again the properties of the Frechet distribution, I equate the expected ex-ante utility of a
worker to the following constant:

                                                                                                     1
                                                                                                     η
                          ¯L ≡ E[max Unis                              −αη −(1−α)η  η
                          U                     nis ]   =          Bn Pn  rn       Wn                    γη ,              (4.3)
                                                               n

                                                               ˜ is hired by (i, s) is equal to the
where γη is a constant term.30 Then, the total amount of labor L
amount supplied by all locations and is given by

                                                  ˜ is =
                                                  L                       ¯ L.
                                                                   λnis · L                                                (4.4)
                                                            n

                                                                 ¯n ≡
Thus, the average income received by workers that reside in n is y                               i,s λnis wis .


4.2      Production of the Composite Good

Similar to Miyauchi et al. (2020), preferences for the composite good take a standard CES form of
diﬀerent varieties x across sectors and locations.31 It is described by a two-nested CES structure.
In the ﬁrst nest, consumers choose between sectors, and in the second nest, they choose between
varieties j within each sector:32

                                                   ξ                                             σs
                                          ξ−1     ξ−1                                 σs −1     σs −1
                                           ξ                                           σs
                           Cn =          Cns            ,      Cns =                 xnisj dj             ,
                                     s                                       i   j

where the parameter ξ captures the elasticity of substitution across sectors and the parameters
σs capture the elasticity of substitution across varieties within sectors. Note that the lower nest
parameter varies across sectors, hence, agglomeration externalities diﬀer between the two sectors
generating an additional allocation eﬀect. In principle, we should expect σF < σI to capture
that trade ﬂows in the informal sector are more sensitive to trade costs and that agglomeration
externalities are larger in the formal sector. I will estimate these parameters by estimating gravity
equations. The price index Pn in location n, and the price indices for each sector Pns take the usual
  30
     The term γη = Γ(1 − 1/η ) and Γ(·) is the gamma function. This is the usual constant that arises after integrating
the pdf from the Fréchet distribution.
  31
     Recent work on the public ﬁnance literature has shown that consumers, especially on the lower-income levels have
preferences for varieties in the informal sector (Bachas et al., 2020).
  32
     The CES preferences can be micro-founded using extreme value-type distributions as in the literature that has
studied the demand of heterogeneous consumers for a set of diﬀerentiated goods (Anderson and de Palma, 1992). For
example, Miyauchi et al. (2020) uses this procedure.


                                                               18
CES functional form:

                                                   1                                        1
                                                  1−ξ                                     1−σs
                                         1−ξ                                    1−σs
                          Pn =          Pns             , Pns =                pnisj dj          ,            (4.5)
                                    s                                 i    j

where pnisj is the price charged by ﬁrm j in (i, s) to consumers in n.
       I model the production of each good and the market structure as in the new economic geography
literature (Helpman, 1995; Krugman, 1991). Firms compete monopolistically. To produce a variety
a ﬁrm must incur both a constant variable cost and a ﬁxed cost. Both costs use labor and commercial
ﬂoor space with the same factor intensity across ﬁrms, which implies that the production function
is homothetic. The variable cost varies with the productivity from location i and sector s, and it is
represented by Ais . The total cost of producing xij units of variety j in location i and sector s is:

                                           xisj
                          Γisj =    Fs +            (wis [1 + tisL ])βs (qi [1 + tisZ ])1−βs ,                (4.6)
                                           Ais
where wis is the wage per eﬃciency unit in (i, s), qi is the price of commercial ﬂoor space, and Fs is
a ﬁxed cost that varies by sector to capture that the number of ﬁrms in the informal sector is larger.
In the case of commercial ﬂoor space, both sectors face the same price. Finally, I add exogenous
wedges represented by tisL and tisZ . These parameters represent taxes and subsidies in each sector
and location (i.e., payroll taxes), and they imply that the marginal revenue of labor is not equalized
across ﬁrms deviating from the optimum. Informal ﬁrms avoid paying these taxes, ﬁrst generating
dispersion in TFPR and then lowering TFP. I model informality in a diﬀerent way relative to recent
papers such as Ulyssea (2018) and Dix Carneiro et al. (2018).33 However, it captures the main
diﬀerences between the formal and informal economy. First, diﬀerences in TFP captured by the
parameter Ais and diﬀerences in the input intensity captured βs .
       Proﬁt maximization implies that the equilibrium price is the standard constant mark-up in trade
models over marginal cost. Firms also face iceberg trade costs τni to sell goods. In the empirical
analysis, I assume that these trade costs also change after the transit shock. The price charged by
ﬁrms in i to location n is

                                       σs         τni (wis [1 + tisL ])βs (qi [1 + tisZ ])1−βs
                          pnisj =                                                              .              (4.7)
                                     σs − 1                           Ais
       The zero-proﬁt condition implies that the equilibrium output of each variety is constant across
ﬁrms that operate in the same location and sector and is given by


                                                 ¯is = Ais Fs (σs − 1).
                                          xisj = x                                                            (4.8)
  33
    In section D.4 of the Online Appendix, I consider a version of the model in which ﬁrms endogenously decide to
operate in the formal vs. informal sectors following the logic from these studies. Moreover, ﬁrms also determine the
location to operate in the city.




                                                           19
Aggregate payments to labor and commercial ﬂoor space, including taxes, are constant shares of the
total revenue in location i and sector s. These shares are captured by βs and 1 − βs respectively:34

                                                                      ˜is = (1 − βs )Yis .
                                        ˜ is = βs Yis , qi (1 + tisZ )Z
                         wis (1 + tisL )L                                                                           (4.9)

From these expressions, I construct the labor demand.


4.2.1    Expenditure Shares

The assumption of CES preferences implies a standard gravity relationship for bilateral trade ﬂows
in goods between locations for each sector. Using the CES demand, the price indices from equation
4.5, and the fact that all ﬁrms from (i, s) charge the same price, the share of location n’s expenditure
on goods produced in (i, s) is:

                                                                                                 1
                             1−ξ                1−σ                                            1−σs
                            Pns            Mis pnis                                     1−σs
                πnis =           1−ξ
                                       ·           1−σ ,      with     Pns =       Mis pnis           ,            (4.10)
                            s   Pns        i Mi s pni s                        i
                             πns             πnis|s

where Mis is the total number of ﬁrms in location i and sector s, πns is the share of expenditure in
goods from sector s, and πnis|s is the expenditure share on goods from i conditional on consuming
goods from sector s. Finally, since all ﬁrms within the same location and sector choose the same
amount of labor and commercial ﬂoor space units, the total number of ﬁrms in each location i and
sector s in equilibrium is a function of the aggregate amount of labor and commercial ﬂoor space:35

                                                                   ˜ 1−βs
                                                              ˜ βs Z
                                                           ˜s L
                                                           β    is is
                                                Mis =                     ,                                        (4.11)
                                                               σs Fs
      ˜s is a constant term that varies by sector. The fact that consumers have a love of variety
where β
(LOV) and that there is free-entry imply that there are agglomeration externalities for each sector.
                                                                                                        1
As mentioned above, these agglomeration externalisties are captured by the elasticity                 σs −1 .   Since the
elasticity within the second nest varies by sector, agglomeration externalities generate an additional
ﬁrst-order eﬀect as in Bartelme et al. (2019).


4.3     Housing and Commercial Floor Space
                                                           ˜ that produce residential housing and
                                                   ˜ , and Z
I assume that there are two additional industries: H
commercial ﬂoor space respectively. Both of these sectors are non-tradable goods (τniH      ˜ →
                                                                                     ˜ = τniZ
∞ ∀n = i) and operate under perfect competition in all locations. The only factors of production
  34
    Total revenue Yis = n απns πni|s Xn , where Xn is the expenditure from location n.
  35
    This model is akin to the perfectly competitive case in which there is a single ﬁrm in all locations and sectors,
there is perfect competition and there are agglomeration externalities for each sector and location described by
Ais = A˜is · L     ˜ (1−β )γs , where γs = 1 .
             ˜ βγs Z
               is   is                    σs − 1




                                                             20
of these sectors are the group of agents H , and Z . The former supplies units to residential housing,
and the latter to commercial ﬂoor space. The production function for both sectors is linear in labor.
   There is no commuting for both groups, therefore, they only supply units where they live, which
means that dniH = dniS → ∞ ∀n = i. The indirect utility of worker ω from group ν where
ν ∈ {H, Z } from living in location n is:

                                            ¯nνω ≡ Bn wnν · nνω ,
                                            U                                                    (4.12)
                                                    Pnα · r 1−α
                                                           n

where   νω   is an idiosyncratic shock drawn from a Fréchet distribution with dispersion parameter ην ,
and location parameter Tiν , wnν is the wage per eﬃciency unit of group ν in location n. I assume
that ην → 1, where ν ∈ {H,  ˜ S˜}, this assumption replicates the speciﬁc factor case. Hence, the
supply of residential and commercial ﬂoor space is perfectly inelastic and is ﬁxed. Finally, from the
production function of housing and the assumption of perfect competition, the price of housing in
location n is rn = wnH , and the price of ﬂoor space is qi = wiZ .
   Using equation 4.9, which relates payments to labor and commercial ﬂoor space in terms of total
revenue from (i, s), the equilibrium condition to clear the market of commercial ﬂoor space in each
location i is

                                                                         ˜ is
                                                 (1 − βs )(1 + tisL )wis L
                                     ˜i =
                                  qi Z                                        .                  (4.13)
                                             s
                                                       βs (1 + tisZ )

This equation equates the supply of commercial ﬂoor space described by the left-hand side to the
demand by ﬁrms described by the right-hand side. Similarly,the residential ﬂoorspace market clearing
condition is

                                                ˜ n = (1 − α)Xn ,
                                             rn H                                                (4.14)

where Xn is total expenditure from location n, which I will explain later. This expression equates
the total supply of housing to total demand.


4.4     Government Budget Constraint

As mentioned above, the Government collects taxes and gives a rebate to households captured by
¯. I assume that the rebate is proportional to household income instead of a lump-sum so that the
t
Government does not distort migration decisions. This rebate is given by the following expression:

                                                              ˜is = t
                                               ˜ is + tisZ qi Z
                                      tisL wis L                    ¯·        Xn .               (4.15)
                                i,s                                       n

This equation equates the income of the government from the left-hand side to total expenditure on
the right-hand side. I proceed to close the model by ﬁnding an expression of total expenditure in


                                                       21
each location.


4.5   Goods and Labor Market Clearing

I now derive the equilibrium conditions for the goods market-clearing conditions. I analyze the
expression ﬁrst for total expenditure from location n, and then, for total revenue from (i, s).
   From equation 4.4, the total labor income received by agents of type g ∈ {L, H, Z } in location
n is          ˜
      i,s wis Lnisg . Then, taking into account the proportional rebate from the government to
households, yields that the total expenditure from location n:


                                      Xn = (¯                            ¯).
                                            yn Ln + qn Zn + rn Hn ) (1 + t                                         (4.16)

On the other hand, the labor demand comes from consumer preferences and the production function.
By the properties of the CES preferences, total revenue of location i and sector s, Yis , is given by:


                                                 Yis = α          πnis Xn .                                        (4.17)
                                                              n

Finally, equating labor demand and labor supply, the goods market-clearing condition to close the
model is:

                                                       ˜ is = αβ
                                        wis (1 + tisL )L                   πnis Xn .                               (4.18)
                                                                       n

   This equilibrium condition implies that total payments to workers including taxes is equal to a
fraction β of total revenue, where total revenue is a function of expenditures from all locations.
                                                              ¯ create trade imbalances since aggregate
    Note that taxes tisL , tisZ , and the proportional rebate t
expenditure is no longer equal to aggregate income in each location n.


4.6   Equilibrium

The general equilibrium of the model is described by the following vector of endogenous variables:


                                   x = {wis , qi , rn , y                       ˜is , Ln },
                                                                         ˜ is , Z
                                                        ¯n , Wns , Pis , L

               ¯ given a set of exogenous parameters:
and a constant U

                                          ¯ L
              A = {dni , τni , Ais , Bn , L,      ¯Z , Z
                                             ¯H , L         ˜ i , tisL , tisZ , Fs , θs , κ, η, σs , ξ, α, βs },
                                                       ˜i , H

that solve the following system of equations: workplace and sector choice probabilities from equation
4.2; residence choice probabilities from equation 4.2; price indices from equations 4.5 and 4.7; total
expenditure from equation 4.16; goods market clearing described by equation 4.18; commercial ﬂoor


                                                             22
space market clearing described by equation 4.13; housing market clearing described by equation
4.14; labor market clearing; and the Government budget constraint from equation 4.15.
          To assure that the equilibrium is unique, I assume the standard conditions for uniqueness in this
class of GE models (Allen et al., 2015). Agglomeration externalities should be lower than congestion
                                                            1
forces. The parametric condition is (1 − βs ) >           σs −1   ∀s. I proceed to analyze the eﬀect of transit
shocks on welfare using a ﬁrst-order approximation.


4.7         Welfare Decomposition

To aggregate welfare at the city level, I assume a social planner that takes a utilitarian perspective.
Then, the aggregate welfare function is:

                                                ¯L + ωH U
                                         ¯ = ωL U
                                         U                      ¯S ,
                                                        ¯H + ωS U                                               (4.19)

where ωg represents the weights that replicate the eﬃcient allocation of the economy.36 This equation
suggests that aggregate welfare is a weighted average of the ex-ante utility of the three diﬀerent types
of agents in the economy.
   Let’s deﬁne L as an allocation of factors of production given a set of exogenous parameters A.
                                         ¯ achieved by the allocation L. I’m interested in the eﬀect
Specify U (A, L) as the welfare function U
of shocks on aggregate welfare. By a ﬁrst-order approximation, the total change in welfare of any
trade/commuting shock is:

                                             ¯                              ¯
                                    ¯ = ∂ ln U d ln A
                               d ln U                       +
                                                                       ∂ ln U
                                                                              dL           .                    (4.20)
                                        ∂ ln A                          ∂L
                                         “Direct” eﬀect         Allocation/Agglomeration

Equation 4.20 suggests that the eﬀect of any shock can be decomposed into two diﬀerent terms: a
direct eﬀect term that considers just changes in exogenous parameters as iceberg commuting costs
dni or trade costs τni , and a ﬁrst-order allocation term. This second term captures allocation from
two diﬀerent forces: wedges and diﬀerences in agglomeration externalities between the two sectors.37
38

                                                                                               ¯         ¯
     For the parametric case of my model, these weights solve the following expressions: ωL
     36
                                                                                           ¯
                                                                                           U
                                                                                            UL
                                                                                               = αβ , ωZ¯
                                                                                                        U
                                                                                                         UZ
                                                                                                            = α(1 − β ),
         ¯
     ωH UH
and U   ¯   = (1 − α).
  37
     The agglomeration externality component captures distortions from diﬀerences in markups or diﬀerences in pref-
erences for love of variety across the two sectors.
  38
     This formula applies in the general class of urban models for any wedge, such as, variable market power across
ﬁrms in product or labor markets. In the Appendix, I show this result.




                                                          23
    Under the assumptions of the model described above, the explicit solution for this expression is:


       “Direct” eﬀect = −αβ             λnisL · d ln dni − α           (βs λnL + (1 − βs )λnZ ) πnis · d ln τni            (4.21a)
                                n,i,s                          n,i,s

                                                  ¯
                                           tisL − t                                                    ¯
                                                                                                tnsZ − t
                                                                ˜                                                    ˜ns
          Allocation = α βs
                                             1+t ¯ λnisL · d ln Lnis + (1 − βs )                  1+t¯   λnsZ · d ln Z     (4.21b)
                                 n,i,s                                                    n,s

                                 βs         1 + tisL    ˜ is +           (1 − βs )   1 + tisZ    ˜is .
       Agglomeration =                            ¯    dL                                  ¯    dZ                         (4.21c)
                         i,s
                               σs − 1         1+t                 i,s
                                                                          σs − 1       1+t



    The ﬁrst term corresponds to a Hulten (1978) or “direct” eﬀect term that comes from an envelope
argument. It suggests that under the case of perfectly eﬃcient economies, the cost time-saving
approach captures the welfare eﬀect of any trade/commuting shock. For instance, to measure the
welfare gains from a transit improvement, it is suﬃcient to know the value of jobs in each link
between n and i.39 This is the cost time-saving formula used by Train and McFadden (1978) to
evaluate reductions in commuting costs. This implies that if the goal is to understand the aggregate
gains, in the case in which the shock to commuting costs is very small, all the nominal eﬀects cancel
out.
    The second term captures changes in allocative eﬃciency. It suggests that if workers reallocate
to sectors and locations with higher wedges, there is an increase in welfare. Hence, a transit shock
may have an additional ﬁrst-order impact in the presence of distortions. Intuitively, the sign depends
on whether workers reallocate to ﬁrms with larger wedges. Firms that pay higher taxes have higher
values of TFPR, while ﬁrms that do not pay taxes have very low values. Thus, if workers move to
the ﬁrms with higher TFPR, the dispersion of TFPR decreases and the new equilibrium gets closer
to the ﬁrst-best allocation.
    Finally, the last term represents agglomeration externalities. This component arises only in the
presence of externalities that diﬀer between the two sectors as in BCDR or trade imbalances as
in FG. This term captures the eﬀect of these externalities on aggregate TFP and welfare. In my
case, agglomeration externalities diﬀer between the two sectors, and wedges and transfers create
trade imbalances, so the third term also shows up in the formula. This component depends on two
margins: diﬀerences in agglomeration externalities, and the wedge. Intuitively, if workers reallocate
to the sector with bigger externalities, there are larger increases in welfare. For the wedge, the
argument is similar to the second term. Firms that are paying higher taxes are small relative to the
ﬁrst-best due to trade imbalances; hence, reallocating workers to these ﬁrms increases welfare.
    I show the derivation of this formula in Section D.1 of the Appendix. I also generalized this
result for diﬀerent groups of workers and a general utility and production function by solving the
social planner problem in Section D.2. The only assumptions for this derivation are that the utility
function, production function, the consumption good aggregator, and the eﬃciency unit aggregator
   39
      In his seminal work, Hulten (1978) considers productivity shocks and shows that to measure their eﬀect on GDP,
it is suﬃcient to know the share of sector s on value added, or the so-called Domar weights.


                                                                        24
are homogeneous of degree one.40
         Most of the literature whose primarily goal is to measure the welfare gains from transit infras-
tructure within cities has focused on the ﬁrst term and direct eﬀects, by assuming that there are no
wedges in the economy and that it operates under perfect competition. I contribute to this literature
by analyzing the eﬀect of transit improvements on the second and third margins.41


5         Empirical Strategy and Estimation

In this section, I describe the main empirical strategy and estimation of the main parameters. This
section is divided into four parts: parametrization of commuting and trade costs; estimation of trade
and commuting elasticities; estimation of the labor supply elasticity across sectors -κ-, and model
inversion to recover the fundamentals of the economy such as technological and amenity parameters.


5.1        Trade and Commuting Costs

For the counterfactual analysis, I parametrize commuting costs as in the urban economics literature
(Ahlfeldt et al., 2015; Heblich et al., 2018; Tsivanidis, 2019). I assume that both iceberg commuting
and trade costs are parametrized using the following expressions:

                                               dni = exp(δd timeni ),                                          (5.1a)
                                               τni = exp(δτ timeni ),                                          (5.1b)

where timeni is the average travel time in minutes across diﬀerent transportation modes of moving
from location n to location i.42 The main objects of interest are the parameters δd , and δτ that
transform travel times to iceberg costs. I estimate these parameters from a nested logit speciﬁcation
using the 2017 Origin-Destination Survey. I use trips to from home to work and vice-versa to
estimate δd , and trips to restaurants, outlets, and retail shops to obtain the parameter δτ .
         The estimation is based on the following choice model. A worker ω is choosing between diﬀerent
transportation modes to travel from n to i. These transportation modes are grouped into diﬀerent
nests, for example public or private nests denoted by G . Denote the set of transportation modes in
g , by Υg . The indirect utility of choosing transportation model m ∈ Υg ⊂ G is:
    40
     Holmes et al. (2014), Świ¸ecki (2017), and Asturias et al. (2016) use a similar formula using hat algebra.
    41
     Since this formula applies to the case in which the change in commuting/trade costs is inﬁnitesimal, for the
counterfactual analysis, I estimate and decompose the change in welfare using percentage changes and exact hat
algebra.
  42
     I calculated a weighted average of travel times across the diﬀerent transportation modes using each transportation
mode’s aggregate share for commuting and consumption from the travel survey data. Hence, in terms of workers’
utility, the assumption is that transportation modes’ preferences take a Cobb-Douglas form. This is a conservative
assumption. For example, in the case of CES or random idiosyncratic shocks, workers will substitute more other
modes of transportation for the subway after the transit shock.




                                                          25
                             Vnimω = δ timenim + γm + ψnigω + (1 − λg )         nimω ,


where Vnimω is the indirect utility of worker ω if he/she chooses transportation mode m to travel
from n to i. This is the classic framework that Berry (1994) studies. The parameter δ measures
the sensitivity of the decision of the worker/consumer to the average time she spends on moving
across locations.43 The parameter γm captures preferences for transportation mode m relative to a
baseline mode; in my case, I normalize γbus to zero. For example, γcar captures preferences for car
relative to buses, which can include the price of a car, or the stress of driving in a complicated city
such as Mexico City. The variable ψ is common to all transportation modes for worker/consumer
ω within group g and has a distribution function that depends on λ ∈ (0, 1). This latter parameter
measures the correlation of errors within each nest. If this parameter is zero we are in the standard
multinomial logit case. Finally,       nimω   is an idiosyncratic shock to worker ω of choosing m. The
error term of this equation is ψnigω + (1 − λg )        nimω   which is drawn from an extreme value-type I
distribution.
    Table B6 shows the main result after estimating the nested logit speciﬁcation. The ﬁrst column
reports the results for commuting, and the second column reports the results for trade trips. I obtain
a value for δd of -0.009, which is consistent with previous ﬁndings from the literature (Ahlfeldt et al.,
2015). The point estimate for δτ is -0.013, which is also consistent with the literature. On the other
hand, in terms of preferences, when people go to work, the most preferred transportation mode
is car, whereas, when they travel to restaurants or retail shops, the most preferred transportation
mode is walking. The last two rows report the average iceberg commuting and trade costs across
locations in Mexico City before and after the transit shock. On average, after Line B of the subway
opens, commuting costs drop by 4.7%, and trade costs by 4.5%.44


5.2    Commuting and Trade Elasticities

Commuting Elasticities: To estimate the commuting elasticities, I use the 2015 Intercensal Sur-
vey. In this survey, workers report the municipality of their residence and workplace, and I am also
able to deﬁne formal and informal workers by using employment and social security information.
From the model, it is easy to derive the following gravity equation relating commuting ﬂows across
municipalities and iceberg costs:


                            ln λnism|nsm = βs ·timenim + γism + γnsm +            nism ,                      (5.2)
                                              δd ·θs

   43
      I also estimate a heterogeneous δ between the formal and informal sector for the consumption trips. However, I
do not ﬁnd signiﬁcant diﬀerences between the two coeﬃcients, δI = −0.0128, and δF = −0.0114. These results are
available upon request. However, I cannot distinguish between the formal and informal sectors for the working trips
since I construct the commuting ﬂows using the 2015 Intercensal survey.
   44
      I restrict the sample to trips that only use one transportation mode or two transportation modes + walking.



                                                        26
where the subindex m corresponds to one of four diﬀerent transportation modes: car, metro or
metrobus, bus, and walking; λnism|ns is the share of workers that commute to location i from
location n working in sector s using the transportation mode m; timenim is the average commuting
time across municipalities n, i using m; γnsm are origin-transportation-sector ﬁxed eﬀects; γism are
destination-transportation-sector ﬁxed eﬀects, and            nism   captures the measurement error observed
in the data of this gravity equation.
   The goal is to recover the parameters θs after knowing βs and δd described in the previous
section. The parameter θs captures how sensitive workers are to commute in the formal/informal
sector. From the evidence in Section 3, the expected result is that θI > θF , suggesting that informal
jobs are easier to subsitute across location. I estimate this equation via the Poisson regression by
pseudo maximum likelihood (PPML) to include the zero commuting ﬂows between municipalities.
Given the set of ﬁxed eﬀects, the identiﬁcation comes from comparing the workplace decision of
workers that use the same transportation mode and live (work) in the same municipality and sector,
but work (live) in diﬀerent places. Panel A in Table 4 reports the results. As expected, there is
a negative relationship between commuting ﬂows and the average commuting times. I ﬁnd that
the commuting elasticity in the formal sector is 3.11, and in the informal sector it is approximately
4.66. These values are consistent with the theoretical assumptions, and they conﬁrm that informal
workers are more sensitive to commuting costs than formal workers.
Trade Elasticities: To estimate the trade elasticities, I use the 2017 OD Survey focusing on data
on trips to diﬀerent establishments. I restrict the sample to trips to restaurants, retail shops, and
factory-outlets. I assume that people move across the city and spend their income on diﬀerent
consumption goods. To estimate a diﬀerent trade elasticity for the informal and formal sectors, I
use the fact that most informal establishments in Mexico correspond to restaurants and retail shops,
while most formal establishments are manufacturers, as Figure A6 shows (Levy, 2018). I estimate
the following gravity equation relating trade ﬂows π s across municipalities (trips) with iceberg trade
costs:


                       ln πnism|sm =       βs         ·timenim + γism + γnsm +    nism ,                (5.3)
                                        δτ ·(σs −1)

where the diﬀerent parameters represent the same variables as in equation 5.2. The identiﬁcation
comes from comparing trips to locations that use the same transportation mode and whose origin
(destination) is the same, but in which individuals are moving to (from) a diﬀerent municipality. I
estimate this equation via PPML to include zero trips across locations. The goal is to recover the
parameters σs . These parameters represent the elasticity of substitution across varieties for each
sector. They measure how sensitive are trade ﬂows to trade costs when people move across the
city to buy diﬀerent goods. In addition, according to the monopolistic model, they also represent
                                           1
agglomeration externalities given by     σs −1 .   I allow these externalities to diﬀer by sector, generating
additional welfare eﬀects from workers’ reallocation. One expected result is that σI > σF , indicating


                                                         27
that agglomeration forces are larger in the formal sector. The intuition for this result is that informal
varieties are more substitutable than formal ones, and as a result, agglomeration externalities in the
informal sector are lower.
       Panel B in Table 4 describes the main results for this estimation. As in all gravity equations,
trade ﬂows decrease with commuting times. The estimate of σs is consistent with the results from
the previous literature. In particular, the elasticity of substitution in the informal sector is 6.94, and
in the formal sector it is 5.39, suggesting that agglomeration externalities are 0.16 in the informal
sector, and 0.22 in the formal sector.          45



5.3      Labor Supply Elasticity across Sectors

In this section, I estimate the main equation from the model to recover the labor supply elasticity
across sectors, κ. This parameter governs the reallocation of workers from the informal to the formal
economy. I build market access measures following Tsivanidis (2019). According to the model, these
measures represent the wage index for each sector. Hence, they capture whether workers obtained
better access to formal jobs relative to informal jobs after the transit shock.
       For this estimation, I calculate travel times across the diﬀerent census tracts in Mexico City
with and without Line B of the subway using the network analysis toolkit from Arcmap. I compute
travel times for three diﬀerent transportation modes: car, walking, and the public transit system. I
calibrate speeds for diﬀerent types of roads and the public system using random trips from Google
Maps. Table C1 describes the values obtained for each category and each mode of the transportation
system.46
       With the commuting times at hand, I deﬁne the commuter market access (CMA) for location
                                        1
                            θs
n and sector s as CMAns = Wns  . This is an index of the accessibility of jobs in location n to
employment in sector s. Following Tsivanidis (2019) and Donaldson and Hornbeck (2016), I can
solve the following system of equations to compute MA measures for both ﬁrms and workers speciﬁc
to each sector and location:

                                                 ˜ is d−θs
                                                 L                                Lns d− θs
                                                       ni                              ni
                              CMAns =                      ,        FMAis =                 ,                    (5.4)
                                                 FMAis                        n
                                                                                  CMAns
                                            i

      ˜ is represents the total amount of labor hired by location i and sector s; Lns corresponds
where L
to the total number of workers that reside in location n and work in sector s; and FMAis is a ﬁrm
market access measure that captures whether ﬁrms in i have good access to workers from sector s.47
  45
     These externalities are relatively large compared to the previous ﬁndings in the literature, where they are around
0.1. However, both numbers are still reasonable, especially in developing contexts. For example, Tsivanidis (2019) ﬁnd
that agglomeration externalities in Bogota are around 0.21, which is a larger value than those of previous ﬁndings.
  46
     Section C1 in the appendix explains the procedure.
  47
     Tsivanidis (2019) estimates these measures for Bogotá and shows that with data of commuting costs, and the
number of residents and workers in each sector and location, the system of equation 5.4 has a unique solution. Another
way to prove the existence and uniqueness of this system of equations is to apply the theorem from Allen et al. (2015).



                                                               28
After solving this system of equations, we can also recover the wage distribution from the market
access approach. Figure A8 plots the wage distribution.
    The intuition of this system of equations follows the same logic as the case with only one sector.
These measures capture whether residents from location n have good access to jobs from sector s,
and similarly whether ﬁrms from location i have good access to labor.
    Figure A9 plots ventiles of the change in CMA for both sectors after the transit shock, holding
constant the number of workers and residents. It is clear that locations close to the new subway
line improved their market access to both formal and informal employment relative to other census
tracts in Mexico City. Additionally, Figure 6 plots natural breaks of the change in CMA, taking the
diﬀerence between the formal and informal sector. The ﬁgure shows that census tracts near line B
experienced a larger increase in market access in the formal sector. As a consequence, workers in
these census tracts obtained better access to formal jobs relative to the informal sector reallocating
to ﬁrms with higher TFPR.
    I exploit this variation to estimate the labor supply elasticity parameter across sectors. From
the structure of the model, I derive a log-linear relationship between the commuter market access
                                                               θs = CMA . Then, from equation
measures and the wage indices for each sector. In particular, Wns      ns
4.2, and similar to the reduced-form results from Section 3, I estimate the following labor supply
equation that correlates the change in the ratio between formal and informal residents with the
change in CMA measures over time and across sectors:


                                       1               1
    ∆ ln LnF,t − ∆ ln LnI,t = κ          ∆ ln CMAnF,t − ∆ ln CMAnI,t              + βXn + γs(n) +      nt ,    (5.5)
                                      θF               θI

where ∆ corresponds to the diﬀerence between 2000 and 2010; LnF,t , and LnI,t is the total number of
residents that live in location n and work in the formal and informal sectors respectively; and γs(n) is a
municipality or state ﬁxed eﬀect. I include a vector of controls Xn to capture speciﬁc trends that vary
with initial characteristics. To recover κ, equation 5.5 is akin to a triple diﬀerence estimator. The
ﬁrst diﬀerence corresponds to time variation before and after the transit improvements, the second
diﬀerence exploits heterogeneity of the treatment across locations, and the third diﬀerence uses
variation in the market access measures across sectors. Equation 5.5 is a labor supply relationship
and implies that people reallocate to the formal sector as they obtain better access to formal jobs
relative to informal employment. As Figure 6 shows, Line B improved access to formal jobs for
residents close to the new stations. It is important to mention that to estimate the parameter κ, the
reallocation of workers wouldn’t bias the estimate of κ since the model allows for migration within
the city, then according to the model, I can estimate κ comparing census tracts.48
The largest eigenvalue of this system of equations is 1. Thus, there is at most one strictly positive solution, up to
scale with this system of equations.
  48
     In particular even if the people reallocate, the comparison to estimate κ needs to be across census tracts after
                                                                                                               1
                                                                                                    κ      κ
taking the ratio. The variable that determines who reallocates to the treated locations is Wn = (Wn  I + WnF ) , and
                                                                                                               κ




                                                         29
    One caveat with the estimation of equation 5.5 is that the change in CMA may capture other
shocks in the economy that shifts the allocation of labor across sectors and locations. These shocks
can change the decision of workers to operate in the formal or informal sector, thus, generating a
correlation between the change in CMA and the error term. This generates a bias in the estimation
of κ. To deal with this problem, I estimate equation 5.5 by two-stage least squares using two
instruments. The ﬁrst instrument is the change in the CMA measures when the number of residents
and workers is held ﬁxed, and the second instrument is the treatment dummy variable. The idea is
to capture changes in commuting costs and clean the estimation from other economic shocks. For
example, the treatment dummy variable captures changes in residents, employment, and market
access in the treated census tracts only because of line B and not other economic shocks in the city.
    Table 5 reports the results for the labor supply elasticity across sectors. I obtained estimates of κ
between 1.1 and 2.4. These estimates are consistent with the model and the commuting elasticities.
The ﬁrst two columns show the results for the OLS and the other four columns for the IV using
each instrument separately. In my preferred speciﬁcations, which are the ones in columns 4 and
6, I obtained a point estimate between 1.5 and 2.4. For the counterfactuals, I take an average
between these two numbers. Comparing the estimates from the 2SLS and OLS, it suggests that
there were other shocks in the economy that created a downward bias for κ. For instance, these
shocks reallocated workers from the informal to the formal sector generating a negative correlation
between the change in the CMA measures and the error term.49                       50



5.4    Labor and Capital Wedges

Labor and capital wedges are a crucial parameter for the quantitative analysis. I follow the popular
approach from Hsieh and Klenow (2009) and use the inverse of the wage bill and capital share to
calibrate the distortions.51 From the proﬁt-maximization condition, the inverse of the labor and
commercial ﬂoorspace share paid by each ﬁrm is

                      −1                                                 −1
            wis lis                σs                          qis zis                   σs
                           =              (1 + tisL ) ,                       =                     (1 + tisZ )
            pis yis            (σs − 1)βs                      pis yis            (σs − 1)(1 − βs )
where wis lis is the wage bill, qis zis is the commercial ﬂoorspace payments, and pis yis are total sales
or value-added. I can observe the left-hand side of this equation for each ﬁrm in the Economic
Census, and use the average labor share βs and markups in each industry to calibrate the wedges.
To aggregate from the ﬁrm level to the census-tract-sector cell, I take the mean of the inverse of the
I am controlling for this variable after taking the ratio between the formal and the informal sectors since this wage
index cancels out. Then, the reallocation of workers does not bias the estimation of κ since what matters is the share
of formal to informal residents across census tracts even if there is migration.
  49
     I didn’t instrument the change in CMA with the two instruments since both of them capture changes in commuting
cost because of the transit shock.
  50
     Relative to previous studies on estimating labor supply elasticities across sectors, such as Galle et al. (2017),
Lagakos and Waugh (2013), and Berger et al. (2019), my estimates are similar.
  51
     Other papers such as Busso et al. (2012) and Levy (2018) that have explored the role of resource misallocation
in Mexico also use the same method.


                                                          30
wage bill and capital share across ﬁrms in each cell.
       Figure A7 in the Online Appendix plots the labor-wedge distribution across locations for each
sector in the baseline year. The wedges between the formal and informal sectors are very similar to
the ones found by Busso et al. (2012). Formal ﬁrms face larger distortions. On average, the wedge
in the formal sector is approximately 1.67 times the wedge in the informal sector. Furthermore,
panel B of ﬁgure A10 in the Online Appendix shows the spatial distribution of labor wedges after I
construct ventiles across locations. In places in the center of the city, where there is more economic
activity and formal ﬁrms locate, wedges are larger.
       Moreover, for the counterfactual analysis, I also use a constant wedge for formal ﬁrms based on
the work from Levy (2018) (Table 7.9). For the labor wedge, I use a conservative value of 0.95, and for
the commercial ﬂoor space, a value of 0.75. These wedges include several distortions such as implicit
taxes on salaried workers, regulations on dismissals and reinstatements, non-contributory social
insurance, standard labor taxation like state payroll taxes, and ﬁrm taxation including REPECO
and value-added taxes.


5.5      Other Parameters

I calibrate other parameters of the model using simple moments of the data, or take them directly
from the previous literature. I calibrate the expenditure share on housing using the ENOE and
ﬁnd, on average, a value of α = 0.75. Similarly, for the labor share, I use data from the Economic
Census in 1999 and ﬁnd a value of βI = 0.70, and βF = 0.6. To calculate the total amount of
        ˜ and commercial ﬂoor space Z
housing H                            ˜ in each location, I use the area in square kilometers of
buildings in each census tract from the Global Human Settlement Layer (GHSL) in 2000 weighted
by the total number of employees and residents. To calibrate the ﬁxed costs, I use the log-linear
relationship between the total number of ﬁrms and the workforce in each sector from the model,and
ﬁnd FI = 0.15, and FF = 1.2. Section C.2 in the appendix speciﬁes the details for this estimation.
This result is consistent with the fact that for a ﬁrm, it is more diﬃcult to produce in the formal
sector. In addition, I use the estimate of the elasticity of substitution across sectors ξ = 2 from
Edmond et al. (2015), which is similar to the estimates of other papers (Asturias et al., 2016). Also,
I compute the counterfactuals using a value of η = 1.50 which is the lowest value of the migration
elasticity that Tsivanidis (2019) ﬁnds for Bogotá, a similar context to Mexico City. This value is
consistent with the assumption that η ≤ κ from the theoretical framework.52


5.6      Model Inversion

In this section, I recover the fundamental parameters Bn , Bns , which capture diﬀerences in amenities
that attract residents to each location and sector; and the parameters Ais , which represent diﬀerences
  52
    In section 6, I show that the results are robust to diﬀerent values of the migration elasticity and the elasticity of
substitution across sectors.



                                                           31
in productivity across locations. The argument is that knowing the key elasticities, and the number
of workers and residents in each location and sector, I can identify the entire model from Section
4. Knowing these parameters, I can then compute trade ﬂows and commuting ﬂows and solve the
counterfactuals using initial equilibrium conditions.
    I proceed in three steps. In the ﬁrst step, I recover relative diﬀerences in amenities, and the wage
distribution equating the labor supply to actual data. In the second step, I recover the productivity
levels Ais equating the labor demand to the number of workers in the data. In the third step, I
recover the amenity parameters Bn , equating the residents’ share in each location in the model to
the data.

Step 1: In a simultaneous step, I recover the entire wage distribution and the parameters Bns by
equalizing the labor supply from equation 4.4 to the total number of workers in each sector and
location from the data. I assume without loss of generality that BnI = 1. I identify BnF from the
following relationship using the share of informal workers from the data in each location and the
wages:

                                                    BnF Wκ
                                                         nF
                                      λnF |n =                 .
                                                 BnF Wκ
                                                      nF + Wκ
                                                            nI

I then identify the wage distribution by equalizing equation 4.4 to the number of workers from the
data in the pre-period.

Step 2: Using the vector of wages, I recover the productivity parameters Ais by solving the labor
demand from equation 4.18. I solve for the vector of productivities, equating the labor demand
implied by the model to the number of workers in each sector and location from the data.

Step 3: With data on wages, and knowing the key elasticities, I can obtain the amenity parameters
in each location Bn by equating the implied number of residents from the model with the number
of residents from the data in the pre-period. In particular, I use λn from equation 4.2 in the model
and equate it to the number of residents in the data.
    As a result, I then can compute trade ﬂows for each sector across the city and solve for the
counterfactuals using exact hat algebra as in Dekle et al. (2008). In Section D.3 of the Appendix, I
provide the equilibrium conditions of the model with exact hat algebra.


6    Counterfactual Analysis

This section describes the counterfactual analysis. To compute the welfare eﬀects of Line B, I use
the estimates of the key elasticities, and the commuting times with and without Line B. Then, I
solve for the GE equilibrium before and after the shock. Regarding the distortions, I ﬁnd a similar
TFP eﬀect as Busso et al. (2012), removing the wedges lead to gains of around 200%.
    I compute two diﬀerent counterfactuals. The ﬁrst one assumes that there is no migration within


                                                   32
the city and only solves the goods market-clearing condition. The second ones takes into account
                                                                                             ¯ is
the migration channel. I assume that the city is closed, so that the total number of workers L
constant. I calculate changes in welfare and total output using percentage changes. To decompose
the welfare eﬀects into the three terms, I compute the equilibrium with and without the labor wedge,
and for the agglomeration channel, I assume a diﬀerent value of σs in the two sectors.
       Figure 7 plots the results for the diﬀerent counterfactuals and Table 6 in the Online Appendix
reports the numbers. Panel A and C holds the number of residents constant, while panel B and D
add the migration margin decision. In panel C and D, I run the counterfactuals with a constant
wedge for the formal sector. On average, Line B of the subway increased welfare between 1.7%-1.9%.
Both changes in commuting and trade costs account for around 50% of the total gains. In terms
of the welfare decomposition, I ﬁnd that in the case in which the distortions are calibrated using
the data, the “direct” eﬀect term represents approximately 79% of the total gains, the reallocation
of workers to the formal sector explains 18%, and the agglomeration externality component drives
the remaining 2%. As a result, the allocation mechanism generated 26% additional gains relative
to the standard case under the perfectly eﬃcient economy. On the other hand, in the case in which
I assume a constant wedge for the formal sector in the model, the direct eﬀect explain a larger
fraction of the total gains, 83%; the change in factor allocation explains 14%, and diﬀerences in
external economies of scale between the two sectors explain 3%. The results are robust for diﬀerent
values of η , κ, and ξ .53
       Relative to previous ﬁndings, and considering the size of my shock, these estimates are a bit
higher. Nevertheless, these studies only considered changes in commuting costs and the direct
eﬀect. In my counterfactual, I’m analyzing changes in consumption costs and the allocative eﬃciency
margin, which explains why the welfare eﬀects are bigger.
       The project’s cost-beneﬁt analysis implies that there was an increase of around 26% of real
income net of the total cost at the aggregate level in the city. According to oﬃcial documents
from the Government, the total cost of Line B in 2000 was approximately USD 2,900 million in
2014, considering the net present value of maintenance, operational costs, and other overheads.
This number represented approximately 0.72% of the total GDP of Mexico City in 2000. Then,
in the benchmark case, line B generated an increase of around 2.59 USD per dollar spent on the
infrastructure. This change would have been only 2.00 without considering the allocation mechanism.
The new margin increased the eﬀect on total welfare per dollar spent on the infrastructure by
approximately 26%.54 For instance, if the city constructs a line or a road with a similar demand,
but in places in which most of the workers are formal, the changes in welfare are smaller.
       The main takeaway from this analysis is that when policymakers assess the economic impact
  53
    These results are available upon request.
  54
    This number is obtained in the following way: in the perfectly eﬃcient economy, the total gains are: 1.48% of the
GDP, then the beneﬁt per dollar spent on the project is 2.04 (1.47/0.72). By contrast, under the ineﬃcient economy,
the beneﬁt is 1.86%, and the value per dollar spent on transit infrastructure is 2.59 (1.86/0.72). Thus, there was an
increase of 26.3% relative to the perfectly eﬃcient economy.



                                                         33
of transit infrastructure, it is critical that they consider other mechanisms that may aﬀect welfare
beyond common factors such as transportation demand. For example, when governments decide
where to allocate future infrastructure, they should not only focus on connecting poor areas with
eﬃcient locations for distributional implications, but also for eﬃciency reasons. As this study shows,
connecting informal workers with formal employment may generate additional welfare gains by
reducing factor misallocation.


Other policies

In this section, I consider the eﬀectiveness of other policies that the government can implement to
reduce informality. I study two diﬀerent types of policies. The ﬁrst type consists of reductions in
the entry ﬁxed costs and the second type of placed based policies.


Entry ﬁxed costs

First, I consider a policy in which the Government reduces the entry ﬁxed cost of formal ﬁrms or
increases it for informal ﬁrms. These policies are akin to making it easier for entrepreneurs to start
a formal business in Mexico City (i.e., reducing red tape or bureaucracy) or to increase government
regulations that make it more diﬃcult for informal ﬁrms to enter the market.
   According to the reduced form estimates, Line B of the subway led to a decrease in informality
rates at the aggregate level by 0.4%. Figure 8 plots the eﬀectiveness of diﬀerent policies that change
the entry ﬁxed cost for both formal and informal ﬁrms. Panel A plots the results for diﬀerent rates
decreasing the entry ﬁxed cost for formal ﬁrms, and panel B simulates an increase in the informal
entry ﬁxed cost for diﬀerent values.
   There are three main takeaways from this analysis. First, according to the model, it is more
eﬀective to reduce the entry ﬁxed cost of formal ﬁrms relative to increasing the entry ﬁxed cost
of informal ﬁrms. For example, to decrease informality rates by 0.4% at the aggregate level, the
government can lower the formal ﬁxed cost by 5-8%, but it needs to increase the informal ﬁxed cost
by more than 8%. This suggests that it is more eﬀective to focus on policies that beneﬁt formal ﬁrms
than to harm informal ﬁrms. Second, as the target of the Government increases, it becomes more
eﬀective to reduce the formal ﬁxed cost relative to increasing the informal ﬁxed cost. Third, the
results suggest that transit infrastructure that connects informal workers with formal employment
can be a useful tool to reduce informality rates. For example, if the government wants to generate
similar results at the aggregate level, it needs to change the ﬁxed cost by a substantial proportion.
   Overall, the ﬁndings imply that transit lines can be an excellent tool to reduce informality rates
by giving better access to formal jobs to workers that live in remote areas compared to other types
of policies that the government can implement.




                                                 34
Place-based policies

For the second set of policies, I study whether place-based policies that reallocate formal ﬁrms in
the city can eﬀectively increase welfare and reduce informality rates. The intervention consists of
increasing the commercial ﬂoor space employed by formal ﬁrms in diﬀerent parts of the city. I
consider two sets of policies; the ﬁrst one consists of increasing the commercial ﬂoor space in the
center of the city, and the second one in the outskirts. Figure 9 plots the locations in which the
Government implements the policy; in total, there around 250 treated census-tracts in both parts
of the city. The goal is to compare policies that reallocate formal ﬁrms to the outskirts vs. transit
shocks that connect informal workers with formal jobs.
    Figure 10 plots the results of the intervention. In panel A, the Government increases commercial
ﬂoor space in the central locations, and in panel B in the remote areas. The results suggest that
it is more eﬀective to intervene in the central locations than in the outskirts. For instance, if the
Government increases the commercial ﬂoor space by around 40% in the central areas, the policy
generates similar welfare gains to the transit shock that I studied. On the other hand, as shown in
panel B, it is very ineﬀective to reallocate ﬁrms to the outskirts. Even if the Government increases
the commercial ﬂoor space by a substantial proportion, 60%, it only increases welfare by 0.18%,
which is signiﬁcantly lower than the one obtained by the new subway line that connected informal
workers with formal jobs. Moreover, in the latter case, the allocative eﬃciency margin and the
externality component explain a very small fraction of the total gains.
    There are two main explanations for this result. The ﬁrst one is that since most of the formal
ﬁrms locate in the CBD, the agglomeration forces are minimal in the outskirts. The second one is
that these locations are very unproductive in terms of the productivity scale parameters, especially
for formal ﬁrms. Hence, reallocating ﬁrms to the outskirts generate negligible welfare gains. In
general, the results imply that it is more eﬀective to connect informal workers with formal jobs by
transit lines than to move formal ﬁrms to the city’s remote areas through de-agglomeration policies.


7    Conclusion

This paper has examined the welfare gains from transit improvements in developing countries, con-
sidering the allocative eﬃciency margin. The mechanism that it studies is whether workers reallocate
from the informal to the formal sector. I ﬁnd that transit infrastructure that facilitates commuting
may generate additional welfare gains by improving the market access of the informal labor force to
formal employment.
    From an empirical perspective, the paper exploits a transit shock in Mexico City that connected
poor and remote areas with the center of the city. The main ﬁnding is that informality rates decrease
in the locations that experienced the shock relative to other areas in the city. This result implies
that workers reallocated to ﬁrms with higher TFPR, thereby increasing welfare to a larger extent



                                                 35
than the predictions under perfectly eﬃcient economies.
   On the quantitative side, the paper departs from the standard eﬃciency case in urban models
that have studied the economic impact of transit infrastructure. The model extends the classic
framework by adding wedges and resource misallocation. The paper quantiﬁes the gains from transit
infrastructure and ﬁnds that allocative eﬃciency drives approximately 17%-25% of the total gains.
   The results from this study are informative to policymakers in several aspects. First, it is
critical that when they analyze the cost-beneﬁt and opportunity cost of a project, they take into
consideration other ﬁrst-order eﬀects that are driven not just by direct eﬀects through the classic
approach of transportation demand. These projects can have an additional economic impact through
an allocative eﬃciency margin. For example, policymakers should consider whether the population
that resides in the potential connected areas work in the informal or formal economy. The results
suggest that even if a government is not concerned about distributional aspects, connecting poor
areas with high-eﬃciency locations can generate larger gains than transit developments that link
locations with a similar composition of workers through this new margin.
   Moreover, the results are informative on other public policy issues in urban areas. Programs that
segregate informal workers and poor individuals in cities in developing countries, combined with
high commuting costs, can increase the extent of resource misallocation, lowering both aggregate
eﬃciency and TFP. Hence, governments must make decisions based on an analysis that considers
all the ﬁrst-order components that may aﬀect welfare.




                                                36
References
Ahlfeldt, G. M., Redding, S. J., Sturm, D. M., and Wolf, N. (2015). The Economics of Density:
  Evidence From the Berlin Wall. Econometrica, 83:2127–2189.

Allen, T., Arkolakis, C., and Li, X. (2015). On the Existence and Uniqueness of Trade Equilibria.
  Working Paper, Yale University.

Alvarez, F. and Lucas, R. J. (2007). General equilibrium analysis of the Eaton-Kortum model of
  international trade. Journal of Monetary Economics, 54(6):1726–1768.

Anderson, S. P. and de Palma, A. (1992). The logit as a model of product diﬀerentiation. Oxford
  Economic Papers, 44(1):51–67.

Arkolakis, C., Costinot, A., Donaldson, D., and Rodríguez-Clare, A. (2019). The Elusive Pro-
  Competitive Eﬀects of Trade. Review of Economic Studies, 86(1):46–80.

Asturias, J., García-Santana, M., and Ramos, R. (2016). Competition and the Welfare Gains from
  Transportation Infrastructure: Evidence from the Golden Quadrilateral of India. Working Papers
  907, Barcelona Graduate School of Economics.

Atkin, D. and Khandelwal, A. (2019). How distortions alter the impacts of international trade in
  developing countries. Working Paper 26230, National Bureau of Economic Research.

Bachas, P. J., Gadenne, L., and Jensen, A. (2020). Informality, Consumption Taxes and Redistri-
  bution. Policy Research Working Paper Series 9267, The World Bank.

Balboni, C. (2019). In harm’s way? infrastructure investments and the persistence of coastal cities.
  Unpublished paper, MIT.

Banerjee, A. V. and Duﬂo, E. (2005). Growth Theory through the Lens of Development Economics.
  In Aghion, P. and Durlauf, S., editors, Handbook of Economic Growth, volume 1 of Handbook of
  Economic Growth, chapter 7, pages 473–552. Elsevier.

Baqaee, D. R. and Farhi, E. (2020). Productivity and Misallocation in General Equilibrium. The
  Quarterly Journal of Economics, 135(1):105–163.

Bartelme, D., Costinot, A., Donaldson, D., and Rodríguez-Clare, A. (2019). External Economies of
  Scale and Industrial Policy: A View from Trade. Working paper, UC Berkeley.

Baum-Snow, N. (2007). Did highways cause suburbanization? The Quarterly Journal of Economics,
  122(2):775–805.

Berger, D. W., Herkenhoﬀ, K. F., and Mongey, S. (2019). Labor market power. Working Paper
  25719, National Bureau of Economic Research.


                                                37
Berry, S. T. (1994). Estimating discrete-choice models of product diﬀerentiation. The RAND Journal
  of Economics, 25(2):242–262.

Busso, M., Fazio, M. V., and Algazi, S. L. (2012). (In)Formal and (Un)Productive: The Productivity
  Costs of Excessive Informality in Mexico. Research Department Publications 4789, Inter-American
  Development Bank, Research Department.

Dekle, R., Eaton, J., and Kortum, S. (2008). Global Rebalancing with Gravity: Measuring the
  Burden of Adjustment. IMF Staﬀ Papers, 55(3):511–540.

Dix Carneiro, R., Goldberg, P., Meguir, C., and Ulyssea, G. (2018). Trade and Informality in the
  Presence of Labor Market Frictions and Regulations. Working Paper, Yale University.

Donaldson, D. and Hornbeck, R. (2016). Railroads and American Economic Growth: A “Market
  Access” Approach. The Quarterly Journal of Economics, 131(2):799–858.

Edmond, C., Midrigan, V., and Xu, D. Y. (2015). Competition, Markups, and the Gains from
  International Trade. American Economic Review, 105(10):3183–3221.

Fajgelbaum, P. and Schaal, E. (2017). Optimal Transport Networks in Spatial Equilibrium. Working
  Paper, UC Los Angeles.

Fajgelbaum, P. D. and Gaubert, C. (2020). Optimal Spatial Policies, Geography, and Sorting*. The
  Quarterly Journal of Economics, 135(2):959–1036.

Fajgelbaum, P. D., Morales, E., Serrato, J. C. S., and Zidar, O. (2019). State Taxes and Spatial
  Misallocation. Review of Economic Studies, 86(1):333–376.

Galle, S., Rodríguez-Clare, A., and Yi, M. (2017). Slicing the Pie: Quantifying the Aggregate and
  Distributional Eﬀects of Trade. NBER Working Papers 23737, National Bureau of Economic
  Research, Inc.

Gollin, D. (2002). Getting Income Shares Right. Journal of Political Economy, 110(2):458–474.

Gollin, D. (2008). Nobody’s business but my own: Self-employment and small enterprise in economic
  development. Journal of Monetary Economics, 55(2):219–233.

Gonzalez-Navarro, M. and Quintana-Domeque, C. (2016). Paving Streets for the Poor: Experimental
  Analysis of Infrastructure Eﬀects. The Review of Economics and Statistics, 98(2):254–267.

Gonzalez-Navarro, M. and Turner, M. A. (2018). Subways and urban growth: Evidence from earth.
  Journal of Urban Economics, 108(C):85–106.

Heblich, S., Redding, S. J., and Sturm, D. M. (2018). The Making of the Modern Metropolis:
  Evidence from London. NBER Working Papers 25047, National Bureau of Economic Research,
  Inc.

                                               38
Helpman, E. (1995). The Size of Regions. Papers 14-95, Tel Aviv.

Hernández-Cortés, D., Olica, P., and Severen, C. (2021). Modal Choice, Income, and Congestion in
  Mexico City. Working paper, Federal Reserve Bank of Philadelphia.

Holmes, T. J., Hsu, W.-T., and Lee, S. (2014). Allocative eﬃciency, mark-ups, and the welfare gains
  from trade. Journal of International Economics, 94(2):195–206.

Hornbeck, R. and Rotemberg, M. (2019). Railroads, Reallocation and the Rise of American Manu-
  facturing. Working paper, University of Chicago.

Hsieh, C.-T., Hurst, E., Jones, C. I., and Klenow, P. J. (2019). The allocation of talent and u.s.
  economic growth. Econometrica, 87(5):1439–1474.

Hsieh, C.-T. and Klenow, P. J. (2009). Misallocation and Manufacturing TFP in China and India.
  The Quarterly Journal of Economics, 124(4):1403–1448.

Hsieh, C.-T. and Moretti, E. (2019). Housing Constraints and Spatial Misallocation. American
  Economic Journal: Macroeconomics, 11(2):1–39.

Hulten, C. R. (1978). Growth Accounting with Intermediate Inputs. The Review of Economic
  Studies, 45(3):511–518.

Kanbur, R. (2009). Conceptualizing Informality: Regulation and Enforcement. Working Papers
  48926, Cornell University, Department of Applied Economics and Management.

Krugman, P. (1991). Increasing Returns and Economic Geography. Journal of Political Economy,
  99(3):483–499.

La Porta, R. and Shleifer, A. (2008). The Unoﬃcial Economy and Economic Development. Brookings
  Papers on Economic Activity, 39(2 (Fall)):275–363.

La Porta, R. and Shleifer, A. (2014). Informality and Development. Journal of Economic Perspec-
  tives, 28(3):109–126.

Lagakos, D. and Waugh, M. E. (2013). Selection, agriculture, and cross-country productivity diﬀer-
  ences. American Economic Review, 103(2):948–80.

Levy, S. (2018). Under-Rewarded Eﬀorts: The Elusive Quest for Prosperity in Mexico. Interamerican
  Development Bank.

McCaig, B. and Pavcnik, N. (2018). Export Markets and Labor Allocation in a Low-Income Country.
  American Economic Review, 108(7):1899–1941.

McMillan, M. S. and McCaig, B. (2019). Trade liberalization and labor market adjustment in
  botswana. Working Paper 26326, National Bureau of Economic Research.

                                                39
Miyauchi, Y., Nakajima, K., and Redding, S. (2020). Consumption Access and Agglomeration:
  Evidence from Smartphone Data. Working paper, Princeton University.

Monte, F., Redding, S. J., and Rossi-Hansberg, E. (2018). Commuting, Migration, and Local
  Employment Elasticities. American Economic Review, 108(12):3855–3890.

Moreno-Monroy, A. I. and Posada, H. M. (2018). The eﬀect of commuting costs and transport
  subsidies on informality rates. Journal of Development Economics, 130(C):99–112.

Pérez Pérez, J. (2018).       City minimum wages.         Unpublished paper, Brown University,
  http://jorgeperezperez. com/ﬁles/Jorge_Perez_JMP. pdf (viewed July 19, 2018).

Perry, G. E., Maloney, W. F., Arias, O. S., Fajnzylber, P., Mason, A. D., and Saavedra-Chanduvi, J.
  (2007). Informality : Exit and Exclusion. Number 6730 in World Bank Publications. The World
  Bank.

Ramírez, S. B., Rojo, M. H., and Gault, D. A. (2017). Decisiones e Implementación en la Con-
  strucción de las Primeras Once Líneas de la Red del Metro en La Ciudad de México Hacia la
  Desorganización del Metro (1967-2000). Working Paper, Centro de Investigaci ón y Docencia
  Económicas CIDE.

Redding, S. J. and Turner, M. A. (2015). Transportation Costs and the Spatial Organization of
  Economic Activity. In Duranton, G., Henderson, J. V., and Strange, W. C., editors, Handbook
  of Regional and Urban Economics, volume 5 of Handbook of Regional and Urban Economics,
  chapter 0, pages 1339–1398. Elsevier.

Restuccia, D. and Rogerson, R. (2008). Policy Distortions and Aggregate Productivity with Het-
  erogeneous Plants. Review of Economic Dynamics, 11(4):707–720.

Santamaría, M. (2020). The Gains from Reshaping Infrastructure: Evidence from the division of
  Germany. Working paper, University of Warwick.

Suárez, M., Murata, M., and Campos, J. D. (2016). Why do the poor travel less? Urban structure,
  commuting and economic informality in Mexico City. Urban Studies, 53(12):2548–2566.

   ecki, T. (2017). Intersectoral distortions and the welfare gains from trade. Journal of Interna-
Świ¸
  tional Economics, 104(C):138–156.

Train, K. and McFadden, D. (1978). The goods/leisure tradeoﬀ and disaggregate work trip mode
  choice models. Transportation Research, 12(5):349 – 353.

Tsivanidis, N. (2019). Evaluating the Impact of Urban Transit Infrastructure: Evidence from Bo-
  gotá’s Transmilenio. Working Paper, UC Berkeley.

Ulyssea, G. (2018). Firms, Informality, and Development: Theory and Evidence from Brazil. Amer-
  ican Economic Review, 108(8):2015–47.

                                                40
Figures

                                            Figure 1: Transit System




                             (a) Line B                                  (b) Other lines

Notes: This ﬁgure plots a map of Mexico City with the transportation system. Panel (a) highlights the transit line
-Line B- that I exploit in my main speciﬁcation. On the other hand, panel (b) highlights the two lines that I use as
a control group for the robustness checks. According to the transit expansion plan from 1980, line c -green line- was
planned as a feeder line in the early 2000s, similar to line B. However, the Government of the city never constructed
it. And line 12 -red line- is the latest subway line in Mexico City and was opened in 2012. The other lines correspond
to the other subway lines of the actual system.




                                                         41
                             Figure 2: Commuting Time- Informal vs. Formal




Notes: This ﬁgure plots the point estimate and 95th percentile conﬁdence interval of a regression that relates the
probability of commuting within some window of time with an informal dummy variable. The ﬁrst bar reports the
results for the category of non-commuting, the second bar if the worker spends on average between 1 to 15 minutes,
the second bar between 16 to 30 minutes, the fourth bar between 30 to 60 minutes, the ﬁfth bar between 60 to 120
minutes, and the sixth bar more than 120 minutes. The dark-blue bar does not include controls, while the light-blue
bar includes individual controls and municipality ﬁxed eﬀects. Standard errors are computed with clusters at the
municipality level.




                                Figure 3: Spatial distribution of informality




                 (a) Informal workers                                         (b) Informal Residents

Notes: This ﬁgure plots a map of Mexico City with the spatial distribution of informality rates. Panel (a) plots a
heat map of workers’ informality rates by deciles in 1999. Panel (b) plots a heat map of residents’ informality rates
by deciles in 2000. The main takeaway of this map is that in the middle-west and center of the city informality rates
are lower than on the boundaries and east of Mexico City. As a result, informal workers that live in the outskirts
have poor access to most of the formal employment, which is located in the center of the city.



                                                         42
                 Figure 4: Diﬀerence in Diﬀerence Results-Workers’ Informality Share




                 (a) Informal workers                                (b) Informal and non-salaried workers

Notes: This ﬁgure depicts the point estimates and 90th percentile conﬁdence interval from the diﬀerence in diﬀerence
speciﬁcation relating workers’ informality rates with the transit shock. The treatment group are census tracts with
centroids within a walking range of 25 minutes to stations of line B. The control group are census tracts in Mexico
City. Panel (a) reports the results for the share of informal workers, and panel (b) for the share of informal and
non-salaried workers. Standard errors are clustered at the census tract level.


                         Figure 5: Lower-bound of Transit Infrastructure Impact




Notes: This ﬁgure plots the results of the lower bound eﬀect under the extreme assumption that all workers that
move from the city to the outskirts were formal. The red line represents the point estimate of the main speciﬁcation,
the orange line the share of workers in the treated areas that live in the locations that experience the shock, and the
blue line the diﬀerence-in-diﬀerence point estimate when the speciﬁcation removes the people that moved.




                                                          43
                                   Figure 6: Change in CMA across sectors




Notes: This ﬁgure plots a heat map of Mexico City with the spatial distribution of the change in CMA across sectors
after the transit shock. I construct natural breaks across locations by taking the diﬀerence between the formal and
informal sector of CMA before and after the shock. Each color represents one of the natural breaks categories. Blue
colors represent a very small change, while red color a very large change. From the ﬁgure, census tracts close to the
new line got better access to formal employment relative to the informal sector. Thus, workers reallocate to the formal
sector.




                                                          44
                                       Figure 7: Counterfactual results




                     (a) No migration                                         (b) Migration




           (c) No migration-constant wedge                           (d) Migration-constant wedge


Notes : This ﬁgure plots the counterfactual results. Panel (a) and (c) show the results for the counterfactual with no
migration, and panel (b) and (d) for the counterfactual in which there is migration. In panel (a) and (b), I calibrate
the distortions using value added and in panel (c) and (d) a constant wedge for the formal sector based on Levy
(2018).




                                                         45
                                Figure 8: Counterfactual results-Fixed costs




             (a) Decrease formal costs                                   (b) Increase informal costs


Notes: This ﬁgure plots the counterfactual results for changes in the entry ﬁxed cost for both formal and informal
ﬁrms. Panel (a) shows the results for a counterfactual reducing formal ﬁxed costs, and panel (b) for a counterfactual
increasing informal ﬁxed costs. The objective of the government is to reduce informality rates by 0.5%, which is the
aggregate eﬀect that I ﬁnd from the transit shock.



                                        Figure 9: Place-based policies




Notes: This ﬁgure plots a map of Mexico City with the locations in which the Government increases the commercial
ﬂoorspace for formal ﬁrms. The central locations are in red, and the remote locations in blue.




                                                         46
                          Figure 10: Counterfactual results-Place based policies




    (a) Place-based policies central locations                (b) Place-based policies remote locations


Notes: This ﬁgure plots the counterfactual results for changes in the supply of commercial ﬂoor space for formal
ﬁrms. Panel (a) shows the results for a counterfactual increasing commercial ﬂoor space in central locations, and
panel (b) in remote areas. The objective of the government is to increase welfare by 1.84%, which is the eﬀect from
the transit line.




                                                        47
Tables

                             Table 1: Diﬀerence-in-Diﬀerence - Share of Informal Residents


                              (1)                 (2)                  (3)                 (4)                 (5)                 (6)                 (7)                 (8)
Outcome:               ∆(ln LF − ln LI )   ∆(ln LF − ln LI )    ∆(ln LF − ln LI )   ∆(ln LF − ln LI )   ∆(ln LF − ln LI )   ∆(ln LF − ln LI )   ∆(ln LF − ln LI )   ∆(ln LF − ln LI )
                                                           Panel A: Continuous treatment measure-Pool of residents
- ln distance              0.040***            0.054***             0.045***            0.058***             0.014*             0.030***            0.018**             0.035***
                           (0.007)             (0.008)               (0.008)            (0.008)             (0.008)             (0.008)             (0.009)             (0.009)
Observations                3,192               3,192                 3,192              3,192               3,192               3,192               3,192               3,192
R-squared                   0.162               0.248                 0.162              0.248               0.230               0.300               0.230               0.301


                                                               Panel B: Treatment dummy variable-Pool of residents
Ti                         0.038**             0.069***              0.033**            0.067***             0.024              0.068***             0.016              0.064***
                           (0.016)             (0.016)               (0.016)            (0.016)             (0.018)             (0.016)             (0.018)             (0.017)
Observations                3,192               3,192                 3,192              3,192               3,192               3,192               3,192               3,192
R-squared                   0.156               0.241                 0.156              0.240               0.230               0.300               0.230               0.300


                                                          Panel C: Continuous treatment measure-Low skilled residents
-ln distance               0.049***            0.056***             0.053***            0.060***             0.017*             0.032***            0.021**             0.036***
                           (0.008)             (0.008)               (0.008)            (0.008)             (0.009)             (0.008)             (0.010)             (0.009)
Observations                3,192               3,192                 3,192              3,192               3,192               3,192               3,192               3,192
R-squared                   0.137               0.230                 0.138              0.230               0.203               0.281               0.203               0.282


                                                           Panel D: Treatment dummy variable-Low skilled residents
Ti                         0.051***            0.071***             0.046***            0.069***             0.027              0.068***             0.019              0.065***
                           (0.017)             (0.016)               (0.017)            (0.016)             (0.019)             (0.017)             (0.019)             (0.017)
Observations                3,192               3,192                 3,192              3,192               3,192               3,192               3,192               3,192
R-squared                   0.130               0.222                 0.130              0.221               0.202               0.281               0.202               0.281


                                                               Panel E: Continuous treatment measure-Outskirt area
-ln distance               0.072***            0.088***             0.079***            0.095***            0.037***            0.050***            0.041***            0.055***
                           (0.009)             (0.009)               (0.010)            (0.010)             (0.010)             (0.009)             (0.011)             (0.010)
Observations                2,171               2,171                 2,171              2,171               2,171               2,171               2,171               2,171
R-squared                   0.199               0.279                 0.200              0.279               0.279               0.338               0.280               0.338


                                                                Panel F: Treatment dummy variable-Outskirt area
Ti                         0.076***            0.138***             0.066***            0.131***            0.062***            0.110***            0.048**             0.099***
                           (0.021)             (0.019)               (0.022)            (0.019)             (0.023)             (0.021)             (0.024)             (0.022)
Observations                2,171               2,171                 2,171              2,171               2,171               2,171               2,171               2,171
R-squared                   0.185               0.264                 0.184              0.262               0.277               0.336               0.276               0.335


Distance                    Meters             Minutes               Meters             Minutes              Meters             Minutes              Meters             Minutes
Dist.+Prod. Controls          X                   X                     X                  X                   X                   X                   X                   X
Population Controls                               X                                        X                                       X                                       X
State FE                      X                   X                     X                  X
Municipality FE                                                                                                X                   X                   X                   X



Notes: This table reports the results of a regression relating changes in the share of informal residents in each location
with the line B of the subway. Panel A reports the results for the continuous treatment measures and the pool of
residents, panel B for the treatment dummy variables and the pool of residents, panel C for the continuous treatment
measure and low-skilled workers, panel D for the treatment dummy variables and low skilled workers, panel E for
the continuous treatment measure on the locations that are not in the CBD, and panel F for the treatment dummy
variable on the locations that are not in the CBD. In the ﬁrst four columns, I include state-time ﬁxed eﬀects, and in
the ﬁfth column to the eight column municipality-time ﬁxed eﬀects. The regressions are weighted by the population
in 2000. Standard errors are clustered at the census tract level and reported in parentheses. *p < 0.1, **p < 0.05,
***p < 0.01.




                                                                                      48
                             Table 2: Diﬀerence-in-Diﬀerence - Log individuals


                                     (1)           (2)            (3)          (4)          (5)           (6)
    Outcome:                       ∆ ln Li      ∆ ln LiF        ∆ ln LiI     ∆ ln Li     ∆ ln LiF       ∆ ln LiI
                                               Panel A: Pool of workers
    Ti                             0.017*       0.057***        -0.010       -0.006      0.030***      -0.034***
                                   (0.009)       (0.010)        (0.013)      (0.009)      (0.011)       (0.013)
    Observations                    3,192         3,192          3,192       3,192         3,192         3,192
    R-squared                       0.310         0.417          0.177       0.365         0.458         0.251


                                             Panel B: Low-skilled workers
    Ti                            0.022***      0.067***        -0.002       -0.001      0.039***      -0.026**
                                   (0.008)       (0.011)        (0.013)      (0.009)      (0.011)       (0.013)
    Observations                    3,192         3,192          3,192       3,192         3,192         3,192
    R-squared                       0.440         0.489          0.220       0.465         0.513         0.280


                                             Panel C: High-skilled workers
    Ti                              0.008         0.022         -0.014       -0.011        0.004       -0.038**
                                   (0.013)       (0.014)        (0.017)      (0.012)      (0.013)       (0.017)
    Observations                    3,192         3,192          3,192       3,192         3,192         3,192
    R-squared                       0.446         0.442          0.375       0.497         0.492         0.427


    Controls                         X             X               X           X            X              X
    State fe                         X             X               X
    Municipality fe                                                            X            X              X

Notes: This table reports the results of a regression relating changes in the log of the number of individuals in each
location and sector with the line B of the subway. Panel A reports the results for the pool of workers, panel B for
low-skilled workers, and panel C for high-skilled workers. In the ﬁrst three columns, I include state-time ﬁxed eﬀects,
and in the fourth column to the sixth column municipality-time ﬁxed eﬀects. The ﬁrst and fourth column reports the
results for the overall number of individuals, the second and ﬁfth column for individuals in the formal sector, and the
third and sixth column for workers in the informal sector. The regressions are weighted by the population in 2000.
Standard errors are clustered at the census tract level and reported in parentheses. *p < 0.1, **p < 0.05, ***p < 0.01.




                                                           49
                           Table 3: Change in covariates after the transit shock


                                       (1)       (2)       (3)         (4)        (5)         (6)            (7)       (8)
       Outcome:                                 Number of kids                               Household size
       - ln distance                  0.008              -0.004                  0.015                     0.011
                                     (0.042)             (0.052)                (0.013)                    (0.014)
       Ti                                      -0.026                -0.049                  0.024                    0.011
                                               (0.091)               (0.104)                (0.030)                  (0.029)
       Observations                   3,192    3,192     3,192        3,192      3,192       3,192         3,192      3,192
       R-squared                      0.038    0.038     0.076        0.076      0.060       0.060         0.076      0.076


       Outcome:                                  Male dummy                                          Age
       - ln distance                  0.000              -0.000                  -0.008                    0.031
                                     (0.000)             (0.000)                (0.021)                    (0.026)
       Ti                                      -0.001               -0.002***                0.008                   0.107*
                                               (0.001)               (0.001)                (0.050)                  (0.057)
       Observations                   3,192    3,192     3,192        3,192      3,192       3,192         3,192      3,192
       R-squared                      0.255    0.255     0.273        0.273      0.137       0.137         0.179      0.180


       Outcome:                                High-skilled share                            Student share
       - ln distance                  -0.000             -0.001                 -0.002**                   -0.001
                                     (0.001)             (0.001)                (0.001)                    (0.001)
       Ti                                      -0.002                -0.001                -0.006***                 -0.004**
                                               (0.002)               (0.002)                (0.002)                  (0.002)
       Observations                   3,192    3,192     3,192        3,192      3,192       3,192         3,192      3,192
       R-squared                      0.244    0.244     0.310        0.310      0.142       0.143         0.174      0.175
       Controls                         X        X         X           X           X          X              X          X
       State FE                         X        X                                 X          X
       Municipality FE                                     X           X                                     X          X


Notes: This table reports the results of a diﬀerence-in-diﬀerence speciﬁcation relating changes in household composi-
tion and covariates with the transit shock. The odd columns report the results for the continuous treatment variable,
and the even columns for the treatment dummy variable. The regressions are weighted by the population in 2000.
Standard errors are clustered at the census tract level and reported in parentheses. *p < 0.1, **p < 0.05, ***p < 0.01.




                                                            50
                     Table 4: Gravity Equations-Commuting and Trade Elasticities


                                                                       (1)                         (2)
                                                                 Formal sector              Informal sector
                                              Panel A: commuting
 Outcome                                                             ln λniF                     ln λniI


 Minutes                                                            -0.028***                  -0.042***
                                                                     (0.003)                     (0.005)
 Observations                                                         2,257                       2,280
 R-squared                                                            0.535                       0.518
 Implied θ                                                            3.11                        4.66
                                                Panel B: Trade
 Outcome                                                             ln πniF                     ln πniI


 Minutes                                                            -0.059***                  -0.078***
                                                                     (0.004)                     (0.005)
 Observations                                                         2,128                       2,108
 R-squared                                                            0.406                       0.497
 Implied σ                                                            5.39                        6.94


 Origin -Transportation mode FE                                         X                           X
 Destination -Transportation mode FE                                    X                           X

Notes: This table reports the results of a gravity equation relating commuting and trade ﬂows at the municipality
level with the average time for four diﬀerent transportation modes: car, bus, metro or metrobus (brt), and walking.
I estimate this regression via the PPML method to include the zeros. The ﬁrst column presents the results for the
formal sector the second column for the informal sector. Standard errors are clustered at the municipality of origin
level and reported in parentheses. *p < 0.1, **p < 0.05, ***p < 0.01.




                                                        51
                            Table 5: Estimation of Labor Supply across sectors


                                      (1)          (2)            (3)        (4)          (5)          (6)
                                     OLS          OLS             IV1        IV1          IV2          IV2
             Outcome:              ∆t,s ln Ls   ∆t,s ln Ls   ∆t,s ln Ls   ∆t,s ln Ls   ∆t,s ln Ls   ∆t,s ln Ls


             κ                      0.179**     -0.383***    1.142***     1.516***     1.490***     2.387***
                                    (0.078)      (0.102)      (0.142)      (0.222)      (0.351)      (0.643)


             Observations            3,192        3,192        3,192        3,192        3,192        3,192
             Adjusted R-squared      0.237        0.296        0.247        0.302        0.238        0.295


                                                                  FS1       FS1          FS2          FS2
             Outcome:                                        ∆t,s CMA     ∆t,s CMA     ∆t,s CMA     ∆t,s CMA


             ∆t,s CMA                                        1.981***     1.407***
                                                              (0.093)      (0.070)
             Ti                                                                        0.047***     0.028***
                                                                                        (0.004)      (0.003)


             Observations                                      3,192        3,192        3,192        3,192
             Adjusted R-squared                                0.588        0.852        0.523        0.822


             F-stat                                           588.91       708.68        77.06        59.95
             Controls                  X            X             X           X            X            X
             State FE                  X                          X                        X
             Municipality FE                        X                         X                         X

Notes: This table reports the results of the estimation of the labor supply elasticity to recover the parameter κ, which
governs the reallocation from the informal to the formal sector. The dependent variable is the change in the log ratio
between formal and informal workers. The independent variable is the change in CMA across sectors. The ﬁrst two
columns show the results for the OLS. The third and fourth column displays the results of a two-stage least square
estimation using as an instrument the change in CMA across sectors and holding constant the number of workers and
residents. The ﬁfth and sixth column display the results of a two-stage least square estimation using as an instrument
a dummy variable indicator of whether the centroid of the census tract is within a 25 minute walking range. The
odd columns include state ﬁxed eﬀects, and the even columns include municipality ﬁxed eﬀects. Standard errors are
clustered at the census tract level and reported in parentheses. *p < 0.1, **p < 0.05, ***p < 0.01.




                                                             52
                                                                   ˆ = X /X
                                   Table 6: Counterfactual Results X

                             (1)            (2)             (3)            (4)            (5)             (6)
                            Panel A: Percentage change in welfare-Distortions
                                       No Migration                                    Migration
    %∆ Welfare              ∆dni           ∆τni        ∆dni , ∆τni        ∆dni           ∆τni        ∆dni , ∆τni


    Total change           0.95%          0.77%           1.86%          0.92%           0.73%          1.75%
    Decomposition
    Pure Eﬀect             91.94%         65.36%         79.16%          91.70%         65.51%         79.27%
    Allocation             7.17%          31.17%         19.20%          7.21%          30.54%         18.64%
    Agglomeration          0.89%          3.46%           1.64%          1.08%           3.95%          2.09%


                          Panel B: Percentage change in welfare-Constant wedge
    %∆ Welfare              ∆dni           ∆τni        ∆dni , ∆τni        ∆dni           ∆τni        ∆dni , ∆τni


    Total change           0.93%          0.71%           1.76%          0.91%           0.68%          1.68%
    Decomposition
    Pure Eﬀect             93.60%         71.26%         83.71%          92.72%         70.70%         82.90%
    Allocation             5.62%          23.55%         14.01%          6.29%          23.61%         14.39%
    Agglomeration          0.78%          5.19%           2.28%          0.99%           5.69%          2.71%

Notes : This table reports the counterfactual results for the line B of the subway. The ﬁrst and fourth column considers
only change in commuting costs, the second and ﬁfth column changes in trade costs, and the third and sixth column
considers changes in both type of iceberg costs. The ﬁrst three columns presents the results for the counterfactual with
no migration, and the second three columns for the counterfactual in which I allow for migration in the model. Panel
A reports the results for welfare with the calibrated distortions a lá Hsieh & Klenow (2009), and panel B for welfare
with a constant wedge in the formal sector based on Levy (2018). The ﬁrst row describes the results considering the
total change. While, the other rows decompose the total change into the diﬀerent components. The second row shows
the percentage explained by the direct eﬀect, the third row by the allocative eﬃciency margin, and the fourth row by
the agglomeration externality component.




                                                          53
                                           Online Appendix:
         Spatial Misallocation, Informality, and Transit Improvements

A Additional Figures                                                                                               1

B Additional Tables                                                                                                8

C Data and Quantiﬁcation Appendix                                                                                 12
  C.1 Calibration of Speeds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
  C.2 Calibration of Fixed Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
  C.3 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
  C.4 Additional Infrastructure Counterfactuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
  C.5 Formal workers rebate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
  C.6 Line 1 and 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

D Theoretical Appendix                                                                                            20
  D.1 Welfare Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
  D.2 The problem of the social planner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
  D.3 Equilibrium Conditions - Exact Hat Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
  D.4 Model with ex-ante ﬁrm decision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29




                                                         0
A       Additional Figures

                                           Figure A1: Informality Rates-Latin America and the Caribbean




                                                                 80
                                                                 60
                                                          Informality Rate
                                                               4020
                                                                 0

                                                                               ile


                                                                                          y

                                                                                                a

                                                                                                       il

                                                                                                                  a

                                                                                                                         a

                                                                                                                                 or

                                                                                                                                          bia

                                                                                                                                                      p.

                                                                                                                                                                       o

                                                                                                                                                                             u

                                                                                                                                                                                    la

                                                                                                                                                                                          gua


                                                                                                                                                                                                 ia

                                                                                                                                                                                                        ay
                                                                                               Ric




                                                                                                              uel

                                                                                                                       tin
                                                                                         a




                                                                                                                                                                      xic

                                                                                                                                                                            Per
                                                                                                         z




                                                                                                                                                                                   ma




                                                                                                                                                                                                liv
                                                                                                                                                   Re
                                                                                                                              vad
                                                                                     ugu




                                                                                                                                                                                                       agu
                                                                              Ch




                                                                                                     Bra




                                                                                                                                      lom




                                                                                                                                                                                         ara
                                                                                                                      gen
                                                                                                             nez




                                                                                                                                                                  Me




                                                                                                                                                                                                Bo
                                                                                                                                                                                  ate
                                                                                              sta




                                                                                                                                                m.
                                                                                                                             Sal




                                                                                                                                                                                                      Par
                                                                                     Ur




                                                                                                                                                                                        Nic
                                                                                                                                    Co
                                                                                                                    Ar
                                                                                                             Ve




                                                                                                                                                                              Gu
                                                                                                                                            Do
                                                                                          Co




                                                                                                                            El
Notes: This ﬁgure plots informality rates across countries from Latin America and the Caribbean. The data source is the online
appendix from Ulyssea (2018) that uses data from SEDLAC, an initiative from the World Bank and Universidad Nacional de la
Plata. Informal workers are deﬁned as those without social security. The orange line represents the average informality rate of
countries from the OECD. The ﬁgure shows that informality rates in LAC are very high, and even within the region, Mexico is
one of the countries with the highest informality rates.




                                   Figure A2: Firm size and Productivity Distribution-Economic Census 1999
              .6




                                                                                                                                                         .4      .3
                      .4
       Density function




                                                                                                                                                Density function
                                                                                                                                                      .2
      .2




                                                                                                                                                         .1
              0




                                                                                                                                                         0




                           1   4    16        64       256                   1024         4096         16384           65536                                           -5                               0                       5                          10
                                                    Firm Size                                                                                                                                                Sales per worker

                                     Legal and informal                         Illegal and informal                                                                                             Legal and informal                 Illegal and informal
                                     Mixed                                      Legal and formal                                                                                                 Mixed                              Legal and formal



                                         (a) Firm size                                                                                                                                               (b) Productivity

Notes: This ﬁgure plots the ﬁrm size and productivity distribution for the four diﬀerent categories of ﬁrms: 1) Legal and informal
2) Illegal and informal, 3) Mixed, and 4) Legal and formal. I use the 2004 economic census. Panel (a) plots the ﬁrm size distribution
and panel (b) the productivity distribution. Firm size is measured as the number of workers, and productivity as the logarithm of
sales per worker.




                                                                                                                                      1
                       Figure A3: Diﬀerence in Diﬀerence Results-Residents’ Informality Share




Notes: This ﬁgure depicts the point estimate and 90th percentile conﬁdence interval of a regression that relates the change over
time in the log of the ratio between formal and informal residents with the transit shock. The treatment variable takes a value of 1
for census tracts with a centroid within a 25 minutes walking range. The ﬁrst three bars show the results of a regression including
distance and population controls with state ﬁxed eﬀects, and the second three bars report the results with municipality ﬁxed eﬀects.
Standard errors are clustered at the census tract level. The dark-blue bar reports the results for the pool of workers, the middle-blue
bar for low-skilled workers and the light-blue bar for high skilled workers. Line B increased the ratio of formal to informal residents
on approximately 7% when I compare treated areas vs. the rest of Mexico City.



                               Figure A4: Robustness Checks-Residents’ Informality Share




Notes: This ﬁgure depicts the point estimate and 95th percentile conﬁdence interval of a regression that relates the change over
time in the log of the ratio between formal and informal residents with the transit shock. The treatment variable takes a value of
1 for census tracts whith a centroid within a buﬀer zone of the new subway line. The control group are locations within a buﬀer
zone of line C or line 12. These were subway lines that the Government planned to build in the 1980s, but it didn’t construct in my
period of analysis. I use diﬀerent buﬀer zones: 1500, 2000, 2500, and 3000 meters. The ﬁrst bar shows the results for 1500 meters,
the second bar for 2000 meters, the third bar for 2500 meters, and the fourth bar for 3000 meters. Standard errors are clustered at
the census tract level.




                                                                  2
                                Figure A5: Robustness checks-Workers’ Informality Share




                       (a) Buﬀer: 1500 meters                                     (b) Buﬀer: 2000 meters




                       (c) Buﬀer: 2500 meters                                     (d) Buﬀer: 3000 meters

Notes: This ﬁgure depicts the point estimates and 95th percentile conﬁdence interval from the diﬀerence-in-diﬀerence speciﬁcation
using diﬀerent buﬀers and diﬀerent control groups. The treatment group is deﬁned as census tracts that are within a buﬀer to
the new stations. The control group are census tracts within a buﬀer to lines that the Government planned to build in 1980, but
that were not constructed in my period of study. The outcome variable is the share of informal and non-salaried workers and the
speciﬁcation includes state-time ﬁxed eﬀects. Panel (a) reports the results for a buﬀer of 1500 meters, panel (b) for 2000 meters,
panel (c) for 2500 meters, and panel (d) for 3000 meters. The blue line pools together as a control group the locations close to
line C and line 12, the orange lines locations close to line C, and the green line are locations close to line 12. Standard errors are
clustered at the census tract level.




                                                                  3
                                                                       Figure A6: Informal/formal sector by industry




                                                 1
                                                                                   0.15                0.53               0.46


                                                                                   0.85




                                                                .8
                                 Informal and formal employment
                                                     .6
                                                                                                                          0.54
                                                                                                       0.47


                                        .4       .2
                                                 0




                                                                              Manufacturing        Retail                Service

                                                                                              Formal          Informal

                                                                     Source: Levy (2018)




Notes: This ﬁgure plots the share of employment by industry between the formal and informal sector. The information comes
from the book by Levy (2018), who uses the 2014 Mexican Economic Censuses. In his book, like this study, the author deﬁnes the
informal and formal sector using the contractual relationship between the ﬁrm and the worker. An establishments is informal if it
only hires non-salaried workers or if it does not provide social security to their workforce.



                                                               Figure A7: Distribution of Labor Wedges by Sector




Notes: This ﬁgure plots the distribution of the labor wedge by sector across the diﬀerent census tracts. I follow Hsieh and Klenow
(2009) to calculate the labor wedge for each sector-location cell using the inverse of the labor share. In particular the distortion is
computed using the following relations w is Lis
                                        pis yis
                                                . The blue line depicts the labor wedge distribution for the formal sector, and the red
line for the informal sector. The ﬁgure suggests that conditional on productivity, formal ﬁrms are too small relative to a perfectly
eﬃcient allocation since these ﬁrms have higher levels of total factor revenue productivity (TFPR). The marginal revenue product
of labor does not equalize across ﬁrms.




                                                                                                   4
                                       Figure A8: Distribution of Wages by Sector




Notes: This ﬁgure plots the wage distribution obtained from the market access measures and the number of workers in each census
tract. According to the deﬁnition of ﬁrm market access, wis θs
                                                               = Lis FMA−  1
                                                                         is . The blue line depicts the wage distribution for the
formal sector, and the red line for the informal sector. The model replicates that the formal sector pays a wage premium. This
value is approximately 55% by comparing the wage median between the formal and informal sector.



                                       Figure A9: Change in CMA for each sector




                    (a) ∆ CMA Formal Sector                                  (b) ∆ CMA Informal Sector

Notes: This ﬁgure plots a map of Mexico City at the census tract level with the spatial distribution of the change in CMA after
the transit shock for each sector. I construct ventiles for the change in CMA across locations before and after the transit shock.
Each color represents one quantile category. Blue colors represent a very small change, while red color a very large change. Panel
(a) plots a heat map for the formal sector, and panel (b) for the informal sector. From the ﬁgure, it is clear that locations that
experienced the shock and are close to the new stations got better access to both formal and informal employment.




                                                                5
                       Figure A10: Spatial Distribution of Productivity and the Labor Wedge




                        (a) Productivity                                                (b) Labor wedge

Notes: This ﬁgure plots a map of Mexico City with the spatial distribution of productivity measured as value added per worker. I
construct ventiles across locations after aggregating value added measures and the total number of workers. Each color represents
one of the quantile categories. Census tracts in central areas have higher productivity measures.



                Figure A11: Diﬀerence in Diﬀerence Results-Workers’ Informality Share-20 minutes




                        (a) Informal workers                            (b) Informal and non-salaried workers

Notes: This ﬁgure depicts the point estimates and 90th percentile conﬁdence interval from the diﬀerence in diﬀerence speciﬁcation
relating workers’ informality rates with the transit shock. The treatment group are census tracts with centroids within a walking
range of 20 minutes to stations of line B. The control group are census tracts in Mexico City. Panel (a) reports the results for the
share of informal workers, and panel (b) for the share of informal and non-salaried workers. Standard errors are clustered at the
census tract level.




                                                                 6
               Figure A12: Diﬀerence in Diﬀerence Results-Residents’ Informality Share-20 minutes




Notes: This ﬁgure depicts the point estimate and 90th percentile conﬁdence interval of a regression that relates the change over
time in the log of the ratio between formal and informal residents with the transit shock. The treatment variable takes a value of 1
for census tracts with a centroid within a 20 minutes walking range. The ﬁrst three bars show the results of a regression including
distance and population controls with state ﬁxed eﬀects, and the second three bars report the results with municipality ﬁxed eﬀects.
Standard errors are clustered at the census tract level.




                                                                 7
    B     Additional Tables

                                                                 Table B1: Informality and Commuting Patterns

                                                    (1)                         (2)                       (3)                         (4)                  (5)
                                                           Panel A: Probability of working in the same municipality of residence
              Outcome:                     Workplace municipality    Workplace municipality    Workplace municipality    Workplace municipality   Workplace municipality


              Informal                           -0.265***                  -0.231***                  -0.231***                   -0.132***            -0.079***
                                                  (0.009)                     (0.008)                   (0.008)                     (0.006)              (0.008)


              Observations                        577,041                    577,039                   577,039                     517,354               516,931
              R-squared                            0.069                       0.098                     0.123                       0.215                0.465


                                                                Panel B: Probability of working in the CBD of Mexico City
              Outcome:                        Workplace-CBD              Workplace-CBD             Workplace-CBD             Workplace-CBD           Workplace-CBD
8




              Informal                           -0.086***                   -0.056**                  -0.059***                   -0.037***                -
                                                  (0.026)                     (0.024)                   (0.018)                     (0.011)                 -


              Observations                        577,041                    577,039                   577,039                     517,354                  -
              R-squared                            0.007                       0.042                     0.468                       0.444                  -


              Individual Characteristics                                        X                         X                           X                     X
              Origin FE                                                                                   X                           X                     X
              Transportation Mode FE                                                                                                  X                     X
              Destination FE                                                                                                                                X

    Notes: This table reports the results of a linear probability model relating the probability of working in the same municipality as the one in which the worker resides, and the
    probability of working in the CBD with a dummy variable that takes the value of 1 if the worker is informal. Panel A reports the results for working in the same municipality,
    and panel B whether the individual works in the CBD. Standard errors are clustered at the residence municipality level and reported in parentheses. ∗p < 0.1, ∗ ∗ p < 0.05,
    ∗ ∗ ∗p < 0.01.
                                      Table B2: Descriptive Statistics 1999 and 2000


                                                       Panel A: Outcomes
               Variable                                                  Mean             Sd      Min         Max
               Share informal workers                                   60.25%       33.37%      0.00%      100.00%
               Share informal and non-salaried workers                  43.47%       29.60%      0.00%      100.00%
               Share informal ﬁrms                                      84.15%       18.26%      0.01%      100.00%
               Share informal residents                                 46.68%       10.58%      1.00%       91.97%
               Share informal high-skilled residents                    35.65%        7.47%      1.42%       76.99%
               Share informal low-skilled residents                     50.31%       10.47%      1.07%       93.01%


                                                 Panel B: Treatment Variables
               Variable                                                  Mean             Sd      Min         Max
               Euclidean Distance to new stations (meters)             11223.33      6625.81     411.89     32838.87
               Walking Distance to new stations (minutes)                124.70         73.62     4.58       364.88
               Dummy variable (dist<2463)                               10.74%       30.97%      0.00%      100.00%
               Dummy variable (minutes≤25)                              10.00%       30.04%      0.00%      100.00%

Notes: This table reports summary statistic of the main variables. Panel A presents the statistics for the outcomes of interests:
workers’ informality rates from the Economic Census in 1999 and residents’ informality rates from the Population Census in 2000.
Panel B for the diﬀerent deﬁnitions of the treatment group that includes: the euclidean distance, the network walking distance, a
dummy variable whether the centroid of the ageb is within buﬀer zone of 2000 meters to the new stations, and a dummy variable
whether the centroid of the ageb is within a 25 minutes walking range.




                    Table B3: Results: Census tract characteristics 1999 and 2000 vs. Treatment

                                         (1)                (2)                     (3)                     (4)
             Outcome:                ln Income      High Skill Share       Occupation share        Informality Rates


             Ti                      -0.026***          -0.032***              -0.012***                 0.026***
                                       (0.009)           (0.008)                  (0.002)                 (0.007)


             Observations               3,193              3,193                  3,193                   3,193
             R-squared                  0.299              0.205                  0.332                   0.125


             Distance controls            X                  X                      X                       X
             State FE                     X                  X                      X                       X

Notes: This table reports the results of a regression relating census tract characteristics with a dummy variable whether the centroid
of the census tract is within a 25 minutes walking range. The ﬁrst column reports the results for the log of income, the second
column for the share of high-skilled workers, and the third column for the informality rate. Standard errors are clustered at the
census tract level and reported in parentheses. *p < 0.1, **p < 0.05, ***p < 0.01.




                                                                   9
                                                            Table B4: Diﬀerence-in-Diﬀerence- Share of Informal Workers

                                                      (1)             (2)              (3)               (4)            (5)         (6)            (7)              (8)
                 Outcome:                          Informal        Informal       Inf.-non salary   Inf.-non salary   Informal   Informal    Inf.-non salary   Inf.-non salary
                                                                              Panel A: Continuous Treatment Measure


                 -ln distancei x 1999                0.000          0.002             0.001             0.002          -0.001      0.001         -0.001            -0.000
                                                    (0.004)         (0.005)          (0.004)           (0.004)        (0.006)     (0.006)        (0.005)          (0.006)
                 -ln distancei x 1999               -0.008          -0.007           -0.012**          -0.011**       -0.015**   -0.016**       -0.018***        -0.019***
                                                    (0.006)         (0.006)          (0.005)           (0.005)        (0.007)     (0.007)        (0.006)          (0.007)
                 -ln distancei x 2004              -0.016***       -0.015**         -0.016***          -0.014**       -0.019**   -0.020**       -0.017**          -0.017**
                                                    (0.006)         (0.006)          (0.006)           (0.006)        (0.008)     (0.008)        (0.007)          (0.008)


                 Observations                       11,504          11,504            11,504            11,504         11,504     11,504         11,504            11,504
                 R-squared                           0.866          0.866             0.844             0.843          0.869       0.869          0.847            0.847


                                                                     Panel B: Treatment Measure using the dummy variable


                 Ti x 1999                          -0.006          -0.010            -0.001            -0.006         -0.005     -0.010         -0.006            -0.013
                                                    (0.010)         (0.010)          (0.009)           (0.009)        (0.011)     (0.011)        (0.010)          (0.010)
                 Ti x 2004                          -0.023*        -0.026**          -0.023**          -0.029**       -0.033**   -0.036***      -0.028**         -0.035***
10




                                                    (0.013)         (0.013)          (0.011)           (0.011)        (0.013)     (0.013)        (0.012)          (0.012)
                 Ti x 2009                         -0.036***       -0.035**         -0.034***         -0.036***       -0.037**   -0.036**        -0.026*          -0.029**
                                                    (0.013)         (0.014)          (0.012)           (0.013)        (0.014)     (0.014)        (0.014)          (0.014)


                 Observations                       11,504          11,504            11,504            11,504         11,504     11,504         11,504            11,504
                 R-squared                           0.866          0.866             0.843             0.843          0.869       0.869          0.847            0.847
                 Mean outcome before the shock       0.582          0.582             0.415             0.415          0.582       0.582          0.415            0.415


                 Distance Measure                   Meters         Minutes           Meters            Minutes        Meters     Minutes         Meters           Minutes
                 Distance Controls                    X               X                 X                 X              X          X              X                 X
                 State-Time FE                        X               X                 X                 X
                 Municipality-Time FE                                                                                    X          X              X                 X


     Notes: This table reports the results of a regression relating changes in the share of informal workers in each location with the line B of the subway. Panel A reports the results
     for the continuous treatment measures, and panel B for the dummy variables. In the ﬁrst four columns, I include state-time ﬁxed eﬀects, and in the ﬁfth column to the eighth
     column municipality-time ﬁxed eﬀects. Standard errors are clustered at the census tract level and reported in parentheses. *p < 0.1, **p < 0.05, ***p < 0.01.
                                                              Table B5: Diﬀerence-in-Diﬀerence- Share of Informal Firms

                                                     (1)               (2)               (3)               (4)             (5)         (6)           (7)               (8)
                Outcome:                          Informal          Informal        Inf.-non salary   Inf.-non salary   Informal    Informal    Inf.-non salary   Inf.-non salary
                                                                                Panel A: Continuous Treatment Measure


                -ln distancei x 1999               -0.005**          -0.005*            -0.004            -0.003        -0.005**     -0.004*        -0.004            -0.003
                                                   (0.002)           (0.002)           (0.003)           (0.003)         (0.002)     (0.003)       (0.003)           (0.003)
                -ln distancei x 1999              -0.009***         -0.007**           -0.006*            -0.004        -0.009***   -0.009***      -0.006**           -0.005
                                                   (0.003)           (0.003)           (0.003)           (0.003)         (0.003)     (0.003)       (0.003)           (0.003)
                -ln distancei x 2004              -0.018***         -0.017***         -0.015***         -0.014***       -0.017***   -0.018***     -0.013***         -0.014***
                                                   (0.003)           (0.003)           (0.003)           (0.003)         (0.003)     (0.004)       (0.004)           (0.004)


                Observations                       11,504            11,504             11,504            11,504         11,504      11,504         11,504            11,504
                R-squared                           0.884             0.883             0.906             0.906           0.892       0.892         0.910             0.910


                                                                      Panel B: Treatment Measure using the dummy variable


                Ti x 1999                         -0.014***         -0.013**            -0.007            -0.006        -0.010**     -0.009*        -0.007            -0.005
                                                   (0.005)           (0.005)           (0.006)           (0.006)         (0.005)     (0.005)       (0.006)           (0.006)
11




                Ti x 2004                         -0.022***         -0.020***          -0.015**          -0.013**       -0.017***   -0.015**       -0.014**          -0.012*
                                                   (0.006)           (0.006)           (0.007)           (0.007)         (0.006)     (0.006)       (0.007)           (0.007)
                Ti x 2009                         -0.032***         -0.029***         -0.022***         -0.019***       -0.026***   -0.023***     -0.019***          -0.016**
                                                   (0.006)           (0.006)           (0.007)           (0.007)         (0.006)     (0.006)       (0.007)           (0.007)


                Observations                       11,504            11,504             11,504            11,504         11,504      11,504         11,504            11,504
                R-squared                           0.883             0.883             0.906             0.906           0.892       0.892         0.910             0.910
                Mean outcome before the shock       0.833             0.833             0.796             0.796           0.833       0.833         0.796             0.796


                Distance Measure                   Meters           Minutes            Meters            Minutes         Meters     Minutes        Meters            Minutes
                Distance Controls                     X                X                  X                 X              X           X              X                 X
                State-Time FE                         X                X                  X                 X
                Municipality-Time FE                                                                                       X           X              X                 X


     Notes: This table reports the results of a regression relating changes in the share of informal workers in each location with the line B of the subway. Panel A reports the results
     for the continuous treatment measures, and panel B for the dummy variables. In the ﬁrst four columns, I include state-time ﬁxed eﬀects, and in the ﬁfth column to the eight
     column municipality-time ﬁxed eﬀects. Standard errors are clustered at the census tract level and reported in parentheses. *p < 0.1, **p < 0.05, ***p < 0.01.
                                          Table B6: Nested Logit - Iceberg Costs


                                                                       (1)                              (2)
           Costs:                                              Commuting                              Trade
                                                           Trips to Workplace                    Trips to Shops


           Minutes                                               -0.009***                          -0.013***
                                                                     (0.000)                          (0.000)
           Metro                                                     -0.144*                        -0.596***
                                                                     (0.079)                          (0.007)
           Metrobus                                              -0.556***                          -0.855***
                                                                     (0.117)                          (0.100)
           Car                                                    -0.220**                          -0.603***
                                                                     (0.095)                          (0.073)
           Walking                                               -0.352***                           0.338***
                                                                     (0.106)                          (0.049)
           λ public                                              0.484***                            0.484***
                                                                     (0.033)                          (0.009)
           Observations                                              56,330                          312,015
           Trips                                                     11,266                           62,403


           Iceberg cost before (mean)                                4.095                             7.217
           Iceberg cost after (mean)                                 3.901                             6.893

Notes: This table reports the results of a nested logit using the 2017 OD survey considering only trips that use one transportation
mode. The ﬁrst column reports the results to estimate commuting costs considering only trips from work to home or viceversa
between 6am to 10am, and between 5pm to 9pm. The second column reports the results to estimate trade costs using trips to retail
shops, outlets, and restaurants. I restrict the sample to trips after 1pm.




C        Data and Quantiﬁcation Appendix

C.1        Calibration of Speeds

This section describes the calibration of speeds across diﬀerent transportation modes. I use diﬀerent sources
of information. For the transportation network in Mexico City, I use data from the Government of the city.55
For the network of roads, I use information from the New York University digital archive in which they report
diﬀerent types of roads for each census tract in the commuting zone of Mexico City. The diﬀerent roads include:
autopistas, calles, viaductos, etc. I calibrate an average speed for each one of the roads. With this information I
compute commuting times across census tracts in Mexico City using the Network analysis toolkit from Arcmap
  55
       The data can be found here.



                                                                12
(Tsivanidis, 2019). I compute these times for four diﬀerent modes of transportation: walking, car, traditional
buses, and the subway. I add ﬁve minutes in each station when I compute times for the public transit network,
and three minutes when I compute travel times for “car” to capture the time spent in the parking lot. To
compute commuting and shopping iceberg costs, I take an average of these times across the diﬀerent modes. I
calculate a matrix across census tracts of approximately 13 million observations.

                           Table C7: Calibration of speeds using trip data from Google Maps

                     Type                                                             Speed
                                               Panel A: Public transit system
                     Subway Lines                                                601.24 m/min
                     Metrobus                                                    308.13 m/min
                     Bus                                                         216.67 m/min
                     Walking                                                      90.00 m/min


                                             Panel B: Types of roads for cars
                     Autopista                                                   752.03 m/min
                     Avenida                                                     266.84 m/min
                     Boulevard                                                   608.12 m/min
                     Calle                                                       198.56 m/min
                     Callejón                                                    69.643 m/min
                     Calzada                                                     169.98 m/min
                     Carretera                                                   623.38 m/min
                     Cerrada                                                     123.39 m/min
                     Circuito                                                    304.69 m/min
                     Corredor                                                    160.75 m/min
                     Eje vial                                                    273.98 m/min
                     Pasaje                                                      240.71 m/min
                     Periférico                                                  673.43 m/min
                     Viaducto                                                    399.99 m/min

           Notes: This table reports the calibration of speeds using trips from Google maps. The calibration uses 4,000
           random trips. The information was downloaded with the command gmapsdistance in R that uses the Distance
           Matrix Api from Google. I computed these times between 8 am - 11 am and 5 pm - 8 pm under diﬀerent traﬃc
           scenarios.



       To calibrate speeds for each mode and each type of road, I use random trips from Google Maps.56 I
downloaded 4000 random trips between 8 am-11 am, and 5 pm-8 pm using the command gmaps distance in R
that uses the Google Maps Distance Matrix Api. I use as an origin and destination, the closest vertex of each
type of road or metro line. This tool has the feature that you can calculate times for diﬀerent modes under
several traﬃc scenarios: pessimistic, optimistic, or none and modes such as: walking, car, or the public transit
network. Using this information, I calibrate speeds for each road and each line using the average time spent to
  56
   I did not calculate times across census tracts using Google Maps because the network analysis toolkit is much faster, and the
command gmaps distance takes a lot of time.



                                                               13
move from one vertex to the other. Table C7 reports the average speed for each one of the roads and the public
transit system.


C.2    Calibration of Fixed Costs

                                          Table C8: Estimation of ﬁxed costs

                                                 (1)                 (2)                (3)             (4)
            Outcome:                           ln Mis              ln Mis           ln Mis            ln Mis


            ln Lis                            0.715***          0.879***          0.642***           0.568***
                                               (0.014)             (0.077)         (0.017)           (0.054)
            γi                                1.799***          1.735***          1.825***           1.853***
                                               (0.034)             (0.036)         (0.051)           (0.046)
            ln wis Lis                                          -0.154**                              0.070
                                                                   (0.069)                           (0.051)
            γ                                -1.005***          -0.559***         -0.598***         -0.810***
                                               (0.055)             (0.194)         (0.091)           (0.183)


            Observations                        5,387              5,387               4,374          4,374
            Adjusted R-squared                  0.851              0.853               0.901          0.902
            Implied FI                          0.182              0.123               0.117          0.141
            Implied FF                          1.366              0.875               0.911          1.124


            State FE                              X                  X
            Municipality FE                                                             X               X

  Notes: This table reports the results of a regression relating the number of ﬁrms to the number of workers to recover the
  parameter β and the ﬁxed costs FI and FF for the informal and formal sector respectively. The unit of observation is a
  sector-census-tract cell. The dependent variable is the log number of ﬁrms in each cell. Columns 1 and 2 include state ﬁxed
  eﬀects, while column 3 and 4 include census-tract ﬁxed eﬀects to control for the price of commercial ﬂoor space qi . Even
  columns add as a control the wage bill for each sector to control for the price per unit of commercial ﬂoor space. Standard
  errors are clustered at the municipality level and reported in parentheses. ∗p < 0.1, ∗ ∗ p < 0.05, ∗ ∗ ∗p < 0.01.



In the model from section 4, in equilibrium, the optimal number of ﬁrms is

                                                                     ˜β
                                                                  ˜β Z
                                                        ˜ −1 σ −1 L
                                                  Mis = βFs   s    is is

               ˜is is the amount of labor and commercial ﬂoor-space units employed by location i and sector s,
      ˜ is and Z
where L
σs is the elasticity of substitution, and Fs is the entry ﬁxed cost. Taking logs, I estimate the following equation
relating the number of ﬁrms to the number of workers for both sectors in the baseline year. This estimation
allows me to recover the parameters Fs :


                                   ln Mis = β ln Lis + (1 − β ) ln Zis − ln σs − ln Fs .                                  (C.1)
                                                                    wis Lis       γs
                                                                      qi




                                                              14
In some of my speciﬁcations, to control for Zis , I include the wage bill for each sector and location with a
census-tract ﬁxed eﬀect to capture qi . Then, I estimate the following equation using the Economic Censuses in
1999, the omitted category is the formal sector,


                             ln Mis = γ1 ln Lis + γ2 ln wis Lis + γi + γI + γ +                is ,        (C.2)
                                        β             1−β

where γi is the census-tract ﬁxed eﬀect, γI is an informal sector dummy variable, and γ is a constant term.
                                                                          ˜ + ln(σF FF ).
From the optimal number of ﬁrms, we have that γI = ln(σI FI ), and γ = ln β
   Table C8 reports the results for this estimation for diﬀerent speciﬁcations. I run the previous equation,
including state and municipality ﬁxed eﬀects, and in the even columns, I control for the wage bill. I obtained
that on average, the value of β ≈ 0.7. I also ﬁnd that the entry ﬁxed cost for a ﬁrm into the informal sector is
approximately 0.15, and into the formal sector is 1.1. This means that the ﬁxed cost to enter into the formal
sector is more than ﬁve times the one to enter into the informal sector. This result is consistent with the fact
that the average size in terms of workers of informal ﬁrms is lower, but that there are more informal ﬁrms in
the economy.


C.3    Algorithm

In this section, I explain the main algorithm to solve for the general equilibrium model. The system of equations
is described in section 4. The sub-index t represents simulation. The algorithm is based on Alvarez and Lucas
(2007) and it is a contraction mapping. It is as follows:

  1. Guess an initial vector of wages w0 , and number of residents in each location L0 .

  2. Given a vector wt and Lt , compute the following equations:

        • Labor supply equations:
                                                                          θs −θs
                                                                         wis dni
                                                   λnisL|ns =                 θs −θs
                                                                                                           (C.3)
                                                                         i   wi s dni

                                                       κ
                                                      Wns|n              θs                θs −θs
                                        λnsL|n =            κ         , Wns |n =          wis dni          (C.4)
                                                      s    Wns |n                     i


                                                      ˜ is =
                                                      L                        ¯ L.
                                                                        λnis · L                           (C.5)
                                                                  n

        • Average income


                                                           ¯n ≡
                                                           y            λnis wis                           (C.6)
                                                                  i,s

        • Commercial ﬂoor space prices

                                                                                   ˜ is
                                                            (1 − β )(1 + tisL )wis L
                                               ˜i =
                                            qi Z                                        ,                  (C.7)
                                                       s
                                                                  β (1 + tisZ )



                                                            15
       • Number of ﬁrms

                                                                                    ˜L
                                                                                    β ˜β Z˜ 1−β
                                                                                       is is
                                                                    Mis =                       ,                                       (C.8)
                                                                                       σs Fs
       • Expenditure shares

                                                                                                                             1
                                            1−ξ                   −σ
                                           Pns              Mis p1
                                                                 nis                                                1−σs
                                                                                                                           1−σs

                            πnis =               1−ξ
                                                       ·            1−σ ,             with     Pns =           Mis pnis           ,     (C.9)
                                            s   Pns         i Mi s pni s                                   i
                                            πns                 πnis|s

       • Government budget constraint

                                                                                                 ˜is
                                                                                  ˜ is + tisZ qi Z
                                                                         tisL wis L
                                                                   i,s
                                                           δ≡
                                                                               ¯n Ln
                                                                              ny           + qn Z n


                                                                    ¯=         α·δ
                                                                    t                                                                  (C.10)
                                                                          1 + (1 − α) · δ
       • Aggregate Expenditure

                                                                   (1 + t¯)
                                                       Xn =                    yn Ln + qn Zn ) .
                                                                              (¯                                                       (C.11)
                                                                     ¯
                                                                 α − t(1 − α)
       • Labor demand
                                                                   Yis = α                πnis Xn .                                    (C.12)
                                                                                     n


                                                                                         αβYis
                                                                         LDis =            t                                           (C.13)
                                                                                          wis

       • Calculate the diﬀerence between labor demand and labor supply and the number of residents

                                                                    αβYis − wist (1 + t      ˜
                                                                                        isL )Lis
                                                            zw =                                                                       (C.14)
                                                                       wist (1 + t      ˜
                                                                                   isL )Lis

                                                                        −αη              −(1−α)η    η
                                                   ˜t               Bn Pn   rn                     Wn      ¯L
                                                   Ln =                      −αη           −(1−α)η     η
                                                                                                           L                           (C.15)
                                                                    n    Bn Pn   rn                   Wn

              ˜ t ) − (0, Lt )|| <
3. If ||(zw , L                      tol   then, the algorithm stops. Otherwise, update
                i          i




                                                            t+1    t
                                                           wis  = wis (1 + νw zw )                                                    (C.16a)
                                                           Lt+1      ˜ t + (1 − νL )Lt ,
                                                                = νL L                                                                (C.16b)
                                                            n          n             n



   where νL , and νw are convergence parameter and                            tol   is a tolerance value.




                                                                         16
C.4    Additional Infrastructure Counterfactuals

In this section, I report the results of two additional counterfactuals. In the ﬁrst counterfactual, I assume that
the Government only gives the rebate to formal workers instead of the entire population in the city. In the
second counterfactual, I compute travel times without Lines 1 and 2 of the subway and show that the resource
misallocation component only explains half of the total gains relative to line B.


C.5    Formal workers rebate

I now consider what are the welfare gains of line B under the assumption that the Government only gives the
rebate to the formal workers. There are two main equations that change from the general equilibrium framework.
First the labor supply equation from the formal sector is now given by:

                                                                    ¯)κ
                                                             κ (1 + t
                                                       BnF WnF
                                       λnF L|n =                          κ ,                              (C.17)
                                                               ¯)κ + BnI Wn
                                                        κ (1 + t
                                                   BnF WnF                 I

                                                                     ¯ is given by:
and the new Government budget constraint that pin downs the value of t

                                                        ˜is = t
                                         ˜ is + tisZ qi Z
                                tisL wis L                    ¯L¯·         λn λnF L|n λniF L|nF wiF .      (C.18)
                          i,s                                        i,n

   We can solve the system of equations with these two new equations. Table C9 reports the results for the
counterfactual of line B. Overall, the results are very similar to the baseline simulations. The welfare gains are
between 1.7%-1.8%, and the resource misallocation component explains between 15% to 20%.




                                                            17
                                                                  ˆ = X /X - Line B
                                 Table C9: Counterfactual Results X

                                               (1)          (2)            (3)        (4)       (5)           (6)
                                         Panel A: Percentage change in welfare-Distortions
                                                       No Migration                           Migration
                %∆ Welfare                    ∆dni         ∆τni        ∆dni , ∆τni   ∆dni      ∆τni       ∆dni , ∆τni


                Panel A: Calibrated wedges
                Total change                  1.00%       0.67%          1.79%       1.01%     0.67%        1.77%
                Decomposition
                Pure Eﬀect                   87.37%       75.70%        81.98%       83.60%   71.73%       78.32%
                Allocation                   11.28%       14.51%        13.82%       14.94%   18.35%       17.20%
                Agglomeration                 1.34%       9.79%          4.21%       1.45%     9.93%        4.48%


                                       Panel B: Percentage change in welfare-Constant wedge
                %∆ Welfare                    ∆dni         ∆τni        ∆dni , ∆τni   ∆dni      ∆τni       ∆dni , ∆τni


                Total change                  0.95%       0.64%          1.71%       0.94%     0.63%        1.65%
                Decomposition
                Pure Eﬀect                   92.13%       78.17%        86.19%       89.69%   76.38%       84.07%
                Allocation                    7.04%       14.87%        10.94%       9.34%    16.22%       12.70%
                Agglomeration                 0.83%       6.97%          2.88%       0.97%     7.39%        3.23%

Notes : This table reports the counterfactual results for Line B of the subway with a rebate only to formal workers. The ﬁrst and
fourth column considers only change in commuting costs, the second and ﬁfth column changes in trade costs, and the third and
sixth column considers changes in both type of iceberg costs. The ﬁrst three columns presents the results for the counterfactual
with no migration, and the second three columns for the counterfactual in which I allow for migration in the model. Panel A reports
the results for welfare with the calibrated distortions a lá Hsieh & Klenow (2009), and panel B for welfare with a constant wedge
in the formal sector based on Levy (2018). The ﬁrst row describes the results considering the total change. While, the other rows
decompose the total change into the diﬀerent components. The second row shows the percentage explained by the direct eﬀect, the
third row by the allocative eﬃciency margin, and the fourth row by the agglomeration externality component.


C.6     Line 1 and 2

This section reports the result of a counterfactual, in which I remove lines 1 and 2 of the subway. These lines
have the characteristic of connecting the central areas in the city. Figure C13 plots a map of the city highlighting
lines 1 and 2. The interpretation of the counterfactual consists of an allocation without these lines, starting
from a world with these lines.




                                                                  18
                                       Figure C13: Transit System-Lines 1 and 2




                                                     (a) Subway Lines

Notes: This ﬁgure plots a map of Mexico City with the transportation system highlighting the ﬁrst two lines of the subway: lines
1 and 2. In this counterfactual, I remove these lines to measure the eﬀect on informality and welfare


    Table C10 reports the main ﬁndings. Overall, lines 1 and 2 lead to a real income increase of around 2.7%.
However, the allocation component explains a lower fraction of the total gains relative to line B since it only
explains 10% of the welfare gains. In contrast, in the case of line B, the reallocation of workers explains more
than 20%. Then, Line B generated a larger reallocation of workers from the informal to the formal sector relative
to the size of the shock.




                                                              19
                                                              ˆ = X /X - Lines 1 and 2
                            Table C10: Counterfactual Results X

                                           (1)         (2)            (3)            (4)       (5)          (6)
                                           Panel A: Percentage change in welfare-Distortions
                                                   No Migration                            Migration
                      %∆ Welfare        ∆dni           ∆τni       ∆dni , ∆τni    ∆dni       ∆τni        ∆dni , ∆τni


                      Total change      1.57%         1.16%         2.70%        1.56%      1.16%         2.69%
                      Decomposition
                      Pure Eﬀect       96.71%         78.88%       89.27%       96.77%     78.87%         89.30%
                      Allocation        2.93%         10.02%        5.87%        2.86%      9.80%         5.74%
                      Agglomeration     0.36%         11.10%        4.86%        0.37%     11.33%         4.96%


                                       Panel B: Percentage change in welfare-Constant wedge
                      %∆ Welfare        ∆dni           ∆τni       ∆dni , ∆τni    ∆dni       ∆τni        ∆dni , ∆τni


                      Total change      1.56%         1.14%         2.67%        1.56%      1.14%         2.67%
                      Decomposition
                      Pure Eﬀect       96.93%         80.50%       90.11%       96.88%     80.38%         90.05%
                      Allocation        2.69%         7.25%         4.57%        2.71%      7.15%         4.54%
                      Agglomeration     0.38%         12.26%        5.32%        0.41%     12.47%         5.41%

Notes : This table reports the counterfactual results for Lines 1 and 2 of the subway. The ﬁrst and fourth column considers only
change in commuting costs, the second and ﬁfth column changes in trade costs, and the third and sixth column considers changes
in both type of iceberg costs. The ﬁrst three columns presents the results for the counterfactual with no migration, and the second
three columns for the counterfactual in which I allow for migration in the model. Panel A reports the results for welfare with the
calibrated distortions a lá Hsieh & Klenow (2009), and panel B for welfare with a constant wedge in the formal sector based on
Levy (2018). The ﬁrst row describes the results considering the total change. While, the other rows decompose the total change
into the diﬀerent components. The second row shows the percentage explained by the direct eﬀect, the third row by the allocative
eﬃciency margin, and the fourth row by the agglomeration externality component.



D     Theoretical Appendix

D.1     Welfare Decomposition

In this section, I derive the formula for the welfare decomposition. I start with the perfectly eﬃcient economy
and then, I introduce labor wedges. As in the text, there are three group of agents: workers denoted by L,
commercial ﬂoor space owners denoted by Z , and house owners denoted by H . The two latter groups do not
commute.
    The indirect utility of agent ω is:

                                                                      dni wis  niω
                                                       Vniω = Bn          β 1−β
                                                                                                                              (D.1)
                                                                         rn Pn
where wis is the wage per eﬃciency unit in location i, and sector s, and                       niω   is an idiosyncratic shock drawn
from a nested Fréchet distribution with dispersion parameters θs , and κ. By the properties of the Fréchet, the
total amount of eﬃciency units d−1 L
                                   ˜ nis net of commuting costs provided by location n to location i-sector s is:
                                      ni



                                                 wis d− 1˜
                                                      ni Lnis = λn λns|n λnis|ns y
                                                                                    ¯
                                                                                 ¯n L,                                        (D.2)


                                                                    20
                                                                                 1
                                 1
                           κ κ                                      θs −θs      θs
      ¯n ≡ (
where y             s Bns Wns ) , and Wns ≡                      i wis dni           . From these expressions, the goods market clearing
condition is the following system of equations:


                           λn λns|n λnis|ns y  ¯ = αβ
                                            ¯n L                        πns πnis|s y     ¯ + qn λnZ Z + rn λnH H
                                                                                   ¯n λn L                                             (D.3a)
                                                                   n


                               qn λnZ = α(1 − β )                 πns πnis|s y     ¯ + qn λnZ Z + rn λnH H .
                                                                             ¯n λn L                                                   (D.3b)
                                                             n

And the housing market clearing condition is:


                                     rn λnH H = (1 − α) y     ¯ + qn λnZ Z + rn λnH H
                                                        ¯n λn L                                                                        (D.3c)

And the average utility in each location is:

                                                                                ¯n
                                                                       ¯ n = Bn y
                                                                       U                                                                (D.4)
                                                                             α r −α
                                                                            Pn  1
                                                                                n


D.1.1       Social Planner

There is a social planner maximizing welfare such that the market allocation replicates the perfectly eﬃcient
allocation. The problem of the planner consists to maximize:

                                     ¯ = ωL
                                     U                     ¯n +ωZ
                                                        δn U                        ¯nZ +ωH
                                                                                δnZ U                   ¯nH ,
                                                                                                    δnH U                               (D.5)
                                                n                          n                    n
                                                        ¯L
                                                        U                       ¯Z
                                                                                U                   ¯H
                                                                                                    U

                                                                                                                   ¯L
                                                                                                                ωL U
where ω and δ are weights that replicate the market allocation.57 As shown later                                 U¯     = αβ . I am interested
in a shock to commuting costs or trade costs. Then, by a ﬁrst-order approximation, the eﬀect of any shock is:



                                     ¯ = αβ
                                d ln U                  ˜ nL (d ln y
                                                        λ          ¯n − αd ln Pn − (1 − α)d ln rn )                                    (D.6a)
                                                n

                                      + α(1 − β )                 ˜ nZ (d ln qn − αd ln Pn − (1 − α)d ln rn )
                                                                  λ                                                                    (D.6b)
                                                             n

                                      + (1 − α)                  ˜ nH (d ln rn − αd ln Pn − (1 − α)d ln rn ) ,
                                                                 λ                                                                     (D.6c)
                                                         n


      ˜ nL ≡         ¯n λn
                     y                                                                                ˜ nH ) is the share
                                                                                                ˜ nZ (λ
where λ                y
                       ¯n λn   is the share of total labor income in location n, and similarly, λ
                    n
of total income of commercial ﬂoor space (housing) in location n. Then, the change in the average income, and
 57                                       ¯ =
      To replicate the market allocation, U             ¯n Ln + qn Zn + rn Hn
                                                        y
                                                    n




                                                                               21
the price index is:



                      d ln Wn =         λns|n λnis|ns d ln wis −               λns|n λnis|ns d ln dni                        (D.7a)
                                  i,s                                    i,s

                      d ln Pn =         πns πnis|s (βd ln wis + (1 − β )d ln qi ) +                 πns πnis|s d ln τni .    (D.7b)
                                  i,s                                                         i,s


From the goods market clearing condition and with some algebra manipulation then:



                           ¯ = −αβ
                      d ln U                      ˜ nL λns|n λnis|ns d ln dni
                                                  λ                                                                          (D.8a)
                                        n,s,i

                            −α                                          ˜ nZ + (1 − α)λ
                                                        ˜ nL + α(1 − β )λ
                                          πns πnis|s αβ λ                             ˜ nH d ln τni .                        (D.8b)
                                  n,s,i


This equation is the Hulten result. When the economy is perfectly eﬃcient, the change in welfare is a weighted
average of the change in the fundamentals. In this case, changes in trade and commuting costs.


D.1.2    Labor wedge

I now assume that ﬁrms face distortions. These wedges can be variable markups or taxes. As in HK, I am going
to denote these wedges as taxes. In particular, the goods market clearing condition now is:

                                               ¯
                  λn λns|n λnis|ns y  ¯ = 1 + t αβ
                                   ¯n L                             πns πnis|s y     ¯ + qn λnZ Z + rn λnH H ,
                                                                               ¯n λn L                                        (D.9)
                                         1 + tisL               n

             ¯ is a rebate of the Government that can vary by location, or in the case of markups a portfolio
   where 1 + t
that is rebate to households. The previous equation create trade imbalances and wedges across ﬁrms. Thus,
there is an additional eﬀect in the ﬁrst-order approximation. This eﬀect captures changes in wages and it is:



                           ¯ = −αβ
                      d ln U                        ˜ nL λns|n λnis|ns d ln dni
                                                    λ                                                                       (D.10a)
                                          n,s,i

                            −α                                          ˜ nZ + (1 − α)λ
                                                        ˜ nL + α(1 − β )λ
                                          πns πnis|s αβ λ                             ˜ nH d ln τni                         (D.10b)
                                  n,s,i
                                                                    tisL − t¯
                            + αβ            ˜ nL λns|n λnis|ns
                                            λ                                 d ln wis                                      (D.10c)
                                                                      1+t ¯
                                    n,i,s

                                       ¯).
                            + d ln(1 + t                                                                                    (D.10d)

It is easy to show that the change in the rebate is:

                                                                           tisL − t¯
                                  ¯) =
                         d ln(1 + t                 ˜ nL λns|n λnis|ns
                                                    λ                                                 ˜ nis ).
                                                                                     (d ln wis + d ln L
                                                                             1+t ¯
                                            n,i,s




                                                                     22
Then, the total change in welfare is:



                          ¯ = −αβ
                     d ln U                         ˜ nL λns|n λnis|ns d ln dni
                                                    λ                                                            (D.11a)
                                      n,s,i

                            −α                                          ˜ nZ + (1 − α)λ
                                                        ˜ nL + α(1 − β )λ
                                          πns πnis|s αβ λ                             ˜ nH d ln τni              (D.11b)
                                  n,s,i
                                                                   tisL − t¯
                            + αβ            ˜ nL λns|n λnis|ns
                                            λ                                     ˜ nis .
                                                                             d ln L                              (D.11c)
                                                                     1+t ¯
                                    n,i,s


The third term captures agglomeration forces and it suggests that when workers reallocate to sectors-locations
with larger wedges, there is an additional increase in welfare due to an improvement in allocative eﬃciency.


D.1.3    Agglomeration forces

Finally, there is an additional eﬀect due to agglomeration forces. In my model this force comes from LOV. This
additional eﬀect also captures changes in allocative eﬃciency and it arises for two reasons. First, if agglomeration
externalities diﬀer between the two sectors as in BCDR, or because there are trade imbalances as in FG. In the
presence of LOV or agglomeration forces, consumers beneﬁt from lower prices as the sector-location becomes
bigger. In particular, there is an additional change in welfare captured by:

                     ¯ = ... +     β                                              ˜ nZ + (1 − α)λ
                                                                  ˜ nL + α(1 − β )λ                       ˜ is
                                                                                                ˜ nH d ln λ
                d ln U                              πns πnis|s αβ λ
                                 1 + σs
                                            n,i,s


                                               ¯ = ... +       β              1 + tisL    ˜ is ,
                                          d ln U                                         dλ                       (D.12)
                                                             1 + σs            1+t  ¯
                                                                      n,i,s

         ˜ is is the labor share in total income from sector s and location i. This additional term captures two
   where λ
things. First, if workers move to sectors-location in which agglomeration externalities are larger, then there is
an increase in total welfare, and second, if workers reallocate to sectors-location with larger wedges the eﬀect of
any shock on welfare is larger.




                                                                    23
   Combining the previous expressions, then,


                             ¯ = − αβ
                        d ln U                     ˜ nL λns|n λnis|ns · d ln dni
                                                   λ                                                       (D.13a)
                                           n,i,s

                                             “Pure” eﬀect commuting costs

                              −α                                         ˜ nZ + (1 − α)λ
                                                         ˜ nL + α(1 − β )λ
                                           πns πnis|s αβ λ                             ˜ nH d ln τni       (D.13b)
                                   n,i,s

                                                                “Pure” eﬀect trade costs
                                                                                              
                                                                          tisL − t¯
                              + αβ             ˜ nL λns|n λnis|ns
                                                λ                                        ˜ nis 
                                                                                    d ln L                 (D.13c)
                                                                            1+t ¯
                                        n,i,s

                                                          Allocative eﬃciency

                                        1                      1 + tisg      ˜ isg ,
                              +                           βg                dλ                             (D.13d)
                                     σ −1                       1+t  ¯
                                n,i,s s             g

                                              Agglomeration Forces



which is the same expression as in the text.


D.2    The problem of the social planner

In this section, I ﬁnd the equilibrium conditions for the problem of the social planner. I show two results. First,
in the case in which the economy operates under perfect competition, the market allocation coincides with the
                                         ¯ is equal to the aggregate total expenditure or income of the economy,
eﬃcient allocation. Second, the variable U
which is the main assumption from the previous section.
   There are diﬀerent groups of workers indexed by g , sectors indexed by s and a mass of locations N indexed
by n and i. Each group has a utility function Ug (cng , hng ), where cng represents the average consumption of
a composite good in location n and hng is the average amount of housing in location n. This utility function
is homogeneous of degree one. In the optimal allocation, workers are indiﬀerent across locations and there are
iceberg trade and commuting costs. The problem of the planner is to maximize the following welfare function:

                                                               ¯ = λg · Ug
                                                               U

subject to i) spatial mobility constraints

                                                                     ¯g ∀n, g
                                                           Ung Lng ≤ U

   ii) composite and housing feasibility constraints


                                                          τni Qis ≤ Yis (Egis ) ∀i, s
                                                    n,s


                                             Lng · cng ≤ C (Qn11g , ..., QnSN g ) ∀n


                                                                    ˜nhg ) ∀n
                                                    Lng · hng ≤ Hn (E

                                                                     24
iii) labor supply constraints

                           ˜isg ≤
                           E              d− 1
                                           ni Enisg ∀i, s, g including the sector that produces housing
                                      n


                                                      Eg (En11g , ..., EnSN g ) ≤ Ln,g ∀n, g

iv) non-negativity constraints of commuting ﬂows, trade ﬂows, labor.
v) Labor Market clearing

                                                                                            ¯g
                                                                                     Ln,g ≤ L
                                                                             n,g

where Y is the production function, C (·) is a composite good aggregator across locations and sectors, in my case
the nested CES; E (·) is a eﬃciency units aggregator, in my case the nested Fréchet; and Enisg are eﬃciency
units provided from location n to i, s by group g .58 The other parameters represent the same variables as in
section 4.
       The Lagrangian of the planning problem omitting the non-negative constraints is:



                                          L = Lg Ug −                    ωng Lng (Ug − Ug (cng , hng ))
                                                                n,g


                                            −           p∗
                                                         is
                                                                                         ˜isg )
                                                                         τni Qnis − Yis (E
                                                i,s                 n

                                                         ∗
                                            −           Pn               Lng cng − C (Q(Qn11g , ..., QnSN g ))
                                                 n                  g


                                            −             ∗
                                                         wisg       ˜isg −
                                                                    E                    d− 1
                                                                                          ni Enisg
                                                i,s,g                                n
                                                          ∗
                                            −           y
                                                        ¯n,g   (Eg (En11g , ..., EnSN g ) − Ln,g )
                                                n,g


                                            −            ∗
                                                        rn                            ˜nhg )
                                                                        Lng hng − Hn (E
                                                 n              g

                                            −           Ψg (                  ¯ g ) + ...
                                                                        Lng − L
                                                 g              n



                                                         ˜ihg , Lng , and Ug to maximize welfare. I proceed in two
                                                  ˜isg , E
   The planner chooses cng , hng , Qnis , Enisg , E
parts. First, I show the relationship between U  ¯ and aggregate expenditure, and then, I show that the market
allocation coincides with the eﬃcient allocation. Then, I generalized the formula from the previous section using
the goods market clearing condition.
                                                                              ξ −1                 ξ−1         σs −1            σs −1
  58                                                                                                            σs               σs
       Recall that the CES aggregator from section 4 is Cn ξ                             ≡    s
                                                                                                    ξ
                                                                                                  Qns , where Qns      ≡   i   Qnis     and the eﬃciency units
                  κ              κ               θs                       θs
                 κ−1            κ−1             θs − 1                   θs − 1
aggregator is En       ≡   s   Ens , where Ens           =      i   Enis .




                                                                                         25
Utility and Total Expenditure

The F.O.C with respect to cng and hng is:

                                                         ∂Ug    ∗
                                               ωng cng       ≤ Pn cng ∀g
                                                          ∂c

                                                         ∂Ug    ∗
                                               ωng hng       ≤ rn hng ∀g
                                                         ∂h
Since Ug (·) is homogeneous of degree one, then,

                                                ∗        ∗
                                          Lng (Pn cng + rn hng ) = Lng ωng Ung                            (D.14)

The LHS of equation D.14 is the aggregate expenditure Xng of group g who lives in location n. The F.O.C whit
respect to Ug is:


                                                         ωng Lng = Lg
                                                     n

Combining this equation with equation D.14, and the fact that in equilibrium Ung = Ug for all the locations in
which Lng > 0 yield that:


                                                    Lg Ug =           Xng
                                                                  n

            ¯=
Recall that U             Lg Ug , thus,
                      g


                                                            ¯ =X
                                                            U

where X ≡        Xng is aggregate expenditure. At the aggregate level, total expenditure is equal to total income
                n,g
                             ¯=
then in the previous section U                  ˜n + rn Hn , which was the assumption for the theoretical result
                                     ¯n Ln + qn Z
                                    ny
of the ﬁrst-order approximation.


Eﬃcient Allocation

Now, I show that the market allocation coincides with the eﬃcient allocation. The F.O.C with respect to other
variables is:


                                                        ∗      ∂C
                                              [Qnis ] :Pn           ≤ p∗
                                                                       is τni                            (D.15a)
                                                             ∂Qnis
                                                 ˜isg ] :p∗   ∂Y        ∗
                                               [E         is ˜      ≤ wisg                               (D.15b)
                                                             ∂ Eisg
                                                                           ∂Eg
                                             [Enisg ] :wisg∗
                                                              d− 1
                                                                ni ≤ y
                                                                     ¯ng                                  (D.15c)
                                                                         ∂Enisg
                                                ˜nhg ] :r∗    ∂H
                                              [E          n ˜       ≤ wnhg                               (D.15d)
                                                            ∂ Enhg
                                                       y n ≤ Ψg
                                               [Lng ] :¯                                                  (D.15e)



                                                             26
Equations D.15a to D.15d are the same as the utility and proﬁt maximization conditions of the consumer’s
and ﬁrm’s problem. In the particular case in which the function C (·) is the nested CES utility function from
section 4, E (·) is the nested Frechet, and assuming that Y (·) is homogeneous of degree one, I can rewrite these
conditions as:


                                           λnsg|n λnisg|ns y ∗
                                                           ¯ng        ∗
                                                               Lng = wisg d− 1
                                                                           ni Enisg

                                                         ∗ ˜
                                                        wisg Eisg = βisg p∗
                                                                          is Yis

                                                    p∗
                                                     is Yis =         αng πnis y ∗
                                                                               ¯ng Lng
                                                                n,g

                                                    ∗                          ∗
                                                   rn Hn =          (1 − αng )¯
                                                                              yng Lng ,
                                                              n,g

where
                                                                            ∂Y
                                                                      Eisg ∂E isg
                                                          βisg ≡
                                                                           Yis

                                                                             ∂U
                                                                              g
                                                                       cng ∂cng
                                                           αng ≡
                                                                           Ung
These are the same conditions as the market allocation from the previous section. Then, the market allocation
is eﬃcient in the case in which there are no wedges.
   We can generalize the welfare decomposition from the previous section for diﬀerent groups of labor under
the assumptions where the utility and production function is homogeneous of degree one. In particular, we can
                      ¯ as:
rewrite the change in U



                              ¯ =−
                         d ln U                            ˜ ng λnsg|n λnisg|ns · d ln dni
                                                  αng βisg λ                                                   (D.16a)
                                        n,i,s,g

                                                      “Pure” eﬀect commuting costs

                                −                                 ˜ ng d ln τni
                                              πns πnis|s αng βisg λ                                            (D.16b)
                                    n,i,s,g

                                                  “Pure” eﬀect trade costs
                                                                                                          
                                                                                             ¯
                                                                                      tisg − t
                                +                         ˜ ng λnsg|n λnisg|ns
                                                  αng βisg λ                                        ˜ nisg 
                                                                                               d ln L          (D.16c)
                                                                                       1+t  ¯
                                      n,i,s,g

                                                                 Allocative eﬃciency
                                                βg        1 + tisg          ˜ isg .
                                +                                          dλ                                  (D.16d)
                                              σs − 1       1+t  ¯
                                    n,i,s,g

                                              Agglomeration Forces


This result is similar to the one obtained by Baqaee and Farhi (2020) in GE models. However, this expression
is in the context of an urban model in which ﬁrms face iceberg trade costs and workers face i) commuting costs,
and ii) are indiﬀerent to live across locations within the city.


                                                                      27
D.3    Equilibrium Conditions - Exact Hat Algebra

In this section, I solve for the equilibrium conditions and change in total welfare using exact hat algebra as in
Dekle et al. (2008). I deﬁne the percentage change of a variable as:

                                                                      x
                                                              ˆ=
                                                              x
                                                                      x
then, the change in the average utility is

                                                                                         1
                                                                                         η
                                         ˆ
                                         ¯=
                                         U                  ˆn
                                                         λn P −αη −(1−α)η ˆ η
                                                                 ˆn
                                                                 r       Wn                   ,            (D.17)
                                                    n

              η −αη   −(1−α)η
             Wn Pn rn
where λn ≡          −η η        is the share of residents in location n in the pre-period. The change in the price
                 n Pn Wn
and wage indices is given by the following expressions:


                                                                                 1
                                                                               1−σs
                                           ˆns =                       −σs
                                           P                        ˆ1
                                                              πni|s p is                                  (D.18a)
                                                          i
                                                                                 1
                                                                                1−ξ
                                             ˆn =                       1−ξ
                                                                      ˆns
                                             P                πns ·   P                                   (D.18b)
                                                          i
                                                                                          1
                                                                                         θs
                                          ˆ ns =
                                          W                   λnis|ns · w
                                                                        ˆis d−
                                                                          θs ˆ
                                                                              ni
                                                                                θs
                                                                                                          (D.18c)
                                                          i
                                                                                 1
                                                                                 κ
                                           ˆn =
                                           W                          ˆ ns
                                                              λns|n · W κ
                                                                                     .                    (D.18d)
                                                          s


The change in the residence, sector, and workplace choice probability is:


                                                             ˆn−η ˆ η                ˆn−η ˆ η
                                          ˆn =              P       Wn              P     Wn
                                          λ                             − η    η  =                       (D.19a)
                                                                     ˆ ˆ                ˆ
                                                                                        ¯η
                                                        n λn · Pn Wn                   U
                                                             Wˆ nsκ               ˆ ns
                                                                                  W κ
                                        ˆ ns|n =
                                        λ                                     =                           (D.19b)
                                                                        ˆκ         ˆn
                                                        k λnk|n · Wnk             W

                                      ˆ nis|ns =            w  θs ˆ
                                                             ˆis   d−ni
                                                                        θs
                                                                                     w θs ˆ
                                                                                     ˆis d−ni
                                                                                             θs
                                      λ                                    θs ˆ
                                                                                  =             .         (D.19c)
                                                          λ nls | ns  · wˆ   d θs
                                                                                       Wˆ ns
                                                                                          θs
                                                        l                  l nl


And the change in the expenditure shares is:


                                                              1−ξ
                                                            ˆns          1−ξ
                                                                        ˆns
                                                            P          P
                                           ˆns =
                                           π                          = 1−ξ                               (D.20a)
                                                                ˆ 1−ξ   ˆn
                                                          k πnk Pnk    P
                                                              1−σs            1−σs
                                                             ˆnis
                                                             p              ˆnis
                                                                            p
                                          ˆni|s =
                                          π                           −σs =        .                      (D.20b)
                                                                   ˆ1
                                                          l πnls|s pnls
                                                                              1−σs
                                                                            ˆns
                                                                            P




                                                                 28
The change in the average labor income and aggregate expenditure is:


                                                     ˆ
                                                     ¯n =
                                                     y              λY   ˆ       ˆ
                                                                     nis λnis|ns λns|n wˆis ,                                          (D.21)
                                                              i,s

               λnis|ns λns|n wis
where λY
       nis ≡          y
                      ¯n         .   Then, the change in Xn is:


                                          X        ˆ ¯t
                                          ˆ n = (1 +        ˆ
                                                            ¯n λ
                                                      ) ωnL y  ˆ n + ωnZ q        ˆn ,
                                                                         ˆn + ωnH r                                                    (D.22)

                      ¯n Ln
                      y                                  qn Zn                                        rn Hn
where ωnL ≡    ¯n Ln +qn Zn +rn Hn ,
               y                        ωnZ ≡     ¯n Ln +qn Zn +rn Hn ,
                                                  y                        and ωnH ≡           ¯n Ln +qn Zn +rn Hn .
                                                                                               y                       Then, the goods market
clearing condition using exact hat algebra for each location i and sector s is:


                                        w
                                        ˆis       ˜ ni λ
                                                       ˆ nis|ns λ
                                                  λ             ˆ ns|n λ
                                                                       ˆn =          X
                                                                                    πnis π
                                                                                         ˆns π      ˆn,
                                                                                             ˆnis|s X                                  (D.23)
                                              n                                 n

      ˜ ni =     λnis               X =         πns πnis|s Xn
where λ            λn        , and πnis          πn s πn is|s Xn    . I compute the counterfactuals solving the previous system
                n       is                    n
of equations D.23.


D.4    Model with ex-ante ﬁrm decision

In this section, I present a version of the model in which ﬁrms decide whether to operate in the formal or in the
informal sector. The model is based on Ulyssea (2018) and Dix Carneiro et al. (2018) in which ﬁrms that operate
in the informal sector face a distortion that increases with size. There is a inﬁnite mass of potential entrants
that exit at an exogenous rate δs . The labor supply function takes the same form as in the main text. On the
other hand, ﬁrms make two decisions. First, they decide whether to enter in the labor market and conditional
on entry whether to operate in the formal or informal sector based on a pre-entry signal and a entry ﬁxed cost.
Second, ﬁrms decide the location in the city in which they are going to operate based on an extreme value type
II shock. There is no production ﬁxed cost.
   The total operational proﬁts of ﬁrm ω that operate in location i and sector s, and sells to n is given by:

                                                                                                       1−σs
                          op              1        τni (wis [1 + tisL ])β (qi [1 + tisZ ])1−β                  σs −1 ˜
                         πnis (ω ) =                                                                          Pn    Xns ,
                                       σs − 1                    z (ω ) is (ω )

                                               op                                         op
                                              πis (ω ) = (1 − υis (ris (ω )))            πnis (ω ),
                                                                                     n

where z (ω ) is the pre-entry signal,           is (ω )   is an idiosyncratic productivity shock of ﬁrm ω that varies across
locations, and υis (ris (ω )) is a distortion that captures the probability of getting caught if ﬁrm ω operates in
the informal sector. This probability increases with the size and revenue r(ω ) of ﬁrm ω . I assume that the
idiosyncratic shocks are drawn from a Frechét distribution with shape parameter ψ and scale parameters Ais .
Then, the share of ﬁrms with pre-entry signal z from sector s that operate in location i is:

                                                                              ψ
                                                                         Ais πi s(ω )
                                                          µis (z ) =             ψ
                                                                                           .                                           (D.24)
                                                                          l Als πl s(ω )

   With these assumptions, the expected value of entry for a ﬁrm with pre-entry signal z that operates in sector


                                                                         29
s is:


                                                                                                                                                    1
                                                                                                                                                    ψ
                                                                                                                            1−σs                ψ
                         σ s −1                                                                    β                 1−β
                     z                                                   τni (wis [1 + tisL ]) (qi [1 + tisZ ])                     σ s −1 ˜
   Vse (z, wis ) =                         (1 − υis (ris ))Ais                                                                    Pn     Xns        .   (D.25)
                         δs           i                             n
                                                                                           z (ω )

A ﬁrm decides to enter and operate in sector k if the following condition holds:

                                                                           e
                                                    Vke (z, w) − Ek ≥ max{V− k (z, w ) − E−k , 0}.                                                        (D.26)

Because the average expected proﬁts increase with size, and the distortion also increases with size, there are two
cutoﬀs of the pre-signal productivity z that determine the entry to market and whether a ﬁrm operates in the
informal or the formal sector. Let’s deﬁne the entry cutoﬀ as zE , and the informality cutoﬀ as zI . Then, the
labor demand in location i and the informal and formal sector are given by:


                                                                        −1                zI
                                                                    βwi  I
                                                     LiI =                                     µiI (z )riI(z ) dF (z )                                    (D.27)
                                                              F ( zI ) − F ( zE )        zE
                                                                   −1           ∞
                                                                βwiF
                                                    LiF =                            µiF (z )riF (z ) dF (z ),                                            (D.28)
                                                              1 − F ( zI )     zI


where the variable ris (z ) represent the average revenue of a ﬁrm with presignal z . The labor supply function
takes the same form as in the main text, and the equilibrium is determined by equalizing the labor demand and
labor supply. Similarly, to solve for the commercial ﬂoor space equilibrium, the demand function is given by:


                                                               zI                                                   ∞
        ˜ D = q −1 (1 − β )                       1                                                    1
        Z i    i                                                    µiI (z )riI(z ) dF (z ) +                            µiF (z )riF (z ) dF (z ) .       (D.29)
                                          F (zI ) − F (zE )   zE                                   1 − F (zI )     zI


On the other hand, the equilibrium equations for the residential ﬂoor space are the same as in the main text.
Following the logic from Ulyssea (2018) and Dix Carneiro et al. (2018) this equilibrium exists.




                                                                                    30