Transit Migration: All Roads Lead to America

The paths of many migrants include multiple destinations and transit routes, yet this pattern is almost never reflected in empirical analyses. For example, 9% of recent immigrants to the US arrived from a transit country as opposed to the one they were born in. Among those arriving from many OECD countries, transit migration ratio exceeds 30%. To explain these patterns, we construct a dynamic model of global migration that allows transit migration opportunities to impact the attractiveness of locations. After estimating structural parameters of the model, we simulate various counterfactual scenarios to highlight the spillovers of transit migration paths.

The migration decisions and paths of many international migrants include multiple destinations and transit routes. Many people leave their birth countries and live in different locations before settling permanently in a foreign country or returning home. For example, around 9% of the people who migrated to the US during 2001-12 were living in a country other than their birthplace prior to their arrival. This pattern is even more common for migrants with tertiary education: nearly 14% of such migrants did not come directly from their birth countries. 1 High-income OECD countries are particularly important transit stops for immigrants. Among the people who were living in Australia, Canada, or the UK just before their arrival in the US, over 30% were born in a different country and would already be classified as international migrants.
Despite their prevalence and potential economic significance, transitory migration patterns and the dynamic decision processes behind them are not explored in depth in the international migration literature. The determinants of migration flows or stocks are generally estimated as functions of the bilateral mobility barriers between geographic locations and the differences between the utility (or income) levels at alternative destinations and origins. 2 The standard model that validates this approach assumes that migrants make a permanent decision at a single point in time to move to a foreign country or to stay at home (see Hanson 2010, for a discussion).
Actual migration decisions are, however, more complex and dynamic in nature. Current and potential migrants continuously update their information sets and review matter how circuitous that route might be. In other words, the availability of transit routes might significantly reduce the effectiveness of restrictive migration policies.
Using the estimated parameters of our dynamic migration model, we perform several simulation exercises where we 'close' certain bilateral corridors to highlight the spillovers between alternative locations as discussed above. We chose the migration corridors based on their importance for immigration to the US, where detailed transit migration data are available. The migration paths we consider are direct migration: (i) from Canada to the US; (ii) from developing countries to Canada; and (iii) from developing countries to the US.
Our fourth and final simulation blocks all transit migration to the US and only allows direct migration from the migrants' birth countries. In each of these scenarios, we analyse how these policy experiments would affect migration from different countries to Canada, to the US and to other transit countries, such as the UK. As expected, there are significant spillovers. For example, when people cannot move to the US directly (Simulation 3), both permanent and transit migration to Canada increases. On the other hand, when transit migration to the US is banned (Simulation 4), then direct migration to Canada also drops since Canada's option value diminishes significantly.
While issues related to international transit migration receive relatively little attention in economics, they have a more prominent place in other fields, such as demography and sociology, with varying definitions and approaches. More specifically, the phrase 'transit migration' is used to describe the migration patterns emerging in Europe following restrictions on legal migration and border controls that were introduced in the 1990s. As a result, many migrants from Eastern Mediterranean and African countries moved to countries at the periphery of the European Union, such as those in North Africa and Eastern Europe, while waiting for a chance to move to the European Union countries Collyer and de Haas, 2012). The paths of most recent refugees from Syria, Iraq and other source countries follow similar transit routes, such as through Turkey, as they try to reach Western Europe. The phrase is also used extensively to refer to low-skilled and mostly undocumented migrants from Central America who travel through and temporarily live in Mexico before reaching the US. In other words, transit migration has a strong affiliation with irregular and undocumented migrants and refugees as well as a strong sense of temporariness (see D€ uvell, 2012). Even among the few papers in the economics literature (such as Djajic, 2014), this is the general perception.
We use the expression 'transit migration' to describe any migrant who lives in a foreign country before moving to a second foreign country. Our data from the American Community Survey indicate that transit migrants, defined as those living in a country other than their birth country 'one year' prior to their move to the US, are somewhat different than those the demography and sociology literatures focus on. First, transit migrants to the US are more likely to have tertiary education as opposed to primary or secondary education. Second, most transit migrants are born in highincome European (and some African and East Asian) countries as opposed to most Latin American migrants who come directly from their countries of birth.
In terms of international migration, our article is related to the large literature on the determinants of migration patterns such as Mayda (2010), Grogger and Hanson (2011), Beine et al. (2011) and Belot and Ederveen (2012). Several recent papers , were influenced by new developments in international trade literature on gravity models (discussed in Anderson, 2011). They aim to relax certain assumptions used in the earlier models. In this regard, the paper closest to ours in terms of insights and approach is Bertoli et al. (2014), which analyses the impact of financial crisis on intra-European migration patterns. Using a model that incorporates expectations about future economic conditions and the attractiveness of alternative destinations, they show how the determinants of migration patterns change when dynamic considerations are incorporated. While our empirical approach and questions are different, our analytical models share many insights.
Our article is related to several other strands in the labour economics literature. First, there is an extensive literature in labour economics on dynamic choice models, which explore various dimensions of mobility such as occupational mobility (Keane and Wolpin, 1997), internal mobility (Kennan and Walker, 2011), and job search (McCall and McCall, 1987). These papers usually rely on rich individual panel data and focus on endogeneity of the human capital accumulation process. A more closely related paper is Artuc ß et al. (2010) on the sectoral mobility of workers in domestic labour markets in the presence of international trade shocks. They estimate moving cost frictions using a dynamic discrete choice model, highlighting the importance of the option values of available choices.
The use of option values of locations as a concept in the migration literature is more sparse. The first paper, as far as we are aware, is Burda (1995) who models the timing of a single migration decision. There is an option value to wait due to underlying volatility in the economic environment. Locher (2001) explores the same concept in a twoperiod framework, using data on ethnic German migration from CIS countries. The next important contribution is Bayer and Juessen (2012) who model internal migration decisions in the US and estimate the structural parameters of a dynamic model. The option value again arises from underlying economic volatility.
Dynamic considerations are extensively analysed in a series of recent papers on temporary and return migration (see Dustmann and G€ orlach, 2016 for a recent survey). As several papers illustrate, return migration levels have always been quite high. Bandiera et al. (2013) show out-migration rates from the US were over 60% during the age of mass migration at the turn of the 20th Century. Bijwaard et al. (2014) use administrative data from the Netherlands and Bratsberg et al. (2007) use register data from Norway and Sweden to explore more recent patterns. Several papers on return and circular migration have used more formal and explicit dynamic models. Among the first examples, Kirdar (2012) develops a dynamic stochastic model to jointly explore return migration and savings decisions of migrants and estimates it using panel data from Germany. Among more recent and prominent examples are Thom (2010) and Lessem (2013) who use data from the Mexican Migration Project on detailed migration histories of individuals. Among their findings is the importance of border enforcement (similar to our simulations of blocked corridors) to circular and return migration decisions. Using multiple data sources from Mexico, G€ orlach (2016) develops a comprehensive dynamic life-cycle model to analyse the role of financial constraints on emigration, return migration and re-emigration decisions. In an another life-cycle model, Adda et al. (2015) explore human capital accumulation and migration duration decisions using data on Turkish migrants in Germany. All of these papers rely on detailed panel data from a single origin or destination country (or a single corridor) as the data requirements of such dynamic models are quite demanding. One of our main contributions is to analyse a model of global migration patterns and find empirical solutions to address some of the data constraints.
The next Section presents the dynamic structural model. Then, we discuss the data used in the article, followed by the estimation algorithm. We next present the estimation results and the simulation exercises. We conclude with a discussion of future research paths and policy analysis.

Model
We present a dynamic model that allows repeated migration between alternative destinations in each period. Thus, the attractiveness of a destination from a given origin is not solely based on the income gap between them but it is also a function of future migration opportunities that become available at the new location. Unlike static models where migration decisions are made at a single point in time without further migration possibilities, our model captures economically important and ubiquitous sequential and transit migration patterns.
The agents in the model differ according to their current location, birth country and skill category. An agent's country of birth is indexed by i, country of residence is indexed by j and skill level is indexed by s. There are n countries in the model and i, j 2 1, 2, . . . , h n. An agent chooses a destination country based on his expectations. If i = j, the person continues to live in his birth country and is not considered to be a migrant. If i 6 ¼ j, the agent is considered to be a migrant. In every period, an agent living in j, can choose to migrate to any other country k 6 ¼ j, regardless of whether he is already a migrant or not. Alternatively, he can choose to stay in j.
An agent with skill level s living in j receives instantaneous (flow) utility / s;j t at time t. This utility depends on the current location and skill level but it is independent of the agent's birth country. The next components of the utility function are composed of moving costs which influence mobility decisions in almost every migration model. Moving costs are incurred if an agent born in country i decides to move from country j to another country k 6 ¼ j. We assume that the moving cost is equivalent to a one time reduction in the utility and is given by C s;i;j;k þ a;k t . The first component of this moving cost, C s;i;j;k , is fixed (non-random) and is a function of the agent's birth country, current country, destination country and the skill-level. This component does not vary among agents with the same characteristics. The second component, a;k t , is random and varies by agent (indexed by a), by time and by destination country k. More specifically, a;k t is assumed to be i.i.d. and drawn from a mean zero Gumbel distribution with scale parameter 1. Furthermore, we assume that the fixed cost component is zero for stayers, i.e. C s;i;j;j ¼ 0, ∀j 2 1, 2, . . . , h n. However, the random cost a;k U s;i;j t where a t is a vector of random shocks, k represents the available choices of migration destinations and b refers to the discount factor between t and t + 1. We take expectations of the above expression with respect to agent specific shocks to get the Bellman equation which is expressed as: ð a t Þ is the expected value of location j for agent a from country i at time t, conditional on idiosyncratic shock vector a t . In this setting, as agents decide whether to stay in their current residence country j or to migrate to another country k, they have to take the following into consideration as seen in expression (2) above: (i) the instantaneous utility / s;j t of staying in j; (ii) expected future continuation payoffs of every potential destination k, represented by V s;i;k tþ1 ; and (iii) the moving cost from j to k expressed as C s;i;j;k þ a;k t . Ex ante identical agents may still make different mobility decisions because of the random component of their moving costs. These are equivalent to agent-specific utility shocks that affect attractiveness of different destinations. Although agents are more likely to migrate from countries with small / s;j t to those with large / s;j t , the reverse is still possible if the differences between random shocks are large enough. This is the mechanism through which the model would generate simultaneous migration flows from j to k and k to j. However, the future utility at a given destination k is not fully known, thus agents form expectations on these values and optimise accordingly. The instantaneous utility at a location, / s;j t , is not a function of the agent's birth country but the expected value, V s;i;j t , is a function of both birth country i and current residence country j. This is due to the assumption that the fixed component of the moving cost, C s;i;j;k , depends on the birth, current and destination countries. This assumption is supported by the data which show the immigrants in a given country j are more likely to move to another country k when compared to the natives of j.
The distributional assumption of the random shock, ϵ, allows us to solve the moving probabilities analytically. An agent born in i, currently living in j has the following probability of moving to k: The probability of moving from j to k is a function of location specific values and moving costs. This means we need to solve the Bellman equations to pin down the moving cost parameters C s;i;j;k and location specific instantaneous utility parameters, / s;j t . This process will allow us to calculate the direct and transit migration probabilities implied by the model as expressed in (3). The Bellman equations need to be solved recursively for any given parameter set and we will discuss the solution algorithm in detail in the next Section. The next step is to simplify the Bellman equation. The expected incremental utility of being able to move from a given country j at time t can be written as: which is the expected maximum future value minus the discounted value of the current choice. This is the option value of location j and it is a key feature of our model and empirical analysis. Similar to the moving probabilities, the 'E max' expression above can be solved analytically, thanks to the distributional assumptions on the random shock. 4 This leads to the following expression: The Bellman equation can now be simply expressed using the option value term: 5 This expression shows that the value of location j has three components: (i) the current instantaneous utility / s;j t ; (ii) the discounted expected next period utility of staying in j, given by bE t V s;i;j tþ1 ; and (iii) the option value of j (due to its access to other destination countries) as denoted by X s;i;j t . The option value term in (5), as mentioned earlier, is the feature that separates our model from the standard approaches where migration is a single static decision. In standard models, migrants choose among the set of potential destinations and move from their birth country. Migrants live in that destination for the remainder of time, essentially a single period, and receive their payoff. In our setting, the agents face the migration decision repeatedly, regardless of whether they are living in their birth countries or have already migrated. They make their decisions based on the instantaneous payoff, location specific moving costs and expected continuation payoffs in those destination choices, as expressed in (6).
The option value associated with each potential destination in our model is similar to the 'option value' concept in the finance literature. Agents value their ability to move and this valuation is location specific. The option value of country j is going to be larger if it provides easier access to other and possibly higher income countries. We argue and show that the option value of such locations (such as Canada, the UK and Australia) is one of the main motivations behind the observed transit and sequential migration patterns. Finally, we should emphasise that the option value of a location X s;i;j t ¼ À logðp s;i;j;j t Þ decreases as the probability of moving decreases. If the agents were completely immobile and were stuck in j forever, say due to complete border closures between j and other destinations, then the option value of j would decline to zero.

Functional Form of the Moving Costs
The moving costs are one of the key determinants of the mobility patterns generated in the model. We assume that the fixed moving cost from country j to country k has two distinct components. The first component, denoted by C s;j;k 1 , is the cost of moving from j to k. This is assumed to be common for all agents whether they were born in j (locals) or were born in another country i and had migrated in an earlier period to j (immigrants). The second component, denoted by C s;i;k 2 , is the additional moving cost for immigrants born in i and are currently living in j. We refer to this component as the cost of transit migration and do not have a priori expectation on its value. The data, however, indicate immigrants are more likely to move on to another country k from j when compared to locals, which implies this second component is likely to be negative. We assume that the transit cost is a function of birth country i and destination country k. 6 The common cost component is given by: and the additional transit component is equal to: where dist(j,k) is the distance between countries j and k, lang(j, k) is an indicator variable that is equal to 1 if countries j and k share a common language and gdp(j) is the GDP per capita in country j. Total fixed moving cost of moving from j to k for an agent born in i is, thus, given by: where C s;j;j 1 ¼ 0, C s;i;i 2 ¼ 0, and the indicator function 1 i6 ¼j ¼ 1, if i 6 ¼ j and 0 otherwise. In other words, the fixed moving cost is equal to zero for stayers independent of their birth country; it is equal to C s;j;k 1 for people moving from their birth country to country k and it is equal to C s;j;k 1 þ C s;i;k 2 for transit migrantspeople born in country i, living in j and moving to k. 7

Data
Among the main constraints faced in estimating dynamic migration models is the availability of detailed and comprehensive data. Most global bilateral migration data sets are based on censuses and population registers of the destination countries 6 Unfortunately we cannot use destination specific parameters in these expressions, since we use only the transit migration data to the US. However, we can use bilateral and origin specific parameters. 7 In reality, both direct and transit cost components might be functions of the time a migrant spends at the current location. Furthermore, the random component might show persistence over time. We abstract away from these issues in our model and estimation since it is not feasible to identify duration of stay of migrants at a given location. where the migrants currently reside. These data sources record only the country of birth or citizenship of migrants. Other important variables, such as the year of arrival or migration status, are not included in most surveys. Detailed migration histories tend to be available only in small and specialised surveys that are not nationally representative. 8 Without comprehensive global data that cover all possible destinations, it is difficult for empirical and analytical papers to explore beyond static models.
The data in this article come from two different sources. The global bilateral skilled migration database, as described in great detail in Artuc et al. (2015), provides migrant stocks by gender and education level (tertiary and non-tertiary) for 1990 and 2000 for each pair of countries in the world. Part of the data comes from original statistical sources and comprises around 80% of the migrant stock in the world. The rest are imputed using gravity based models. Artuc et al. (2015) describe the empirical methods behind the imputation based on a gravity model. Since the imputed data might bias our results, we use only the raw data for 2000. The parallel data for 2010 for OECD destinations come from the DIOC database which is described in Arslan et al. (2014). The DIOC 2010 database also has information on the time of arrival of migrants by skill level and country of origin for all of the OECD destinations. This new disaggregation, not available for earlier decades, allows us to construct the number of new arrivals by skill and origin country for 2000-10. We discuss in the next Section how we use this new data in our empirical analysis. Finally, we have collected additional bilateral migrant stock data by skill level for around 50 non-OECD destinations in 2010. We use this data to complement the DIOC data.
The second data source is the annual American Community Survey (ACS) of the US for 2001-11. ACS provides detailed information on a large sample of the population, including migrants who are defined as people born outside the US, regardless of their citizenship or residency status. The survey asks the country of birth, the year of migration to the US and where people were living a year earlier.
Using these questions, we take the people who declared that they had moved to the US within the previous year, record their country of birth and identify their country of residence prior to their move. This procedure allows us to identify those who came directly from their birth countries and those who came from another country. If a recent migrant was residing in a country other than their birth country, we consider them to be transit migrants. No other data source, to our knowledge, has this level of detailed and extensive information about immigrants and their migration paths. Despite this unique information, we do not actually know when transit migrants moved to their country of last residence. This information is not needed for our modelling purposes but would be quite useful for future analysis.
We construct our database in the following manner. We only include people between the ages of 18 and 65 at the time of the survey. Dropping the elderly (65+) reduces the sample by around 4% while removing the children and young adults shrinks it by around 24%. The main reason for removing young migrants is that they are likely to be moving with their parents and are unlikely to be making the individual mobility decisions that form the basis of our model. We include everybody in our sample who states that they arrived during the year of the survey or during the previous year and were abroad in the year before they arrived. 9 This leads to slightly over 9 million people in the sample.
The ACS surveys have, on average, 175 different countries and territories as possible answers to the place of birth question. On the other hand, 'country of residence one year ago' question has only 67 categories listed. Many smaller countries in a given region are aggregated into regional categories, such as 'Other South America' or 'Other West Asia'. To complicate the issue even further, many countries, such as Chile, Peru and Turkey, are individually identified in some of the survey years but aggregated into the regional categories in others, making the analysis close to impossible. To address these complications, we construct the smallest set of countries and regional aggregates that are consistent over time, ending up with 40 geographic units, including the US. Among these, 25 are separate countries and 15 are regional aggregates of the 150 other countries (or regions) that are listed in birthplace category. The list of our 40 groups and the countries they include are presented in Table 1. This type of aggregation of the world into 40 countries or groups potentially lowers our estimates and the extent of transit migration in the data. For example, if a migrant who was born in Peru goes to Argentina before moving to the US, we will not be able to identify him as a transit migrant since Peru and Argentina are part of the same category (Other South America) in our data and estimation. For compatibility, we group the countries in the global bilateral migration database and DIOC/DIOC-E along the same lines. Finally, in addition to their country of birth and last residence, we split all migrants as tertiary or nontertiary educated.
Figures 1 and 2 as well as Tables 2 and 3 provide information on the extent of transit migration to the US. Table 2 shows what percentage of migrants born in a given country was residing in a different country before migrating to the US. For example, 7% of Canadian born migrants came to the US from a different country. The same ratio is only 1% for Mexican migrants but over 20% for migrants born in many European countries such as the UK, Italy and Russia. The same information is 9 There are two questions we use for this purpose. The first question asks when the migrant moved to the US and the second asks the following: 'Where did this person live one year ago?' We should note these two questions seem to create a certain level of confusion which possibly arises from when the survey is conducted within that year. Suppose the survey is conducted in September 2006 and the migrant arrived in the US in March 2005. The answer to the first question will be that the migrant moved to the US during 2005. In other words, this question is likely to capture more than 12 months of arrivals. That is why, the total number of recent arrivals (current or previous year) obtained by summing up each individual survey year is actually more than the number of people who arrived during 2001-11 based on the 2012 ACS data. For the second question, some people will interpret 'one year ago' as November 2005 and state where they were living in the US. Other respondents will interpret one year as the previous calendar year and report the foreign country they were in before coming to the US. As a result, around 40% of the people who stated they migrated during that year or the previous year also reported a location in the US to this second question. We drop these observations from our sample. This adjustment brings our total number of observations to 9.03 million, which is now 10% less than the number of people who arrived during 2001-11 according to the 2012 ACS data.   presented visually in Figure 1 to highlight regional differences and similarities. We see that transit migration is higher among migrants born in Africa and the Middle East and lower in Latin America, as migrants from the latter region have more direct access to the US due to geographic proximity and diaspora linkages. Table 3 presents the percentage of migrants coming from a given location who were born somewhere else and Figure 2 displays the same information visually. The data indicate that transit migration is quite high among the migrants who were living 0 -1 -5 -10 -20 No Data  in higher income OECD countries. For example, 30% of migrants from Canada to the US were born in another country; for the UK, that ratio is 37%. Despite the public perception, the data do not indicate that migrants coming from Central American or Caribbean countries are generally transit migrants. Of course, many Central American migrants might simply be travelling through Mexico without living there for an extended period of time.
Our final observation is that transit migration is more common among highskilled migrants as seen when we compare the second and third columns of Tables 2 and 3. In the aggregate, close to 14% of high-skilled but only 6% of low-skilled migrants are transit migrants. These differences indicate that higher skilled migrants are more mobile, not only in terms of leaving their home countries for higher income countries (Artuc et al., 2015) but also in terms of moving between multiple destinations.

Solution and Estimation Algorithm
We solve the model at steady state to calculate implied migration probabilities to be used in the estimation algorithm (i.e. we assume that V where the moving probability from j to k of migrants from i is given by: p s;i;j;k 0 ¼ exp À bV s;i;k 0 À C s;i;j;k Á P l exp À bV s;i;l 0 À C s;i;j;l Á : The model has the following parameters that need to be estimated (or calibrated) based on the (7), (8), (10) and (11): (i) the moving cost parameters, a; (ii) the location specific instantaneous utility parameters, /; and (iii) the discount factor, b.
We define the parameter vector h, which consists of all parameters of the model for skill group s, i.e. as, /s, and b. Since there are eight moving cost parameters, n location utility parameters and b, our h vector has n + 8 + 1 elements.
Solving the equilibrium values involves finding a fixed point. We first consider the matrix V s 0 consisting of all V s;i;j 0 s and define the function V s 0 ¼ F ðV s 0 ; hÞ, using (7), (8), (10) and (11). Finding the fixed point of function F, given the parameter vector h, leads us to the equilibrium values of V s;i;j 0 . We should note that V s 0 is an n 9 n dimension matrix which means we have a large system of n 2 equations with n 2 unknowns. 10 After solving for V s;i;j 0 s, we can insert them in (11) to calculate the moving probabilities, p s;i;j;k 0 . This procedure allows us to write these moving probabilities as a function of the parameter vector, denoted as p s;i;j;k ðhÞ. Then it becomes possible to calculate the log-likelihood contribution of an agent from origin i, living in j, and moving to k as log p s;i;j;k ðhÞ for any given h.

Likelihood Function
In the American Community Survey (ACS) data, as discussed in the data Section above, we have detailed mobility and birthplace information for people who migrated to the US. Transit migrants are defined as the people born in i and living in j before moving to the US. We denote their number as M s;i;j;U . 11 Next, using the stocks of migrants born in i and living j in 2000 from the global bilateral migration database and DIOC (see Arslan et al., 2014 andArtuc et al., 2015, for details) and M s;i;j;U matrix, we calculate the approximate number of people who are staying in their current country or moving to destinations other than the US. We denote this matrix as M s;i;j;S . In other words, we divide the migrants born in i living in j (i 6 ¼ j) into two groups: those who move to the US, given by M s;i;j;U , and those who moved elsewhere or stayed in j, given by M s;i;j;S .
Using these data matrices, vectors and implied migration probabilities, the loglikelihood contribution of transit migrants can now be expressed as: where k 6 ¼ U means the destination k is any country other than the USA. Our next dataset, the 2010 DIOC provides data on the number of people who migrated within the last 10 years to OECD destination countries. This allows us to calculate the number of natives who moved from j to k within the last 10 years if k is an OECD country. We denote this matrix M s;j;j;k . Next, we define M s;j;j;N as the number of people who move to non-OECD destinations (labelled by N collectively) and M s;j;j;j as the number of people who stayed in their birth country j and did not migrate to any country. Both of these are 40 9 1 vectors for each skill group s. In summary, natives (who were born in j and living in j) are divided into the following groups: Staying at home, M s;j;j;j , moving to a specific OECD destination k, M s;j;j;k and moving to any non-OECD destination, M s;j;j;N .
The log-likelihood contribution of direct migrants is equal to: where k 6 2OECD, k 6 ¼ j means k is different from birthplace and it is not an OECD country. k 2 OECD, k 6 ¼ j means destination k is different from birthplace j and is an OECD country. We merge all non-OECD destinations when we calculate the likelihood contribution since we do not have individual flow data for those. Finally the loglikelihood contribution of stayers is equal to: log L 3 ðhÞ ¼ X j M s;j;j;j logp s;j;j;j ðhÞ: Based on log-likelihood contributions from (12), (13) and (14), the parameter vector h is given by: h ¼ arg max logL 1 ðhÞ þ logL 2 ðhÞ þ logL 3 ðhÞ ½ : We assume that b = 0.95, normalise the fixed instantaneous utility of USA to / s;U ¼ 8:0, and estimate the remaining n À 1 + 8 parameters. In other words, we fix the values of 2 elements of h and estimate the remaining 47 parameters.
The algorithm is quite demanding computationally, since it involves calculation of 40 9 40 9 40 probabilities. It is necessary to solve for all of them simultaneously since they are functions of each other. Another computational challenge is to search over 47 parameters that maximises the log-likelihood function. The steps of the estimation algorithm are the following: with its value from the previous iteration: If the sum of square differences is less than e ¼ 10 À8 , the algorithm stops and moves on to step 3. Otherwise it updates V s;i;j 0 , goes back to step (b) and continues to iterate.
(iii) The algorithm evaluates log-likelihood function, logL 1 þ logL 2 þ logL 3 using p s;i;j;k 0 and data matrices via (12), (13), (14) and (15). (iv) Then, the optimisation algorithm checks to see if there is room to improve the log-likelihood function. If this is the case, it goes back to step 2. If the algorithm has converged, it stops. 12

Empirical Results
We estimate parameters of the model separately for high-skilled and low-skilled individuals. As discussed above, the vector h for each skill group s has 49 elements -40 value parameters for each country/region, eight moving cost parameters and b. As discussed above, we set / s;U ¼ 8:0 for the US, b = 0.95 and estimate the remaining 47 parameters. Table 4 presents the eight moving cost parameters for high and low-skilled workers with the standard errors reported in parentheses. We find that the intercepts of basic moving cost for direct migration for both low and high-skilled migrants, denoted as a 1 s;0 in (7), are positive and significant at the 1% level. The additional cost intercepts, denoted as a 2 s;0 in (8), for transit migrants are negative for both skill levels. This is consistent with the fact that immigrants from country i living in country j are much more likely to migrate to another country k when compared to the natives of j. The magnitude of these transit migration coefficients are smaller than those of the basic cost coefficients, indicating that the combined intercept for transit migration is still positive. Note. Standard errors are in parantheses. * Significant at 99% level. 12 The estimation and simulation codes are available as Supporting Information, see Data S1.
The most important migration cost variable is likely to be distance. The additional distance coefficient for transit migrants is positive and significant for low-skilled migrants and insignificant for high-skilled migrants. As expected, distance increases migration frictions for direct migrants and the cost imposed by distance is significantly higher for the lower skilled migrants. This pattern has been identified repeatedly in the literature as the higher skilled can more easily overcome physical and financial costs that are proxied by distance (Hanson, 2010;Beine et al., 2011). This difference is also one of the reasons why migrants tend to be positively selected in terms of education levels. On the other hand, the distance coefficient for the transit cost component is only significant for the low-skilled migrants. This result indicates that the distance between birth country i and final destination k seems to matter only for the low-skilled once migrants are already in the transit country j. High-skilled migrants, in contrast, are not significantly affected by this distance.
GDP per capita is another determinant of moving costs and the effect operates through several channels, such as financial barriers. The coefficient is negative and significant for basic migration for both types of migrants, with a higher (more negative) value for the low-skilled. The implication is that direct migration from a poorer country is more costly, especially for low-skilled migrants. The GDP per capita coefficients in the transit cost component are both positive, but significant only for lowskilled migrants. This is similar to the pattern we observed for the distance variable. Low-skilled people from poorer countries are more likely to be transit migrants when compared to those from high income countries. Another possible explanation is that a transit migration experience abroad might increase labour market returns in final destination countries. This effect will be stronger for those coming from lower income countries and will be captured by a positive coefficient of the GDP per capita component of transit migration cost. 13 Linguistic overlap decreases migration frictions and increases mobility as demonstrated in numerous studies using gravity models (Beine et al., 2016). Common language decreases basic migration costs for both skill levels, as demonstrated by the statistically significant negative coefficients, with a larger effect on the high-skilled. The interesting result is that linguistic similarity continues to be important for transit migration costs, and still more so for high-skilled migrants. For example, an Australian migrant in France faces a lower cost of migrating to the US than a German migrant in France, and the cost is even lower if he is high-skilled.
Our final set of results presents the instantaneous utility parameters, / s;j 0 for each country/region j and skill level s in our model. Note that we set / s;U ¼ 8:0 for the US and estimate the parameters for the other 39 locations. Figure 3 presents the results for both skills levels, with results for the high-skilled on the left panel and the parameter values ranked in declining order in each panel. 14 We emphasise that these parameters are estimated separately for each skill level and the right comparisons are within, not across, skill levels. First, we observe, as expected, a high degree of correlation with the income levels in these locations. High-income OECD countries, such as Canada, 13 We thank an anonymous referee for this interpretation. 14 All of these instantaneous utility parameter estimates are significant at the 99% level with t-statistics between 14 and 70. This pattern indicates high-skilled migrants enjoy much larger instantaneous utility gains when they move, especially from low income to high income countries. Third, several low-income countries, such as India and China, have large utility values for the low-skilled. This pattern is consistent with the low emigration rates of these countries and might result from the non-pecuniary benefits these locations provide to natives or from country-specific mobility costs that the utility parameters capture.

Simulations
This Section presents the results of a range of policy simulations that use the estimated coefficients of moving costs and location specific values. Our main goal is to identify the main spillovers between direct and transit migration options and how various simulated policies, such as complete blockage of certain migration corridors, impact bilateral and global migration patterns. In all of these simulations, we consider all origin countries/regions and two representative transit countries: Canada and the UK.
Each simulation involves solving the equilibrium outcomes for all 40 countries/ regions but we selected only a sample of representative countries for presentation purposes. The simulation algorithm is similar to the one used in the estimation. This time, however, V s;i;j t 6 ¼ V s;i;j tþ1 , so we use (3), (5) and (6), rather than (10) and (11). The simulations use the original population and migrant distributions from the data as the initial state variable. The population distribution includes agents' birthplace, current location, and skill level, making it a 40 9 40 matrix for each skill level.
The number of type s individuals who were born in country i living in country j at time t is denoted as L s;i;j t . For the simulation, we change the parameter vector of the model to h 0 . Then, we solve the optimisation problem using the new parameters, h 0 , and calculate the new migration probabilities, p s;i;k;j t ðh 0 Þ, for each agent type. Next, we find the new migrant stocks, L s;i;j tþ1 using the new probabilities and flow equation which is given by the following: We repeat this exercise for each simulation scenario as described in the following subsections. 15 Figure 4 helps to visualise the simulation scenarios where the US is the final destination. For simplicity, we present three countries in the Figure -Canada, the UK and other countrieseven though the simulations are solved for all countries separately. People can migrate directly to the US or via other countries. As we mentioned earlier, Canada and the UK are presented as the two possible transit countries in the Figure and the discussion but the model is solved for all potential transit routes.

Simulation 1: Moving Cost from Canada to the US Increases
In our first set of simulations, we artificially increase the moving cost from Canada to the US to a level such that migration levels decrease by more than 99%. Technically, 100% prohibitive costs do not exist since the i.i.d. shocks have an infinite support. This is presented as the removal of the arrow between Canada and the US in Figure 4 and implies that both current migrants and native citizens in Canada are no longer able to migrate to the US. This change would force them to reconsider all of their migration decisions.
The results of the simulation are presented in Table 5. The main finding of this simulation is that migration from other countries to Canada significantly decreases as seen in columns 1 and 3, even though there are no changes to migration costs or benefits regarding Canada. Recall that migration to Canada is attractive for two reasons. First, the instantaneous utility parameter for Canada is high compared to many other countries (the / term in the value function expression (6)). Second, moving from Canada to the US is relatively easy which generates a high option value associated with being in Canada (the Ω term in the value function). Therefore when it becomes almost impossible to move from Canada to the US, the option value of migration to Canada decreases for almost all migrants. The final effect is a significant decline in the overall migration levels from other countries to Canada. Note that migration does not drop to zero, because it is still possible to move from Canada to other countries and Canada still has a relatively high per-period payoff.
In terms of specific outcomes, the number of immigrants moving to Canada decreases between 3% and 18% for people from other countries. The impact is especially high for Latin American and Caribbean migrants. If we were to compare the impact on low and high-skilled migrants, we see that it is marginally stronger for the low-skilled regardless of the country of birth. The gap is also high for Latin American migrants, where the impact on the low-skilled is almost twice as much.
Another potential effect is the impact on other critical transit countries such as the UK. We see in columns 2 and 4 that there is almost no impact on migration from most countries to the UK when the Canada-USA border is closed. However, there is some increase in high-skilled migration from several countries, such as Jamaica (1.8%) and Other Caribbean countries (1.6%) as the UK becomes more attractive. Finally, we observe increased migration from Canada to the UK (1.9% for the low-skilled and 2.8% for the high-skilled) since the Canadians themselves are also unable to move to the US.

Simulation 2: Moving Cost to Canada Increases
Our second simulation involves increasing the moving costs to Canada from all developing (only non-OECD) countries to prohibitively high levels. This is presented

2018]
T R A N S I T M I G R A T I O N F327 as the removal of the arrow labelled simulation 2 in Figure 4. In other words, people are no longer able to move from India or China to Canada whether they would like to settle there permanently or use it as a transit stop on their way to other countries such as the US. This policy change is similar to the spillover effects of bilateral visa policies implemented by destination countries on other potential destinations, as explored by Bertoli and Moraga (2015). Table 6 presents results only for high-skilled migrants since the impact on low-skilled migrants are negligible, with the exception of the obvious decline of 100% in direct migration to Canada. This is probably due to the fact that low-skilled transit migration via Canada is already relatively small and has minimal impact on other transit paths. For high-skilled migrants, the number of transit migrants moving from Canada to the US decreases significantly, as seen in column 2, but not by the full 100%. This is due to the fact there are some migrants, say from Mexico, who were already in Canada before the border closed and they continue to move to the US over time. We also observe that there is a decline in transit migration from OECD countries (such as France) to the US via Canada, even though migration from France to Canada is not blocked. This arises from the fact that transit migration of French-born migrants from developing countries to Canada is also blocked, resulting in a small decline in the overall migration of the French-born.
There is a small, and in some cases considerable impact on all other paths of migration, especially direct migration to the US and transit migration via the UK. For example, direct migration to the US (first column) increases for Asian and Caribbean countries, up to 3% for Jamaica, 6% for other Caribbean Countries, and slightly over 1% for Korea, the Philippines, Western and Eastern Africa. Similarly, there is some increase in transit migration from the UK to the US (third column) indicating some degree of replacement. For example, there is an increase of around 2% in transit migration from Jamaica and 4% from other Caribbean countries as well as around 1% from African regions. These are again due to global spillovers from changes in bilateral corridors.

Simulation 3: Moving Cost to the US Increases
Our next simulation increases the direct moving cost to the US to a prohibitive level for direct migrants from developing (non-OECD) origin countries. This is equivalent to the removal of the arrow labelled simulation 3 in Figure 4 and the results are presented in Table 7. We again only present the results for high-skilled migrants since the patterns are qualitatively the same but quantitatively smaller in the case of lowskilled migrants. Since transit migration forces are stronger for high-skilled individuals, they are more responsive to changes in moving costs and the US is more attractive for the high-skilled migrants from most developing countries.
Interestingly enough, closing the borders for migrants from developing countries is among the main proposals put forward by some of the leading candidates in the 2016 US election. Our results indicate that the impact of this policy simulation is actually quite large. First, we highlight that the US is already the largest destination for highskilled migrants from most origin countries. For example, more than 70% of highskilled workers who left the Philippines already live in the US.
When direct migration to the US becomes impossible, Canada and the UK become very attractive destinations. Both of these countries are high-income economies with similar income opportunities for migrants so there is a direct spillover in this scenario. These effects are presented in the second and fourth columns for Canada and the UK, respectively, as many high-skilled migrants move to these close substitutes. For example, as reported in columns 2 and 4, migration to Canada and the UK increases by between 25% and 45% for many Central American and Caribbean countries and by a staggering 165% for the Other Caribbean region that includes Haiti and other smaller island countries. The increase is between 5% and 15% for the African regions and between 3% and 15% for many non-OECD Asian countries. Although we do not report it, we observe positive but smaller changes in migration flows to other high-income OECD countries.
Canada and the UK also provide a pathway to the US. As a result, transit (or the option value of) migration to these countries increases when direct migration to the US is no longer possible and migrants try to find alternative routes. This motivation is clearly seen in column 1 where we observe a significant increase in transit migration to the US via Canada for a range of birth countries. For example, there is a 15-30% increase in transit migration from Latin American and Caribbean countries (and again The implications of eliminating transit migration for total migration to these countries are quite striking as presented in Table 8. We observe that the level of migration to transit countries declines significantly even though these are still as attractive for permanent settlement. For example, as reported in column 1, the migration of high-skilled workers to key transit countries such as Canada, the UK and Australia from high-income countries (such as Germany, Italy and France) declines by 4.6%, 1.8% and 3.2% receptively. Similarly, migration from low-income countries to these three destinations declines by larger amounts: 7.0%, 2.4% and 4.5% respectively. There are similar significant declines in migration to other OECD transit countries. This result is one of the clearest and direct evidence of the significance of transit migration and importance of many OECD destinations as transit countries, especially those with lower moving costs to the US.
The decline in low-skilled migration is slightly smaller but still relatively high. For example, low-skilled migration from high-income countries (column 3) to Canada is down by 3.1%. The decline in low-skilled migration from low-income countries (column 4) is also larger, 11.6% for Canada and 2.0% for the UK.
Finally, we observe bigger declines in percentage terms in migration levels to many of the smaller developing countries, especially for the high-skilled migrants from other lower-income countries. This is simply an artifact of the data. Since the original migration levels to these countries are already so small, any decline translates to a relatively large percentage change.

Conclusion
Most empirical analyses of migration decisions and patterns are based on static models. A potential migrant is assumed to make a single decision between alternative destinations and his current place of residence based on location-specific utility differences and bilateral mobility costs. Once migration takes place, the payoff and costs are realised and, most importantly, there are no additional movements. The data indicate that this simplistic view is quite off the mark. Many migrants' paths involve multiple countries and are likely to be the outcomes of complicated dynamic decision processes. Historical evidence and current data provide ample evidence that many migrants live in a series of countries for different lengths of time, experience repeated migration episodes, or have circular migration paths.
There is a relatively new and exciting strand of the literature that is based on innovative dynamic models. They rely on detailed survey data on individual migration histories from single origin or destination countries to estimate the determinants of  return or circular migration behaviour. Unfortunately, the absence of detailed individual data at the global level prevents us from shedding light on some of the other processes. Our contribution is to construct a novel dynamic model of global migration with the goal of incorporating and explaining transit migration patterns, using the available data. In our model, agents decide to stay in their current location or move to another one every period, taking instantaneous utility payoffs and bilateral mobility barriers into account. This dynamic structure leads to an option value associated with each location, making it attractive as it provides easier access to other locations in future periods. This attractiveness is exactly the basis of the transit migration behaviour we observe. Our empirical analysis relies on transit migration data from the American Community Survey, which asks migrants their place of birth and where they were living the year before they came to the US. This data indicate that transit migrationpeople coming to the US from places other than their birth countriesis actually high, especially among high-skilled migrants coming from other high-income OECD countries. We combine the American data with global bilateral migration data and adopt estimation methods to address the data challenges. We finally simulate certain scenarios that highlight the dynamic interactions and spillovers between different migration paths. For example, blocking certain paths might decrease or increase migration flows in other corridors as the option values of the impacted locations change. In other words, governments need to incorporate the externalities created by their policy actions, especially on their neighbours, via direct and transit migration channels.