Policy Research Working Paper 9518 A Reappraisal of the Migration-Development Nexus Testing the Robustness of the Migration Transition Hypothesis Nicolas Berthiaume Naomi Leefmans Nienke Oomes Hugo Rojas-Romagosa Tobias Vervliet Macroeconomics, Trade and Investment Global Practice January 2021 Policy Research Working Paper 9518 Abstract This paper tests the migration transition hypothesis that increase in economic development is not found to lead to emigration flows first increase and later decrease with a higher emigration. For a subsample of 44 countries that country’s economic development. Using a migration ver- have transitioned from low-income to middle-income sion of the gravity model, this hypothesis is tested on a status, emigration has rather declined with economic devel- global panel data set comprising 180 origin and destination opment. The migration transition hypothesis is therefore countries and a 50-year timeframe (1970–2020). This is the unfounded. Instead, the migration hump appears to be most extensive panel data set used so far to test the migra- driven by an underlying cross-sectional pattern that cannot tion transition hypothesis. The results confirm the existence be fully controlled: middle-income countries tend to exhibit of an inverted U-shaped relationship between development higher emigration rates than low- or high-income countries. and emigration within a cross-country panel setting. Nev- The findings of this paper have important policy implica- ertheless, the migration hump cannot be interpreted as a tions: development programs can simultaneously promote causal relationship: for a given low-income country, an economic development and reduce emigration. This paper is a product of the Macroeconomics, Trade and Investment Global Practice. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at hrojasromagosa@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team A Reappraisal of the Migration-Development Nexus: Testing the Robustness of the Migration Transition Hypothesis Nicolas Berthiaumea, Naomi Leefmansb, Nienke Oomesa, Hugo Rojas-Romagosac and Tobias Vervlieta a SEO Amsterdam Economics b University of Amsterdam c World Bank JEL-Classification: F22, O15 Keywords: International migration, Economic development 1 Introduction Globalization has facilitated physical mobility and as a result enabled international migration to increase from 92 million in 1960 to 244 million in 2017. 1 The traditional view that the root cause of these rising migration flows has been a lack of economic development in origin countries has resurfaced in the past few years within the policy debates of both sending and receiving countries.2 Would-be migrants, the argument goes, decide to move primarily in search of higher wages and income abroad. In this framework, exogenous non-economic factors such as natural disasters and conflicts at origin are secondary. The direct relation between income differentials and emigration originates from the neoclassical theory of migration. 3 This theory posits that a higher domestic reservation wage reduces the relative expected returns on emigration, as opposed to staying at home. This implies that, the larger the income and wage differentials between countries, the higher the migration pull factors are. Consequently, emigration is predicted to decrease as income gaps between origin and destination countries close. An important policy implication of this theory is that high-income countries can decrease immigration through policies that help low-income countries raise their average incomes and development levels. When income differentials decline as a result, so will the migration flows from low-income to higher- income countries. This will also relieve strained borders and stem the brain drain that negatively affects developing countries (Caselli, 2019). Accordingly, since the 1990s, policy makers, academics and development NGOs have advocated a triad of policies aimed at fostering development in emigration countries through aid, trade liberalization, and temporary and return migration (De Haas, 2007). However, although these models are intuitively appealing, they do not adequately explain observed patterns of migration. Empirical evidence shows that migration determinants do not depend only on economic factors such as income and wages, but also on migrant networks abroad, foreign immigration policies, and demographic transitions (Clemens, 2014). The migration transition hypothesis developed in Zelinsky’s (1971) seminal paper, on the other hand, accounts for both these economic and non-economic migration determinants. This creates a richer interrelation between migration and economic income levels. In particular, this hypothesis predicts a nonlinear inverted-U relationship between development and migration. Emigration first rises as development increases in a given origin country, until a so-called migration transition turning point is reached, after which emigration starts declining. As explained in Clemens (2014), this phenomenon can be explained by factors such as, among others, rising inequality, gradually relieving credit constraints, and structural labor market changes leading to worker dislocation, which might all accompany the economic development process. There is an extensive literature on the determinants of migration that has tested Zelinsky’s hypothesis. Using cross-section data many studies find the inverted U-shaped relationship between levels of GDP per capita and the share of emigrants, even after controlling for other determinants of migration (Djajic et al., 2016; Dao et al., 2018; Idu, 2019). However, testing for a migration hump using cross-section 1 These values were computed using the World Bank’s Global Bilateral Migration (Özden et al., 2011) and the United Nations’ Trends in International Migrant Stocks (UN Department of Economic and Social Affairs, 2017) databases. This corresponds to an increase in migration flows from 3% of the World population in 1960 to 3.2% in 2017. 2 This notion was first put forth in Todaro (1969) and Lucas (1975), and is exemplified in the European Commission’s (EC) European Agenda for Migration (EC, 2015), for instance. 3 The neoclassical model of migration was first elaborated in Ravenstein (1985). See De Haas (2011) for a survey on the different theories on the determinants of migration. 2 data leaves important considerations unaccounted for, such as reverse causality and the migration transition’s longitudinal dimension, as the transition takes place over an extended time period in a given origin country. Other studies have tested for a hump shape using panel data (Mayda, 2010; Bertoli and Huertas-Moraga, 2013). However, these papers use a limited number of country-time points, which restricts the empirical strength of their results. Other papers test the inverted-U relationship using solely migration flows to OECD destinations (Lull, 2016; Benček and Schneiderheinze, 2019). These studies, however, exclude the possibility that migrants from low-income countries can also migrate to other low- or medium- income countries. Since the average share of migration from all origins to non-OECD destinations is 50% over the 1960-2017 period, 4 we include such migration flows in order to incorporate all migration corridors in the analysis. The aim of this paper is to test for the inverted U-shape between emigration and development using a large panel database. We employ a comprehensive global panel data set with 180 origin and destination countries on a 50-year timeframe (1970-2020). 5 This allows us to empirically test for bilateral migration dynamics not only across countries but also across time with a relatively large number of observations. Because of its large longitudinal dimension, it is well suited for testing the migration transition hypothesis’ central prediction, which is a long-run phenomenon per origin country (De Haas, 2010). Our empirical specification is based on the random utility-maximization (RUM) model, which provides the micro- foundations for a migration version of the gravity model. 6 We employ a gravity-migration specification with a large number of fixed effects, which control for several observed and unobserved origin-, destination-, time- and country-pair-specific characteristics deemed to influence migration. We introduce both a linear and a squared GDP per capita at origin term (our proxy for development levels) to test for the non-linear inverted-U shape. This term is instrumented using its period-to-period lag in order to tackle reverse causality. The data set presented in this paper further contributes to Llull (2016), who employs a similar panel data set including bilateral migration flows for the 1960-2000 time period but does not test for the inverted U- shaped relationship between development and emigration. To our knowledge, this is the first paper on the migration transition hypothesis that tests the RUM model on a global panel data set that extends over a period of 50 years and includes bidirectional flows for 180 origin and destination countries. This comprehensive database accounts for all potential migration flows, and not merely flows to OECD destinations. As stated above, about half of all international migration, on average, was to non-OECD destinations. Merely including OECD destinations would therefore leave out a large portion of all migration flows. Furthermore, we reduce the bias due to the presence of zeros in the dependent variable using a Poisson Pseudo-Maximum-Likelihood estimator with High-Dimensional Fixed Effects (PPML-HDFE), and not by simply omitting them or resorting to data aggregations. Lastly, we conduct additional alternative tests of an inverted U-shaped relationship, while previous studies have generally merely run quadratic model estimations and hence ran into the risk of incorrectly finding an extremum. 4 Computed using the World Bank’s Global Bilateral Migration (Özden et al., 2011) and the United Nations’ Trends in International Migrant Stocks (UN Department of Economic and Social Affairs, 2017) databases. 5 Data on international migrant stocks in 2019 is used as a proxy for 2020. 6 Gravity models are more commonly employed in the trade literature, but several migration studies also use them. See Beine et al. (2016) for an extensive review of the migration literature employing RUM micro-founded gravity models. 3 Our results confirm the existence of an inverted U-shaped relationship between development and emigration within a cross-country (panel) setting. This result is robust to the inclusion of additional control variables and the estimation of the empirical model on alternative time subsamples. It is also robust to the inclusion of an interaction term between geographical distance and income at origin, and several additional tests for the existence of the inverted-U relationship. However, we cannot conclude that our findings yield evidence of a causal link between development at origin and emigration flows. The reason is that multilateral resistance to migration (i.e., that the attractiveness of a given country depends on the latent attractiveness of other potential destinations) is not fully accounted for. The only viable way to adequately correct for this is to also include origin-time fixed effects next to other (origin, destination-time, time, and country-pair) fixed effects that we do include. However, like all other papers in the existing literature on this topic, our econometric model does not allow for the inclusion of origin-time fixed effects as these would be perfectly collinear with our origin-time- varying variable of interest: GDP per capita at origin. With this endogeneity issue remaining unsolved we cannot claim that our results establish a causal relationship. We perform several robustness analyses to test whether an initial increase in economic development indeed leads to higher emigration. To this end, we test, for a subsample of countries that have actually transitioned from the low-income to the middle-income category, whether their emigration has increased with development, by applying both a linear and a quadratic version of our regression model. From this and several other robustness tests, we do not find that the inverted-U relation between development and emigration based on panel data also implies such a relation for an individual low-income country over time. Accordingly, drawing the conclusion that the inverted-U relationship between economic development and emigration is causal seems unfounded. Several authors (e.g. De Haas, 2019, Clemens and Postel, 2018) have concluded from the migration transition hypothesis that as low-income countries develop, their emigration will tend to increase before declining after the turning point and that development aid is therefore not a proper instrument to reduce emigration from low-income countries. Our findings do not imply this conclusion. On the contrary, for a subsample of countries that transitioned from low to middle-income (excluding China and India), we find that, as low-income countries develop economically, their emigration actually declined. This obviously has important policy implications for development cooperation: it suggests that development programs can actually reduce emigration from low-income countries if they are successful at promoting local economic development. The remainder of this paper is structured as follows. In section 2, we review the theories that might give grounds to the existence of a migration-development inverted U-shaped ‘life cycle’ in any given country, as well as the current empirical evidence for them. Section 3 describes the data we use and provides a descriptive analysis. Section 4 outlines our empirical methodology and section 5 presents our results, also including several robustness analyses. Section 6 concludes. 2 The migration-development ‘life cycle’ This section presents a literature review on the migration transition theories as well as the existing empirical evidence of the inverted U-shaped relationship between development and migration. 4 2.1 Theory The migration transition hypothesis (Zelinsky, 1971; Gould, 1979) sustains that economic, demographic, and socio-political forces, which co-occur with development, might also influence migration decisions. Under certain assumptions, such factors can jointly explain an inverted U-shaped relation between migration and development levels. Following De Haas (2010), these factors affecting migration decisions can be grouped into migration capabilities and migration aspirations. On the one hand, migration capabilities (MC) can be expected to monotonically increase with development indicators such as income and education, as well as with the creation of migrant networks abroad. First, income growth implies that potential migrants are better able to finance migration (Vanderkamp, 1971; Faini and Venturini, 2010). This effect can be compounded by the impact of remittances from migrant communities abroad. Second, improvements in education and human capital raise the number of feasible migration destinations by increasing the number of visa classes (which are usually skilled-employment work visas) that migrants can obtain (Flahaux and De Haas, 2016; Ortega and Peri, 2013). Third, would-be migrants’ relationships with previous migrants already abroad may improve their ability to integrate in a given destination country, thereby further increasing migration capabilities (Massey, 1988). Yet, when the migrant population abroad grows, the positive network externalities generated by it may eventually disappear, due to the formation of a localized culture, gradually eroding the link between the established foreign network and potential domestic migrants (Epstein, 2008). Overall, with development, the rise in disposable income, human capital levels and migrant communities abroad leads to an increase in capabilities to emigrate. These MC can be expected to start growing more and more rapidly at first, because of the compounding impact of migrant networks and remittances, and later decelerating due to the formation of a localized culture with decreasing links with potential migrants in origin countries. This initial acceleration and later deceleration of migration capabilities with development is shown as the S-shaped MC curve in Figure 1. On the other hand, migration aspirations (MA) are more likely to have an inverted U-shape. Migration aspirations are a function of several factors, all of which are likely to first rise and later decrease with a country’s economic development (Clemens, 2014). These factors include: (i) Population growth initially increases with development due to declining mortality rates, and at some point starts decreasing with further development due to declining fertility rates. The initial increased population growth generates labor market pressures at home and thus increases demand for emigration, while at some point reduced population growth reduces emigration aspirations (Zelinsky, 1971). (ii) Opportunity costs of migration for capital owners initially decrease with development and stop falling once the relative prices of production factors have adjusted to the economy’s opening to international trade (Samuelson, 1948 7; Martin and Taylor, 1996). 7 According to the Stolper-Samuelson theorem, in a relatively poor country with an abundance of labor, trade liberalisation will increase the exports and relative price of the labor-intensive good and decrease the price of the capital-intensive goods. This is translated into a more than proportional increase in labor wages and a simultaneous reduction in capital returns. For capital owners, opportunity costs of migration (which include these foregone capital gains at home) are thereby reduced, increasing migration incentives. 5 (iii) Rising domestic inequality with development which can, for some subset of the population, increase the gap between expected and actual income, leading to an initial rise in migration aspirations (Stark, 2006). 8 Once the subset of the population with the highest gap between expected and actual income has migrated, this gap is on average reduced in the total population, causing a fall in aggregate migration aspirations. 9 Figure 1 illustrates the hump-shaped line for migration aspirations and the S-shaped curve for migration capabilities. At development levels Dlow and Dmedium, we assume that one’s aspiration to migrate is the same, at MA1 = MA2. Yet migration capabilities at Dlow are much lower than at the higher development stage Dmedium. For an equal aspiration to migrate, this difference in capabilities is expected to be the reason why poorer individuals tend to migrate less. Conversely, possessing both a strong willingness to migrate and sufficient capabilities to act upon it, medium earners are most likely to emigrate. On the other hand, since high-income individuals possess the required ability but lack the willingness to migrate, their propensity to do so will be lower. Figure 1 The migration transition hypothesis at the individual level MA, MC MC M MA1 = MA2 MA MC1 Development Dlow Dmedium level Source: De Haas (2010). At the country level and over time, we therefore expect emigration to first rise as domestic development rises, until a certain ‘turning point’ at which migration aspirations and capabilities are both relatively high. From this point onwards, capabilities grow just marginally with development, while migration aspirations fall, gradually pulling aggregate emigration rates downwards. Migration transition theories 8 As domestic inequality rises, so does the income gap between the lower and higher ends of the income distribution. This lowers relative income for the poorest, and thus raises income expectations. Since migrating abroad may be a way to achieve this new level of expected income due to inter-country income differences, this can foster migration aspirations at the lower end of the income distribution. 9 In reality, this phenomenon generally does not generate a clear inverted U-shaped relationship between development and migration aspirations. Inequality in a country can rise and fall more than once as development increases. Nevertheless, inequality has a clear impact on the gains from migration attained by workers at different points in the income distribution and in time: as inequality rises, migration aspirations are thought to increase in tandem, and vice-versa (Borjas, 1987). 6 therefore collectively predict that emigration has an inverted U-shaped ‘life cycle’ that is a function of the stage of development in the source country (Hatton and Williamson, 2011). 2.2 Empirical evidence The inverted U-shaped relationship between migration and development has recently been observed in cross-sectional nonparametric regressions (Clemens, 2014; Dao, Docquier, Parsons and Peri, 2018). The turning point is graphically found to lie at a gross domestic product (GDP) per capita level varying from $ 4,000 to around $10,000 (in 2019 US dollars, adjusted for purchasing power parity (PPP)). Countries with medium levels of development are associated with the highest emigration rates, while both underdeveloped and highly developed countries exhibit comparatively low rates of emigration. Clemens (2014) and Dao et al. (2018) report that both the (initially) positive and the (later) negative relationships between emigration and GDP per capita levels were statistically significant. Clemens (2014) found that this cross-sectional, hump-shaped association holds for every decade since 1960 and becomes more pronounced with time. The turning point in GDP per capita remains at the same level, whereas the corresponding emigration rate increases over time. De Haas (2010) showed that the same cross-sectional inverted U-shaped relationship holds when using the human development index (HDI) instead of GDP per capita values. It is not sufficient to merely observe that migration traces an inverted U-shaped pattern with development for a given year across countries. There are a number of studies that test for the existence of the migration hump using parametric regressions in such a cross-sectional setup, such as Djajic et al. (2016), Dao et al. (2018) and Idu (2019). However, this leaves at least three important considerations unaccounted for: First, the migration transition hypothesis’ central prediction is that this relationship ought to hold on average over time in any given country, and not merely in a given year across countries. That is, it is expected to hold in the longitudinal rather than in the cross-sectional dimension (Hatton and Williamson, 2011). Second, development can be expected to affect migration flows, but rising migration also affects development levels, for instance through the remittances it generates. This can lead to reverse causality problems, which cannot be adequately tackled in a cross-sectional set-up. Third, it can be expected that migration decisions strongly depend on observed or unobserved idiosyncratic characteristics of origin and destination countries. Examples of these factors are migration policies or individual preferences for migration, or drivers affecting pairs of countries, such as geographical distance or linguistic proximity. It is important to consider and correct for all costs and benefits related to every possible migration channel available to a would-be migrant. Other studies have investigated the relationship between development at origin and emigration using panel data. Although not specifically testing the migration transition hypothesis, these authors have included a squared term in their specifications to test for nonlinearities in the migration-development nexus. These studies nonetheless use an insufficient number of country-time points to adequately test for this relationship. Mayda (2010) focusses on flows from 79 origins countries to 14 OECD destinations, and therefore does not incorporate other types of flows (e.g. South-South) in the analysis. The data only contains migration observations for 15 years (1980-1995). Similarly, Bertoli and Huertas-Moraga (2013) test their migration model on a 12-year timeframe (1997-2009) for 61 origins to a single destination. 7 One paper that employs a similar methodology to ours is the paper by Llull (2016). His paper exploits a relatively new database of bilateral migrant stocks and finds heterogeneous effects of income gains on migration prospects depending on distance. Like our paper, he uses a gravity-migration specification which is tested using panel data. Moreover, Llull (2016) employs a similar bilateral data set although the data we present in this paper is more temporally extended. Despite the similarities, this paper differs from Llull (20106) in three important ways. First, Llull (2016) does not test for the existence of a hump-shaped relationship between emigration and development. Second, he uses migrant stocks instead of migration flows as the dependent variable, which is not in line with the specification’s micro-foundation (Beine et al., 2016). Third, Lull (2016) does not use the PPML- HDFE technique and instead employs the Ordinary Least Squares (OLS) technique. OLS is known not to perform well when the proportion of zeros in the dependent variable is high, which is the case here. It also yields relatively high biases in the presence of heteroscedasticity (Silva and Tenreyro, 2006, 2011). A second, and as far as we know the only other, paper that is similar to ours is Benček and Schneiderheinze (2019), who more recently tested systematically for the existence of the migration hump. They find a negative relationship between income and emigration that is independent from the origin country’s initial income level. Similar to this paper, they investigate the existence of the hump shape not only in cross-section but also over time. Our methodology and data differ from Benček and Schneiderheinze (2019) in three ways. First, we explore all bilateral migration flows, whereas Benček and Schneiderheinze (2019) only focus on unilateral emigration flows to OECD countries. Second, we employ an estimation method owing to which we are able to limit the estimation bias due to the large number of zeros in our migration flow variable without having to exclude these observations. We do not make such sample selections as it might generate bias due to the exclusion of many potential destination countries. Third, we include a complete set of origin- and destination-time fixed effects, which reduces, although not fully eliminates, the potential endogeneity issues. 3 Data and descriptive analysis 3.1 Data For the empirical analysis, we compiled an extensive panel data set comprising bilateral migration flows of 180 origin and destination countries for each decade from 1970 to 2020 (using 2019 as a proxy for 2020). The dependent variable is the bilateral migration flow in each of the five decades from 1970 to 2020. Each of the explanatory variables we include in our model are varying in the origin-country and time dimensions only. We also employ fixed effects that vary in the destination, time and country-pair dimensions. All are averaged over decades, from t – 10 to t – 1. The dependent variable under study in our analysis is the decadal bilateral migration flow for the 1970-2020 time period. 10 Following Beine and Parsons (2015), migration flows are computed as the decade-to-decade difference in stocks, where, if Mijt represents the stock of migrants from country i living in destination j at time t, the migration flow in period t is defined as mijt = Mijt – Mij,t-1. 10 We thus use the following reference years: 1970, 1980, 1990, 2000, 2010, and 2020. 8 To measure this variable, we merge two migrant stock databases produced by the World Bank and the UN. For the years 1960-2000, we use the World Bank’s Global Bilateral Migration database compiled by Özden et al. (2011). This is based on raw data from the Global Migration Database of the United Nations Department of Economic and Social Affairs of the Population Division (UN DESA, 2008). It contains migrant stock data by country of origin compiled from a collection of 3,500 censuses spanning 230 migrant destinations, for every decade from 1960 to 2000. 11 For the years 2010 and 2020 we combine this World Bank database with the Trends in International Migrant Stocks data from UN DESA (2019), which contains data for the following reference years: 1990, 1995, 2000, 2005, 2010, 2015 and 2019. This methodology is akin to Özden et al. (2011). The year 2019 is used as a proxy for the year 2020. In these databases, migrants are defined as foreign-born individuals who have moved to a different country. 12 As explained in Özden et al. (2011) and UN DESA (2019), this has advantages over defining them by their citizenship. The latter definition does not provide a consistent measure of international migrant stocks because of differing citizenship laws across nations, and because people in some countries can acquire citizenship after having been a migrant for a number of years. This definition better captures the concept of migration as a “movement of a person or a group of persons, either across an international border, or within a State” (International Organization for Migration, 2011). Both databases are based on the same underlying migration data and share many of the same processing methods. In both cases, the UN’s Population Division census data is used to compile the database. The same country list is employed for both databases, although the UN DESA data contains six more countries than the World Bank’s. In our merged data set, we only count those countries included in both databases. The original data suffers from a substantial amount of missing observations because many countries do not release national census data every 10 years. These may be prohibitively expensive in terms of labor intensity, can be abandoned because of exogenous factors, such as civil unrest or conflict, or are never released for political reasons. The authors chose to minimize the number of gaps in the data through interpolations. For the ‘in-between’ years (1970, 1980, 1990 for the 1960-2000 World Bank data and 2000 and 2010 for the UN DESA data), they do so by assuming a linear trend before and after missing data points. Where data are lacking for the beginning or end decades, they use growth rates in migration, taken from the UN Total Migrant Stock database (2006), to estimate bilateral migrant stocks. It is important to note, however, that since both databases use interpolations and predictions to fill in for missing values, our compiled bilateral database will also include a number of predicted values. 13 As a result, our estimation results are partially based on using predicted values as independent variables, which leads to increased uncertainty on the results. 11 During the timeframe covered by these censuses, many regions reshaped their political boundaries, such as the USSR and Germany. For this reason, authors define their “master” country list as the most current set of countries. 12 This definition is used where possible. Whenever birthplace information is missing, the authors identify international migrants using the citizenship criterion in order to minimize the amount of missing data points. 13For the World Bank database, for example, Özden et al. (2011) report that around 30% of countries have no missing data, 60% have one to three missing census rounds and the remaining 10% have four to five missing rounds. However, the countries with no missing data represent 68% of total world migration, while countries with just one or two missing rounds represent an additional 22%. Hence, 90% of world migration in this database is either based on raw data or by interpolating one or two data points of a total of five. 9 The UN DESA (2019) database differs from the World Bank database (Özden et al., 2011) in two ways. Firstly, the UN DESA (2019) also adds data on refugees if available. Secondly, UN DESA (2019) used nationally representative surveys to complement the international migrant stock estimates based on population censuses and registers used in both databases. We follow Rojas-Romagosa and Bollen (2018) by appending the data sets using the most recent UN international migrant stock data for the year 2019. Employing decadal data enables us to closely map our data to the population census rounds, which are done every decade. As in Beine and Parsons (2015), we set negative flow values to zero. To our knowledge, this is the most extensive panel data set used so far in the literature to test for the existence of the migration hump. First, the large time dimension (50 years) has not been used to test the migration transition hypothesis before and it is well-suited to capture migration’s long-run dynamics. Second, the large set of origins and destinations (180 countries, see Appendix Table A. 1) enables us to test the model on every possible migration direction, and not just South-North flows. Appendix Table A.2 contains the definitions and sources for all variables used in this paper. 3.2 Descriptive statistics Table A. 4 in the Appendix shows the summary statistics for our dependent variable of interest (migration flows), migration rates (migrant stocks over population), our explanatory variable of interest Figure 2 Bilateral emigration rates over time for countries in each quartile of the income distribution to countries in the other quartiles, in the 1960-2020 timeframe 10 Note: Income groups were made by partitioning our PPP-adjusted GDP per capita country-time points into four equally sized (n = 285) quartiles. Emigration rates are computed as the ratio of the total number (stock) of migrants from a given income quartile country group residing in the destination income quartile country group to the total population in the origin income quartile country group. The low, lower-middle, upper-middle- and high-income quartiles respectively correspond to countries in the $392-$2207, $2207-$5708, $5708-$14943 and $14943-$279498 GDP per capita ranges (in PPP-adjusted constant 2011 US dollars). (GDP per capita) and all other explanatory variables used in this study. Notably, with their highly positive skewness, the migration flow and rate distributions are heavily skewed towards the left. This reflects the large number of migration directions with small or zero flows of migrants. 14 The share of migration to OECD countries is equal to around 50% on average. 15 The evolution of bilateral migration over time from each income quartile of the distribution of GDP per capita country-time points in our data set to all other quartiles is depicted in Figure 2. A similar graph using the World bank’s classification of countries into low- lower middle- upper middle- and high- income groups presenting bilateral emigration rates over time for these income groups to all other income groups can be found in Figure A.1. in the appendix. As shown by both figures, migration rates are generally highest for lower- and upper-middle-income countries than for low- and high-income countries for each time period shown, in line with the cross-sectional migration hump. 4 Empirical analysis: Methodology 4.1 The canonical RUM model The Random Utility-Maximization (RUM) model has recently been used in the migration literature, see Beine et al. (2016). This approach allows us to rigorously micro-found a migration version of the gravity model that is more commonly employed in the trade literature since Tinbergen’s (1962) seminal contribution. The RUM expression of the location-decision problem faced by a would-be migrant (which translates into a simple utility-maximization problem) includes country-pair-specific utility components which call for the inclusion of bilateral (gravity) variables into the empirical model. Let us consider the location-decision problem faced by an individual h that considers migrating from a given country i to country j at time t. RUM models describe the utility derived from this move as: (1) Uhijt ≡ wijt – cijt + θhijt , where wijt denotes a deterministic component of utility and cijt represents the cost of migrating from i to j at time t. These can both be modelled as a function of observable variables, which should capture anything increasing or reducing the attractiveness of a particular destination and should include location- or country- pair-specific elements (Bertoli and Huertas-Moraga, 2013). Conversely, since θhijt is an individual-specific stochastic term, it cannot be observed. As has been repeatedly done in the migration literature, we assume that θhijt follows an independent and identically 14 To be precise, 153,700 migrant flow observations are equal to zero, or about 40.95% of the total. 15 This was computed using our country-time data set as the flow of international migrants from all possible origins having moved to an OECD destination in a given decade, averaged over the entire time sample. 11 distributed extreme value type 1 distribution à la McFadden and Zarembka (1974). Applied to equation (1), the expected share of individuals residing in i who move to j at time t, E(pijt ), can then be written as: w -c e ijt ijt (2) E(pijt ) = , ∑l∈D ewilt -cilt where D is the set of all countries the individual can choose from, l represents any country in this choice set, and pijt ∈ [0,1] is the actual share of share of individuals residing in i who move to j at time t . By definition, the expected scale of the migration flow from country i to country j at time t is E(mijt ) = E(pijt )sit, where sit represents the size of the population residing in country i at time t. We can thus re-write expression (2) above to express it as follows: ewijt -cijt (3) E�mijt � = s. ∑l∈D ewilt -cilt it RUM models usually assume that the deterministic component of utility does not change with the origin country i. This allows us to re-write equation (3) as: yjt E�mijt � = Φijt sit , (4) Ωit where Φijt = e-cijt , yjt = ewjt , and Ωit = ∑l∈D Φilt ylt . In this expression, migration depends on the accessibility Φijt of destination j, its attractiveness yjt , the capacity the origin country i has to send out migrants, proxied by its total population, sit , and is inversely related to the utility derived by migrating to other destinations l ∈ D or staying in the home country, Ωit . Expression (4) is similar to other canonical gravity specifications, such as that used in the context of trade in Baier et al. (2019). 4.2 Main migration-gravity econometric specification As is commonly done in the literature, we use GDP per capita levels (at PPP) as our measure of development levels at origin. To compute it, we use expenditure-side national GDP, which is most suitable for comparing living standards over time and across countries (Feenstra et al., 2015), divided by total population size. We include both a linear and squared origin country GDP per capita variable in order to test for the hypothesized nonlinearity in the impact of development at origin on subsequent emigration flows. These are our two variables of interest. Some econometric studies, however, claim that using merely a squared term in order to test for an (inverted) U-shaped relationship might lead to false conclusions (Lind and Mehlum, 2010; Haans et al., 2016). Therefore, before we conclude that there truly is a U-shaped relationship, we consider the three-step procedure of Lind and Mehlum (2010) and test our model fit when including a cubic term to the empirical specification, as suggested in Haans et al. (2016). To conform with the theory behind the RUM model (equation 4), we also control for population size. Within the RUM framework, population size measures the capacity that a given origin country has to send out migrants. Naturally, when a country has a larger population, it also has potentially higher migration flows in absolute numbers. 12 Following Rojas-Romagosa and Bollen (2018), we include country-pair fixed effects (FE) in our estimation. This is needed in order to account for all observable or unobservable bilateral time-invariant migration cost components, such as cultural or geographical distance, or any other time-invariant factor that might affect one’s choice of destination j. Taking logs of the RUM expression (4) above yields the following econometric specification: ln(mijt ) = β1 ln(GDPpcit–10 ) + β2 [ln (GDPpci,t–10 )]2 + β4 ln (sit ) + Iij + Ijt + Ii + εijt (5) where mijt represents migration flows from country i to country j at time t; GDPpcit–10 is the 10-year lag of GDP per capita at origin; sit is the population size at origin at time t; Iij , Ijt and Ii are respectively pair, destination-time and origin FE; ϵijt is the error term. Without taking logs as in (5) the empirical specification would run the risk of suffering from biased estimates due to the large number of zeros in our data set. Given the logarithmic form of our dependent variable, all pairwise observations with zero migration in the data would normally get dropped, as in log- linearized models estimated using OLS (e.g. Ortega and Peri, 2013; Llull, 2016). In order to avoid this, we estimate specification (5) using a Poisson pseudo-maximum-likelihood with high-dimensional fixed-effects (PPML-HDFE) estimator. As shown in Silva and Tenreyro (2006, 2011), PPML estimations perform well even when the proportion of zeros in the dependent variable is high. This justifies this approach given our data set. When compared to log-linearized gravity models, PPML estimations also yield relatively small biases in the presence of heteroscedasticity. To estimate the above model, we employ the estimator by Correia et al. (2019). This estimator allows for a large set of different high-dimensional fixed effects structures. Exponentiating expression (5), our PPML migration specification can be expressed as follows: mijt = exp{β1 ln(GDPpcit–10 ) + β2 [ln (GDPpci,t–10 )]2 + β4 ln (sit )} + Iij + Ijt + Ii + εijt (6) We use robust heteroscedasticity and autocorrelation consistent (HAC) standard errors and we cluster these around countries of origin. This is because our standard errors may be heteroscedastic and are probably correlated over time within origin countries’ observations. 4.3 Dealing with endogeneity A serious issue in the literature concerns the potential endogeneity. In particular, the possible reverse causality between development at origin and migration flows. The RUM expression (3) above does not make any specific assumptions about the direction of causality of the relationship between the prospective net utility of moving, wijt – cijt, and expected migration flows E�mijt �. The former can impact the latter, but the reverse may also plausibly hold. Development at origin might affect one’s migration aspirations and capabilities, and thus overall migration flows, through the channels mentioned in section 2. However, migration outflows can also affect development levels at origin. This could either happen directly (through remittances, modifications in consumption patterns, changes in asset accumulation at home, and 13 brain drain) or indirectly (for instance, through changes in the prices of local production factors and goods, or thanks to migrants encouraging investments into their areas of origin). 16 One way in which the literature (imperfectly) accounts for endogeneity is by assuming that current migration outflows may only affect present and future development levels, while past levels of income per capita can affect future levels of emigration (Mayda, 2010; Ortega and Peri, 2013; Idu, 2019). That is, migration flows in year t, mijt , can only impact GDP per capita at t, t + 1, t + 2, …, while income in previous periods t – 1, t – 2, … may impact contemporaneous and future migration flows. Following the literature, we therefore relate current migration flows to lagged values of GDP per capita in our estimations. This reverse causality problem is likely to be less present in our case, as we use 10-year lags in GDP per capita. 17 Another potential concern is the so-called multilateral resistance to migration (MRM). This is defined in Bertoli and Fernández-Huertas Moraga (2013) as the confounding influence that all potential alternative destinations l ∈ D might have on one’s choice to migrate to country j. This is encapsulated in the term Ωit in equation (4). Ignoring this ‘third country effect’ has been shown to lead to omitted variable bias (Bertoli and Fernández-Huertas Moraga, 2013). Existing strategies used in the literature to control for MRM do not work in this case. For example, Ortega and Peri (2013) control for heterogeneous preferences for migration across countries, which induce MRM by employing origin-time fixed effects. These are nonetheless perfectly collinear with any vector of time-varying origin variables wit and therefore do not allow for the inclusion of development at origin, our variable of interest, into the model. A more general and less restrictive approach is the common correlated effects (CCE) estimator. This allows for consistent estimations in the case of spatially and serially correlated error structures. This estimator was proposed by Pesaran (2006) and employed in Bertoli and Fernandez- Huertas Moraga (2013). However, with only six time periods, our data set does not have a sufficiently large longitudinal dimension for the CCE estimator to be used here. Following Mayda’s (2010) approach and the arguments put forth in Beine et al. (2016), we (partially) control for MRM by introducing origin and destination-time fixed-effects. These absorb time-invariant and time-varying unobserved country-specific effects, respectively. They also serve as a proxy for MRM induced by time-invariant aspects of heterogeneous preferences for migration at origin or by the temporally fluctuating attractiveness of alternative destinations (Beine and Parsons, 2015). Origin FE are not collinear with GDP per capita at origin, which varies temporally, and can thus be included in the estimation model. This is analogous to the standard Anderson-Van Wincoop trade-gravity specification (2003), which incorporates importer and exporter fixed effects to account for multilateral resistance to trade. Adequately accounting for MRM would require including origin-time fixed effects in our model along with destination-time and country-pair-varying fixed effects, as is done in state-of-the-art trade-gravity specifications, such as Baier et al. (2019). However, this would cause collinearity issues with respect to our variable of interest, which varies in both country and time dimensions. For this reason, we cannot fully 16 See Mendola (2012) for a review of this literature. 17 GDP per capita is averaged over decades, as explained in section 2. This means that we use information on GDP per capita from t – 20 to t – 11 to compute GDP per capita at t – 10. This longer time lag reduces the probability that reverse causality might be an issue in this case. 14 account for MRM and thereby eliminate the endogeneity bias from our estimation. Accordingly, our findings regarding to the migration-development nexus cannot be argued to represent a causal relationship. 5 Results 5.1 Main results Table 1 shows the results from our main specification. The significant coefficients on the linear and squared GDP per capita terms have a positive and a negative sign, respectively, see column (2). These results provide empirical evidence of an inverted U-shaped relationship between GDP per capita at origin and emigration flows. Moreover, it confirms the existence of the hump not only in the case of South-North flows, which had largely been the focus of past research on the topic, but for all combinations of origin and destination countries. By focusing on South-North flows, usually by leaving out non-OECD destinations from their analysis, previous studies have excluded about half of total international migration over the 1970-2020 period. By including such flows, we can therefore provide a more accurate test of the migration transition hypothesis, which is expected to hold for every origin globally. The results from our model estimation on alternative time subsamples suggest that the migration hump holds both before and after 2000. Table 1 (columns (3) and (4)) shows the results for both the 1970- 2000 period and the 2000-2020 period. As can be seen in the table, the coefficients on the linear and squared GDP per capita term are again significant and have a positive and negative sign, respectively. While the size of the two coefficients is lower for the latter timeframe, this decline is not significant. Table 1 Results from base model, full sample and time subsamples Migration flow (1) (2) (3) (4) Ln GDPpc orig. (t – 1) -0.0661 4.033*** 4.003** 3.366*** (0.140) (0.862) (1.778) (1.217) Ln GDPpc orig. sq. (t – 1) -0.257*** -0.253** -0.195*** (0.0489) (0.106) (0.0669) Ln pop. orig. 0.555* -0.0588 0.827 -0.238 (0.299) (0.272) (0.551) (0.402) Year sample 1960-2020 1960-2020 1960-2000 2000-2020 Observations 89,490 89,490 65,703 30,812 Pseudo R-squared 0.899 0.902 0.927 0.907 Robust clustered standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1 Destination-time, country pair and origin fixed effects are included in all estimations. The finding of a migration hump remains robust when estimating the model separately for each decade within the 1970-2020 timeframe. This is illustrated in Figure 3, which shows the results of our non- parametric cross-country regressions of emigrant stocks on GDP per capita at origin (PPP-adjusted), for each of the five decades within the 1970-2020 period. Emigration rates are computed as the ratio of the total number (stock) of migrants from a given country residing in a foreign country to the total population 15 in the origin country. These regressions depict an inverted U-shaped relationship between development at origin and emigration for each of these decades. Our results confirm those found in Clemens (2014) and Dao et al. (2018). Figure 3 Non-parametric regression of the migration-development nexus in cross-section for each year in the 1970-2020 time period Note: The dark red lines depict Second-Order Gaussian continuous kernel non-parametric regressions. Countries with emigration rates that are higher than 1 per year are omitted. The Cayman Islands and Kuwait are omitted from the regressions as well. 5.2 Robustness analyses In order to test the robustness of our results, we perform two sets of robustness analyses. First, in Section 5.2.1, we test the robustness of the hump shape as a whole by conducting the analysis with several alternative specifications, which all support the finding of the hump shape. Second, in Section 5.2.2, we specifically test the robustness of the finding that emigration initially rises when a low-income country begins to develop, corresponding to the upward sloping ‘left hand side’ of the migration hump at the lower end of the income distribution. There, we do not find support for the initial increase of emigration with development. 5.2.1 Robustness analyses of the hump shape To test the robustness of the hump shape we use several alternative specifications. These include: (i) the addition of several origin-time control variables (their definitions and sources can be found in the Appendix Table A. 2, along with GDP per capita and population at origin), in order to prevent omitted 16 variable bias in the origin country-time dimension; (ii) the inclusion of an interaction term between geographical distance and income at origin, and (iii) additional tests of the existence of an inverted U-shape between GDP per capita at origin and emigration flows. (a) Controlling for demographic and other origin-time variables In the first set of robustness checks, we augment our base model with several socio-demographic control variables. These serve to enrich our model by capturing more of the variation in the origin-time dimension and effectively reduce potential omitted variable bias issues. First, demographic factors at origin can significantly influence migration patterns through their impact on the domestic labor market structure. On a global scale, inter-country differentials in demographic structures might affect the directionality of migration flows, whereby countries with a large inactive population demand more labor from abroad in order to support the economy, while residents of countries with a relatively large labor force are more willing to emigrate. Also, higher population densities can make one more willing to emigrate, as it limits the amount of available resources per person. In this light, we introduce the age dependency ratio and population density at origin (both defined in Appendix Table A. 2) as controls. We expect a positive sign on the coefficient on population density: an increase entails higher pressures on a country’s resources, potentially leading to higher rates of emigration. The coefficient for age dependency could be both positive (e.g. a higher elderly dependency could lead to more emigration among pensioners, while a higher youth dependency could lead to more pressure for parents to look for better income opportunities abroad) or negative (e.g. higher elderly dependency may require more immigrants in elderly care), since several mechanisms are at play here. Moreover, political instability or poor governance may catalyze emigration, sometimes by forcing it. The landscape of politically driven emigration can range from people fleeing a war or a genocide to those seeking better living conditions, in the form of secured property rights or the freedom of expression. In order to capture the influence of these factors on emigration, we introduce the Polity IV index at origin, along with the number of months the origin country has been in any sort of conflict (genocides, politicides, and ethnic and revolutionary wars). We expect a negative sign for the former, as one’s willingness to migrate in a relatively democratic country is expected to be low. The coefficient on our conflict variable is expected to be positive. Populations can be displaced by natural disasters as well, which might destroy means of living in the origin country and thereby force people to flee appalling conditions at home to seek higher material wealth abroad. We account for these in an alternative specification through the number of natural disasters that occurred in a country during the time period considered. The coefficient on this variable is expected to be positive, as the rise in natural disaster occurrences in a given time period should lead to more outward migration. In order to prevent potential collinearity issues, only control variables that have an absolute correlation of 0.4 or lower with GDP per capita are included in the estimation. 18 Further, since natural disaster occurrences are highly correlated with the natural logarithm of population at origin (correlation > 0.4), we do not simultaneously include them in the estimation. The same goes for the Polity IV index at 18 A correlation matrix for all explanatory variables included can be found in Appendix Table A. 3. 17 origin and the age dependency ratio. Therefore, we first incorporate each control variable to the main model separately in order to test their significance with no influence from other potential factors. We then include all controls at the same time, excluding some variables to avoid collinearity. 19 (b) Estimating alternative time subsamples The on average positive global growth rates in GDP per capita between 1970-2020 led to a rightward shift of the world per capita income distribution. An increasing number of countries now lie in the middle- to high-income per capita group. This can have an impact on the existence of the migration transition. If the migration turning point lies at relatively low GDP per capita levels, then the hump will be more pronounced for earlier periods, assuming that the turning point remains constant over time. Otherwise, if the turning point does move to the right over time, this effect does not occur. In order to test whether the hump shape became less pronounced, we subdivide our country-time sample into two distinct timeframes, taking advantage of the panel structure of our data set. The two timeframes chosen were 1970-2000 and 2000-2020 (the year 2000 cutoff was chosen arbitrarily). We then estimate model (6) on these two subsamples. This will also enable us to have a better idea of where the actual migration transition point lies, and thus which income levels actually drive the migration transition. (c) Controlling for interactions between geographical distance and income at origin Furthermore, the impact of a change in income at home on emigration might be different depending on the distance to potential migration destinations chosen by a would-be migrant. For instance, the effect of a positive income shock on one’s decision to move might be more pronounced if the destination considered is closer to home. This can be due to the fact that migrants considering a faraway destination might focus more on long-run income prospects than fluctuating income shocks in their migration decision. Moving farther away implies less flexibility to move back and forth to one’s home country to benefit from wage fluctuations. Following Lull (2016), the interaction between geographic distance and income at origin, � it . D GDPpc � � being the sample mean of variable x, we define ij , is included into model (6), where, for any x �. This yields the following estimation model: �≡x-x x mijt = exp{β1 ln(GDPpcit–1 ) + β2 [ln (GDPpci,t–1 )]2 + β3 ln (sit ) + β5 � � GDPpcit .D ij } (7) + Ijt + Ii + ϵijt (d) Additional tests for the existence of an inverted U-shape Lastly, we conduct further statistical tests of the inverted U-shaped relationship between development at origin and emigration. As argued in Lind and Mehlum (2010), merely adding a quadratic term to an otherwise linear specification can be too weak a criterion to test for such a nonlinear relationship if the latter is either convex or monotone. In this case, one might be led to a type I error where the null hypothesis of linearity is wrongly rejected because an extreme point is found and thus an inverted U-shape. 19The Polity IV index or age dependency (correlation coefficient of -0.44) and the natural disaster variable or natural logarithm of total population (correlation coefficient of 0.62) are excluded in turn because of their relatively high correlation with each other. 18 To account for this potential issue, we follow Lind and Mehlum’s (2010) three-step procedure. 20 First, we verify that β2 in specification (6) is significantly negative. Second, we check whether the slopes at both ends of the data range, to the right and to the left of the optimum, are significantly different from zero, and positive and negative, respectively. Third, the turning point should lie well within the data range. With regard to the first step, we use the results from the estimation of the empirical specification (6). The second and third step are done using the Sasabuchi test (Sasabuchi, 1980; Lind and Melhum, 2010). This test checks the robustness of an inverted U-shaped relationship by testing whether the slopes to the left and the right of the turning point are significantly positive and negative, respectively. We also choose to test the fit of our model when adding a cubic term, thus allowing for the curve to take an S-shape rather than a U-shape. Results of robustness checks The results of the first robustness checks are that the main result remains unchanged when augmenting the main model with a set of socio-demographic controls. As shown in Table 2, in terms of significance and sign, our main result regarding the two GDP per capita coefficients remains unchanged when the age dependency ratio, the Polity IV index, the number of natural disaster occurrences, conflict duration and population density are individually added to the main model. Networks and population density, which are the only variables with significant coefficients, both have the expected positive sign. This suggests that these variables, either through an increase of a country’s diaspora population or through a negative impact on resource availability, might foster emigration. Table 2 Results from the estimation of the base model, augmented with selected origin-time control variables (specification is sometimes changed to avoid multicollinearity) Migration flow (1) (2) (3) (4) (5) (6) Ln GDPpc (t – 1) 4.090*** 4.387*** 3.855*** 3.308*** 3.964*** 4.409*** (0.864) (0.911) (0.827) (0.925) (0.822) (0.861) Ln GDPpc sq. (t – 1) -0.260*** -0.278*** -0.246*** -0.209*** -0.252*** -0.281*** (0.0487) (0.0521) (0.0464) (0.0524) (0.0467) (0.0490) Ln population -0.0357 -0.172 -0.349 -0.0652 -0.170 (0.273) (0.263) (0.313) (0.277) (0.275) Age dependency ratio 0.0018 (0.0052) Polity IV index 0.00230 (0.0115) Nat. disaster occurrence -0.0010 (0.0015) Networks (t – 2) 0.409** (0.165?) Conflict duration 0.0026 20 See also Haans et al. (2016). 19 (0.0018) Population density 0.0004*** (0.0001) Observations 88,229 80,692 90,396 75,123 89,490 88,171 Pseudo R-squared 0.902 0.903 0.902 0.907 0.902 0.902 Robust clustered standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1 Destination-time, country pair and origin fixed effects are included in all estimations. Including all of these control variables along with the main model, changing the specification to avoid multicollinearity issues 21 shows that the main results do not change. As Table 3 shows, both the linear and squared GDP per capita terms remain highly significantly positive and negative, respectively. Moreover, population density and our network variable are both significant across most specifications. This evidences the role of demographic pressure and the impact of the size of the diaspora in affecting one’s propensity to migrate. Population density keeps its expected positive sign, while the coefficient of the network variable on occasion however unexpectedly turns negative. Table 3 Results from the estimation of the base model with different combinations of control variables Migration flow (1) (2) (3) (4) (5) Ln GDPpc (t – 1) 3.653*** 3.423*** 3.304*** 2.986*** 3.221*** (0.879) (0.900) (0.839) (0.902) (0.844) Ln GDPpc sqr. (t – 1) -0.230*** -0.219*** -0.206*** -0.188*** -0.207*** (0.050) (0.053) (0.047) (0.052) (0.050) Ln population -0.403 -0.555* -0.613** (0.328) (0.302) (0.301) Age dependency ratio 0.00130 0.00284 -0.00418 (0.00611) (0.00628) (0.00616) Polity IV index 0.0134 0.0143 0.0136 (0.0109) (0.0114) (0.0108) Conflict duration 0.00256 0.00267 0.00250 0.00270 0.00264 (0.00195) (0.00178) (0.00204) (0.00183) (0.00176) Network (t – 2) 0.422*** -16.950*** 0.434*** -16.810*** -17.330*** (0.155) (5.071) (0.149) (5.196) (5.333) Population density 0.000321*** 0.000415** 0.000304** 0.000291 0.000394** (0.000117) (0.000201) (0.000125) (0.000207) (0.000183) Natural disaster 0.00167 0.000296 6.04e-05 occurrences (0.00226) (0.00214) (0.00202) Variables with more than 0.4 (absolute) correlation are not included together in the same specification. A correlation 21 matrix can be found in Table A.3 in the Appendix. 20 Observations 72,775 66,973 72,775 66,973 66,973 Pseudo R-squared 0.907 0.911 0.907 0.911 0.911 Robust clustered standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1 Destination-time, country pair and origin fixed effects are included in all estimations. The results from the estimation of model (7) are shown in Table 4. While both the linear and the squared GDP per capita terms are highly significant and have the expected positive and negative signs, respectively, the added interaction term is weakly significant. In accordance with Lull (2016), we therefore find some evidence suggesting that income shocks might have a heterogeneous impact on emigration depending on distance to destination. The results of the Sasabuchi test for the (inverted) U-shape can be found in the Appendix Table A. 5. The slopes at both ends of the data ranges are significant, and of the expected signs: positive at the lower bound and negative at the upper bound. The overall test for the presence of an inverted U-shape between GDP per capita and migration flows also enables us to reject the null hypothesis that emigration evolves linearly with GDP per capita, and thus further confirms the existence of the inverted-U shape. Moreover, the extremum point, at ln(GDPpcit) = 7.85788, lies well within the Ln GDP per capita range, which goes from 6.126 to 12.541 (see Appendix Table A. 4 for summary statistics). Finally, adding a cubic term to the empirical model does not improve model fit, as Table A. 7 in the Appendix depicts. The linear, squared and cubed GDP per capita terms are insignificant. Given these results and the ones above, the migration- development nexus is thus more likely to follow an inverted-U shape than an S-shape. Table 4 Results from the estimation of model (7) with the interaction term distance and GDP per capita Migration flow Ln GDPpc orig. (t – 1) 3.913*** (0.910) Ln GDPpc orig. sq. (t – 1) -0.248*** (0.052) Ln pop. orig. -0.103 (0.291) Ln Dist.*Ln GDPpc orig. 0.136* (0.077) Observations 84,336 Pseudo R-squared 0.904 Robust clustered standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1 Destination-time, country pair and origin fixed effects are included in all estimations. 21 5.2.2 Robustness analyses of the initial increase of migration with development All the findings of our robustness tests in sections 5.1 and 5.2.1 seem to suggest that there is strong empirical support for the migration transition hypothesis’ prediction of a migration hump: an inverted-U relationship between development levels and emigration. This finding is consistent with our Figures 2 and 3 which show that middle-income countries tend to have higher emigration rates than either low-income or high-income countries. Recently, several authors (e.g. De Haas (2019), Clemens and Postel (2018)) have concluded from this finding of such an inverted-U relation that this implies that, as low-income countries develop, their emigration will tend to increase first before declining only after some threshold level of income. If this conclusion holds for individual countries, then this could have serious implications for development programs. In particular, it would imply that development cooperation, to the extent that it contributes to economic development, contributes to increased emigration from low-income to high- income countries. As the authors mentioned above have pointed out, development cooperation in that case is not a proper instrument to reduce emigration from low-income countries. However, even if the migration hump finding is as robust as it seems to be, can we actually conclude that all individual countries will follow this inverted U-pattern as they grow richer? In other words, will emigration for an individual low-income country indeed rise as it starts developing economically, and fall after some threshold middle-income level? The answer is that this does not necessarily follow from the finding of an inverted-U relation between development and emigration based on cross-country or panel data. Benček and Schneiderheinze (2019) are therefore critical of any causal interpretation of the migration hump. One reason why the cross-sectional evidence for the hump shape does not necessarily demonstrate an individual country’s transition path is that, while middle-income countries experience higher emigration than low-income (and high-income) countries, this is not necessarily due to their income differences. It may also be due to fundamental heterogeneity between the different country income groups that simultaneously affect both economic development and migration (Lucas, 2019). If such omitted variables are driving the inverted U-relationship, then the migration hump is misinterpreted as being a result of economic development. This point is not solely relevant for evidence of an inverted-U relationship based on cross-section data, but also for evidence of a migration hump based on panel data, as we are using in this paper. The reason is as follows. By using panel data, we exploit both the variation over time and the variation across countries. The variation over time for each country across the income distribution is however limited in the sense that even though we use a large 50-year timeframe from 1970 to 2020, there is no country that has covered the whole income distribution over this period developing from a low-income to a high-income country. Despite substantial economic growth for many countries within this period, countries have still moved within a limited range of the income distribution. This implies that, even though we are using panel data and exploiting some income variation over time for each country, we are still to a large extent relying on the cross-section variation in the data for our finding of an inverted-U relation between emigration and economic development. That means that this finding of the inverted-U relation is to an important extent still driven by the fact that middle-income countries experience higher emigration than low- or high-income countries. So again, the conclusion that income levels are driving the inverted-U relation between 22 development and migration will not necessarily hold if there is systematic heterogeneity across countries in these income groups. This is particularly the case if there is heterogeneity with respect to factors that affect both development and migration, and if these factors are not properly controlled for. One reason why full control for all relevant factors affecting both income and emigration is complicated in all panel data studies on emigration and development is the following. Even though in our above panel data analysis we have applied a very extensive set of control variables, including several origin- time control variables and destination-time, country-pair, and origin fixed effects, there might still be some origin-time factors that affect both emigration and development and hence require additional controls. While such factors in principle could be controlled for by using origin-time fixed effects, such fixed effects are however perfectly collinear with any origin-time varying variables and hence cannot be simultaneously included in the specification with development at origin, which is our variable of interest. As indicated, this issue is relevant for all panel data studies on emigration and development. Therefore, additional robustness checks are required in order to test whether low-income countries as they develop indeed initially experience an increase in emigration due to economic growth. This will be done in the next section. In order to avoid this issue of inappropriately using the higher emigration levels of middle-income countries compared to low-income countries, while not being able to fully control for fundamental differences between the two groups of countries that may drive the result of an initial increase in emigration with development, we perform several robustness tests in this subsection that all relate to the upward- sloping part of the migration hump in order to test whether as low-income countries grow, their emigration will tend to initially increase. The first test we perform is for the subsample of 46 countries that actually transitioned from low- income to middle-income status. We test whether emigration from these countries increased with development, by applying our base regression model on this sub-group only. In this case, the included middle-income countries are the same as the included low-income countries (only at a later point in time) and hence there is no heterogeneity between the two income groups when using this subsample. This subsample consists of 46 countries that have all developed from the low-income to the middle-income category in the period 1970-2020 according to the World Bank income classification. 22 If emigration initially increases with economic development until low-income countries reach some middle-income threshold level, then we would expect a positive and significant coefficient on our linear GDP per capita variable. We perform the regression both with and without a squared term for our GDP per capita variable. The results from our estimation for this subsample of countries that have transitioned from low-income to middle- income are shown in columns (1) and (2) of Table 5. The table shows that in neither case, we get a significantly positive coefficient for our GDP per capita variable and hence we cannot conclude that for this group of countries economic development has resulted in an increase of emigration. We have also 22The countries included in this subsample are Angola, Albania, Armenia, Azerbaijan, Bangladesh, Bhutan, Cambodia, Cameroon, China, the Comoros, Republic of Congo, Côte d’Ivoire, the Arab Republic of Egypt, Equatorial Guinea, Georgia, Ghana, Guyana, Honduras, Indonesia, India, Kenya, Kyrgyzstan, Lao PDR, Lesotho, Maldives, Mauritania, Republic of Moldova, Mongolia, Myanmar, Nicaragua, Nigeria, Pakistan, Papua New Guinea, São Tomé and Príncipe, Senegal, Solomon Islands, Sri Lanka, Sudan, Tajikistan, Timor-Leste, Turkmenistan, Ukraine, Uzbekistan, Vietnam, the Republic of Yemen, Zambia. 23 performed the regression for several subsets of this group of 46 countries and the results are all similar in the sense that they show no evidence of an increase of emigration with development for these countries. Next, we perform the same test for the similar sample of countries that have transitioned from the low-income to the middle-income category, but now excluding China and India. These two countries are outliers in terms of population and country size, which may have important implications for emigration, and they have also experienced relatively high economic growth. The results for this subsample are presented in columns (3) and (4) of Table 5. The results in column (4) for the regression including the quadratic term for our GDP per capita variable show no significance for the coefficients on either the linear or squared GDP per capita variable. However, the results in column (3) for the regression including only the linear GDP per capita variable show a very significant and negative coefficient on our GDP per capita variable. Limiting our analysis to this sample of 44 countries that have actually developed from being a low- income country to becoming a middle-income country, the finding is thus that emigration has not increased but rather declined with economic development. By focusing solely on the countries that actually made the transition from low-income to middle-income status, we avoid the issue of inappropriately using the higher emigration levels of middle-income countries compared to low-income countries, while not being able to fully control for fundamental differences between the two groups of countries. For this relevant subsample, it is clear that, when low-income countries develop economically, their emigration declines. This obviously has important policy implications as it refutes the recent belief that development programs contribute to rising emigration when promoting economic development. In addition to this subsample of countries that each developed from low-income to middle-income status, we also test whether there is an increase in emigration with development for the subsample of all African countries. This is also an interesting subsample because these countries have grown in the covered 50-year period from being mostly low-income to being mostly lower-middle income, with less than half of the countries still being low-income countries in 2020 and a few countries transitioning to the upper-middle income category. Mean GDP per capita for African countries increased substantially from US$ 1,738 to US$ 4,798 during this period. 23 The results for our subsample of African origin countries confirm that GDP per capita growth does not give rise to emigration from African countries. The results are presented in columns (5) and (6) of Table 5. The results show that, despite substantial increase in GDP per capita among African countries, there is no sign of a significant positive relation between GDP per capita and emigration, as some authors in the migration literature suggested. Instead, the relationship is negative though not significant. As shown in the table, population at origin does show a significant and positive coefficient. This indicates that population growth may have been driving higher emigration levels for African countries. Table 5 Results from base model, countries that transitioned from LIC to MIC and African countries Migration flow (1) (2) (3) (4) (5) (6) Ln GDPpc orig. (t – 1) -0.253 0.245 -0.470*** 0.230 -0.086 -0.339 23 In constant 2011 U.S. dollars. 24 (0.215) (2.029) (0.162) (2.263) (0.112) (0.774) Ln GDPpc orig. sq. (t – 1) -0.033 -0.0461 0.017 (0.129) (0.146) (0.052) Ln pop. orig. -0.505 -0.522 -0.239 -0.232 1.290*** 1.309*** (0.370) (0.391) (0.444) (0.451) (0.417) (0.419) Subsample LIC to LIC to LIC to MIC LIC to MIC Africa Africa MIC MIC excl China, excl China, India India Observations 18,524 18,524 16,423 16,423 22,012 22,012 Pseudo R-squared 0.935 0.935 0.934 0.934 0.915 0.915 Robust clustered standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1 Destination-time, country pair and origin fixed effects are included in all estimations. The next robustness check is to test whether, for the lower part of the income distribution, there is an upward-sloping ‘left hand side’ of the migration hump, in other words, whether there is a significantly positive relation between development and emigration up to a certain point. We first check this for our base model applied to the full sample of countries for which the result was presented in column (2) in Table 1. The corresponding extreme point for this base model result lies at a per capita GDP of US$ 2,586. We therefore now test our migration base model (both only with a linear and also with a quadratic term for GDP per capita) applied on all observations in our data set up to this extreme point for GDP per capita. The results are presented in column (1) and (2) of Table 6 and do not show a significantly positive coefficient for our linear GDP per capita term that we would expect if an increase in income would lead to more emigration in this lower part of the income distribution until the extreme point of the hump. The coefficient on squared GDP per capita is also insignificant. We perform a similar test for the highest extreme point of GDP per capita that we found across all other specifications used in sections 4.2 and 4.3.1 and that is the one applied for the time subsample 2000- 2020, for which the results were shown in column (4) of Table 1. The extreme point corresponding to this result lies at a GDP per capita of US$ 5,693. We again test our base model of emigration, again both with only a linear and also with a quadratic term for GDP per capita, applied on all observations below this turning point of the hump of US$ 5,693. The results are presented in columns (3) and (4) of Table 6 below and show that also using this extreme point, the coefficients on our GDP per capita variable are insignificant, indicating there is no significant relation between income per capita and emigration at this part of the income distribution. We also tested the extreme points for all other specifications used in section 4.3.1 and since these all lie to the left of the above extreme point of US$ 5,693, the results from the estimation of our base emigration model applied to the observations to the left of these respective extreme points are all similar in the sense that they do not show a positive and significant relation between our GDP per capita variable and emigration. Table 6 Results from the estimation of the base model, up to various extreme points found Migration flow (1) (2) (3) (4) Ln GDPpc (t – 1) -0.064 0.389 -0.137 1.612 25 (0.158) (4.028) (0.142) (2.206) Ln GDPpc sq. (t – 1) -0.032 -0.117 (0.282) (0.146) Ln population -0.122 -0.113 -1.194*** -1.213*** (0.483) (0.468) (0.448) (0.446) Subsample GDP pc GDP pc GDP pc GDP pc extreme point extreme point of extreme point extreme point of US$2586 US$2586 of US$5693 of US$5693 Observations 25,864 25,864 45,909 45,909 Pseudo R-squared 0.952 0.952 0.937 0.937 Robust clustered standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1 Destination-time, country pair and origin fixed effects are included in all estimations. Next, we test our base model on all observations in the first and second quartile of the income distribution. Again, if emigration initially increases with development, we would expect to find a significant and positive coefficient on our GDP per capita variable. The results can be found in columns (1) and (2) of Table 7 and show no significance for either the linear or the squared GDP per capita variable. Finally, we apply the PPML base model on all observations with a maximum GDP per capita of US$ 9,999, which happens to be the mean of GDP per capita across all upper middle-income countries. Columns (3) and (4) of Table 7 show the results and indicate no significance for either the GDP per capita variable or for the hump-shape of the relation between development and emigration. Table 7 Results from the estimation of the base model, up to various income thresholds Migration flow (1) (2) (3) (4) Ln GDPpc (t – 1) -0.020 -2.356 -0.087 2.214 (0.143) (2.221) (0.144) (1.512) Ln GDPpc sq. (t – 1) 0.157 -0.151 (0.149) (0.098) Ln population -1.210** -1.308** -0.702 -0.681 (0.536) (0.555) (0.451) (0.447) Subsample First and second First and second Until GDP pc Until GDP pc quartile of quartile of income of US$9999, of US$9999, income distribution mean income mean income distribution UM countries UM countries Observations 37,508 37,508 53,780 53,780 Pseudo R-squared 0.946 0.946 0.930 0.930 Robust clustered standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1 Destination-time, country pair and origin fixed effects are included in all estimations. The conclusion from these robustness checks is twofold. On the one hand, the finding of a hump- shaped relationship between emigration and development levels is highly robust in panel data settings, using data for 180 countries and a 50-year timeframe. On the other hand, it is not correct to conclude that, in any 26 given country, emigration initially increases with economic development before it starts to fall. In particular, the ‘left hand side’ of the migration hump does not withstand any of the robustness tests that we performed. On the contrary, when we focus on low-income countries that actually transitioned to middle-income status, we find evidence that emigration actually declined with economic development. This suggests that the inverted U-shaped relationship of economic development and migration cannot be interpreted as a causal relationship. 6 Concluding remarks This paper has rigorously tested the migration transition hypothesis according to which emigration follows an inverted U-shaped relationship with economic development. The migration transition hypothesis suggests that emigration first increases, as countries move from low to middle-income levels of development, and subsequently decreases again as countries grow richer. As predicted by several migration transition theories, such a non-linear pattern could emerge from various factors at play, including financial constraints that diminish over time, migrant networks abroad that increase with migration, or a demographic transition. In order to test this hypothesis, we applied a migration version of the gravity model, micro-founded by the Random Utility-Maximization (RUM) model, on a global panel data set comprising 180 origin and destination countries and a 50-year timeframe (1970-2020). This is the most extensive panel data set used so far in the literature to test for the existence of the migration hump. We used GDP per capita at origin as a proxy for development levels and include a linear and a squared term to account for the nonlinearities predicted by migration transition theories. We used the most recent PPML estimator and, following the literature, controlled for the influence of alternative destinations on one’s decision to migrate (so-called multilateral resistance to migration). We did so by incorporating several origin-time control variables and various fixed effects structures controlling for unobserved origin-, destination-time, time and country-pair- characteristics potentially affecting migration flows. Based on this panel data analysis, we find strong empirical support for an inverted-U relationship between emigration and development levels. Our results are robust to (a) the addition of several origin-time control variables, (b) the use of different time and country subsamples (with and without non-OECD countries), (c) the inclusion of an interaction term between geographical distance and income at origin and (d) several additional tests of the existence of an inverted-U shaped relation between GDP per capita at origin and emigration flow. However, the finding of an inverted U-shaped relation between economic development and emigration is mainly driven by cross-country heterogeneity in factors other than income and therefore the migration hump cannot be interpreted as a causal relation. In several additional robustness analyses we found that, for a given low-income country, an increase in economic development does not lead to higher emigration. On the contrary, for a subsample of 44 countries that actually transitioned from low-income to middle-income status (excluding China and India), we even found evidence that emigration rather declined with economic development. Drawing the conclusion that the inverted-U relationship is causal therefore seems unfounded. This new finding, supported by various robustness checks, has important policy implications. In contrast with what other authors (e.g. De Haas, 2019 and Clemens and Postel, 2018) have concluded based 27 on cross-sectional findings, we can no longer conclude that, as low-income countries develop, their emigration will tend to increase before declining after a certain middle-income turning point. While we do find empirical evidence of an inverted U-relation between economic development and emigration using the full sample of 180 countries over 50 years, it seems that this finding is driven by the underlying cross- sectional pattern of middle-income countries having higher emigration rates than either low- or high-income countries. These differences in emigration rates are likely caused by fundamental differences between countries in different income categories that make a causal inference of the inverted-U relation invalid. Moreover, akin to other papers in the existing literature on this topic, we are not able to fully control for the potential endogeneity arising from the reversed causality between migration and GDP, nor from the multilateral resistance to migration, i.e. the unobserved impact of the attractiveness of alternative destinations on one’s willingness to emigrate. Due to these issues any causal interpretation of the migration hump is unfounded. Although in our analysis we do not eliminate the bias due to endogeneity, we are able to reduce it by including a decade-to-decade lag of GDP per capita at origin as an instrument in order to tackle reverse causality and by using country-, destination-time and country-pair-varying fixed effects in order to partially account for multilateral resistance to migration. We circumvent the remaining endogeneity problem due to fundamental differences between countries in different income categories that we cannot fully control for, by estimating the model solely for those countries that actually transitioned from low-income to middle-income status. In this case, the included middle-income countries are the same as the included low-income countries (only at a later point in time) and hence there is no heterogeneity between the two income groups when using this subsample. Interestingly, the results for this subsample (which excludes China and India) show importantly that emigration actually declines as low-income countries develop economically. This obviously has important policy implications: it suggests that development programs can in fact promote economic development in low-income countries without encouraging emigration. 28 References Anderson, J. E., & Van Wincoop, E. (2003). Gravity with gravitas: a solution to the border puzzle. American Economic Review, 93(1), 170–192. Bade, J. and A. De Kemp (2019), Migration and Development, Dutch Ministry of Foreign Affairs Baier, Scott L. & Yotov, Yoto V. & Zylkin, Thomas, 2019. On the widely differing effects of free trade agreements: Lessons from twenty years of trade integration, Journal of International Economics, vol. 116(C), pages 206-226. Beine, M., Bertoli, S., & Fernández‐Huertas Moraga, J. (2016). A Practitioners’ Guide to Gravity Models of International Migration. The World Economy, 39(4), 496–512. Beine, M., Boucher, A., Burgoon, B., Crock, M., Gest, J., Hiscox, M., … Thielemann, E. (2016). Comparing immigration policies: An overview from the IMPALA database. International Migration Review, 50(4), 827–863. Beine, M., Bourgeon, P., & Bricongne, J. (2019). Aggregate fluctuations and international migration. The Scandinavian Journal of Economics, 121(1), 117–152. Beine, M., & Parsons, C. (2015). Climatic factors as determinants of international migration. The Scandinavian Journal of Economics, 117(2), 723–767. Belot, M. V. K., & Hatton, T. J. (2012). Immigrant Selection in the OECD. The Scandinavian Journal of Economics, 114(4), 1105–1128. Benček, D., & Schneiderheinze, C. (2019). More development, less emigration to OECD countries: Identifying inconsistencies between cross-sectional and time-series estimates of the migration hump (No. 2145). Kiel Working Paper. Bertoli, S., & Moraga, J. F.-H. (2013). Multilateral resistance to migration. Journal of Development Economics, 102, 79–100. Borjas, G. (1987). Self-Selection and the Earnings of Immigrants. The American Economic Review, 77(4), 531-553. Retrieved from www.jstor.org/stable/1814529 Caselli, M. (2019). “Let Us Help Them at Home”: Policies and Misunderstandings on Migrant Flows Across the Mediterranean Border. Journal of International Migration and Integration, 1–11. Clemens, M. A. (2014). Does Development Reduce Migration? IZA Discussion Papers. Clemens, M. A. & Postel, H.M. (2018). "Deterring Emigration with Foreign Aid: An Overview of Evidence from Low-Income Countries" Population and Development Review, 4, 667-693. Correia, S., Guimarães, P., & Zylkin, T. (2019). PPMLHDFE: Fast poisson estimation with high- dimensional fixed effects. ArXiv Preprint ArXiv:1903.01690. Dao, T. H., Docquier, F., Parsons, C., & Peri, G. (2018). Migration and development: Dissecting the anatomy of the mobility transition. Journal of Development Economics, 132, 88–101. De Haas, H. (2007). Turning the tide? Why development will not stop migration. Development and Change, 38(5), 819–841. De Haas, H. (2010). Migration Transitions. IMI Working Papers. Oxford. De Haas, H. (2011). The Determinants of International Migration (IMI Working Papers). Oxford. 29 De Haas, H. (2019). Why Development Will Not Stop Migration. https://www.macmillanihe.com/blog/post /why-development-will-not-stop-migration-hein-de-haas/. De Haas, H., Natter, K., & Vezzoli, S. (2014). Compiling and coding migration policies: Insights from the DEMIG POLICY database. International Migration Institute, DEMIG Project Paper, 16, 43. Di Giovanni, J., Levchenko, A. A., & Ortega, F. (2015). A global view of cross-border migration. Journal of the European Economic Association, 13(1), 168–202. Djajic, S., Kirdar, M. G., & Vinogradova, A. (2016). Source-country earnings and emigration. Journal of International Economics, 99, 46–67. Egger, P., & Nigai, S. (2015). Structural Gravity with Dummies Only. CEPR Discussion Papers. Epstein, G. S. (2008). Herd and network effects in migration decision-making. Journal of Ethnic and Migration Studies, 34(4), 567–583. European Commision. (2015). A European Agenda on Migration. Brussels. Faini, R., & Venturini, A. (2010). Development and migration: Lessons from southern Europe. In Frontiers of Economics and Globalization (Vol. 8, pp. 105–136). Emerald Group Publishing Limited. https://doi.org/10.1108/S1574-8715(2010)0000008011 Feenstra, R. C., Inklaar, R., & Timmer, M. P. (2015). The next generation of the Penn World Table. American Economic Review, 105(10), 3150–3182. Flahaux, M.-L., & De Haas, H. (2016). African migration: trends, patterns, drivers. Comparative Migration Studies, 4(1), 1. Gonzalez-Garcia, M. J. R., Hitaj, M. E., Mlachila, M. M., Viseth, A., & Yenice, M. (2016). Sub-Saharan African migration: patterns and spillovers. International Monetary Fund. Gould, J. D. (1979). European Inter-Continental Emigration 1815-1914: Patterns and Causes. Journal of European Economic History, 8(3), 593–679. Grogger, J., & Hanson, G. H. (2011). Income maximization and the selection and sorting of international migrants. Journal of Development Economics, 95(1), 42–57. Guha-Sapir, D. (2019). EM-DAT: The Emergency Events Database - Université Catholique de Louvain (UCL) - CRED. Haans, R. F., Pieters, C., & He, Z. L. (2016). Thinking about U: Theorizing and testing U‐and inverted U‐ shaped relationships in strategy research. Strategic Management Journal, 37(7), 1177-1195. Harris, N. (2002). Thinking the Unthinkable: The Immigration Myth Exposed (IB Tauris) London. Hatton, T. J., & Williamson, J. G. (2011). Are third world emigration forces abating? World Development, 39(1), 20–32. Head, K., Mayer, T., & Ries, J. (2010). The erosion of colonial trade linkages after independence. Journal of International Economics, 81(1), 1–14. Héran, F. (2018). Europe and the spectre of sub-Saharan migration. Population & Sociétés, (558). Idu, R. (2019). Source Country Economic Development and Dynamics of the Skill Composition of Emigration. Economies, 7(1), 18. International Organization for Migration. (2011). Glossary on Migration. (R. Perruchoud & J. Redpath-Cross, 30 Eds.) (2nd ed.). Khoudour-Castéras, D. (2009). Neither migration nor development: The contradictions of French co- development policy. Larch, M., Wanner, J., Yotov, Y. V, & Zylkin, T. (2019). Currency Unions and Trade: A PPML Re‐ assessment with High‐dimensional Fixed Effects. Oxford Bulletin of Economics and Statistics, 81(3), 487– 510. Letouzé, E., Purser, M., Rodríguez, F., & Cummins, M. (2009). Revisiting the migration-development nexus: a gravity model approach. Human Development Research Paper (HDRP) Series, 44. Lind, J. T., & Mehlum, H. (2010). With or without U? The appropriate test for a U‐shaped relationship. Oxford bulletin of economics and statistics, 72(1), 109-118. Llull, J. (2016). Understanding international migration: evidence from a new dataset of bilateral stocks (1960–2000). SERIEs, 7(2), 221–255. Lucas, R.E.B. (2019). Migration and Development The Role for Development Aid, https://eba.se/en/rapporter/migration-and-development-the-role-for-development-aid-research- overview/11211/ Marshall, M. G., Gurr, T. R., & Jaggers, K. (2018). Political Regime Characteristics and Transitions, 1800- 2017. Center for Systemic Peace. Retrieved from www.systemicpeace.org Marshall, M., Gurr, T. R., & Harff, B. (2018). PITF - State Failure Problem Set: Internal Wars and Failures of Governance, 1955-2017. Martin, Phillip; Taylor, E. (1996). The anatomy of a migration hump. In Development Strategy, Employment, and Migration: Insights from Models. (pp. 43–62). Paris: Organization for Economic Cooperation and Development: Organisation for Economic Co-operation and Development ; OECD Publications and Information Center [distributor]. Massey, D. S. (1988). Economic Development and International Migration in Comparative Perspective. Population and Development Review, 14(3), 383–413. https://doi.org/10.2307/1972195 Mayda, A. M. (2010). International migration: A panel data analysis of the determinants of bilateral flows. Journal of Population Economics, 23(4), 1249–1274. Mayer, T., & Zignago, S. (2005). Market access in global and regional trade. McFadden, D., & Zarembka, P. (1974). Conditional logit analysis of qualitative choice behavior. Frontiers in Econometrics, 105–142. McKenzie, D., & Rapoport, H. (2010). Self-selection patterns in Mexico-US migration: the role of migration networks. The Review of Economics and Statistics, 92(4), 811–821. Mendola, M. (2012). Rural out‐migration and economic development at origin: A review of the evidence. Journal of International Development, 24(1), 102–122. Ortega, F., & Peri, G. (2013). The effect of income and immigration policies on international migration. Migration Studies, 1(1), 47–74. Özden, Ç., Parsons, C. R., Schiff, M., & Walmsley, T. L. (2011). Where on earth is everybody? The evolution of global bilateral migration 1960–2000. The World Bank Economic Review, 25(1), 12–56. 31 Pesaran, M. H. (2006). Estimation and inference in large heterogeneous panels with a multifactor error structure. Econometrica, 74(4), 967–1012. Rojas-Romagosa, H., & Bollen, J. (2018). Estimating migration changes from the EU’s free movement of people principle. CPB Netherlands Bureau for Economic Policy Analysis, 385. Samuelson, P. A. (1948). International trade and the equalisation of factor prices. The Economic Journal, 58(230), 163–184. Santo Tomas, P., Summers, L., & Clemens, M. (2009). Migrants Count: Five Setps Toward Better Migration Data. Washington, DC: Center for Global Development. Sasabuchi, S. (1980). A test of a multivariate normal mean with composite hypotheses determined by linear inequalities. Biometrika, 67(2), 429-439. Sen, A. (2001). What is development about. Frontiers of Development Economics, 506–513. Silva, J. M. C. S., & Tenreyro, S. (2006). The log of gravity. The Review of Economics and Statistics, 88(4), 641– 658. Silva, J. M. C. S., & Tenreyro, S. (2011). Further simulation evidence on the performance of the Poisson pseudo-maximum likelihood estimator. Economics Letters, 112(2), 220–222. Stark, O. (2006). Inequality and migration: A behavioral link. Economics Letters, 91(1), 146–152. Tinbergen, J. J. (1962). Shaping the world economy; suggestions for an international economic policy. UN DESA. (2017). Trends in International Migrant Stock: The 2017 Revision (United Nations database, POP/DB/MIG/Stock/Rev.2017). United Nations Department of Economic and Social Affairs Population Division. (2006). Trends in Total Migrant Stock, 1960-2000, 2005 Revision. United Nations Department of Economic and Social Affairs Population Division. (2008). United Nations Global Migration Database. Retrieved from http://esa.un.org/unmigration Vanderkamp, J. (1971). Migration flows, their determinants and the effects of return migration. Journal of Political Economy, 79(5), 1012–1031. Zelinsky, W. (1971). The hypothesis of the mobility transition. Geographical Review, 219–249. 32 Appendix Table A. 1 Overview of the 180 origin and destination countries in the panel data set ISO code Country name ISO code Country name DEU Germany ABW Aruba DJI Djibouti AGO Angola DMA Dominica ALB Albania DNK Denmark ARE United Arab Emirates DOM Dominican Republic ARG Argentina DZA Algeria ARM Armenia ECU Ecuador ATG Antigua and Barbuda EGY Egypt, Arab Rep. AUS Australia ESP Spain AUT Austria EST Estonia AZE Azerbaijan ETH Ethiopia BDI Burundi FIN Finland BEL Belgium FJI Fiji BEN Benin FRA France BFA Burkina Faso GAB Gabon BGD Bangladesh GBR United Kingdom BGR Bulgaria GEO Georgia BHR Bahrain GHA Ghana BHS Bahamas GIN Guinea BIH Bosnia Herzegovina GMB Gambia, The BLR Belarus GNB Guinea-Bissau BLZ Belize GNQ Equatorial Guinea BMU Bermuda GRC Greece Bolivia (Plurinational BOL GRD Grenada State of) BRA Brazil GTM Guatemala BRB Barbados HKG Hong Kong SAR, China BRN Brunei Darussalam HND Honduras BTN Bhutan HRV Croatia BWA Botswana HTI Haiti CAF Central African Republic HUN Hungary CAN Canada IDN Indonesia CHE Switzerland IND India CHL Chile IRL Ireland CHN China IRN Iran (Islamic Republic of) CIV Côte d'Ivoire IRQ Iraq CMR Cameroon ISL Iceland Democratic Republic of COD ISR Israel Congo COG Republic of Congo ITA Italy COL Colombia JAM Jamaica COM Comoros JOR Jordan CPV Cabo Verde JPN Japan CRI Costa Rica KAZ Kazakhstan CUW Curacao KEN Kenya CYM Cayman Islands KGZ Kyrgyzstan CYP Cyprus KHM Cambodia 33 ISO code Country name ISO code Country name CZE Czech Republic KNA Saint Kitts and Nevis KWT Kuwait KOR Republic of Korea LAO Lao PDR LBN Lebanon LBR Liberia LCA St. Lucia LKA Sri Lanka SEN Senegal LSO Lesotho SGP Singapore LTU Lithuania SLE Sierra Leone LUX Luxembourg SLV El Salvador LVA Latvia SRB Serbia MAC Macao SAR, China STP São Tomé and Príncipe MAR Morocco RWA Rwanda MDA Republic of Moldova SAU Saudi Arabia MDG Madagascar SDN Sudan MDV Maldives RUS Russian Federation MEX Mexico SUR Suriname MKD North Macedonia SVK Slovak Republic MLI Mali SVN Slovenia MLT Malta SWE Sweden MMR Myanmar SWZ Eswatini MNE Montenegro SXM Sint Maarten, Dutch part MNG Mongolia SYC Seychelles MOZ Mozambique SYR Syrian Arab Republic MRT Mauritania TCA Turks and Caicos Islands MUS Mauritius TCD Chad MWI Malawi TGO Togo MYS Malaysia THA Thailand NAM Namibia TJK Tajikistan NER Niger TKM Turkmenistan NGA Nigeria TTO Trinidad and Tobago NIC Nicaragua TUN Tunisia NLD Netherlands TUR Turkey United Republic of NOR Norway TZA Tanzania NPL Nepal UGA Uganda NZL New Zealand UKR Ukraine OMN Oman URY Uruguay PAK Pakistan USA United States of America PAN Panama UZB Uzbekistan Saint Vincent and the PER Peru VCT Grenadines Venezuela (Bolivarian PHL Philippines VEN Republic of) POL Poland VGB British Virgin Islands PRT Portugal VNM Vietnam PRY Paraguay YEM Yemen, Rep. PSE West Bank and Gaza ZAF South Africa QAT Qatar ZMB Zambia ROU Romania ZWE Zimbabwe 34 Table A. 2 Overview of the main variables used in the analyses, its definitions and sources Variable Definition Source GDP per The ratio of Purchasing Power Parity (PPP)-adjusted total Gross Penn World Tables, version capita Domestic Product (GDP) in constant 2011 US dollars, to the total 9.1. population count. Age The ratio of the number of people younger than 15 or older than 64 World Development dependency (dependents) to the working-age population (ages 15-64). Indicators, World Bank. ratio Population Midyear population divided by land area in square kilometers. World Development density Indicators, World Bank. Population The mid-year estimate of all residents, regardless of legal status or World Development (total) citizenship. Indicators, World Bank. Polity IV This index (Marshall, Gurr, & Jaggers, 2018) considers a nation as Center for Systemic Peace index strongly democratic if citizens have the ability to express their (CSP) preferences about policies and leaders through institutions and procedures, executive power is institutionally constrained, and civil liberties are guaranteed. ‘Strong’ autocracies, on the other hand, are characterized by the presence of sharp restrictions on, or suppression of, competitive political participation. This index ranges from -10 (strongly autocratic) to +10 (strongly democratic). Conflict The number of months any origin country has been in any sort of CSP’s Political Instability duration conflict. The types of conflict considered include wars between Task Force (PITF) data set; governments and minorities (ethnic wars) or political challengers authors’ estimates. (revolutionary wars), and events involving the implementation of policies resulting in the deaths of a significant portion of communal or politicized groups in the total population (genocides and politicides; cf. M. Marshall, Gurr, and Harff, 2018). Natural The number of biological, climatological, geophysical, hydrological EM-DAT database, disaster and meteorological disasters having occurred in a given decade. Université Catholique de occurrences Louvain’s Centre for Research on the Epidemiology of Disasters (cf. Guha-Sapir, 2019). 35 Table A. 3 Correlation matrix of selected origin-time variables, including log GDP per capita GDPpc at Population Age Polity IV Number of Networks Conflict Population origin at origin dependency index natural duration density ratio at disasters origin GDP per capita 1.000 Population -0.023 1.000 Age dependency -0.780 -0.115 1.000 ratio Polity IV index 0.409 0.113 -0.435 1.000 Number of 0.024 0.618 -0.147 0.166 1.000 natural disasters Networks 0.015 -0.029 -0.016 0.011 -0.018 1.000 Conflict -0.220 0.310 0.146 -0.109 0.253 -0.013 1.000 duration Population 0.150 -0.011 -0.203 -0.017 0.001 0.000 -0.004 1.000 density Table A. 4 Summary statistics of explanatory variables N Mean St. Dev Min Max Skewness Migration rate 169271 .001 .021 0 7.801 311.084 Migration flow 169271 1492.9 28612.7 0 4705677 84.469 Ln GDP pc 169271 8.574 1.198 6.126 12.541 .219 GDP pc 169271 11046.39 19331.3 457.506 279000 7.253 Polity IV index 138578 .529 7.149 -10 10 .065 Pop density 156898 266.144 1356.936 .823 21389.1 11.062 Age dep ratio 163524 74.797 19.889 16.856 120.41 -.091 War duration 169271 10.506 29.657 0 120 2.823 Nat. disast. occ. 169271 12.092 26.373 0 284 5.795 Table A. 5 Output Sasabuchi test Lower Upper bound bound Interval 5.973 12.541 Slope 0.967 -2.404 t-value 3.252 -6.065 P >|t| 0.001 0.000 The overall test of the presence of an inverted U-shape yields a t-value of 3.25, where P >|t| = .00574. The extreme point lies at ln(GDPpcit) = 7.85788. 36 Table A. 6 Base model estimated on a migration data set excluding small island states Migration flow Ln GDPpc orig. (t – 1) 4.173*** (0.872) Ln GDPpc orig. sq. (t – 1) -0.267*** (0.0497) Ln pop. orig. -0.133 (0.279) Destination-time FE Yes Country-Pair FE Yes Origin FE Yes Year sample 1960-2020 Observations 79,577 Pseudo R-squared 0.901 Robust clustered standard errors in parentheses. ** p<0.01, ** p<0.05, * p<0.1 Note: Small island states are defined as islands with a population of less than 3mln. Table A. 7 Results from the estimation of the base model with a cubic term Migration flow Ln GDPpc orig. (t – 1) 0.118 (5.694) Ln GDPpc orig. sq. (t – 1) 0.209 (0.659) Ln GDPpc orig. cubed (t – 1) -0.0182 (0.0252) Total population -0.0813 (0.273) Destination-time FE Yes Country-Pair FE Yes Origin FE Yes Observations 89,490 Pseudo R-squared 0.902 Robust clustered standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1 37 Figure A. 1 Mean bilateral emigration rates over time for countries in each income group (as defined by the World Bank) to countries in all other income groups, in the 1960-2020 timeframe 38