Human Capital Outflows: Selection into Migration from the Northern Triangle

This study quantifies the outflow of human capital associated with migration from Guatemala, El Salvador, and Honduras since 1990. To measure the outflow of skills and human capital and how this has changed over time, the study uses information on Northern Triangle migrants residing in the United States, a group that accounts for over 90 percent of all migrants from the three countries. The results suggest that these migrants are, in general, positively selected into migration. That is, based on their observable characteristics, the individuals would have a higher earnings distribution relative to individuals who do not migrate. The results show a decrease in selectivity between the 10-year cohort of migrants who arrived by 2000 and those who arrived by 2014. This finding may reflect increased access to migration networks by lower-income households and individuals. The data suggest that the loss in human capital associated with a 10-year outflow of adults, as measured by foregone local wages, represented 1.9 percent of gross domestic product in El Salvador, 1.5 percent in Honduras, and 1.0 percent in Guatemala.

This study quantifies the outflow of human capital associated with migration from Guatemala, El Salvador, and Honduras since 1990. To measure the outflow of skills and human capital and how this has changed over time, the study uses information on Northern Triangle migrants residing in the United States, a group that accounts for over 90 percent of all migrants from the three countries. The results suggest that these migrants are, in general, positively selected into migration. That is, based on their observable characteristics, the individuals would have a higher earnings distribution relative to individuals who do not migrate. The results show a decrease in selectivity between the 10-year cohort of migrants who arrived by 2000 and those who arrived by 2014. This finding may reflect increased access to migration networks by lower-income households and individuals. The data suggest that the loss in human capital associated with a 10-year outflow of adults, as measured by foregone local wages, represented 1.9 percent of gross domestic product in El Salvador, 1.5 percent in Honduras, and 1.0 percent in Guatemala.
2 Immigration from El Salvador, Guatemala, and Honduras to the United States has grown at an annualized growth rate of 8.1 percent since 1980, far outpacing the growth of the overall immigrant population in the United States during this period. In 1980 fewer than 200,000 immigrants from these three countries, often referred to collectively as the Northern Triangle, lived in the United States; by 2015 this number had grown to 2.8 million. Long-lasting and violent civil wars in El Salvador and Guatemala combined with disastrous earthquakes and hurricanes were important initial triggers to this outflow of immigrants. Continuing high rates of poverty, lack of economic opportunities, and some of the highest rates of violent crime in the world have continued to feed the supply of immigrants from the Northern Triangle. As the flow of migration has grown, its composition has also changed, reflecting changes both in push factors (civil wars, natural disasters, poverty, crime) and pull factors (changes in labor demand and migration policies in destination countries, family reunification, development of migrant networks). The gradual development of migrant networks, as subsequent arrival cohorts have expanded into new areas, has facilitated and channeled new migration flows. 4 This large diaspora represents that 10 percent of the Northern Triangle (NT) population -the countries of El Salvador, Guatemala, and Honduras -lives outside their birth countries. Migration rates at this level can have implications for development and labor markets, especially if those with higher skill or productivity levels are the ones exiting. In line with the low levels of formal schooling in these three countries, immigrants from these countries are largely unskilled workers who have not completed secondary school. While much has been made of the economic costs of brain drain, wherein highly skilled workers such as doctors and engineers leave the country at significant rates, migration patterns can lead to a broader loss of skills and productivity even among lower skilled groups. The analysis in this paper shows that the majority of immigrants from the NT are young, working age adults who, though they are not high skilled by international standards, represent a disproportionate loss of human capital for the source countries.
In this paper, we use household survey data from the United States and from the countries of the Northern Triangle to compare the distributions of human capital of immigrants and non-migrants. This is done through an analysis of a counterfactual wage distribution based on observable characteristics -in particular, age, education, marital status and, for women, presence of children in the household. The results show a process of positive selection into migration, wherein individuals with characteristics associated with higher earnings are more likely to migrate. The analysis also shows that, while migration remains positively-selected, the degree of this bias has fallen between 2000 and the more recent cohort ending in 2014.
The results suggest that migration channels between the NT and the primary destination countries, in particular the United States, have expanded, resulting in "access" to migration becoming more widespread across economic groups. That is, while initial immigrants were often from higher income households, more recent immigrants have increasingly come from lower income households. This has significant implications for poverty in the three source countries as it suggests, on one hand, a loss of potential labor among low-income households while, on the other hand, increased access to remittances income. Yet, despite decreasing positive selection, migrant outflows from the Northern Triangle continue to be positively selected. Borjas (1987Borjas ( , 1995 argues that the decision to migrate depends on the wage distributions of the source and destination countries. Depending on the dispersion of wages and relative returns to skills in the source and destination countries, there may be positive or negative selection into migration across the skill distribution. In countries with high relative returns to skill and high wage inequality, the economic models would predict more migration from lower skilled migrants. In contrast, countries with low returns to skill and low wage dispersion provide those with aboveaverage skill levels the greatest incentive to migrate. In support of this, Borjas (1987), argues that negative selection from Latin America in the 1970s led to a decrease in relative wages at the time of arrival compared to earlier waves of migrants from Western Europe.

Literature
Despite the relevance of this topic, empirical research on how the skills of migrants compare to the skills of non-migrants in countries of origin is limited and with mixed findings. Earlier work by Ramos (1992) finds support for negative selection from Puerto Rico, while Funkhouser (1992) finds positive selection from El Salvador. Chiquiar and Hanson (2005) and Cuecuecha (2003), relying on a combination of data from the United States and Mexico, find evidence against Borja's' negative selection hypothesis in Mexico and instead argue there is intermediate and positive selection of migrants respectively. However, Ibarraran and Lubotsky (2007) analyze potential over-reporting of education by migrants and undercounting of younger migrants in the U.S. data and conclude that Mexican migrants tend to be less educated than non-migrants. Yashiv (2008) uses a single data source (the 1987 quarterly Territories Labor Force Survey conducted in both Israel and Palestine) to evaluate self-selection of Palestinian workers in Israel and finds evidence in favor of negative selection. Similarly, Kaestener and Malamud (2014) use pre-migration earnings from the Mexican Family Life Survey to test for negative selection on earnings. They find that Mexican men in the top 20 percent of the income distribution were less likely to migrate and, importantly, that the negative selection into migration is driven by higher relative returns to migration for workers in the lower end of the income distribution. They also find that Mexican migrants are largely drawn from the middle of the educational distribution without evidence of selection on cognitive ability or health. Orrenius and Zavodny (2005) use data from the Mexican Migration Project to assess how changes in economic conditions, migrant networks and border enforcement influence the selection of undocumented migrants. They find that migrants come from the middle of the skill distribution and stricter border enforcement are associated with a more positively selected migrant population with higher skill levels. They also conclude that selection becomes more negative as economic conditions in both source and destination countries improve.
To evaluate migrant selection from the Northern Triangle to the United States, this paper follows the methodology from Chiquiar and Hanson (2005). Given differences in returns to skill in host and source countries, a comparison of skill distributions of migrants and non-migrants is not sufficient. Thus, we compare what migrants and non-migrants would each earn in the same labor market under common skill prices. Using data from source (El Salvador, Guatemala and Honduras) and host countries (United States) this section evaluates the selection of NT migrants in terms of observable skills and tests Borjas' (1987) negative-selection hypothesis for migration in 2000 and 2014. In this way, we measure the extent to which selection into migration has changed across these two cohorts. The changes in cohort composition may be attributable to the presence of expanding migration networks between the NT and the United States.

Data
In order to compare the skills of immigrants and non-migrants from El Salvador, Guatemala and Honduras, we use survey and census data from the source and destination countries. For clarity throughout this discussion, we will refer to non-migrants as "residents" and rely on data from the 2000 and 2014 household surveys for El Salvador, Guatemala and Honduras. 5 We refer to those who migrated to the United States as migrants and rely on the 5 percent Public-Use Microdata Sample (PUMS) from the 2000 U.S. Census of Population and Housing and the 2014 U.S. American Community Survey (ACS) for information on their human capital and earnings. For each country and year, we created a stacked data set combining a limited set of variables from the source and destination data sets. We restrict the U.S. sample to recent migrants (those that arrived in the last 10 years). Because of differences in returns to U.S. and foreign education in the U.S. labor market, we limit the sample to individuals who arrived in the United States at the age of 18 or older. Since the analysis is of labor market returns, we limit it to individuals between the ages of 21-65.
One potential measurement issue is the likely undercount of undocumented migrants in the U.S. data. As of 2014, an estimated 700,000 Salvadorans, 525,000 Guatemalans, and 350,000 Hondurans were undocumented migrants (Passel and Cohn, 2016). While no official counts of undocumented migrants in the United States exist, researchers use residual estimation methodologies that compare the total number of migrants measured in household surveys and censuses with official statistics of migrants residing legally in the United States (for example, (Passel and Cohn 2016;Warren and Warren 2013). The latter is estimated using counts of lawful admission since 1980 obtained from the Department of Homeland Security's Office of Immigration Statistics. The difference between the survey total foreign-born population and the estimated lawful migrant population is presumed to be the number of undocumented migrants in the survey, a number that is later adjusted for omissions from the survey.
Even though undocumented migrants are generally more likely to be undercounted than lawful migrants, the undercount of unauthorized migrants in recent ACS data is estimated to be lower. Passel and Cohn (2016), Van Hook et al. (2014) and earlier work find evidence of serious coverage problems in surveys collected before the 2000 census but fewer issues in the 2000 census and subsequent surveys. For 2000 to 2009, coverage adjustments increase the estimate for the unauthorized migrant population by 8 to 13 percent; this adjustment falls to 5 to 7 percent for data between 2010 and 2014. This is to say, the majority of undocumented migrants are covered by the survey data -but they are underrepresented in the migrant estimates.
To the extent that undocumented migrants are more likely to be poor and of lower human capital levels, this may generate a bias in the results by yielding a higher distribution of educational attainment. On the other hand, push factors not strictly economic, in particular, violence and civil war, may incentivize a broader range of the skill distribution, including those who might otherwise 5 try to migrate through formal channels, to migrate informally. Given the significant levels of violence in the Northern Triangle, this factor may mitigate the extent to which undocumented migrants are negatively selected relative to documented migrants. Finally, rates of undocumented migration from the NT were lower for the 2000 cohort than for 2014, hence reducing the magnitude of the potential bias for the earlier cohort. 6 To the extent that undocumented migrants are undercounted in the U.S. data and have lower skill levels than documented migrants, results based on these data can be biased against finding negative selection.
Another measurement issue is the comparability of data for the migrant and non-migrant populations, especially in measuring education. Ibarraran and Lubotsky (2007) suggest that evidence of positive selection using the U.S. data may be driven by a high prevalence of imputed values among migrants in the U.S. Census. Indeed, analysis of the 2014 ACS reveals that imputed values alter the distribution of educational attainment towards higher rates of secondary school completion for the three countries ( Figure 1). 7 To address this concern, we exclude observations with imputed education, accounting for 12-15 percent of the U.S. sample. In effect, the implicit assumption is that education is missing at random within the sample of NT migrants. Another potential source of bias in the measurement of education is differences in the way the questions are asked in the U.S. data and the source countries. However, as shown in Table  1, these questions are largely consistent between the U.S. data and the source country surveys.  Difference (Total -nonimputed)

SLV GTM HND
As noted in other research on migration outcomes, another source of potential bias is return migration. If return migration is not random among migrants, then migrants who remain in the destination country vary systemically from those who returned to the source country. Borjas and Bratsberg (1996) argue that 'return migration accentuates the type of selection.' This suggests that if migrants from the NT are negatively selected, for example, those who are most successful (and hence more likely to reach their migration "earnings goal") would be more likely to return, thus skewing the remaining sample to look more negatively selected. 8 On the other hand, if less successful migrants are more likely to leave, for example if wage expectations are not met, then this would bias the remaining sample towards positive selection. One approach to address this concern is by analyzing recently arrived migrant cohorts as attrition is expected over time and hence less likely in shorter time frames. However, to maintain sufficiently large samples, the arrival cohorts used in this analysis are of 10 years. Table 2 reports average age, distribution of schooling, labor force participation, and average hourly wages for residents of the Northern Triangle and migrants to the United States for the cohorts of 2000 and 2014 by country of origin. Migrants from all three countries are younger, more educated, and have higher hourly wages relative to residents (even after adjusting for purchasing power parity). Though educational attainment among the NT-born population has increased in the last decade, migrants continue to be more educated than residents in 2014. Migrants were less likely to have only primary school or less (nine or fewer years of education) and more likely to have secondary education (10-15 years of education) in the three countries. College graduate rates (16+ years of schooling) were similar between the two groups across the three countries and two cohorts.

Summary Statistics
As of 2014, relative to residents, recent migrants from the region remain less likely to have nine or fewer years of education, more likely to have 10-15 years and less likely to have 16+ in Honduras and El Salvador. Table 2 presents suggestive evidence against the negative selection hypothesis: the least educated are not the most likely to migrate from the Northern Triangle to the United States.

Methodology
Building on Chiquiar and Hanson (2005), we denote | as the wage density in country j conditional on characteristics x. Let be an indicator equal to 1 if individual i is employed and let | , , be the distribution of observed characteristics among individuals in country j born in country k. Using this notation, we can write the observed wage density of Guatemalan residents and migrants, respectively, as Residents: These two wage densities are shown in Figure 2. It shows kernel density estimates for wages of NT migrants and residents. These are higher for NT migrants in the United States, even after adjusting for cost of living differences. Differences in wage distributions between migrants and residents are attributable not only to differences in returns to skills in the two labor markets, but also potentially due to differences in the distribution of skills in the migrant and resident populations. This would be the result of positive selection into migration. To test this hypothesis, we estimate the forgone local wages of migrants. 9 This is done by building a counterfactual wage distribution that, in effect, estimates the distribution of wages for migrants if they were employed and paid at the rates found in their source countries. This is calculated by reweighting the resident distributions in Figure 2 to reflect the characteristics of the migrants in the migrant wage distribution.
Foregone wages of migrants are estimated by constructing counterfactual weights that adjust the wage distribution of residents (individuals who did not migrate) to account for differences in observable characteristics between migrant workers and resident workers. The counterfactual weight is designated as and adjusts for differences in migrant and resident characteristics ( , in the source country and destination country. The second step is to estimate the probability that a local adult is in the United States, using the full sample of migrants and residents. For each country c and year t we estimate the following logit regression for men and women (i): ) where Pr(Migrantcti) is the probability of being a migrant in country c, in year t on a set of observed characteristics: dummies for years of schooling, age, age squared, dummy for marital status and the interaction of years of schooling and age, as well as marital status and years of schooling. For women, regressions also include number of own children in the household and its interaction with schooling.
The product of the conditional probabilities and the fitted coefficients from regression (1) are applied to the sample of wage-earning residents to estimate counterfactual wage kernel densities for migrants in the United States. 10 The difference between the observed and counterfactual wage densities non-parametrically summarizes migrant selection in terms of local earnings. 11 A larger positive gap to the right of the wage distribution is evidence of positive selection: migrants are overrepresented among those with above-average skills and underrepresented among those with below-average skills. On the other hand, a larger positive gap at the lower end of the wage distribution shows evidence of negative selection.

Results
The results for the three countries suggest positive selection into migration in 2000: the counterfactual wage density for migrants lies to the right of the observed wage density for residents (Figure 3 and 4). The results for men in El Salvador show the least obvious difference between the densities for immigrants and residents. The results show a marked tendency for the wage densities for each gender in each of the countries to grow closer for the 2014 cohort of migrants. This suggests a decrease in the degree to which migration from these three countries is positively selected. The result is consistent with an increase of migration flows based on larger networks during this period. As migration networks are expected to reduce costs of migration, these also imply a reduction in selectivity into migration.
Though the densities became more similar between 2000 and 2014, evidence of positive selection remains. The differences between the densities of immigrants and residents remain negative for low wage and positive for upper-middle wage values ( Figure 5). The shift in selection of female migrants is more noticeable as they have become less positively selected, suggesting a change in costs for migration for women and/or family integration. Similarly, for Guatemalan and Honduran men the density difference is positive for some low wage values and strongly positive for middle wage values, indicating a shift to intermediate selection with some evidence of negative selection that was not evident a decade earlier.
To assess whether these changes in selection into migration as measured through counterfactual wage distributions are statistically significant, wage distributions are summarized in terms of deciles. The share of the population in each decile is estimated for each country, year and gender. For each decile, we test whether the share of the migrants is statistically different from 0.10. Positive selection of migrants would result in population shares below 0.10 for low-income deciles and above 0.10 for high-income deciles, and vice versa for negative selection. Table 3 corroborates the results arising from the visual inspection of Figure 4. Migrants are overrepresented among upper-middle wage workers for the three countries. For most countries and years, both female and male migrants are underrepresented in the bottom deciles and overrepresented in the top deciles. Though visually the differences between actual and counterfactual wage densities appear small, most of these differences are statistically significant across wage deciles. As suggested by the densities, positive selection is particularly significant among Guatemalan and Honduran male migrants and Honduran female migrants in 2000. The only group for which none of the deciles differs between migrants and non-migrants are Honduran women in the 2014 cohort. This represents a significant shift in selection from the 2000 cohort, in which women were significantly overrepresented.
Across country and gender groups, the more recent migrant cohort has higher shares of migrants in the lower wage deciles relative to the 2000 cohort. This is consistent with the results from Figure  4, which suggest that in 2014 Guatemalan and Honduran migrants, in particular, were less positively selected than they had been in 2000. This is particularly noticeable among female emigration.      (1) as reported in Appendix Table  A.1. Statistical difference is estimated relative to 0.10 with *** indicating p<0.05.

Economic importance of forgone wages
What is the economic impact of this outmigration in terms of forgone wages? The counterfactual wage distribution can be used to estimate the economic cost of migration in terms of forgone wages under the assumption of no labor market effects of out-migration or remittances. Specifically, forgone labor earnings for each country and cohort (designated by j) can be estimated based on the labor earnings of residents and the counterfactual weights estimated above. Implicit in this exercise is a strong assumption that labor market wages would be the same in the absence of migration, an unlikely situation given the size of the migrant flow and the size of the remittances inflows. Even so, it provides a general sense of the cost of migration in terms of lost human capital. Table 4 reports the results from this exercise. The recent cohort of migrants, defined as those who migrated in the 10 years before 2000 and 2014 as adults (and further limited to those without imputed education information in the U.S. data), represents between 2.4 percent (the 2014 cohort from Guatemala) and 6.6 percent (the 2000 cohort from El Salvador) of the local labor force of the source countries. Yet, as suggested by the results above, their forgone earnings are a larger share of total labor earnings, ranging from 3.2 to 7.0 percent. The 2000 cohorts from Guatemala and Honduras in particular stand out as representing outsized shares of labor earnings, results aligned with greater positive selection. As a share of GDP, forgone earnings from each cohort range from a low of 1.0 to 3.2 percent. This share has fallen between 2000 and 2014, a combination both of higher GDP in the Northern Triangle and of lower selectivity into migration.  ACS, 2000 US Census data, and SEDLAC (CEDLAS and World Bank). Note: The ten-year cohort is defined as all working age (18-65) migrants who migrated as adults (age 18 or older) living in the US from each country who arrived in the 10 years prior to the survey (either prior to 2000 or 2014). The first column of results reports the migrant population as a share of the potential workforce (ages 18-65). The other columns report the total estimated monthly labor earnings of these migrants as a share of total local earnings, GDP, and remittances (each adjusted to 2011 US$). Due to data limitations, remittances share for the 2000 cohort is based on 20002 data for Guatemala.
Of course, in exchange for losing migrant human capital, these countries have received significant remittances inflows. The results suggest that the loss in earnings from these recent cohorts represents between 8.7 and 11.8 percent of all remittances received in 2014. Since the migration stream to the United States from the Northern Triangle is relatively recent, the forgone earnings of recent migrants as a share of remittances received in 2000 was significantly higher. This was particularly the case in Honduras, where Hurricane Mitch in 1998 was an important trigger for migration to the United States. In the case of that initial wave of migrants, the forgone wages of those who left represented about half of the amount received in remittances by the country in the year 2000.

Conclusion
The estimates presented in this study suggest that the cost of migration in terms of lost human capital is high in the countries of the Northern Triangle. Assuming simply that migrants would have earned similar to what non-migrants earn, forgone earnings of a subset of migrants, accounting for about 12 to 16 percent of all migration from these countries to the United States, accounted for 1.9 percent of GDP in El Salvador, 1.5 percent in Honduras, and 1.0 percent in Guatemala in 2014. These forgone earnings are recovered through remittances, which represent larger shares of the GDP of each country. Of course, the loss of human capital, particularly in a situation of positive selection into migration, can suggest lower productivity and growth potential.
It is an open question the extent to which remittances have fully compensated for these losses.
Two measurement caveats should be considered when analyzing these results. First, due to an undercount of undocumented migrants, it is possible that the data are biased towards higher educational attainment and groups least likely to be undocumented. The rates of undocumented migration were higher in the 2014 data, which suggests the problem would be larger in 2014. However, improvements in survey coverage and an increase in violence as a push factor are expected to have mitigated the bias. The second caveat is the lack of information on unobservable characteristics, such as motivation. To the extent that the decision to migrate is positively correlated with greater levels of motivation, this may suggest an underestimate of the counterfactual wage distribution.    In the U.S. data the sample is restricted to recent migrants (those that arrived in the last 10 years), and individuals ages 18 or above at time of entry to the U.S. Regressors are dummy variables for years of schooling, age, age squared, dummy for marital status and interactions among these variables. For women regressions include number of own children and its interaction with schooling. Standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1