Policy Research Working Paper 9391 Making Gravity Great Again Will Martin Development Economics Development Research Group September 2020 Policy Research Working Paper 9391 Abstract The gravity model is now widely used for policy analysis that the traditional Ordinary Least Squares estimator in and hypothesis testing, but different estimators give sharply logarithms is strongly biased downwards. The popular Pois- different parameter estimates and popular estimators are son Pseudo Maximum Likelihood model also suffers from likely biased because dependent variables are limited- downward bias. An Eaton-Kortum maximum-likelihood dependent, error variances are nonconstant and missing approach dealing with the identified sources of bias pro- data frequently reported as zeros. Monte Carlo analysis vides unbiased parameter estimates. based on real-world parameters for aggregate trade shows This paper is a product of the Development Research Group, Development Economics. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The author may be contacted at W.Martin@cgiar.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Making Gravity Great Again Will Martin International Food Policy Research Institute Washington DC EconLit Descriptors: C13, C24, C54, F14, F20 Making Gravity Great Again I. Introduction The gravity model has become enormously important both for economic theory and for policy analysis. Since its first use in the 1960s (Tinbergen 1962), it has been popular because of its ability to represent patterns in many bilateral flows, including aid, investment, migration and trade. The progress made in providing strong theoretical foundations (see Head and Mayer 2014) has greatly increased its popularity for estimating critically important parameters, such as the effects of distance on trade flows (Eaton, Kortum and Sotelo 2013) and the welfare costs of economic distortions (Arkolakis, Costinot and Rodríguez-Clare 2014). It is now widely seen as a basis for new quantitative models to supplement (Ottaviano 2016) or replace computable general equilibrium models for policy analysis. Unfortunately, popular estimators yield different parameter estimates, with some seriously biased because of frequent zero flows, missing data presented as zeros and a combination of nonlinearity and heteroscedasticity. Bilateral trade data sets often contain many zeros, with around half common for total trade and larger shares for sectoral trade. Up to the early 2000s, gravity equations were typically estimated in logarithms, excluding the zero observations, despite the evidence (eg Olsen 1980) that such sample truncation results in bias. Trade datasets frequently also include many missing data not reported to UN COMTRADE and conflated with true zeros. An enormously influential paper by Santos Silva and Tenreyro (SST) (2006) highlighted the potential for bias from the combination of heteroscedasticity and nonlinearity in gravity equations. They provided a very thoughtful explanation of why their preferred estimator, the Poisson Pseudo-Maximum Likelihood (PPML), deals with this problem. 2 Many have argued that PPML eliminates the bias associated with zero trade flows by allowing the zero trade values to be included (eg Yotov et al 2016, p20). Certainly, this ability to incorporate zeros is in sharp contrast with the conventional approach, where the zeros must be omitted because the log of zero is undefined. However, it is not enough to eliminate concerns about bias. Greene (1981) showed that where Ordinary Least Squares (OLS) can be and is used on such censored samples the coefficients are scaled down by the share of non-limit observations in the sample. SST (2006) presented Monte-Carlo analyses which led them—and many others—to conclude that the PPML estimator was the best available estimator for the gravity model. However, these experiments included no zeros so they illuminated only the heteroscedasticity/nonlinearity problem. SST (2011) presented an alternative Data Generating Process (DGP) with many zeros for which the PPML estimator was unbiased but, as noted by Head and Mayer (2014, p180), this DGP does not feature the fixed costs or threshold values that play key roles in generating zero trade outcomes in many models based on modern theory. Estimators such as the Tobit (Tobin 1958) or Heckman (1979) models that allow for zero-valued outcomes are consistent with modern gravity but also suffer from potentially large biases associated with heteroscedasticity (Arabmazar and Schmidt 1981; Donald 1995). Monte Carlo simulation allows application of alternative estimators to datasets with known parameters so any bias can be identified. However, as argued by Leung and Yu (1996), the DGPs used for Monte Carlo analysis must be close to the parameters of the process generating the actual datasets, or their findings may be very misleading. They showed that many Monte-Carlo based studies for limited-dependent variable models gave totally misleading answers because their experimental designs ignored key features of the underlying data 3 generating processes. Martin and Pham (2020) found Monte Carlo conclusions about gravity model estimators to be very sensitive to key parameters such as the distribution of predicted trade. Unfortunately, no Monte Carlo simulations on data with properties matching those of the relevant underlying data generating processes appear to be available. The important SST (2006) Monte Carlo experiments involved: (i) trade predictions assumed log standard normal before adding a dummy variable and (ii) model disturbances distributed log standard normal before adding arbitrary patterns of heteroscedasticity (SST 2006, Case 3). Head and Mayer’s Handbook chapter (2014, p181) used a Monte Carlo design that yields the right share of zero trade observations, but considered only two arbitrary types of heteroscedasticity—a constant variance to mean ratio consistent with the PPML model and a constant variance in logs model consistent with estimation in logarithms. The goals of this paper are to reduce the uncertainty about the best estimator for the gravity model with zeros resulting from sample selection and missing data, and to identify good, practical estimators that allow for potential heteroscedasticity. To foreshadow the key results, the standard truncated OLS estimator was strongly biased downwards—by between 38 and 48 percent. The PPML estimator was biased downwards by an average of 18 percent and threshold Tobit models by 8 to 15 percent. A maximum likelihood estimator following Eaton and Kortum (2001) in dealing with the limited-dependent variable problem but also allowing for heteroscedasticity (the EK-H estimator) provided unbiased parameter estimates. Including missing trade data mapped to zeros had small impacts on most parameter estimates but increased some by around 20 percent. 4 The second section of the paper considers the potential for biases from limited-dependent variables, missing data and heteroscedasticity, and their implications for the design of Monte Carlo experiments. The third section presents estimates of the gravity model from a range of potential estimators, partly to see whether there are important differences in results, and partly to provide estimates of the parameters needed for Monte Carlo analyses to identify the vulnerability of different estimators to bias. The Monte Carlo framework is developed in the fourth section and its results are discussed in the fifth section. A short sixth section provides some suggestions for applied research, prior to the final conclusions. II. Sources and Solutions for Estimation Bias Before designing Monte Carlo simulations that allow testing the suitability of alternative estimators, it is important to understand the sources of potential bias. SST (2006) identified two: (i) limited-dependent variable bias associated with zero trade values, and (ii) the combination of nonlinearity and heteroscedasticity in the model. An additional source of potential bias considered here arises from missing trade data treated as zeros. Limited-dependent variable bias results from winsorization of the distribution of a latent variable at a threshold value, with all values beyond the threshold mapped to this value (Tobin 1958). This results in errors from Ordinary Least Squares (OLS) estimators having non-zero means (Heckman 1979), violating the fundamental assumptions of OLS and leading to biased parameter estimates. The combination of nonlinearity and heteroscedasticity may cause bias in nonlinear models because of correlations between explanatory variables and error terms. In limited-dependent models heteroscedasticity causes bias, even in linear models, as discussed below. 5 Eaton and Kortum (2001) suggest an estimator for the gravity model later formalized for limited-dependent variable models by Carson and Sun (2007). While limited-dependent variable estimators for decisions such as purchases of new automobiles frequently use a zero threshold, Carson and Sun (2007) point out that the minimum possible purchase is not zero but the price of the cheapest new car, implying a threshold above zero. Similarly, modern theory points to threshold values below which market entry is not worthwhile given the distribution of productivity across firms and countries (Chaney 2018). Consistent with this, the Eaton-Kortum (EK) (2001) estimator replaces the limit observations with thresholds proxied by the smallest observed import into each country. Importantly, Carson and Sun (2007) show that the asymptotic distributions of other parameters are not affected by uncertainty about this threshold. The resulting model in levels, generalized to incorporate a nonlinear functional relationship between the dependent and independent variables and different error variances for different observations, can be represented as: (1) yi* = f(xi,β) + ui where yi* is a latent variable, xi is a vector of exogenous regressors and β a vector of parameters. Observations of the dependent variable are: (2) yi = yi* if yi* > γ or (3) yi = 0 if yi* ≤ γ where γ is the threshold that the latent variable must exceed before its value is observable. The likelihood function for this problem is the product: 0 −( ,) 1 −( ,) (4) (, ) = ∏=1 Φ� � ∏ =0 +1 ( ) 6 Where n0 is the number of zero observations; Φ is the Cumulative Distribution Function (CDF) and the Density Function of the error terms; n is the total number of observations; is the standard deviation of the distribution of errors at observation i which, extending Carson and Sun (2007), varies by observation. The first product in equation (4) refers to observations at the threshold. The second refers to nonzero trade observations. The positive EK threshold allows use of log-linear models. Figure 1 provides useful insights into key problems with standard approaches to estimation and the Eaton-Kortum (2001) version of the gravity model. The x-axis shows the predicted value of the log of trade, which is mapped to actual trade in logarithms on the y-axis with a unit coefficient. The EK threshold is represented by the horizontal line at γ. This linear-in- logarithms version makes the implications of limited-dependent variables and missing data much more obvious than a nonlinear version. The small sample of observations was generated in STATA’s pseudo-random number generator using parameters consistent with estimated distributions of residuals and predicted trade. All the dots, squares and triangles in Figure 1 are realizations of the latent variable in equation (1). Although it is not obvious with this sample, observations with lower predicted values have higher error variances. If the full set of points were available, OLS would provide unbiased estimates of the slope. However, only the round, and potentially the square, points are observed. The triangles are realizations of the latent variable below the threshold, which are observed as zeros and mapped to the x observations on the threshold in evaluating the likelihood function. The displacement of these observations means that the residuals from applying OLS to a censored sample consisting of the dots, the squares and the x’s have a non-zero mean, making OLS a biased estimator (Heckman 1979). OLS estimation using only positive trade values also 7 results in bias because this truncation tends to remove observations with smaller errors, resulting in errors with positive expected value. Heteroscedasticity can have substantial implications for Tobit estimators since the value affects the realization of the CDF in equation (4) for each of the limit values, x. The resulting bias increases with the degree of heteroscedasticity and the share of limit observations (see Arabmazar and Schmidt 1981) and is quite different from bias due to the combination of heteroscedasticity and nonlinearity identified by SST (2006). Gravity models are frequently estimated on flows of aid, investment (Chitu, Eichengreen and Mehl 2014), migration (Artuc et al 2014) or trade data with missing values of the dependent variable. When these are mapped to zeros, the resulting sample merges data from two distinct regimes—the stochastic relationship between x and y and the deterministic relationship between x and the zero observations. This can be expected to reduce the coefficient on x estimated with non-limited-dependent estimators like Nonlinear Least Squares (NLS) or PPML. With limited- dependent variable estimators, the impact of the missing data on coefficient estimates depends firstly on whether these missing values would have exceeded the threshold (bias) or not (no bias). Figure 1 shows that the direction of the bias from mapping nonzero trade values to zero will depend on their values—mapping the square at (x = -1.1, y=4.4) to the threshold biases the estimated slope upwards, while mapping the square at (8, 13) to the threshold biases it down. This suggests that missing trade for relatively small flows may bias gravity model coefficients up while missing trade for large flows may bias them down in absolute value. 8 Figure 1. Log-Log Gravity Model with Zeroes, Heteroscedasticity & Missing Values 15 10 y 5 γ 0 -5 x -5 0 5 10 15 x 9 The key parameters needed to ensure that Monte Carlo simulations replicate the system under examination are: 1. The threshold, γ, distinguishing values at the limit and those above the limit, 2. The distribution of predicted outcomes, including (i) Their distribution and especially whether it is normal or log-normal, (ii) The mean and variance of this distribution 3. The distribution of the residuals around the expected trade outcomes, including: (i) Their distribution, (ii) Their standard deviation at any point, σi, and (iii)The pattern of any heteroscedasticity 4. Missing values and their treatment, and 5. The fraction of observations at the threshold Many estimators of limited-dependent variable models assume a threshold value of zero, since only zero or positive values of gross flows of aid, migration, investment or trade are recorded. But a zero threshold is questionable with multiplicative gravity models of the type considered because of: (i) the need for enough volume to cover fixed costs in modern models of trade (eg Chaney 2008) or other flows, (ii) the fact that it makes zeros impossible if residuals are distributed log-normally, and (iii) the fact that the zero share must be below 50 percent if model residuals are symmetrically distributed in levels, while larger shares are frequently observed. The distribution of predicted outcomes may influence the results substantially. Predicted values of trade, ( , ), below or near the threshold are more likely to result in zeros. When predicted trade is far above the threshold, zeros are much less likely. 10 The distribution of the residuals also influences outcomes. Whether they are in levels or logs has profound implications while higher variances usually increase the share of limit observations. Heteroscedasticity can result in bias through correlations with nonlinear functions, misspecification of Equation (4), or both. The frequency of missing values and their treatment are potentially important. Many zero observations reflect the failure of economies to report their trade or other flows. If these missing values are treated as zeros, there may be substantial bias. If they have been replaced by analysts’ estimates, estimation of the gravity model becomes a different—and much less interesting— exercise in inferring the model used to fill out the missing parts of the trade matrix. The fraction of observations at the threshold is an important, observable outcome of the data generating process, depending on all the parameters identified above. III. Estimating Key Parameters This section examines a series of potential estimators of the gravity model both to assess the importance of estimator choice for estimates of key parameters for trade flows, and to identify the functional forms and parameter values needed for informative Monte Carlo analysis. The models considered are: (i) traditional OLS in logs, (ii) PPML (iii) Feasible Generalized Least Squares (GLS); (iv) the Tobit model, (v) Weighted Tobit, and (vi) the Eaton and Kortum (2001) Maximum Likelihood with correction for heteroscedasticity (EK-H). The traditional gravity model includes national variables such as income and population as well as bilateral variables such as the transport costs, tij, from country i to j. While intuitively appealing, this does not capture multilateral resistances and hence generates misleading policy 11 parameters (Anderson and van Wincoop 2003). Feenstra (2004) and Redding and Venables (2004) showed that policy relevant parameter estimates can be obtained by replacing national variables with dummies in cross-sectional analysis, leaving only the bilateral variables, although Anderson (2011) expressed some concerns about this approach. Because this fixed-effects (FE) approach is the simplest solution to the problem it has become enormously popular and is examined to ensure that the parameters for Monte Carlo analysis represent such structural gravity models. Where heteroscedasticity is allowed for, the variance of the residuals is modelled as 2 = (+) where x is the predicted log of trade and the exponent term ensures the variance is always positive (see Cameron and Trivedi 2009, p155). Consistent with standard practice, estimates based on the OLS residuals were used for Feasible GLS and Weighted Tobit estimators to see how well these simple approaches perform and to allow comparison with the EK-H estimator. PPML was considered because of its widely perceived ability to deal with the bias problems associated with both zeros and nonlinearity/heteroscedasticity, and the tentative support for its performance provided by Martin and Pham (2020, p11) when predicted trade is concentrated near zero. The Tobit model is designed to deal with limited-dependent variable bias, and Weighted Tobit attempts to deal with both limited-dependent variable bias and impacts of heteroscedasticity on this model. The EK-H model was included because it specifically deals with both heteroscedasticity and limited-dependent variable bias. Several models used in earlier work were not examined. The Gamma Pseudo-Maximum Likelihood model was excluded because it did not perform well in Head and Mayer’s simulations (2014, p181). NLS was excluded because of concerns about bias due to limited- dependent variables, heteroscedasticity and nonlinearity; and evidence against normality of 12 residuals in levels. The Heckman model was not examined because it is challenging to correct for heteroscedasticity (Donald 1995), and its need for an additional variable in the selection equation makes model comparisons difficult. The Zero-Inflated-Poisson (ZIP) model was excluded because it was found to give the same results as truncated PPML in real-world regressions (Martin and Pham 2020) and because the inflation term requires an additional variable. The analysis uses the 136*136 cross-sectional trade matrix for 1992 curated by Feenstra, Lipsey and Bowen (1997) and studied by SST (2006) and by Jochmans (2017). A cross check with UN COMTRADE revealed that 46 of these economies did not report their trade for 1992, including Iran, Nigeria, Russia and South Africa. Of the potential 2070 trade flows between these non-reporters 2059 were zero, with nine minuscule flows into St Kitts and Nevis, one to the Bahamas and one to Guyana. This made clear that the developers of the dataset had made no systematic attempted to predict flows between non-reporters. Given the possibility of bias from treating these missing data as zeros, trade between these non-reporters was omitted in the main analysis, with results from the full dataset presented in the Online Appendix. With these data excluded, the full sample was 16290 with 41 percent zeros. The results for the Fixed Effects (FE) models in Table 1 show substantially different results between estimators. The critical distance elasticity, for example, ranges from -0.75 with the PPML estimator through -1.47 with the EK-H estimator to -1.88 with the Tobit model. Of the categorical variables, common language and colonial ties have consistently positive and significant effects, while the signs on FTA and Contiguity dummies are puzzlingly negative for the three models based on limited-dependent variable models. 13 A key feature of the traditional model results in Table 2 is consistent patterns in the estimates of key parameters, particularly the income and distance elasticities. The elasticity on exporters’ GDP, for instance, rises by more than a factor of two from 0.73 with the PPML model to 1.5 with the Weighted Tobit model. For the distance elasticity, the range is even larger, from -0.78 with the PPML to -1.74 with the Tobit model. Several other features of Tables 1 and 2 are of interest. The first is that the heteroscedasticity parameters are strikingly similar across estimators. A second is that the differences between the estimators that account for heteroscedasticity and those that do not are generally relatively small. This is clear for OLS and Feasible GLS, and for Tobit and Weighted Tobit. A third is that the estimators accounting for limited dependent variable status tend to have larger coefficients than those that do not. A fourth feature is that the PPML estimator has lower coefficients on key distance and income terms than other estimators. 14 Table 1. Results for the Fixed Effects Gravity Model, Excluding Non-Reporters. OLS PPML Feasible GLS Tobit Wtd Tobit EK-H Coeff t-stat Coeff t-stat Coeff t-stat Coeff t-stat Coeff t-stat Coeff t-stat Log distance -1.35 -43.1 -0.75 -18.5 -1.28 -44.6 -1.88 -44.9 -1.66 -42.3 -1.47 -26.1 Contiguity 0.17 1.3 0.37 4.1 -0.01 -0.1 -0.57 -3.2 -0.68 -4.7 -0.60 -3.1 Common language 0.41 6.0 0.38 4.1 0.39 6.2 0.79 9.2 0.70 8.4 0.65 6.7 Colonial tie 0.67 9.5 0.08 0.6 0.68 10.4 0.98 10.9 1.09 12.3 0.99 10.5 FTA 0.31 3.2 0.38 4.9 -0.07 -0.8 -0.36 -2.5 -1.03 -8.4 -1.38 -9.5 Fixed Effects Yes Yes Yes Yes Yes Observations 9602 16290 9602 16290 16290 16290 R2 0.75 0.93 0.80 0.30 0.31 Hetero, ln a 1.97 43.0 2.7 105.3 2.6 50.7 b -0.12 -19.4 -0.09 -15.6 -0.12 -14.5 Mean SD Mean SD Ln Predicted Trade 8.0 2.9 4.3 4.9 Source: Author’s estimates using Stata 16.0. The log of trade is the dependent variable in all but the PPML model. WLS and Weighted Tobit used heteroscedasticity parameters shown under the OLS heading obtained by applying the Cameron and Trivedi (2009, p155) technique to residuals from the OLS regression. Heteroscedasticity parameters for the PPML were estimated using an EK-H model to avoid truncation of the residuals in estimating the heteroscedasticity function, with the log of the predicted value from the PPML as the explanatory variable, the log of actual trade in the density function in equation (4) and the log of the threshold in the CDF. 15 Table 2. Results for the Traditional Gravity Model, Excluding Non-Reporters OLS PPML Feasible GLS Tobit Weighted Tobit EK-H Coeff t-stat Coeff t-stat Coeff t-stat Coeff t-stat Coeff t-stat Coeff t-stat Log exporter's GDP 0.94 80.6 0.73 27.3 0.96 86.9 1.49 94.6 1.50 99.1 1.49 91.3 Log importer's GDP 0.80 71.7 0.74 26.9 0.83 78.4 1.15 75.3 1.17 80.0 1.15 73.6 Log exporter's GDPC 0.21 12.5 0.16 2.9 0.21 13.1 0.27 12.1 0.25 11.6 0.24 9.6 Log importer's GDPC 0.11 6.4 0.13 3.0 0.11 7.0 0.21 9.3 0.19 8.5 0.18 7.6 Log distance -1.17 -34.4 -0.78 -14.3 -1.15 -37.0 -1.74 -36.3 -1.62 -36.4 -1.41 -24.0 Contiguity 0.32 2.2 0.20 1.9 0.10 0.9 -0.50 -2.4 -0.69 -4.1 -0.69 -3.0 Common language 0.68 10.6 0.74 5.5 0.70 11.4 1.02 11.8 0.95 11.3 0.78 7.1 Colonial tie 0.40 5.8 0.03 0.2 0.38 5.9 0.76 8.3 0.80 9.0 0.82 7.4 Landlocked exporter -0.06 -0.9 -0.86 -5.5 -0.09 -1.5 -0.11 -1.2 -0.14 -1.6 0.00 -0.1 Landlocked importer -0.66 -10.5 -0.69 -4.9 -0.65 -10.4 -0.74 -8.8 -0.78 -9.1 -0.62 -7.6 Exporter’s remoteness 0.47 6.0 0.65 4.9 0.46 6.3 0.65 5.9 0.61 5.8 0.39 3.5 Importer’s remoteness -0.21 -2.6 0.56 4.7 -0.13 -1.8 -0.31 -2.7 -0.26 -2.4 -0.30 -2.6 FTA 0.48 4.5 0.18 2.1 0.17 1.9 0.32 2.0 -0.30 -2.3 -0.35 -2.5 Openness -0.17 -3.5 -0.12 -0.9 -0.16 -3.3 0.20 3.1 0.19 2.9 0.34 4.3 Constant -28.51 -26.2 -32.13 -15.4 -30.50 -30.2 -49.36 -32.2 -50.86 -35.3 -49.6 -31.9 Observations 9602 16290 9602 16290 16290 16290 R2 0.66 0.86 0.72 0.24 0.26 Hetero, ln a 2.21 48.0 2.8 115.1 2.8 94.3 b -0.11 -18.4 -0.098 -19.8 -0.11 -19.8 Mean SD Mean SD Ln Predicted Trade 8.4 2.7 4.2 4.8 Source: As for Table 1. 16 Comparison of the results for samples including Non-Reporters in Online Appendix Tables A.1 and A.2 provides some guidance on the extent of bias resulting from the missing trade values considered in this paper. For OLS and Feasible GLS, no difference was expected or seen because the estimation sample is almost unchanged. With PPML, the downward bias resulting from adding these observations was almost invisible. The summary of key impacts with Tobit, Weighted Tobit and EK-H in Figure 2, however, points to some larger differences. For GDP and for distance using Fixed Effects, the limited-dependent estimators increased by less than 5 percent when the non-reporters were added, while the coefficients on per capita income rose between 15 and 22 percent. The increases are consistent with the missing data sample including some cases that would, like the left-hand square in Figure 1, have been small positive trade values. However, there are some unexpected exceptions, with the distance coefficients lower in absolute value in several cases. The unpredictable impacts of missing data evident both from Figure 1 and in these results provide a strong case for identifying missing values masquerading as zeros and removing them from the sample. 17 Figure 2. Effects of Adding Missing Data on Coefficients, % 25 20 15 10 5 0 -5 -10 Tobit Wtd Tobit EK-H Note: Exp GDP is the natural log of Exporter GDP; Imp GDP is the natural log of importer GDP; Exp GDPC is the natural log of Exporter GDP per capita; Imp GDPC is the natural log of Importer GDP per capita; Dist (Trad) is the log of bilateral distance in the Traditional Gravity model; Dist (FE) is the log of distance in the Fixed Effects model. 18 IV. Designing the Monte Carlo Experiments Two questions guided the design of the Monte Carlo experiments: (i) are predicted trade and model residuals best represented in logs or levels? and (ii) what are the parameters of the distributions for traditional and FE gravity with PPML and EK-H estimators? Deviations from normality may result in large asymptotic biases with truncated estimators like OLS in logs and much smaller bias with Tobit-type estimators (Arabmazar and Schmidt 1982). To guard against such bias, Quantile plots were used to compare predictions and residuals for each model-estimator combination against a benchmark of normality. These plots for predictions and non-limit residuals from the Traditional Model using EK-H are presented in the four panels of Figure A.1 in the Online Appendix, with heteroscedasticity removed from the log residuals by dividing by their standard deviation at each observation, and similar plots for FE gravity using PPML in Figure A.2. The same result was obtained in every case—acceptance of normality in logs and sharp rejection in levels, confirming SST’s decision to perform their Monte Carlo simulations in logs. The next preparatory step for the Monte Carlo analysis was to select the parameters for the distributions of the logs of predicted trade and the log residuals, whose parameters are given in the lower part of Tables 1 and 2. Because the PPML model is fitted to censored data—which it must convert into a continuous distribution because it uses a continuous version of the Poisson distribution—the mean prediction used is larger (8.2 vs 4.2) and its standard deviation smaller (2.8 vs 4.85) than for the uncensored latent variable in the EK-H model. While these two distributions are quite different, and required two separate Monte Carlo analyses, both distributions are very far from the log standard normal assumed by SST (2006). For the residuals, a striking feature is the broad similarity across model types and estimators of the 19 heteroscedasticity parameters. The intercept was between 2.6 and 2.8 in all cases, while the coefficients on predicted trade were between -0.09 and -0.12 in all cases. V. Monte Carlo Analyses The Monte Carlo analyses were designed to be as simple as possible to focus on the question of interest and to avoid the plethora of cases that made inference challenging in studies, such as Martin and Pham (2020), using a wide range of models and degrees of heteroscedasticity. Because the key challenge for estimation appears to be limited-dependent variable bias, all coefficients should be affected equally (Greene 1981), allowing testing to focus on a single explanatory variable. The experiments were performed by first generating observations using the model presented graphically in Figure 1: (5) y = β.x + e where all variables are in natural logarithms and β=1. The pseudo-random samples used x ~ N(4.2, 4.85) for the EK-H-based data generating process (DGP) and N(8.2, 2.8) for the PPML- based DGP. The error terms for the EK-H model were distributed with variance 2 = (+ ) where a= 2.7 and b= –0.115, while simulations based on PPML parameters used a= 2.75 and b= –0.095. Each simulation included 10,000 observations for comparability with typical samples of bilateral trade flows, repeated 1000 times to obtain an indication of the standard error likely associated with real-world analyses. The required random numbers were generated using the pseudo-random number generator in STATA with seed value 10101. For the OLS estimator, the lowest 41 percent of outcomes were mapped to missing, and eliminated from the sample. For the PPML, the values of the dependent variable were first 20 exponentiated into levels and then the lowest 41 percent of outcomes mapped to zero. For the limited-dependent variable estimators, the smallest 41 percent of outcomes were replaced by the minimum value above this range, the threshold value in the EK-H likelihood function. The estimates for β obtained from each of the four estimators considered, and the associated standard errors, are presented in Table 3. 21 Table 3. Coefficient Estimates and Standard Errors from Monte Carlo Results EK-H Parameters PPML Parameters � β Std Error � β Std Error OLS 0.62 0.01 0.52 0.01 PPML 0.87 0.13 0.78 0.11 Tobit 0.89 0.01 0.90 0.01 Weighted Tobit 0.92 0.03 0.85 0.02 MLE 0.999 0.01 1.000 0.001 Note: β=1 22 These results suggest that Ordinary Least Squares is very seriously biased, with its coefficient estimate with the EK-H parameters of 0.62 very close to Greene’s (1983) prediction of 0.59 with 59 percent non-limit values. The PPML estimator appears to have much lower bias than OLS, with downward bias of 13 percent using the EK-H based sample and 22 percent using parameters derived with the PPML model itself. The Tobit model is less biased than the two models that do not account for the zeros, with a downward bias of around 10 percent for both DGPs. Correction for heteroscedasticity in the Tobit model reduces the bias to 8 percent using EK-H parameters, although this bias is 15 percent with PPML-based parameters. The EK-H estimator allowing for heteroscedasticity and limited-dependent variable bias corrects for these problems and yields results almost exactly in line with the known true value of 1. This result extends the Head and Mayer (2014, Table 3.7, line 3) finding of low bias for an EK Tobit estimator in the absence of heteroscedasticity to the real-world case with heteroscedasticity. It is also consistent with Martin and Pham (2008) who obtained very low bias from a heteroscedasticity-corrected Tobit model applied to data generated using an Eaton and Tamura (1994) DGP. VI. Implications for Applied Research A key consideration for applied research with the gravity model is whether there are many zero values of the dependent variable. If there are, then models that do not explicitly allow for potential sample selection bias—including OLS and PPML—should be used with extreme caution, considering the serious risk of biased parameter estimates. 23 If the theoretical framework and analysis of data suggests the possible presence of bias due to sample selection, heteroscedasticity/nonlinearity or missing values, then the extent of the resulting bias will likely depend on parameters of the type considered in the design of the Monte Carlo analysis in this paper. If the dataset involves very different shares of zeros and/or distributions of predicted trade and model residuals from those in the dataset examined here, then Monte Carlo analysis based on the parameters of the DGP is desirable before final choice of estimator. Several papers have argued for the PPML estimator of the gravity model on the ancillary grounds that it results in predicted trade flows that sum to actual imports and exports (Arvis and Shepherd 2014; Fally 2015). This creates an advantage when gravity models estimated using fixed effects are used to infer the multilateral resistance terms central to structural gravity. But if this advantage of PPML comes with substantially biased coefficient estimates, the price for analytical convenience seems too high. In this situation, the alternative of using fitted export and import flows when inferring the multilateral resistance terms (Fally 2015, p78) seems a better option. It is often attractive in applied work to use curated trade datasets such as the CEPII BACI database (Gaulier and Zignago 2010), the IMF Direction of Trade database (Marini, Dippelsman and Stanger 2018) used by Head and Mayer (2014) or the GTAP trade database (Gehlhar, Zhi and Yao 2010) used by Fally (2015) rather than raw data from UN COMTRADE. The curators of these databases add value, sometimes by choosing between the two estimates available for many trade flows—one from the reporter and one from its partner—on the basis of data quality, and sometimes by replacing missing values for intermittent reporters with trade patterns from nearby years or with real data from other than their primary sources. Alternatively, researchers 24 can work with primary data but use similar data preparation steps to reduce the number of missing values and improve data quality. Whoever processes the data, it is important to ascertain carefully what has been done in the “terra incognita” where neither origin nor destination has reported data. Have these values been set to zero as in the widely analyzed trade database used by SST (2006)? Or have model- based predictions been introduced to fill out the dataset for their needs? Given the risk of bias— of unknown direction—from use of missing values identified in this paper, and the futility of reverse-engineering modelled data—excluding these data from the analysis seems likely to be preferable to their inclusion. When missing data have been set to zero, this does not upset the adding up conditions emphasized by Fally (2015). The problem of unreported data remains important. Even though the number of economies reporting trade data to COMTRADE more than doubled between 1992 and 2014, over 60 economies identified as partners in the 2014 COMTRADE dataset did not report to COMTRADE. This problem is probably even more serious with dependent variables other than trade, such as foreign investment, foreign aid or migration (Artuc et al 2014). And the problem is not unique to curated data sets. Missing values masquerading as zeros arise when bilateral trade data arranged by importer-exporter are arranged into a bilateral trade matrix using tools such as an EXCEL pivot table. VII. Summary and Conclusions Having good estimates of the key parameters for the gravity model has become critically important because of its increasingly wide use in economic theory, economic policy and 25 hypothesis testing (Costinot and Rodríguez-Clare 2014). Unfortunately, widely used estimators generate quite different results, making it particularly important to recognize which are suffering from bias due to econometric challenges in estimating the gravity model. This paper considers three major econometric challenges for estimation of the gravity model—limited-dependent variable bias; a combination of heteroscedasticity and nonlinearity; and missing data mapped to zeros. Many attempts have been made to assess, using Monte Carlo simulations, which estimators are unbiased. Unfortunately, many of these have ignored key features of the underlying data generating processes—whether by omitting all zeros from the analysis; choosing arbitrary distributions for key variables; or examining only arbitrary patterns of heteroscedasticity. This paper follows Leung and Yu (1996) in developing Monte Carlo simulations that reflect the key parameters of the system. The analysis begins by identifying and removing nearly 2100 observations from the widely used SST (2006) dataset that are identifiable as missing values mapped to zero. Simple quantile plots supported use of log-linear rather than linear specifications for the Monte Carlo distributions of predicted trade flows and model residuals, so this element of past practice was continued. The estimated degree of heteroscedasticity was quite similar across three approaches to estimation—OLS, PPML, and Eaton-Kortum corrected for heteroscedasticity (EK-H)—with larger proportional deviations from small trade flows than from large. There were large differences in the distribution of predicted trade from the PPML model, which predicts censored trade, and the EK-H model, which uses an unbounded latent variable. To ensure robustness, Monte Carlo simulations were performed with both sets of parameters. The Monte Carlo results point to very large downward bias in coefficients when the traditional estimator (OLS in logs) is used. The EK-H estimator accounting for heteroscedasticity 26 exhibits almost no bias despite the large share of zeros in the sample. The PPML estimator exhibits much less bias than the traditional OLS estimator but, at 13 to 18 percent downward bias, is much less satisfactory than the EK-H estimator. For those who would prefer to use a “canned” package, a weighted Tobit estimator suffered from downward bias of 8 to 15 percent. While these recommendations on choice of estimator are specific to the type of dataset used—a cross-sectional matrix of total trade flows—the framework for analysis provides a basis for identifying the least biased estimators in other contexts. Like all research, this paper leaves many important and interesting questions to be addressed. Although the Monte Carlo simulations were based on key features of the data generating process, the results don’t appear to capture all features of the data. In particular, the real-world analysis—on only one sample—found lower estimates of key parameters with PPML than with OLS, while the Monte Carlo analysis suggested greater downward bias using OLS. With the insights on heteroscedasticity and selection bias in this paper, future researchers may be better equipped to add additional features, such as the preferences for variety and extensive margin growth emphasized in Helpman, Melitz and Rubinstein (2008), to help explain these differences. 27 References Anderson, James E. and Eric van Wincoop, “Gravity with Gravitas: A Solution to the Border Puzzle,” American Economic Review 93:(2003), 170-92. Anderson, James E., “The Gravity Model,” Annual Review of Economics 3(2011),133-60. Arabmazar, Abbas. and Peter Schmidt, “Further Evidence on the Robustness of the Tobit Estimator to Heteroscedasticity,” Journal of Econometrics 17 (1981), 253-8. Arabmazar, Abbas and Peter Schmidt, “An Investigation of the Robustness of the Tobit Estimator to Non-Normality,” Econometrica 50:4 (1982), 1055-63. Arkolakis, Costas, Arnaud Costinot and Andrés Rodríguez-Clare, “New Trade Models, Same Old Gains?” American Economic Review 102:1 (2012), 94-130. Artuc, Erhan, Frédéric Docquier, Caglar Özden and Chrisopher Parsons, “A Global Assessment of Human Capital Mobility: The Role of Non-OECD Destinations,” World Development 65 (2014), 6-26. Arvis, Jean-François and Ben Shepherd, “The Poisson Quasi-Maximum Likelihood Estimator: a Solution to the ‘Adding Up’ Problem in Gravity Models,” Applied Economics Letters 20:6 (2014), 515-9. Cameron, A. Colin and Pravin Trivedi, Microeconomics Using Stata, Stata Press, College Station, TX (2009). Carson, Richard and Yixiao Sun “The Tobit Model with a Non-Zero Threshold,” Econometrics Journal 10 (2007), 488–502. Chaney, Thomas, “Distorted Gravity: The Intensive and Extensive Margins of International Trade,” American Economic Review 98:4 (2008),1707–21. 28 Chaney, Thomas, “The Gravity Equation in International Trade: an Explanation,” Journal of Political Economy 126:1(2018),150-77. Chitu, Livia, Barry Eichengreen and Arnaud Mehl, “History, Gravity and International Finance,” Journal of International Money and Finance 46 (2014),104-29. Costinot, Arnaud and Andrés Rodríguez-Clare, “Trade Theory with Numbers: Quantifying the Consequences of Globalization,” ch 3 in Gopinath, G., Helpman, E. and Rogoff, K. eds Handbook of International Economics, Vol 4 Elsevier, Amsterdam (2014), http://dx.doi.org/10.1016/B978-0-444-54314-1.00004-5 Donald, Stephen, “Two Step Estimation of Heteroskedastic Sample Selection Models,” in Journal of Econometrics 65:2 (1995), 347-80. Eaton, Jonathan and Samuel Kortum, S. “Trade in Capital Goods,” European Economic Review 45 (2001),1195-235. Eaton, Jonathan, Samuel Kortum and Sebastian Sotelo, “International Trade: Linking Micro and Macro,” in Acemoglu, D., Arellano, M. and Dekel, E. eds. Advances in Economics and Econometrics, Tenth World Congress, Vol II: Applied Economics Cambridge University Press for the Econometric Society (2013). Eaton, Jonathan. and Akiko Tamura, “Bilateralism and Regionalism in Japanese and US Trade and Direct Foreign Investment,” Journal of the Japanese and International Economies 8 (1994), 478-510. Fally, Thibault, “Structural Gravity and Fixed Effects,” Journal of International Economics 97:1 (2015), 75-86. Feenstra, Robert, Advanced International Trade: Theory and Evidence. Princeton University Press, Princeton, New Jersey (2004). 29 Feenstra, Robert, Robert Lipsey and Harry Bowen, “World Trade Flows, 1970-1992, With Production and Tariff Data,” NBER Working Paper 5910, National Bureau of Economic Research, Cambrige MA (1997). https://www.nber.org/papers/w5910.pdf Gaulier, G. and Zignago, S. (2010), “BACI: International Trade Database at the Product Level: the 1994-2007 Version,” CEPII Working Paper 2010-23, CEPII, Paris. Gehlhar, Mark, Zhi Wang and Shunli Yao, “GTAP 7 Data Base Documentation- Chapter 9.A: Reconciling Merchandise Trade Data,” (2010) https://tinyurl.com/yyh7a5ua Greene, William. “On the Asymptotic Bias of the Ordinary Least Squares Estimator of the Tobit Model,” Econometrica 49:2 (1981), 505-13. Head, Keith. and Thierry Mayer, “Gravity Equations: Workhorse, Toolkit and Cookbook,” in Gopinath, Gita, Ehanan Helpman, and Ken Rogoff, eds Handbook of International Economics, Vol 4: (2014), 131-201. Heckman, James, “Sample Selection Bias as a Specification Error,” Econometrica 47: (1979), 153-61. Helpman, Elhanan, Marc Melitz and Yona Rubinstein, “Estimating Trade Flows: Trading Partners and Trading Volumes,” Quarterly Journal of Economics CXXIII:2 (2008), 441- 87. Jochmans, Koen. “Two Way Models for Gravity,” Review of Economics and Statistics 99:3: (2017), 478–85. Leung, Siu. Fai. and Shihti Yu, “On the choice between sample selection and two-part models,” Journal of Econometrics 72 (1996), 197-229. Marini, Marco, Robert Dippelsman and Michael Stanger “New Estimates for Direction of Trade Statistics,” WP/18/16, (2018) International Monetary Fund, Washington DC. 30 Martin, Will and Cong Pham, “Estimating the Gravity Model when Zero Trade Flows are Frequent,” Working Paper 2008_03, (2008), Deakin University. https://tinyurl.com/ybwj2hrk Martin, Will and Cong Pham, “Estimating the Gravity Model when Zero Trade Flows are Frequent and Economically Determined,” Applied Economics 52:26 (2020), 2766-79. Olsen, Randall, “Approximating a Truncated Normal Regression with the Method of Moments,” Econometrica 48:5 (1980), 1099-1105. Ottaviano, Gianmarco, “European Integration and the Gains from Trade,” ch 12 in Badinger, Harald and Volker Nitsch, V. eds Routledge Handbook of the Economics of European Integration, Routledge, London and New York (2016). Redding, Stephen and Anthony Venables, “Economic Geography and International Inequality,” Journal of International Economics 62:1 (2004), 53–82. Santos Silva, João and Silvana Tenreyro, “The Log of Gravity,” Review of Economics and Statistics 88 (2006), 641-58. Santos Silva, João and Silvana Tenreyro, (2011), “Further Simulation Evidence on the Performance of the Poisson Pseudo-maximum Likelihood Estimator,” Economics Letters 112:220–2. Tinbergen, Jan, Shaping the World Economy: Suggestions for an International Economic Policy. The Twentieth Century Fund, New York (1962). Tobin, James, “Estimation of Relationships for Limited Dependent Variables,” Econometrica 26 (1958), 24-36. Yotov, Yoto, Roberta Piermartini, José-Antonio Monteiro and Mario Larch, An Advanced Guide to Trade Policy, UNCTAD and the World Trade Organization, Geneva (2016). 31 Online Appendix for “Making Gravity Great Again” Tables A.1 to A.3 of this Online Appendix present econometric results in parallel to those in Tables 1 to 3 of the paper. These results, for the full sample used by Santos Silva and Tenreyro (2006), are included both to show the effects on key coefficient estimates of excluding missing Non-Reporter to Non-Reporter data from the sample, and to confirm that the key Monte Carlo results reported in the paper are robust to the exclusion of these missing data. Quantile Plots for the normality of predicted trade and non-limit residuals from the Traditional Gravity model estimated using the EK-H estimator on the sample excluding Non- Reporters are presented in Figure A.1. Estimates for the Fixed Effects model using the PPML estimator on the same sample are presented in Figure A.2. These plots are presented in levels and in logs to provide guidance on which provides a better representation. Table A.1. Results for the Gravity Model with Fixed Effects, Including Non-Reporters. OLS PPML Feasible GLS Tobit Wtd Tobit EK-H Coeff t-stat Coeff t-stat Coeff t-stat Coeff t-stat Coeff t-stat Coeff t-stat Log distance -1.35 -43.1 -0.75 -18.5 -1.28 -44.6 -1.79 -42.5 -1.59 -39.9 -1.49 -28.4 Contiguity 0.17 1.3 0.37 4.1 -0.01 -0.1 -0.79 -4.5 -0.76 -5.2 -0.75 -3.6 Common language 0.41 6.0 0.38 4.1 0.39 6.2 0.79 9.2 0.73 8.6 0.72 7.7 Colonial tie 0.67 9.5 0.08 0.6 0.68 10.4 0.90 10.0 1.02 11.4 0.93 9.7 FTA 0.31 3.2 0.38 4.9 -0.07 -0.8 -0.49 -3.3 -1.21 -9.63 -1.38 -8.7 Fixed Effects Yes Yes Yes Yes Yes Yes Obsvns 9613 18360 9613 18360 18360 18360 R2 0.75 0.80 Hetero a 2.0 43.0 2.7 137.7 2.41 72.6 b -0.1 -19.4 -0.1 -16.5 -0.09 -15.7 Mean SD Mean SD Ln Predicted Trade 7.6 3.1 3.5 5.2 Sources: As for Table 1. Table A.2 Results for Alternative Estimators, Including Non-Reporters OLS PPML Feasible GLS Tobit Weighted Tobit EK-H Coeff t-stat Coeff t-stat Coeff t-stat Coeff t-stat Coeff t-stat Coeff t-stat Log exporter's GDP 0.94 80.6 0.73 27.3 0.96 86.9 1.54 93.9 1.55 99.0 1.54 92.4 Log importer's GDP 0.80 71.9 0.74 27.0 0.83 78.5 1.19 75.7 1.22 80.8 1.20 74.6 Log exporter's GDPC 0.21 12.5 0.16 2.9 0.21 13.0 0.31 13.2 0.29 12.8 0.29 11.3 Log importer's GDPC 0.11 6.4 0.14 3.0 0.11 7.0 0.25 10.6 0.22 9.8 0.22 9.1 Log distance -1.17 -34.4 -0.78 -14.4 -1.15 -37.0 -1.64 -33.6 -1.57 -34.2 -1.39 -24.0 Contiguity 0.31 2.2 0.19 1.9 0.10 0.9 -0.77 -3.8 -0.81 -4.7 -0.82 -3.5 Common language 0.68 10.6 0.75 5.5 0.69 11.4 1.11 12.7 1.06 12.4 0.95 9.0 Colonial tie 0.40 5.8 0.03 0.2 0.38 5.9 0.59 6.4 0.64 7.0 0.60 5.5 Landlocked exporter -0.06 -1.0 -0.86 -5.5 -0.09 -1.5 -0.05 -0.6 -0.11 -1.3 0.02 0.3 Landlocked importer -0.66 -10.5 -0.70 -5.0 -0.65 -10.4 -0.70 -8.1 -0.76 -8.8 -0.61 -7.2 Exporter’s remoteness 0.47 6.0 0.66 4.9 0.47 6.4 0.77 6.7 0.74 6.8 0.55 4.8 Importer’s remoteness -0.20 -2.5 0.56 4.7 -0.13 -1.7 -0.19 -1.6 -0.14 -1.2 -0.20 -1.7 FTA 0.49 4.7 0.18 2.1 0.18 2.0 0.46 2.7 -0.25 -1.9 -0.29 -2.0 Openness -0.17 -3.5 -0.11 -0.8 -0.16 -3.3 0.56 8.5 0.55 8.3 0.74 9.1 - Constant -28.49 -26.2 -32.33 -15.7 -30.50 -30.2 -55.53 -34.9 -57.29 -38.3 55.76 -34.8 Obsvns 9613 18360 9613 18360 18360 18360 R2 0.66 0.860 0.720 0.25 0.27 Hetero a 2.21 48.1 2.73 145.7 b -0.11 -18.4 -0.09 -20.3 Mean SD Mean SD Ln Predicted Trade 8.0 2.8 3.3 5.2 Table A.3. Coefficient Estimates and Standard Errors from Monte Carlo Results EK-H Parameters PPML Parameters � β Std Error � β Std Error OLS 0.58 0.09 0.48 0.01 PPML 0.84 0.17 0.79 0.11 Tobit 0.89 0.001 0.91 0.01 EK-H 0.9999 0.008 0.9999 0.014 Note: Based parameters estimated including Non-Reporters, with 50 percent zeros. Figure A.1. Quantile Plots: EK-H Model, Traditional Gravity, Excl Non-Reporters 1. Predicted Values, logs 2. Predicted Values, Levels 1.000e+09 20 10 Linear prediction 5.000e+08 zblevel 0 -10 0 -20 -20 -10 0 10 20 -1.000e+08 -50000000 0 50000000 1.000e+0 Inverse Normal Inverse Normal 3. Residuals, logs 4. Residuals, levels Figure A.2. Quantile Plots: PPML Model, Fixed Effects Gravity, Excl Non-Reporters 1. Predicted Values, logs 2. Predicted Values, Levels 3. Residuals, logs 4. Residuals, levels Reference Santos Silva, João and Silvana Tenreyro, “The Log of Gravity,” Review of Economics and Statistics 88 (2006), 641-58.