How Much Does Physical Infrastructure Contribute to Economic Growth? An Empirical Analysis

The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.


Introduction
Policy makers, particularly in developing countries, often face a critical question: how much they should invest in physical infrastructure from their scarce financial resources, both resources generated domestically and provided by foreign sources such as bilateral and multilateral donors and direct foreign investment. This question is also important for international development institutions such as the World Bank Group and regional development banks that wish to maximize their impact on economic development and social welfare in recipient economies. Timilsina et al. (2020) review a large body of literature and conclude that no consensus exists on the impacts of infrastructure investment on economic growth. Some existing studies show a strong positive relationship between infrastructure development and economic growth, whereas others find a mildly positive relationship or no relationship. Many factors are responsible for these varying results, such as differences in methods, differing approaches to measuring infrastructure development, the varying development stages of countries included in the sample, varying time periods, and geographical factors such as high or low population density (Timilsina et al. 2020;Elburz et al. 2017 3 ).
To further illuminate the role of infrastructure in economic growth and development, this study uses a dynamic panel data model to evaluate the contributions of three main categories of infrastructure: transport, electricity, and telecommunications, to growth in a panel of 87 countries over the period 1992 to 2017. Our main estimate uses the pooled mean group estimator (Pesaran et al., 1999) to estimate the effects of these and other inputs on growth.
The best previous research on this topic (see Burke et al., 2018), such as Caldéron et al. (2015), is based on data that ends in 2000 and does not include new types of infrastructure such as mobile phones and internet connections. Our study makes four main contributions.
First, we expand the types of infrastructure considered to include these new categories, specifically mobile phones. Second, we update the analysis to include the most recent available data (2017). Third, we provide separate estimates for developed and developing countries.
Finally, we estimate short-and long-run elasticities of GDP with respect to infrastructure.
Identifying the effect of infrastructure on economic growth and development is challenging because it is likely to be built in the expectation that there will be demand for it, creating a reverse causality challenge (Cook, 2011). There are also likely to be lags of varying lengths before the full extent of economic benefits is received from infrastructure. Identifying the effects of specific types of infrastructure will also be challenging because provision of electricity, transport, and telecommunications infrastructures are likely to be positively correlated. For example, World Bank (2018) finds that the impact of improved access to electricity in rural areas is positively related to road access. One way to address this issue, that has been used in some previous research (e.g. Calderon et al., 2015) is to use a dynamic model, which allows us to test the effect of past changes in infrastructure on growth, while testing for the weak exogeneity of infrastructure inputs.
Although several studies show a strong positive relationship between infrastructure and economic growth in less developed countries deprived of adequate infrastructure (Calderón and Servén, 2010;Kodongo and Ojah, 2016;Chakamera and Alagidede, 2018), whether this finding holds for industrialized economies remains an open question. Is there a threshold level of economic development (measured in terms of per capita GDP or human development indicator) below which the relationship between the infrastructure and economic growth is stronger, whereas the relationship is weak or absent above the threshold? An investigation of this relationship by separating the countries into different groups by income, using the World Bank classification of countries into high-income, middle-income, and low-income countries, can help answer this question. Therefore, we also provide separate estimates for a group of low-and middle-income countries and a panel of high-income economies.
Our study finds larger effects of infrastructure on economic output than found by the previous best studies. We also find that the effects of infrastructure are a higher after 1991 than before that date. The infrastructure has larger effects in developing economies as compared to those in industrialized economies.
The paper is organized as follows. Section 2 presents a review of previous research on the relationship between infrastructure and economic growth. The next two sections introduce the data and econometric methods used. Section 5 presents and interprets the results. Section 6 draws key conclusions and policy insights.

Previous Research
Several early studies (e.g., Aschauer, 1989;Munnell, 1990;Duffy-Deno and Randall 1991;Garcia-Milla 1992;Rives and Micheal 1995;Wylie, 1996;Morrison and Schwartz, 1996.) investigate the role of public infrastructure in economic growth. For example, using state-level data in the United States, Munnell (1990) finds that public capital has a significant, positive impact on output. Similarly, using data of 48 U. S. states from 1969to 1983, Garcia-Milla et al. (1992 find a positive relationship between public infrastructure on education and highways and gross state products. Studies such as Rives and Heaney (1995) show that the impacts of infrastructure are higher in the local economy as compared to in the national economy. Wylie (1996) reports higher output elasticities of infrastructure investment in Canada than in the United States. These studies, however, suffer from methodological limitations, such endogeneity between the public capital stock and economic performance, common trends inducing spurious correlation, reverse causality where the causation also runs from the economic measures to public infrastructure investment, measurement errors, and data availability. Some of these problems have been addressed in later research. However, later studies report that the relationship between infrastructure investment and economic growth is either very weak or absent at least in the United States. Using a meta-analysis, Elburz et al.
(2017) conclude that there exists no relationship between infrastructure investment and economic output. Timilsina et al. (2020) find that existing studies do not agree on the relationship between infrastructure and economic growth.
The early studies mentioned above focused on industrialized economies, particularly those in North America. The subsequent literature, however, extends the research frontier to cover both industrialized and developing countries. Many of these studies use the stocks of physical infrastructures (electricity generation capacity, roads, landline telephone system) instead of public expenditure on infrastructure. Calderón and Servén (2010) use indices infrastructure quantity and quality index to aggregate heterogeneous infrastructure assets. The dependent variable is non-overlapping five-year average GDP growth rates for the  period. They applied the system GMM developed by Arellano andBond (1991) andArellano andBover (1995) to a dynamic panel to address the reverse causality problem, employing both internal instrumental variables and external instruments (demographic variables). They find positive and significant effects on economic growth for both variables and argue that infrastructure development raised the growth rate globally by 1.6 percentage points per annum in 2001-2005relative to 1991-1995. Following Calderón and Servén's approach, Kodongo and Ojah (2016 and Chakamera and Alagidede (2018) both also used system GMM to estimate the relationship between changes in an infrastructure index and economic growth in Sub-Saharan Africa. Chakamera and Alagidede (2018) also find Granger causality from infrastructure to growth but not the reverse. It should be noted that difference and system generalized method of moments (GMM), techniques may suffer from problems caused by weak internal instruments (Bun and Windmeijer, 2010;Bazzi and Clemens, 2013).
Using a synthetic infrastructure index composed using principal component analysis from electricity generation capacity, total road length, and the number of fixed telephone lines, Calderón et al. (2015) conducted a dynamic panel analysis for a sample of 88 countries over the period 1960-2000. They use the pooled mean group estimator (Pesaran et al., 1999) to estimate the model focusing on a long-run production function relationship between infrastructure variables, other manufactured capital, human capital, and GDP. They found a positive and significant long-run effect of infrastructure on GDP with an elasticity of 0.07 to 0.1 depending on specification. The authors could not reject the null that infrastructure is weakly exogenous, which helps to assuage reverse causality concerns. Thus, their study provides macro-level evidence that infrastructure capital can deliver economic dividends. They find no evidence that the effects of infrastructure on GDP vary across countries at different development levels.
Some studies focus their analysis at the regional level, such as Sub-Saharan Africa (Estache, et al. 2005;Nketiah-Amponsah, 2009;Kodongo and Ojah, 2016;and Chakamera, and Alagidede, 2018). Reviewing the literature focused on Sub-Saharan Africa, Ajakaiye and Ncube (2010) report that most studies conducted for this region show a strong relationship between infrastructure investment and economic growth. Several studies have been conducted at the country level. These include Lewis (1998) for Ghana, Mostert and Heerden (2015) for These studies find that the regional disparity in economic development within a country can be explained through the level of infrastructure investment in its regions.
There are also many micro-level studies of the effects of increasing access to electricity or other forms of infrastructure. While these micro studies typically suggest positive impacts of electrification on income and other development outcomes, more recent quasi-experimental approaches such as randomized controlled trials typically find a smaller impact for electrification than earlier studies did (Lee et al., 2020). This could be a result of a limited time window for assessing the outcomes of interventions as it takes time to invest in the complementary inputs that are needed to increase income.

Overview
We estimate the impact of infrastructure stocks on economic growth using a large panel data set. Following Calderon et al. (2015), we estimate an infrastructure augmented aggregate output function using the PMG estimator. Aggregate output (i.e. real GDP) is a function of physical capital, human capital, and infrastructure variables. Initially, we assembled a panel data set for 189 countries over the period 1992 to 2017. Due to missing data and in order to ensure a balanced panel for the PMG estimation, the sample was reduced to 87 countries.
Sample countries are listed in the Appendix. We also carry out the analysis for developing and developed countries separately. There are 48 low-and middle-income countries, as classified by the World Bank, 4 in the sample. Data sources and processing are described in the Appendix.

Principal Components
To reduce the dimension of the data for the PMG estimation we follow the approach of We use the five variables for which we have time series for 1992 to 2017 for the 87 countries. They are the natural logarithms of the following variables per 1,000 workers: the total length of road networks in kilometers, the total length of railways in kilometers, electric power generation capacity in megawatts, number of mobile phone subscribers, and the number of land-line telephone subscribers. We removed country means from each of these variables so that we only use the within variation. By only using variation within countries, we avoid the issue of different definitions of roads in different countries. Then we standardized the variables 4 https://datahelpdesk.worldbank.org/knowledgebase/articles/906519-world-bank-country-and-lending-groups by dividing them by their standard errors. We computed the eigenvalues and eigenvectors of the correlation matrix, which are presented in the following tables.  Table 1a gives the eigenvalues of the five orthogonal principal components and the percentage of the variance of the original infrastructure variables each explains. We select the first two principal components as they have eigenvalues greater than one. Together, they explain 61% of the variation. This share is less than in some existing studies, such as Caldéron et al. (2015), because we took out country fixed effects.
The eigenvectors in Table 1b give the coefficients, which will be used to derive the respective principal components as functions of the original variables, as explained below. The first principal component loads strongly on electricity, mobile phones, and land-line telephones, hence we interpret it as a telecommunications and electrification factor. The second principal component loads strongly on roads and railways and so we interpret it as a transport factor. To aid the interpretability of our econometric model we normalize the principal components so that the coefficients of the non-standardized infrastructure variables in each principal component sum to one. The two principal components are given by: where the are the coefficients of the eigenvectors in Table 1 and the are the standard deviations of the original infrastructure variables, , , expressed in logs of quantities per worker with country means subtracted. The coefficients of each of these variables in the principal components are, therefore:  Table 3 presents summary statistics for the variables in their original units at the beginning and end of our sample period, which also illustrates the relative growth in the variables over the period. All variables enter our models in logarithmic form and the panel regression models also include country fixed effects or individual country intercepts, which together reduce the variation in magnitude and variance seen in Table 3

Background
Economic growth is an economy-wide, dynamic process with effects that cannot be fully captured by micro studies, and, therefore, macroeconomic analysis is important. It is likely to be easier to find evidence for causal effects using disaggregated micro-level data rather than using macro-level data (Lee et al. 2020;Asher and Novosad, 2018). This is because some variables may more easily be considered exogenous at the micro-level, and randomized trials and other field experiments are possible. But it does not seem very plausible that we can find exogenous factors that are correlated with infrastructure development across many countries but do not have direct effects on the economy to use as instrumental variables, though natural experiments might be available for specific countries. So, identifying exogenous shocks using time series or panel data methods is probably the only viable approach.
Ideally, we would identify exogenous shocks to infrastructure using a structural panel vector autoregression (Pedroni, 2013). There are three main challenges to using such an approach. First, both common and country-specific shocks can be formulated. With a large cross-sectional dimension, a very large number of parameters and impulse response functions will be estimated. If we are interested in the typical effects of infrastructure on development, this information will need to be summarized in some way. The best way to summarize the uncertainty in these summary measures is not clear. Second, the VAR approach requires us to model each of the dependent variables when we are only really interested in the GDP equation.
Third, we need to formulate identifying restrictions. Methods to empirically identify a panel SVAR using independent component analysis (see Maxand, 2020) have not yet been developed. Therefore, potentially questionable theoretical restrictions must be used. The current study can be seen as a step towards the long-term goal of developing such a model. In this paper, we use two such composite infrastructure variables, but we also present the results for the disaggregated variables.

Econometric Techniques
Our main analysis uses an aggregate production function as the long-run relationship embedded in an ARDL model. We impose constant returns to scale: where is real GDP in the th country at time , is total factor productivity, is physical capital, is human capital per worker, and L is labor. The n are different types of infrastructure capital. As infrastructure is part of the overall capital stock, K, the coefficient vector will reflect the effect of allocating some of the capital stock to infrastructure rather than other forms of capital such as private structures and equipment. This model implies that though GDP is defined as the income of capital and labor, if capital and labor both increase by the same percentage, but infrastructure remains constant, then, if the are greater than zero, GDP increases by less than that percentage so that there are decreasing returns. Taking logs, subtracting the log of aggregate human capital from both sides, and introducing an error term, , which is likely to be serially correlated, we have: where is ln , y is ln( / ), k is ln( / ), h is lnH and the are ln� / �. As described above, we use the first two principal components of the infrastructure variables in our estimation. The resulting model is: Where � 1, and � 2, are the first two principal components. Then the elasticity of GDP with respect to form j of infrastructure is: where � 1 and � 2 are the two regression parameters of the principal components in (3) and 1 and 2 are coefficients of the respective infrastructure types in the first and second columns of Table 2.
Our main objective is to examine the long-run relationship between infrastructure and economic growth. Pesaran (2006) pointed out that three specification issues need to be addressed when estimating the long-run parameters of (3). First, the long run relationship (3) should cointegrate and all the variables should be unit-root in levels and stationary in first differences. Hence, we perform both the IPS (Im et al. 2003) and cross-sectionally augmented panel unit-root (CIPS) tests (Pesaran, 2007) to find the order of integration of variables considered in the model. We use the Kao (1999), Pedroni (2004), and Westerlund (2005) cointegration tests. The first two tests are based on modified Dickey-Fuller and modified Phillips-Perron test statistics and Westerlund is based on variance ratio for the residuals. The alternative hypothesis for each test is that the time series in each and every country are cointegrated. In order to treat the estimated parameters as the causal effect of infrastructure on growth, we need to test for the weak exogeneity of the input variables. We do this using the test proposed by Johansen (1992) as laid out in Calderon et al. (2015). This estimates a VAR in first differences of the input variables, adding first differences of output and the error correction terms from the PMG model. If the error correction terms are jointly insignificant, then the input variables are weakly exogenous.
Second, there is the issue of cross-sectional dependence. If cross-sectional dependence in the residuals is not modeled this can seriously affect inference and even result in bias in the estimates of the regression coefficients (Söderbom et al., 2014). To deal with cross-sectional dependence, we removed the cross-sectional mean from the data before performing the unitroot tests but we also employ the Pesaran (2021) CD test to check the cross-sectional independence of the residuals. Third, as countries vary regarding their income level, resources, geographic locations etc., we should take cross-country parameter heterogeneity into consideration when estimating Equation (3) (Pesaran et al., 1999).
We use the Pesaran et al. (1999) Pooled Mean Group (PMG) estimator that is embedded in an autoregressive distributed lag (ARDL) framework and models heterogenous short-run dynamics: , is a country fixed effect, is a time effect common across countries, and is an independent and identically distributed random error term. The first term on the RHS models adjustment towards the long-run equilibrium, while the second and third terms model short-run dynamics. We used the replication file provided by RATS for Pesaran et al. (1999) to develop our code using static fixed effects as the initial coefficient vector.
We also provide static fixed effects, dynamic fixed effects (DFE), and mean-group (MG) estimates. These estimation techniques consider different homogeneity restrictions on the coefficients. The mean-group estimation permits the coefficients to differ in both the shortrun dynamics and long-run equilibrium relationship, whereas the dynamic fixed-effects model imposes homogeneity of all parameters. PMG allows short-run dynamics but not the long-run relationship to vary across countries. All these estimates control to a certain degree for confounding variables that vary across countries.
We also estimate the model separately for developing and developed countries. To test whether the effect of infrastructure has changed over time we also estimated the model for a 1970-1991 sample. Finally, to examine the robustness of our results, we estimate the model using the World Bank PPP GDP series.

Coefficient Estimates
We consider maximum values for p and q in (5) of three (two lags of the first differences). We use the Akaike Information Criterion based on the likelihood function in Pesaran (1999) to select the optimal lag length. We do not choose different lag lengths for each country. Effectively, we are choosing a maximum lag length for all countries. Excess lag coefficients can be estimated as zero. Table 6 shows that the optimal lag length is two lags of both the dependent variable and the independent variables (one lagged first differences). The parameter estimates are quite similar for all models with ≥ 2. The infrastructure elasticities are smaller for models with = 1.  to previous estimates such as Calderon et al. (2015). The relatively low value of the adjustment coefficient for PMG and DFE shows that GDP responds slowly to changes in infrastructure as we would expect.
Focusing on the PMG estimates, the first principal component, which we interpret as mainly an electricity and telecommunications factor has double the effect on GDP than the second principal component, which we interpret as mainly a transport factor. Together they have about the same effect on GDP as physical capital in general. As described in (4), we can also compute long-run elasticities of GDP with respect to the individual types of infrastructure. Table 8 reports these for the FE and PMG estimates. Roads, railways, and telephones have similar large elasticities and mobile phones and railways small elasticities. Given the restrictions imposed by the principal components analysis the relative sizes of the effects of each type of infrastructure are similar across the estimators but all are smaller for the FE estimate. Among the telecommunications infrastructure types, we estimate a larger elasticity for fixed line phones than for mobile phones. However, mobile phones have grown in number much more rapidly and so the small elasticity does not reflect a necessarily small contribution to growth. While the mean number of telephone lines per worker was unchanged from 1992 to 2017, the mean number of mobile phones increased by a factor of 144 or in natural logarithms almost 5 (Table 3). Microeconomic research has identified positive impacts of income from mobile and smart phones (Hübler and Hartje, 2016). Though we have assumed that elasticities are constant, in reality they likely depend on the level of the inputs. Table 7 includes the Pesaran (2021) CD test. This shows that the residuals are crosssectionally independent for the MG and PMG estimates if we use a 5% significance level. We also carry out the test for weak exogeneity. The chi-square test statistic is 9.84 with three degrees of freedom and, therefore, a p-value of 0.02. Hence, we cannot reject the null hypothesis of weak exogeneity at the 1% level, and we can interpret the PMG elasticities as causal effects.
As a robustness test, we also estimate the PMG model using GDP in 2017 PPP dollars from the World Bank Development Indicators (Table 10). In the latter data set some countries have missing data in the first few years. We used the growth rates of RGDPO from PWT 10 to fill in these missing observations. We impose the same lag length as in our main estimate. For the World Bank GDP data, the coefficient estimates are very similar to our main estimate.
However, we can reject the weak exogeneity hypothesis for this estimate at the 5% level.

Development Status
Table 10 also reports results for the PMG estimator separately for developing and developed countries. Tables 9a and 9b report the AIC for alternative lag lengths for these subsamples. As a result, we chose p=1, q=3 for the developing country sample and p=3, q=1 for the developed country sample. The infrastructure coefficients are smaller in the developed country sample than in the developing country sample. Additionally, we can reject weak exogeneity of the inputs in the developing country sample. This seems to be the result of choosing the shortest lag length for the dependent variable for this sample as we see the same phenomenon for the 1970-91 sample.     Table 10 reports results using a sample for the years 1970 to 1991. 5 Data is available for 79 countries for this period (Table A2). Many countries were added as well as dropped. In total, 21 countries appear in this sample that do not appear in the later sample, 58 countries appear in both samples, and 29 do not appear in this earlier sample but are in the later sample.

Earlier Decades
We recomputed the principal components, again selecting the first two principal components. Table 9c shows that the optimal lag lengths are p=q=1. The estimated parameters are somewhat similar to those for the developing country sample, with a large elasticity of physical capital compared to human capital in this sample.
Comparing the coefficients of individual infrastructure variables in Table 11 to those for the post 1991 period in Table 8, we find that they are of similar magnitude but are, on the whole, a little smaller especially for electricity and phones than in the earlier decades. As we mentioned above, our main estimates find a much larger effect for infrastructure on growth than the best previous studies. This could partly be explained by an increase in the size of the effect in the three recent decades compared to the previous two.

Long-Run versus Short-Run Effects
As we have estimated a dynamic panel model, we can compute both short-run and long-run elasticities. We recover the average short-run parameters across the sample by estimating a mean-group model imposing the cointegrating vector estimated by PMG. Figure 2 reports the impulse response curves of GDP to changes in the two principal components for our main estimates: Figure 2. Response of GDP to increase in infrastructure Figure 2 shows that in the full sample, not only does transport infrastructure, which is mainly associated with the second principal component, have a smaller long-run effect than electricity and telecommunications, in the short run its effect is negative. However, even for electricity and telecommunications, which are mainly associated with the first principal component, the short-run effect is only about a quarter of the final long-run effect. This suggests that studies that find small effects of infrastructure provision (e.g. Lee et al., 2020) need to adopt a much longer-run time frame for program evaluation. Similarly, we see that the static fixed effects estimates, which tend to converge to short-run effects (Stern, 2010) are much smaller than our long-run estimates.

Rate of Economic Growth
We also estimated cross-section growth models of the following form: where −5 is a vector of the initial values of a set of variables including infrastructure variables and initial GDP per-capita. The latter variable controls for many unobserved determinants of growth. These regressions allow us to use a wider range of infrastructure variables that are not available as extensive time series, such as airports. On the whole, the coefficients of the infrastructure variables were statistically insignificant in these regressions, and we do not report them in this paper. This implies that higher levels of infrastructure are not associated with more rapid economic growth.

Conclusions and Policy Implications
We extended previous research on the role of infrastructure in economic growth to recent decades  and to also including new types of infrastructure, such as mobile telephones. Our study shows two major insights. First, we find larger effects of infrastructure on growth than the previous best studies found. Second, we find that infrastructure has greater impact in developing economies than in developed economies. Both of these findings can be explained through the use of recent data sets covering developing countries where lack of infrastructure is a major development bottleneck. These findings suggest that access to infrastructure plays an important role in unraveling the barriers created by lack of infrastructure to economic growth. The higher impact in the more recent period can also be attributed to the rapid expansion telephone services (through fixed line or mobile phone services). Inputs that are limiting factors or "binding constraints" (McCulloch and Zileviciute, 2017;Burke et al., 2018) are likely to make more difference to output if their quantity increases. Infrastructure stocks do not seem to increase the rate of economic growth.
The findings of the study imply important policy considerations. The higher impacts of telecommunication indicate that economic growth can be stimulated through better information that enhances the market access to products, potential productivity gains through increased substitution possibilities facilitated by information services or through better telecommunication access. Providing telecommunication services is cheaper than providing transportation and electricity services although there would be very little substitution between them. Another policy implication is that infrastructure whose unavailability has already created barriers to economic development has to be addressed first. In other words, an optimal investment in infrastructure is critical. Instead of providing 'too much' road infrastructure like in the United States, an optimal road infrastructure where use of roads (services obtained from road infrastructure) instead of availability of roads (i.e., just increasing the stock of road infrastructures) can be maximized.
We collected data on a wide range of other infrastructure variables, such as the number of airports or the percentage of roads that are paved, but these were very limited in time dimension or the number of countries for which data was available. Future research on the effects of infrastructure on growth would benefit from extending collection or estimation of APPENDIX We extracted real GDP, capital stock, human capital, and employment data from Penn World Table 10 (Feenstra et al. 2015). We use the "rgdpna" series which measures real GDP in constant 2017 US PPP dollars, using the growth rates from countries' national accounts. We use "rnna" capital stock series, which is also in constant 2017 US PPP dollars, using the growth rates from countries' national accounts. We refer to this in the following as "physical capital" to distinguish it from human capital. To test the robustness of our analysis to data sources we also used World Bank data on PPP adjusted GDP (World Development Indicators). The human capital stock is measured by the "hc" series which is an index based on years of schooling and returns to education (Inklaar and Timmer, 2013). Lastly, we use the "emp" series which captures number of persons engaged in work force to measure labor input. We divide GDP and capital stock by labor and take natural logarithms of these variables.

Infrastructure Data
For infrastructure, we assembled data for both quantity and quality of infrastructure variables. We prepared data in three broad categories: transport, telecommunications, and energy as follows.

Transport
Road quantity: We use two editions of World Road Statistics from the International Road Federations for the periods 1990-2007 and 2000-2017 to prepare the length of road network (in kilometers). We also received data from Cesar Caldéron that is derived from similar sources. We merged the data sets as follows: i. Where the values for 2000 in the two WRS datasets matched we just merged the two series.
ii. Where they did not, but the Calderon data did match the 2000-2017 WRS data we used the Calderon data for years before 2000.
iii. Where neither matched exactly we used the implied growth rates in either earlier data set to project back from 2000 to 1990. iv.
We linearly interpolated missing values.
v. Additionally, we deleted the data point for UAE in 2000 for being anomalously low.
We removed the first 1 from the numbers for Botswana for 1992-4. We deleted the datapoint for Bulgaria 2005.
vi. Finally, 11 countries had missing data for 2017, in that case we extrapolated the missing 2017 data by keeping same value as stated in 2016. If there was no or little growth in the road network prior to 2016. There were six countries where that was possible: Belgium, Lao PDR, New Caledonia, Papua New Guinea, Thailand and Vietnam.

Railways:
The length of railways in km is obtained from the World Bank's Development Indicators Database, Knoema (2020) database and complemented by data from national sources referenced in Wikipedia. For some countries, we drop apparently bad data and replaced it with data from official government statistics. For instance, for Australia we obtained data from the Bureau of Infrastructure and Transport Research (2020). We also linearly interpolated some missing or obviously incorrect datapoints.

Telecommunications
Telecommunications: Number of mobile phones subscribers and number of telephone lines data are taken from the International Telecommunication Union's (2020) World Telecommunication /ICT Indicators database.

Energy
Electricity generation capacity data (in megawatts) extracted from the UN Energy Statistics Database (2020).
All variables are converted to per worker values with the exception of human capital which is already in per worker form. All variables apart from mobile phones, and railways are converted to natural logarithms. As there are many 0 values for the number of mobile phones and the length of railways, we use the inverse hyperbolic sine (IHS) transformation for these variables given by � + √ 2 + 1� instead of logarithms.