Policy Research Working Paper 9493 Data Transparency and Long-Run Growth Asif Mohammed Islam Daniel Lederman Middle East and North Africa Region Office of the Chief Economist December 2020 Policy Research Working Paper 9493 Abstract For centuries states have engaged in collecting data to higher potential level of gross domestic product per capita. serve various interests. In modern times, a data gap has The estimates indicate an elasticity of the magnitude of 0.03 emerged between developing and developed economies, percent per year, which is much larger than the elasticity with the latter having more advanced data systems. The of trade openness and schooling in the estimation sample. authors explore the effects of data transparency on long- The empirics employ a variety of econometric estimators, run growth for a sample of mostly developing economies. including dynamic panel and cross-sectional instrumental Data transparency is defined as the timely production of variables estimators, with the latter approach yielding a credible statistics as measured by the Statistical Capacity higher estimated elasticity. The findings are robust to the Index. The paper finds that data transparency has a positive inclusion of several factors in addition to political insti- effect on real gross domestic product per capita, implying tutions and exogenous commodity-price and external a statistically significant impact on transitional growth to a debt-financing shocks. This paper is a product of the Office of the Chief Economist, Middle East and North Africa Region. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at aislam@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Data Transparency and Long-Run Growth Asif Mohammed Islam Daniel Lederman 1 World Bank Group Middle East and North Africa Region Office of the Chief Economist Originally published in the Policy Research Working Paper Series on December 2020. This version is updated on June 2021. To obtain the originally published version, please email prwp@worldbank.org. JEL Codes: D83, E01, EO2, O47 Keywords: Data transparency, statistics, economic growth 1 Lederman is the Acting Chief Economist for MENA and Islam is a Senior Economist with MNACE. The opinions expressed in this paper do not represent the views of the World Bank Group, its Board of Directors or the Governments they represent. All errors and omissions are the authors’ responsibility. We are grateful for feedback from the Economic Research Forum presentation panelists Hadi Esfahani, Khalid Abu-Ismail, Mustapha Nabli, and Noha El-Mikawy. We are also grateful to Todd Eisenstadt, Tamar Gutner and Shadi Mokhtari for feedback during the presentations at the American University School of International Services and Center for Environmental Policy. We would also like to thank Bledi Celiku, Harun Onder, and Kevin Carey for comments. We are also thankful to Aart Kraay for additional comments. I. Introduction The perennial question in the development economics profession is why many developing economies do not develop, and whether there are avenues that could allow them to catch up to the levels of income per capita observed in advanced economies. Recent history has witnessed the explosion of data, laying bare stark disparities in data systems between developing and advanced economies. Advanced economies are characterized by modern and coordinated data collection systems that are widely accessible to the research community. Such transparent data systems have presumably created virtuous information feedback loops whereby ideas are generated and debated based on evidence, thus helping improve the quality of public policies over time. The wealth of data and the underpinning architecture of data ecosystems in advanced economies enable them to have a monopoly in attracting research that expands the frontiers of knowledge (Das et al., 2013). In contrast, many developing economies have either lagged in their capacity to generate data or have prevented the research community, the independent media, and civil society from accessing economic data generated by the public sector. Yet, little attention has been paid to the costs of opaque data systems on economic growth in developing economies. This paper seeks to make a first contribution to the literature in this direction. At the time of writing, there are no other existing empirical analyses of the link between data transparency and subsequent economic growth. Historical data collection can be traced to the first recorded census undertaken by Babylonians around 3800 BC (Grajalez et al., 2013). The purpose and nature of data collection have evolved over time. Early data collection served rulers with the aim of accounting for wealth and power. Information was gathered for taxation purposes, counting of men for military recruitment and workforce, and ascertaining conquered populations and territories (World Bank 2021, forthcoming). The data were kept secret from the public and not meant to improve their lives. This raised general distrust of data collection activities among the populace. In contrast, the enlightenment ideals in eighteenth century Europe emphasized objective scientific inquiry. The role of data evolved to a means of examining society and became more public. In the late eighteenth century, statistical agencies were set up in Europe and North America to publish official statistics and inform the public. The modern responsibility of developing countries to produce statistics is well established. The United Nations’ Sustainable Development Goals (SDGs) emphasize the need for countries to generate socioeconomic indicators within the limits of each country’s capacity. The costs of such efforts are not trivial, and the trade-offs between investing in data capacity and other pressing needs are presumably difficult for developing country governments to ascertain. 2 The long-run benefits of transparency, however, could be considerable. This study explores the partial correlation between data transparency and economic growth proxied by gross domestic product (GDP) per capita. Data transparency, as measured by the Statistical Capacity Index (SCI), entails the production and availability of credible statistics. Figure 1 illustrates the unconditional correlation between SCI and GDP-per-capita growth. Country-year observations are ranked in terms of their SCI scores and grouped by terciles along the horizontal axis. Annual growth rates are measured along the vertical axis. In general, higher terciles of the SCI are correlated with faster economic growth for both the sample of economies that are included in the econometric estimations sample and the larger data set that includes economies excluded from the econometric analyses due to missing data on covariates. It is noteworthy that Figure 1 suggests that the positive correlation between SCI and growth is stronger in the larger sample of country-years than in the estimation sample, 2 It is worth noting here that it is not obvious that setting up minimum standards of data generation and transparency are relatively costly when compared to the public-sector wage bill in developing countries. 2 thus possibly producing a downward bias in the econometric estimates of the partial correlation between SCI and growth. Figure 1: Data Transparency and Long Run Growth Data Transparency (SCI) and Log Difference of GDP per Capita (Panel) All Economies Sample Economies 0.04 0.035 Log Difference GDP per Capita 0.035 Log Difference GDP per Capita 0.03 0.03 0.025 0.025 0.02 0.02 0.015 0.015 0.01 0.01 0.005 0.005 0 0 Tercile 1 Tercile 2 Tercile 3 Tercile 1 Tercile 2 Tercile 3 Statistical Capacity Score (SCI) - Terciles Statistical Capacity Score (SCI) - Terciles Data Transparency (SCI) 2005 and Log Difference of GDP per Capita 2005-2018 (Cross-section) All Economies Sample Economies 0.45 0.45 Log difference GDP per Capita Log difference GDP per Capita 0.4 0.4 0.35 0.35 0.3 0.3 (2005-2018) (2005-2018) 0.25 0.25 0.2 0.2 0.15 0.15 0.1 0.1 0.05 0.05 0 0 Tercile 1 Tercile 2 Tercile 3 Tercile 1 Tercile 2 Tercile 3 Statistical Capacity (SCI) score (2005) terciles Statistical Capacity (SCI) score (2005) terciles Note: The top panel of figure 1 presents the average per capita growth rate for each tercile of SCI based on a panel data set of economies over time. The bottom panel provides the average log difference in GDP per capita between 2005 and 2018 for each tercile of the SCI score in 2005. The econometric estimations examine the relationship between data transparency and economic growth using both dynamic panel System GMM estimators as well as an Instrumental Variable approach on a cross- section of economies to account for endogeneity. The results indicate that a 1 percent increase in SCI score is associated with a 0.03 percent increase in the real GDP per capita per annum across panel and OLS cross- section estimations. The magnitude of the effect increases to 0.04 percent using the Instrumental Variables approach. The magnitude of the elasticities uncovered for SCI are larger than the elasticity of GDP per capita with respect to trade openness and even with respect to school enrollment in our sample of economies. Analyses of the economic magnitude of the estimated elasticities further support the contention 3 that data transparency could be a more powerful driver of growth than either trade or schooling, even without taking into account the costs that would be entailed in raising trade or schooling over time relative to improving data systems. Conceptually, there are several channels through which data transparency can affect development. First, credible and timely data serve as the basis for policy and reforms. Data are about records. Take the example of a business. A manager has the primary goal of raising profits. To achieve this, performance is benchmarked historically and compared with that of competitors. Collateral must be evaluated and leveraged to obtain financing to pursue new ventures. Risk and reward must be balanced. Investors need to be enticed. Without record-keeping, many of these goals could not be achieved. Indeed, experimental evidence indicates that simple interventions that help manufacturing plants improve inventory record keeping in a developing economy are associated with improved performance (Bloom et al., 2020). A similar set of challenges face governments. Countries need to grow, and, to expand options, data must be reliably transparent to provide guidance. Countries with high quality and broadly accessible information can make better decisions. Through data and evaluation, existing policies may be reformed and refined, while new policies may be experimentally evaluated (Rodrik, 2010). Improving statistical capacity can resolve disagreements and lead to better policy reforms (Binswanger and Oechslin, 2015). Second, data that are accessible to the broader civil society can generate better policies and reforms. Substantial expansions in the frontier of knowledge occur when data are available to a large base of analysts. Researchers test hypotheses, debate and dispute findings, establish robust facts and relationships to facilitate the emergence of the best ideas for addressing challenges. Improved data access increases scientific research (Nagaraj et al., 2020). Third, when data are of questionable quality or unavailable, the gap between perceptions and reality may grow. Important reforms may lead to real welfare improvements yet have little impact on public perceptions because there is limited data tracking such improvements. These perceptions may foster a narrative that results in frustrations that manifest themselves in social protests and unrest (Abi- Nassif et al., 2020). Similarly, if data are of dubious quality, the public may lose confidence in such information and may not alter their perceptions despite positive findings from the data. More important, once a government walks down the path of unreliable or limited data accessibility, it may be difficult to regain credibility. The public may be less willing to trust information from the government, which makes it difficult for a changed government to change public perceptions. The result is economies that are more prone to social upheavals. 3 Transparency, in general, can be a means to hold governments accountable and improve institutional quality (Khemani et al., 2016; Lederman et al., 2005; Djankov et al., 2004). A large literature has established the consequences of institutions on output per capita and growth (Hall and Jones, 1999; Acemoglu et al., 2005). Corruption has been found to reduce growth while democratic institutions are found to improve development (Mauro 1995; Acemoglu et al., 2001). Transparency is the antithesis of corruption while being a vital characteristic of democratic institutions. This study contributes to the literature by focusing on a specific type of transparency – data transparency - and its effect on economic growth. A few studies have empirically explored the effects of the quantity of data as well as accessibility on governance, financial 3 Lack of transparency hurts even more when systems are under stress by potent threats, such as the ongoing Covid-19 pandemic. An optimal societal response requires open and direct communication across several actors in society—the government, health care systems, civil society, and various institutions. Information needs to be collected in real time to enable governments and public health officials to take timely, decisive actions. Citizens need to report cases and respond to behavioral changes requested by the government. The flow of data is the oil of the engine of this system of interactions and responses. When data are not made public or are misused, the engine can fail. The ramifications of the lack of trust, forged by limited transparency, come into stark relief when citizens are confused about what to believe. 4 development and investment (Islam, 2006; Williams, 2009; Hollyer et al., 2011; Binswanger and Oechslin, 2015). At the time of writing, we are unaware of any other studies that have empirically explored the relationship between data transparency and economic growth. This fills a gap in the long literature of determinants of economic growth and policies to improve development at large. The rest of the paper is structured as follows. Section II provides an overview of the related literature. Sections III and IV present the estimation strategy and data, respectively. Results are provided in section V, while section VI concludes. II. Related Literature The role of data transparency in facilitating economic growth is closely tied to institutions in the literature. A salient point of Coase (1960) was that transaction costs matter and may explain the nature of organizations such as the firm. Positive transaction costs imply institutions matter for economic development. Institutions can determine the payoffs to political and economic activity in developing economies, which in turn influence economic development (Douglas, 1989; Douglas 1990; Williamson, 2000). “Bad” institutions may lead private agents to allocate resources towards protection, forgoing productive investments. Hall and Jones (1999) show that institutions and government policies (called “social infrastructure”) explain variation in capital accumulation, productivity, and thus output per worker across economies. Countries that inherited weak or “extractive” institutions experience higher macro- economic volatility and lower output per capita today (Acemoglu et al., 2001; Acemoglu et al., 2003; Acemoglu et al., 2005). In contrast, democratic institutions tend to lead to long-run growth (Acemoglu et al., 2019; Rodrik, 2000). Institutions have also been found to have a larger role in explaining income levels than geography and trade (Rodrik et al., 2004). The quality of institutions is largely dependent on their degree of accountability and corruption. Institutions operating in secrecy – lack of transparency - tend to be poor in quality as public officials can seek personal gain with little accountability. A large body of literature has accounted for the detrimental effects of corruption on economic growth (Mauro 1995; Ehrlich and Lui, 1999). The secretive nature of corruption makes it distortionary thereby discouraging investment and slowing growth (Shleifer and Vishny, 1993). Corruption can discourage foreign direct investment, entry of new firms, and misallocate public expenditures (De Soto, 1989; Mauro, 1998; Wei, 2000; Brada et al., 2019). Furthermore, institutions or even corruption that is “unpredictable” discourages investment and lowers growth (Campos et al., 1999). Corruption may also reduce parental investments in human capital (Varvarigos and Arsenis, 2015; Brunetti et al., 1998). The role of transparency in facilitating good institutions by limiting corruption has been well debated. Press freedom and political institutions that increase accountability have been found to limit corruption across economies (Brunetti and Weder, 2003; Lederman et al., 2005; McMillan and Zoido, 2004). As noted by Djankov et al. (2003), two challenges in institutional design are the control of disorder and dictatorship, and transparent governments are better able to control disorder while incurring few social losses. Cordis and Warren (2014) find that strengthening the Freedom of Information Act (FOIA) in the United States reduces corruption and increases the likelihood that corrupt acts will be uncovered. In Mozambique, Armand et al. (2020) implement a large-scale field experiment following the dissemination of information about a substantial natural gas discovery to test whether information can limit the political resource curse. The study finds that when information reaches citizens it increases mobilization and reduces violence, but when information does not spread beyond local leaders it has negative effects such as elite capture and increase in rent-seeking activities. 5 However, studies have also shown that transparency is not always desirable, or it is insufficient to limit corruption. Bac (2001) argues that transparency can have adverse effects by providing information to outsiders on who to bribe. Prat (2005) contends that complete transparency is not always beneficial and the type of information matters. This is part of the rationale of executive privilege - an agent may care more about appearances than engage in frank discussions in decision-making if it is understood that the internal discussions will be made public. Furthermore, information alone may not be sufficient to produce positive outcomes if there is little political engagement by citizens (Khemani et al., 2016). More importantly information has to have a high signal-to-noise ratio to be impactful (Kosec and Wantchekon, 2020). Transparency in general covers a range of areas including general institutional transparency regarding laws and regulations to specific sectors such as financial transparency. In this study we add to the literature by specifically investigating the effects of data transparency on economic growth. Data transparency is defined as the regular publication of credible statistics by the state. This includes the frequency and availability of micro- data, socio-economic indicators, and adherence of data to international standards. The literature has acknowledged the importance of statistics and the benefits of evidence-based policy in terms of generating reforms through experimentation (Rodrik, 2010). However, theoretical arguments have been made that better statistics may not necessarily lead to reforms. Binswanger and Oechslin (2020) theoretically show that under certain political conditions better statistics can inhibit reforms. For instance, with better statistics, voters are less inclined to give politicians the benefit of the doubt when confronted with disappointing economic data, thereby weakening politicians’ incentives to experiment and carry out reforms in the first place. Thus, the benefits of data transparency warrant empirical validation. However, recent research indicates that democracies are more likely to transform talk of reform during economic downturns into self- correcting reforms than less democratic societies (Arezki et al. 2020). Studies have empirically explored elements of data transparency on governance, while not explicitly labeling it as such. For instance, Islam (2006) constructs a transparency index based on the frequency and availability of 11 indicators from four sectors (real, fiscal, financial, and external) and examines its effect on measures of governance. Hollyer et al, (2011) create a transparency index based on the availability of country-level data – 172 variables in total categorized under Economic Policy and Debt in the World Bank’s World Development Indicators – to show that democratic countries are more transparent. Similarly, Williams (2009) constructs a transparency index based on the quantity of data released by governments through the availability of socio-economic data contained in the World Development Indicators and the International Finance Statistics databases. Choi and Hashimoto (2018) proxy for data transparency policy reforms using subscriptions to the IMF’s Data Standards Initiatives and find that such reforms lead to falls in sovereign bond spreads. The study finds that the larger quantity of data produced, as measured by the index, is positively correlated to the quality of bureaucracy, investment and financial sector development. We build on these studies by using a comprehensive and rigorous measure of data transparency, known as the Statistical Capacity Index (SCI). The SCI measures both the production of micro and macro data as well as the frequency and quality of such data through adherence to international standards. The SCI has also been used by Kubota and Zeufack (2020) to explore how data transparency can reduce the external cost of borrowing, as proxied by sovereign bond spreads, in Sub-Saharan Africa. Devarajan (2013) used the SCI to show the poor statistical performance in Africa. As far as we are aware at the time of writing, our study is the first to empirically explore the effect of data transparency on overall economic growth. III. Estimation Strategy The regression model to estimate the effect of data transparency (SCI) on economic growth (∆) for country i in year t can be written as follows: 6 ∆ =∝0 + 1 −1 + 2 −1 + −1 + + + (1) is the vector of control variables; is the country fixed or random effects; is the year fixed effects, and is the error term. Equation (1) can be rewritten as (2) and then estimated in levels as shown in equation (3) where the income level () is regressed on the lagged income per capita variable (), the data transparency measure, while accounting for several other covariates. −−1 =∝0 + 1 −1 + 2 −1 + −1 + + + (2) =∝0 + (1 + 1 )−1 + 2 −1 + −1 + + + (3) There are certain advantages to estimating equation (3) using this approach as used by other studies (Barro, 1991; Brueckner and Lederman, 2018). Equation (3) elucidates the auto regressive process by providing the coefficient on the lag of income per capita (−1 ). This allows for the exploration of conditional convergence of economies. Furthermore, equation (3) highlights our conservative expectation that improvements in data transparency raise growth during a transitional period to a higher level of growth per capita, implying a permanent effect on the level of income per capita but not a permanent effect on economic growth à la endogenous growth models as presented by Romer (1990, 1994). However, the lagged level of income per capita is endogenous as presented in equation (3). To address this, we employ systems General Methods of Moments (GMM) dynamic panel estimators as developed by Arellano and Bover (1995) and Blundell and Bond (1998) where the lagged income per capita is instrumented by further lags in both levels and differences. The literature has paid considerable attention to the relevance of instruments in dynamic panel estimations (Kraay, 2015). Alternatively, we employ a cross-sectional estimation that regresses the level of income per capita in 2018 on the 2005 levels of the SCI score as well as other covariates as shown in equation 4. ,2018 =∝0 + (1 + 1 ),2005 + 2 ,2005 + ,2005 + (4) We also employ an Instrumental Variables (IV) approach where we instrument the SCI score with legal origins and perceptions of government effectiveness. Details are provided in the Instrumental Variables subsection of the results. The sample of the study largely consists of developing economies between 2004 and 2018. The panel data set consists of 124 economies while the cross-sectional data set consists of 91 economies. 4 Tables A2 and A3 provide a list of economies in the panel and cross-sectional sample, respectively. The outcome variable - income per capita - is measured using the log of Real GDP per Capita (constant 2010 USD) obtained from the World Bank’s World Development Indicators. Tables 3 and 4 provide summary statistics for Real GDP per capita for both the panel and cross-section data sets, respectively. Tables 5 and 6 present the correlation between the log of real GDP per capita with SCI and other covariates for both cross-sectional and panel data, respectively. Both tables 3 and 4 show that the log of GDP per capita is positively correlated with the SCI score, statistically significant at the 1 percent level. The proxy variable used for data transparency is the World Bank’s Statistical Capacity Index (SCI). The overall SCI indicator is based on a diagnostic framework to assess the capacity of national statistical systems over time. This indicator is ideal as it covers both the quantity (frequency) and to some extent the quality 4 Several economies have gaps in the SCI data, with some of them missing data for 2005 or 2018. Thus, the sample of economies covered for the cross-sectional data is less than the panel data. 7 aspects of data, building on studies that have mostly focused on the former (Williams, 2009; Hollyer et al, 2011). The framework has three dimensions: (i) Source data, (ii) Methodology, and (iii) Periodicity and timeliness of socioeconomic indicators. Each dimension is evaluated on criteria based on metadata information from the World Bank, IMF, UN, UNESCO, and WHO. The overall SCI score is the average of the three sub-indicators calculated for each dimension. 5 The source data dimension reflects whether a country conducts data collection activity in line with internationally recommended periodicity, and whether data from administrative systems are available and reliable for statistical estimation purposes. This dimension covers the micro-data aspect of data transparency that is essential given microdata are foundational for a country’s data system. Specifically, the criteria used are the periodicity of population and agricultural censuses, the periodicity of poverty and health-related surveys, and completeness of vital registration system coverage. A country can achieve a perfect score if it has conducted 1 or more population census in the last 10 years, 1 or more agricultural census in the last 10 years, 3 or more health surveys in the last 10 years, 3 or more health surveys in the last 10 years, and has a complete vital registration system. The statistical methodology dimension measures a country’s ability to adhere to internationally recommended standards and methods. This aspect is captured by assessing guidelines and procedures used to compile macroeconomic statistics and social data reporting and estimation practices. This dimension measures the quality aspect of the data system. Under the assumption that international guidelines provide the benchmark of ideal data systems, adherence to such standards may imply that the quality of data systems meets well-established standards. Countries are evaluated against a set of criteria such as use of an updated national accounts base year, use of the latest balance of payments manual, the external debt reporting status, updated consumer price index, updated industrial production index, updated import/export prices, accounting basis for reporting government financial data, vaccine reporting to WHO (discrepancy between WHO and government estimates), subscription to IMF’s Special Data Dissemination Standard, and enrollment data reporting to the United Nations Educational, Scientific, and Cultural Organization (UNESCO). 6 Each criterion has equal weight. The periodicity and timeliness dimension measures the availability and periodicity of key socioeconomic indicators, of which nine are Millennium Development Goals (MDG) indicators. This dimension attempts to measure the extent to which data are made accessible to users through transformation of source data into timely statistical outputs. The periodicity of the main indicators considered, each receiving equal weight, includes: (i) Income poverty indicator, (ii) Child malnutrition indicator, (iii) Child mortality indicator, (iv) Immunization indicator, (v) HIV/AIDS indicator, (vi) Maternal health indicator, (vii) Gender equality in education indicator, (viii) Primary completion indicator, (ix) Access to water indicator, (x) GDP growth indicator. Several studies have used the SCI to evaluate the quality of data systems and investigate its relationship with other macroeconomic variables (Devarajan, 2013; Kubota and Zeufack, 2020). Tables 3 and 4 provide summary statistics for the SCI indicator for both the panel and cross-section data sets, respectively. Tables 5 and 6 present the correlation between the SCI and other variables for both cross-sectional and panel data, respectively. The SCI indicator is positively correlated with income per capita as well as most of the political institution variables. 5 The composite score ranges from 0 to 100. For our analysis, we rescale this score to be between 0 and 1. 6 The discrepancy between vaccination rates from government administrative records and household surveys has been documented by Sandefur and Glassman (2015) as an indicator of bad government data. 8 IV. Data – Other Variables We control for several factors that are typically accounted for in growth regressions. These broadly include human capital, sectoral composition, trade, financial development, commodity-price shocks, external debt- financing shocks, and quality of political institutions. Given that the sample of analysis is largely based on developing economies, we employ variables that tend to have broader coverage across developing economies. We use exports plus imports over GDP for the measure of trade openness. Sectoral composition is captured through the share of manufacturing value added and the share agriculture, forestry and fishing value added over GDP. We also account for human capital using gross primary school enrollment rates. Financial development is measured as the domestic credit to the private sector as a percentage of GDP, following Levine et al., (2000). These variables are compiled by the World Bank’s World Development Indicators. We measure commodity shocks using a commodity terms of trade index following Gruss and Kebhaj (2019). Commodity-price shocks are calculated by taking the first differences of the log of the commodity terms of trade index, using historical fixed weights. The commodity terms of trade index is based on international prices of up to 45 individual commodities, constituting broad categories of energy, metals, food and beverages, and agricultural raw materials. External debt-financing shocks are calculated as the log difference of the US T-bill (10-year, long term) multiplied by the lagged external debt. The debt shock variable is included in a few specifications as it entails a drop in the number of observations due to availability of debt data. We account for political institutions using data from the Worldwide Governance Indicators - WGI (Kauffman et al., 2010) following other studies in the literature (Antras and Chor, 2013; Grembi et al., 2016). In our estimations we include Voice and Accountability, Political Instability, Rule of Law, and Control of Corruption. Voice and Accountability is a measure of perceptions of the extent to which a country's citizens are able to participate in selecting their government, as well as freedom of expression, freedom of association, and a free media. Political Instability measures perceptions of the likelihood of political instability and/or politically motivated violence. Rule of Law captures perceptions of the extent to which agents have confidence in and abide by the rules of society, and in particular the quality of contract enforcement, property rights, the police, and the courts, as well as the likelihood of crime and violence. Control of Corruption captures perceptions of the extent to which public power is exercised for private gain, including both petty and grand forms of corruption, as well as “capture” of the state by elites and private interests. We control for internal armed conflict using the UCDP/PRIO Armed Conflict Database updated by Pettersson and Öberg (2020). We follow a definition, similar to Abu Bader and Ianchovichina (2019) where armed internal conflict is a contested incompatibility where the use of armed force between two parties, of which at least one is the government of a state, results in at least 25 battle-related deaths per year . For the panel dataset the variable takes a value of 1 if internal armed conflict, zero otherwise. The exception is economies such as Iraq and Afghanistan that were in conflict for all years in the sample and thus attain values of zero. For the cross-section estimations the variable is the number proportion of years that an economy experienced internal conflict between 2005 and 2018. 7 In addition, we include data for additional variables used as instruments in our Instrumental Variables approach. These include data on legal origins that capture whether the historical origin of a country’s laws 7 All the results presented in this study are largely unchanged if the internal armed conflict variable is excluded from the estimations 9 is German, French or from the United Kingdom. These data are obtained from La Porta et al., (2008), who provide an overview of studies using legal origins. In an additional specification, we also use the 1996 to 2003 sample average of the Government Effectiveness measure from the WGI. Government effectiveness captures perceptions of the quality of public services, the quality of the civil service and the degree of its independence from political pressures, the quality of policy formulation and implementation, and the credibility of the government's commitment to such policies. Tables 3 and 4 provide summary statistics for all the variables for the cross-section and panel data sets, respectively. Tables 5 and 6 present the correlation between these variables. Table A1 provides the definitions and sources of all the variables. V. Results Table 1 presents the main results. Columns 1, 2 and 3 present the cross-sectional OLS estimation results corresponding to equation (4) with no controls, standard controls, and standard controls plus external debt- financing shocks, respectively. The specification with the debt shocks is presented separately as it entails a drop in observations. The SCI variable is positive and statistically significant at the 1 percent level for all three specifications presented in the first three columns. Using the specification with the standard set of controls, the magnitude of the SCI coefficient indicates an elasticity of 0.4 percent. However, to be comparable to the panel estimations, it has to be annualized which then translates to an elasticity of 0.03 percent (0.417/14). Columns 4 through 10 of table 1 present the panel data results corresponding to equation (3). Column 4 presents the findings for year fixed effects with the standard set of controls and column 5 presents the results for Random Country Effects and Year Fixed Effects. The coefficient for SCI is statistically significant at the 1 percent level for the Year Fixed Effects estimations (column 4) and at the 5 percent level for the Random Country Effects with Year Fixed Effects estimations (column 5). The estimated elasticity is 0.03 percent (similar to the cross-sectional OLS estimates) for the Year Fixed Effects. However, this elasticity drops to 0.02 percent for the Random Country Effects with Year Fixed Effects estimations. To account for the endogeneity of the lagged level of GDP per capita, we employ Two Step System GMM estimators. The findings are presented in columns 6 through 10 in table 1. In columns 6 through 9, only the lagged level of GDP per capita is instrumented by further lags in both levels and differences. In column 10 we instrument both the both the lagged level of GDP and the lagged level of SCI. The system GMM results using all the lags possible for instruments are presented without any controls (column 6) and with the standard controls (column 7). The reported Hansen J statistic does not reject the null that the instruments are valid lending credibility to the instruments. However, the first order autocorrelation AR(1) –and second order autocorrelation AR(2) are detected and thus in column 8 we replicate the results of column 7 using only lags of 3 or more as instruments given that AR(3) is not detected. In column 9, we replicate the specification in column 8 with the inclusion of the external debt- financing shock variable. The magnitude of the effect of SCI on the log of GDP per capita is largely the same for all the system GMM estimations showing an elasticity of 0.028 to 0.030 percent, mostly consistent with cross-sectional OLS estimates. Thus, a 1 percent increase in the SCI score translates to a 0.03 percent increase in the level of GDP per capita. The positive effect of SCI on growth appear to be driven primarily by the cross‐country variation captured in the “levels” equation. In column 10 the coefficient for SCI more than doubles when we also instrument SCI with further lags in both levels and differences. The Hansen J- statistic shows a p-value of 1, and thus does not reject the null that the instruments are valid, however the high p-value raises concerns of instrument proliferation We also cannot rule out the possibility that the instruments are weak. 8 8 The issue of instrument proliferation may also imply that the instruments may be weak. We investigate this issue further. We collapsed the data into 2 year averages leading to 7 time periods in order to lower the time dimension 10 The coefficient of lagged log of GDP per capita is statistically significant at the 1 percent level. Its magnitude appears to be close to unity, implying that shocks to the level of (log) GDP per capita are highly persistent. The magnitude of the coefficient can also reveal information on the degree of conditional convergence. Using the cross-sectional OLS estimates including the standard controls (Table 1, column 2) and the System GMM estimates (Table 1, column 8), we obtain the convergence coefficient of -0.002 (annualized) and -0.01, respectively. The negative coefficient indicates conditional convergence within the set of largely developing economies in the sample. We also interacted SCI with the lagged log of GDP per capita to explore whether the SCI may have a permanent effect on economic growth. 9 The coefficient of the interaction term is statistically insignificant across all estimations. The coefficients of the other control variables (not presented) are not statistically significant across most specifications apart from trade, ranging from 10 to 1 percent level, depending on the specification. However, the magnitude of the effect of SCI on real GDP per capita is much larger than trade openness. Using the cross-section OLS model (column 2) and the System GMM model (column 8) with the standard controls, the elasticity of trade (at the sample mean) with regards to GDP per capita is 0.01, regardless of the model used. In contrast the elasticity of SCI with respect to GDP per capita is 0.03, about three times larger. This is a striking finding given that trade has been afforded far more attention than data transparency in the literature. Furthermore, the fact that other governance indicators are not significant predictors of the level of GDP per capita, conditional on the lagged level of GDP per capita, suggests that data transparency might be a key conduit through which institutions affect growth. Instrumental Variables (IV) To try and establish the robustness of our findings, we additionally use the Instrumental Variables approach with the cross-sectional data. The ideal instrumental variable would be one that is correlated with data transparency but has no direct effect on GDP per capita. We use the legal origins of an economy as an instrument for data transparency. For our sample of analysis, all the economies have adopted three types of legal systems – French, German, and English. Broadly, French and German legal origins fall under civil law while English legal origins pertain to Common law. We find that French and German legal systems are positively correlated with the SCI while English legal origins are negatively correlated with SCI. We posit a number of plausible explanations with the acknowledgment that this is speculative rather than based on a specific literature on the evolution of data ecosystems. Civil law systems are generally based on extensive codification. They tend to be comprehensive with frequently updated legal codes that try and account for every situation including detailing procedures and punishment. In contrast, Common law is uncodified and is based on precedent. Thus, one possible explanation is that the extensive nature of Civil law (German and French) requires a greater undertaking of data collection, and thus a higher requirement (T) with respect to the number of countries. We then follow the literature (Bazzi and Clements, 2013; Kraay, 2015) and produce confidence sets by “inverting” test statistics that are valid regardless of whether instruments are strong and when they are weak). These include the Anderson-Rubin (AR) test statistic, Conditional likelihood ratio (CLR) test statistic, and the K and J overidentification test statistic (K-J). For tractability, we use only two lags for the system GMM estimations. Specifications with and without controls are employed. We also unbundle the system GMM into “difference” and “levels” equations. We find the following. The positive effect of SCI on growth appear to be driven primarily by the cross‐country variation captured in the “levels” equation. The confidence sets (95% level) show that the range of possible values for the coefficient of lagged SCI are largely positive. However, the confidence sets are unbounded. Thus, we cannot disregard the possibility of weak instruments. These findings are available upon request. 9 Not reported, but available from the authors upon request. 11 of statistical capacity, especially in developing economies that may be far from the minimum standard of data requirements. In contrast, given Common law (English) is based on precedent, the urgency to develop data systems in developing economies may be less. Another possible explanation is that economies with Civil law legal origins tend to have a more interventionist state with heavier regulations than Common law legal origins (La Porta et al., 2008). Heavier regulations may accompany extensive data collection efforts to monitor compliance. From the correlation observed in the data, better data systems may be side-effects of the features of economies with Civil law legal origins. The instruments are likely to satisfy the exclusion restriction criterion given their similar use by Levine et al., (2000) to instrument for the effect of financial development on economic growth. However, the implication is that our estimations must account for the other channels through which legal origins may affect growth. La Porta et al., (2008) document the economic consequences of legal origins. The two main channels through which legal origins may affect growth are through financial development and governance. To account for the financial development channel, domestic credit to the private sector as a percentage of GDP is a control variable in all the estimations. Furthermore, several governance variables from the Worldwide Governance Indicators are included in our estimations. We also entertain an additional instrumental variable in the form of government effectiveness obtained from the Worldwide Governance Indicators. This variable specifically captures perceptions of the quality of public services, the quality of the civil service and the degree of its independence from political pressures, the quality of policy formulation and implementation, and the credibility of the government's commitment to such policies. Better public and civil service as well as the quality of policy formulation are likely to be positively correlated with data transparency. We take the average score for the years available predating the sample - 1996-2003 – to aim for some level of exogeneity. While government effectiveness alone may be a harder sell in terms of satisfying the exclusion restriction, the fact that we account for other governance measures such as Voice and Accountability, Political Stability, Rule of Law and Control of Corruption, may limit the direct effects of government effectiveness as defined on economic growth. The results for the Instrumental Variables estimations are presented in Table 2. Columns 1 and 2 provide the second stage and first stage results respectively for the estimations with legal origins as instrumental variables. The first stage results show that French and German legal origins are positively correlated with the SCI, statistically significant at the 1 percent level. The second stage shows that SCI has a positive coefficient, statistically significant at the 5 percent level. The magnitude is somewhat larger than all the other panel and OLS cross-section estimates, indicating an elasticity of 0.04 percent (annualized). The Hansen test of overidentifying restrictions does not reject the joint null that the instruments are valid. The underidentification test rejects the null that the instruments are underidentified, indicating that they are relevant. The weak identification test shows that the F-stat is greater than most of the Stock and Yogo thresholds (with the exception of the 10% maximal IV size) implying the instruments are not weak. 10 Do note however that the weak identification tests assume homoskedasticity and thus may be unreliable (Baum et al., 2007). The Shea Partial R square indicates that 17.8 percent of the variation in the endogenous variable is explained by the instruments. In columns 3 and 4 we present the findings with the inclusion of government effectiveness (1996-2003 average) as an additional instrument. The findings are largely the same with the statistical significance of the coefficient of SCI strengthening to the 1 percent level. The magnitude of the elasticity is unchanged. The validity and relevance of the instruments are retained as indicated by the overidentification and underidentification tests. In the first stage, the coefficient of the Government Effectiveness variable is positive and statistically significant at the 1 percent level. This result further supports the argument that 10 With regards to the Stock and Yogo thresholds, the maximal relative bias thresholds are based on the ratio of the bias of the estimator to the bias of OLS. The relative bias thresholds are based on the performance of the Wald test statistic given that the Wald test rejects too often in the case of weak identification. 12 data transparency might be a key channel through which governance institutions affect long-term development. The Shea Partial R square increases to 26.5 percent. In the 5th and 6th columns of table 2, we include the external debt-financing shock variable as a covariate, reducing the sample to 81 economies. The results are largely unchanged. Magnitude of the Effects The elasticity of the SCI with respect to log of real GDP per capita per year is 0.03 for the OLS cross- section estimation. This estimate is unchanged whether the base OLS estimation with year effects in the panel data set is used or the dynamic panel estimators that account for the endogeneity of the lagged level of real GDP per capita. However, the IV estimations yield a higher elasticity of 0.04 (annualized). To put this in context, the estimates of elasticity of schooling as well as trade openness with respect to the log of real GDP per capita are much lower. The system GMM estimations accounting for the endogeneity of lagged real GDP per capita yield an elasticity of 0.01 for trade (at the mean) and 0.009 for primary school enrollment (at the mean). In both cases the magnitude of the elasticities are much lower than the elasticity of SCI. The elasticity for trade is largely unchanged at 0.01 with the IV estimations, while the elasticity for primary school enrollment falls considerably to 0.001. Do note that the coefficient for primary school enrollment is statistically insignificant and not stable across the estimations. The IV estimates of the elasticity of SCI is about four times larger than the elasticity of trade openness (0.04 vs 0.01). Using these same estimates, if Burundi were to achieve Chile’s SCI score in 2018 (from 0.57 to 0.9) – an increase of 58.8 percent - then Burundi would increase its real GDP per capita by 2.3 percent. We explore the economic magnitude of effects of increasing SCI, trade openness, and primary school enrollment by one standard deviation on the log of real GDP per capita. For panel estimations, we use the within-country standard deviation of the variables. For all the estimations, the magnitude of the effect of increasing SCI by 1 standard deviation (ranging from 0.002 to 0.009) is larger than the magnitude for primary school enrollment (0.0001 to 0.001). Comparing SCI with trade openness, the magnitude of the effect of increasing SCI by 1 standard deviation is slightly larger than the magnitude of trade openness in the system GMM estimations (0.0023 vs. 0.0016) but more than twice as large using the IV estimations (0.009 vs. 0.004). The developing Middle East and North Africa (MENA) region is the only region in the sample that experienced a decrease in the SCI score between 2005 and 2018, thus offering another assessment of the economic magnitude of the impact of SCI on the level of GDP per capita. Within this period the average SCI score for the region fell by approximately 4.4 percent, while trade decreased by 2.7 percent (using 2018 as the base). Between 2005 and 2017, primary school enrollment rates increased by 1.5 percent. 11 The base OLS estimates with the cross-section data show a loss of -1.8 percent in GDP per capita between 2005 and 2018 due to the fall in SCI. This is larger than the -0.4 percent decrease in GDP per capita due to the decline in trade openness (between 2005 and 2018), and the 0.01 percent increase due to the increase primary school enrollment (between 2005 and 2017). Using the OLS with year fixed effects estimates (panel data), the loss in GDP due to the fall in SCI is slightly higher at -1.9 percent. This is still larger than the loss of -0.4 percent and gain of 0.4 percent due to the decrease in trade openness and the increase in schooling respectively. The system GMM estimations indicate a loss of -1.8 percent in the GDP per capita between 2005 and 2018 due to the fall in SCI. The corresponding loss for the decrease in trade is -0.4 percent and the gain due to the increase in schooling is 0.2 percent. Finally, using the IV estimates, the loss in GDP per capita due to the fall in SCI rises to -2.4 percent. The increase in GDP per capita due to schooling is 0.02 percent while the loss in GDP per capital due to the decrease in trade is -0.4 percent. Using these estimates, the loss in 11 We use the 2017 value for primary school enrollment rate due to the sparse data in MENA for 2018. All trends are for developing economies in the MENA region (i.e. high-income economies in the region are excluded). 13 GDP per capita in the MENA region due to the fall in SCI is much larger than that of trade and negates the gains from schooling within the same time period. Again, these comparisons do not take into account the costs of raising school enrollment or trade, but since the SCI score declined in absolute terms no such costs would come into an accounting of the gains versus the costs. In other words, if MENA had maintained the level of SCI it had in 2005, at no extra costs, the benefits in terms of GDP per capita would far outstrip the gains from school enrollment once the costs of achieving gains in schooling are taken into account. The losses faced by the MENA region thus far only account for the drop in SCI between 2005 and 2018. A simple correlation between log GDP per capita and the SCI in 2005 suggests that the MENA region was underperforming in terms of the SCI with regards to its level of development. If the MENA region had achieved the average level of SCI in 2005 given its GDP per capita, as opposed to starting below and declining further through 2018, the estimated total losses due to the MENA region’s SCI deficit (using the IV estimations) would be approximately 5.9-7.5% of GDP per capita. 12 VI. Conclusion In this study, we establish a positive relationship between data transparency, as measured by the Statistical Capacity Index, and economic growth for a sample of largely developing economies. The findings indicate that there is scope for the timely and credible production of statistics to improve economic growth for developing economies. The results stand after accounting for a host of factors including political institutions, economy structure, trade openness, school enrollment rates, internal armed conflict, and commodity and debt shocks. The magnitude of the effects of data transparency, although modest, are larger than trade openness in our sample that has received ample attention by the profession. The findings are robust to a myriad of specifications and estimating models. The findings established are of importance given the role of data transparency in promoting growth has received little empirical attention in the literature. While the role of transparency in improving governance and limiting corruption is well understood, the specific role of data transparency has been much debated with theoretical arguments made both for and against better statistics, indicating the topic is ripe for empirical verification. Our study serves to fill a gap in the long literature on the determinants of economic growth. The policy implications are considerable especially for more autocratic governments balancing acts of political suppression and enlarging the economic pie. Data opacity can be used as a tool to keep citizens in the dark, but it might come at the cost of foregoing opportunities to increase the economic pie. We also acknowledge some limitations in our work, highlighting vast opportunities for future research in this area. Our sample is a selection of developing economies, and a few economies that may have transitioned into high-income status. This is a limitation of the SCI. The estimated effects may be larger when a global sample is included. Furthermore, empirically teasing out the specific mechanisms of the effects is beyond the scope of this study. Finally, this study is timely given the recent explosion of data all across the globe. Our study is mostly limited to traditional sources of data, and thus the effect of new types of data such as big data is a promising area for future research. This would also bring in the issue of disinformation that has been a rising concern with the spread of digital technologies. 12 The 5.9-7.5 % range is due to the choice of sample. The relationship between GDP per capita and the SCI is stronger for the cross-section 2005 sample of analysis then the full sample that includes any country with GDP per capita and SCI data available in 2005. The total loss using the full sample is 5.9 (3.5 +2.4) percent. The total loss based on the estimation sample is 7.5 (5.1+2.4) percent. 14 References Abi-Nassif, Christophe, Asif Mohammed Islam, and Daniel Lederman (2020). “Perceptions, Contagion, and Civil Unrest.” Policy Research Working Paper No. 9416. World Bank, Washington, DC. Abu Bader, Suleiman and Elena Ianchovichina (2019). “Polarization, Foreign Military Intervention, and Civil Conflict.” Journal of Development Economics 141: 102248 https://doi.org/10.1016/j.jdeveco.2018.06.006 Acemoglu, Daron, Simon Johnson, and James A. Robinson (2001). “The Colonial Origins of Comparative Development: An Empirical Investigation.” American Economic Review 91(5): 1369-1401. Acemoglu, Daron, Simon Johnson, James A. Robinson, and Yunyong Thaichareon (2003). “Institutional Causes, Macroeconomic Symptoms: Volatility, Crises and Growth.” Journal of Monetary Economics 50: 49-123. Acemoglu, Daron, Simon Johnson, and James A. Robinson (2005). “Institutions as a Fundamental Cause of Long-run Growth.” In Phillippe Aghion and Steven N. Durlauf (Eds) Handbook of Economic Growth Vol 1A (pp. 385-472). Amsterdam and San Diego, Elsevier, North-Holland. Acemoglu, Daron, Suresh Naidu, Pascual Restrepo, and James A. Robinson (2019). “Democracy Does Cause Growth.” Journal of Political Economy 27(1): 47-100 Antras, Pol and Davin Chor (2013) “Organizing the Global Value Chain.” Econometrica 81(6): 2127- 2204. Arellano, Manuel, and Olympia Bover (1995). “Another Look at the Instrumental Variable Estimation of Error-components Models.” Journal of Econometrics 68: 29–51. Arezki, Rabah, Simeon Djankov, Ha Nguyen, Ivan Yotzov (2020). “Reform Chatter and Democracy.” Policy Research Working Paper No. 9319. World Bank, Washington, DC. Armand, Alex, Alexander Coutts, Pedro C. Vicente, and Ines Vilela (2020). “Does Information Break the Political Curse? Experimental Evidence from Mozambique.” American Economic Review 110 (1): 3432- 53 Bac, Mehmet (2001). “Corruption, Connections and Transparency: Does a Better Screen Imply a Better Scene”? Public Choice 107:87-96. Barro, Robert J. (1991). “Economic Growth in a Cross Section of Countries.” Quarterly Journal of Economics 106(2):407-443. Bazzi, Samuel and Michael A. Clemens (2013). “Blunt Instruments: Avoiding Common Pitfalls in Identifying the Causes of Economic Growth.” American Economic Journal: Macroeconomics 5(2): 152– 186 Baum, Christopher F., Mark E. Schaffer, and Steven Stillman (2007). “Enhanced Routines for Instrumental Variables/GMM Estimation and Testing.” The Stata Journal 7(4):465–506 Binswanger, Johannes and Manuel Oechslin (2015). “Disagreement and Learning about Reforms.” Th Economic Journal 125:853-886. 15 Binswanger, Johannes and Manuel Oechslin (2020). “Better Statistics, Better Economic Policies?” European Economic Review 130:103588. Blundell, Richard, and Stephen Bond (1998). “Initial Conditions and Moment Restrictions in Dynamic Panel Data Models.” Journal of Econometrics 87: 115–143. Brada, Josef C., Zdenek Drabek, Jose A. Mendez, M. Fabricio Perez (2019). “National Levels of Corruption and Foreign Direct Investment.” Journal of Comparative Economics 47(1): 31-49. Brueckner, Markus, and Daniel Lederman (2018). “Inequality and Economic Growth: The Role of Initial Income.” Journal of Economic Growth 23:341-366. Brunetti, Aymo, Gregory Kisunko, and Beatrice Weder (1998). “Credibility Rules and Economic Growth: Evidence from a Worldwide Survey of the Private Sector.” World Bank Economic Review 12(3):353-384. Brunetti, Aymo and Beatrice Weder (2003). “A Free Press is Bad News for Corruption.” Journal of Public Economics 87(7-8):1801-1824. Campos, J. Edgardo, Donald Lien, and Sanjay Pradhan (1999). “The Impact of Corruption on Investment: Predictability Matters.” World Development 27(6):1059-1067. Choi, Sangyup amd Yuko Hashimoto (2018) “Does Transparency Pay? Evidence from IMF Data Transparency Policy Reforms and Emerging Market Sovereign Bond Spreads.” Journal of International Money and Finance 88:171-190. Coase, Ronald (1960). “The Problem of Social Cost.” The Journal of Law and Economics 3:1-44. Cordis, Adriana S. and Patrick L. Warren (2014). “Sunshine as Disinfectant: The Effect of State Freedom of Information Act laws on Public Corruption.” Journal of Public Economics 115: 18-36. Das, Jishnu, Do Quy-Toan Do, Karen Shaines and Sowmya Srikant (2013). “U.S. and Them: The Geography of Academic Research.” Journal of Development Economics 105: 112-130. De Soto, Hernando (1989). The Other Path: The Invisible Revolution in the Third World, New York: Harper. Devarajan, Shantayanan (2013). “Africa's Statistical Tragedy.” Review of Income and Wealth, 59 (Special Issue):9-15. Djankov, Simeon, Edward Glaeser, Rafael La Porta, Florencio Lopez-de-Silanes, and Andrei Shleifer (2003). “The New Comparative Economics.” Journal of Comparative Economics 31:595-619. Ehlrich, Isaac and Francis T. Lui (1999). “Bureaucratic Corruption and Endogenous Economic Growth.” Journal of Political Economy 107(6): S270-S293. Grajalez, Carlos Gómez, Eileen Magnello, Robert Woods, and Julian Champkin (2013). “Great Moments in Statistics.” Significance 10 (6): 21–28. Grembi, Veronica, Tommaso Nannicini, and Ugo Troiano (2016). “Do Fiscal Rules Matter?” American Economic Journal: Applies Economics 8(3):1-30. 16 Gruss, Bertrand and Suhaib Kebhaj (2019). “Commodity Terms of Trade: A New Database” IMF Working Paper WP/12/21. Hall, Robert E. and Charles I. Jones (1999). “Why Do Some Countries Produce So Much Output Per Worker Than Others?” Quarterly Journal of Economics 114(1):83-116. Hollyer, James, B. Peter Rosendorff, and James Raymond Vreeland (2011). “Democracy and Transparency.” Journal of Politics 73 (4): 1191–205. Islam, Roumeen (2006). “Does More Transparency Go along with Better Governance?” Economics and Politics 18 (2): 121–67. Kaufmann, Daniel, Aart Kraay, and Massimo Mastruzzi (2010). “The Worldwide Governance Indicators: Methodology and Analytical Issues.” Policy Research Working Paper 5430. World Bank, Washington, DC. Khemani, Stuti, Ernesto Dal Bo, Claudio Ferraz, Frederico S. Finan, Johnson Stephenson, LouiseCorinne, Adesinaola M. Odugbemi, Dikshya Thapa and Scott D. Abrahams (2016). “Making politics work for development: Harnessing Transparency and Citizen Engagement.” Policy Research Report 106337. World Bank, Washington, DC. Kosec, Katrina and Leonard Wantchekon (2020). “Can Information Improve Rural Governance and Service Delivery.” World Development 104376. https://doi.org/10.1016/j.worlddev.2018.07.017 Kraay, Aart (2015). “Weak Instruments in Growth Regressions: Implications for Recent Cross-Country Evidence on Inequality and Growth.” Policy Research Working Paper 7494. World Bank, Washington, DC. Kubota, Megumi, and Albert Zeufack (2020). “Assessing the Returns on Investment in Data Openness and Transparency.” Policy Research Working Paper 9139. World Bank, Washington, DC. La Porta, Rafael, Florencio Lopez-de-Silanes, and Andrei Shleifer (2008). “The Economic Consequences of Legal Origins.” Journal of Economic Literature 46 (2): 285-332. Lederman, Daniel, Norman V. Loayza, and Rodrigo Soares (2005). “Accountability and Corruption: Political Institutions Matter.” Economics and Politics 17(1): 1-35 Levine, Ross, Norman Loayza, and Thorsten Beck (2000). “Financial Intermediation and Growth: Causality and Causes.” Journal of Monetary Economics 46:31-77. Mauro, Paolo (1995). “Corruption and Growth.” Quarterly Journal of Economics 110(3):681-712. Mauro, Paolo (1998). “Corruption and the Composition of Government Expenditure.” Journal of Public Economics 69:263-279. McMillan, John, and Pablo Zoido (2004). “How to Subvert Democracy: Montesinos in Peru.” Journal of Economic Perspectives 18 (4): 69–92. 17 Nagaraj, Abishek, Esther Shears, and Mathijs de Vaan (2020). “Improving Data Access Democratizes and Diversifies science.” Proceedings of the National Academy of Sciences Sep 2020, 117 (38) 23490-23498. DOI: 10.1073/pnas.2001682117 North, Douglass C. (1989). “Institutions and Economic Growth: An Historical Introduction.” World Development 17(9):1319-1332. North, Douglass C. (1990). Institutions, Institutional Change and Economic Performance. Cambridge: Cambridge University Press. Pettersson, Therese and Magnus Öberg (2020). “Organized violence, 1989-2019.” Journal of Peace Research 57(4). Prat, Andrea (2005). “The Wrong Kind of Transparency.” American Economic Review 95(3): 862-877. Rodrik, Dani (2000). “Institutions for High-Quality Growth: What They Are and How to Acquire Them.” Studies in International Comparative Development 35(3):3-31. Rodrik, Dani, Arvind Subramanian, and Francesco Trebbi (2004). “Institutions Rule: The Primacy of Institutions Over Geography and Integration in Economic Development.” Journal of Economic Growth 9: 131-156. Rodrik, Dani (2010). “Diagnostics before Prescription.” Journal of Economic Perspectives 24(3):33-44. Romer, Paul (1990). “Endogenous Technological Change.” Journal of Political Economy 98:S71-102. Romer, Paul (1994). “The Origins of Endogenous Growth.” Journal of Economic Perspectives 8(1):3-22. Sandefur, Justin and Amanda Glassman (2015). “The Political Economy of Bad Data: Evidence from African Survey & Administrative Statistics.” The Journal of Development Studies, 51(2):116-132. Shleifer, Andrei, and Robert W. Vishny (1993). “Corruption.” Quarterly Journal of Economics 108 (3): 599-617. Varvarigos, Dimitrios and Panagiotis Arsenis (2015). “Corruption, Fertility, and Human Capital.” Journal of Economic Behavior and Organization 109:145-162. Wei, Shang-Jin (2000). “How Taxing is Corruption on International Investors?” Review of Economics and Statistics 82(1), 1–11. Williams, Andrew (2009). “On the Release of Information by Governments: Causes and Consequences.” Journal of Development Economics 89: 124–38. Williamson, Oliver E. (2000). “The New Institutional Economics: Taking Stock, Looking Ahead.” Journal of Economic Literature 38(3):595-613. World Bank (2021). World Development Report 2021: Data for Better Lives, forthcoming. Washington, DC: World Bank. 18 Table 1: Statistical Capacity and Growth – Main Estimations Dependent Variable Log of GDP per capita, 2018 Log of GDP per capita Random Year Country Fixed Model OLS (Cross-section) Effects System GMM (two step) Effects & Year (panel) Effects (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Log of Statistical Capacity 0.408*** 0.417*** 0.348*** score (Overall average), 2005 (0.111) (0.132) (0.129) Log of Statistical Capacity score lagged (Overall 0.031*** 0.022** 0.023** 0.030*** 0.029*** 0.028*** 0.064*** average) (0.009) (0.010) (0.012) (0.011) (0.009) (0.011) (0.023) Log of GDP per capita, 2005 0.946*** 0.967*** 0.981*** (0.022) (0.049) (0.051) Log GDP per Capita lagged 0.997*** 0.994*** 0.999*** 0.996*** 0.990*** 0.988*** 0.993*** (0.003) (0.003) (0.006) (0.019) (0.011) (0.010) (0.009) Main Control Variables NO YES YES YES YES NO YES YES YES YES Debt Shock (log difference t- NO NO YES NO NO NO NO NO YES NO bill x External debt lagged) Year Fixed Effects NO NO NO YES YES YES YES YES YES YES Lagged GDP per Lagged Lagged Lagged Lagged Capita, Instrumented variables GDP per GDP per GDP per GDP per Lagged (System GMM) Capita Capita Capita Capita Statistical Capacity Score Hansen Test (P-Value) 0.223 0.452 0.114 0.220 1.000 Arellano-Bond test for AR(1) 0.000 0.000 0.000 0.001 0.000 Arellano-Bond test for AR(2) 0.045 0.049 0.052 0.058 0.046 Arellano-Bond test for AR(3) 0.127 0.211 0.211 0.201 0.202 3 or 3 or Lags ALL ALL 3 or more more more Number of observations 91 91 81 1,301 1,301 1,301 1,301 1,301 1,149 1,301 Adjusted R2 0.969 0.968 0.965 0.999 note: *** p<0.01, ** p<0.05, * p<0.1 Robust standard errors clustered at the country level. For GMM estimates, Windmeijer-corrected standard errors are in parentheses. Control variables are lagged (2005 values for cross-sectional estimates) and include governance variables from the Worldwide Governance Indicators (such as Voice and Accountability, Political Stability, Rule of Law, and Control of Corruption), trade as a share of GDP, commodity price shocks, domestic private credit (% of GDP), Shares of manufacturing and agriculture in GDP, Primary school enrollment, and internal conflict (proportion of years with internal conflict for cross-sectional estimates). Trade is the only consistently statistically significant variable (at least at the 5% level). Other controls are largely not statistically significant across most specifications. 19 Table 2: Statistical Capacity and Growth – Instrumental Variables Log of Log of Statistical Log of Statistical Log of GDP Log of GDP Log of GDP Statistical Capacity score Capacity score Outcome Variable per capita, per capita, per capita, Capacity (Overall average), (Overall 2018 2018 2018 score (Overall 2005 average), 2005 average), 2005 Second Second Stage First Stage Second Stage First Stage First Stage Stage (1) (2) (3) (4) (5) (6) Log of Statistical Capacity score 0.508** 0.542*** 0.541** (Overall average), 2005 (0.254) (0.188) (0.211) French Legal Origin 0.132*** 0.131*** 0.148*** (0.042) (0.040) (0.044) German Legal Origin 0.335*** 0.315*** 0.324*** (0.053) (0.059) (0.053) WGI Government Effectiveness 0.212*** 0.231*** 96-03 (0.068) (0.071) Log of GDP per capita, 2005 0.967*** -0.023 0.967*** -0.047 0.985*** -0.053 (0.047) (0.043) (0.048) (0.040) (0.050) (0.039) Main Control Variables YES YES YES YES YES YES Debt Shock (log difference T-bill (2018-2005) x External debt NO NO NO NO YES YES lagged) Shea Partial R2 0.178 0.265 0.275 Underidentification test (p-value) 0.001 0.000 0.002 Weak identification test (F stat) 19.717 13.425 15.994 Stock-Yogo weak ID test critical values: 5% maximal IV relative bias 13.910 13.910 10% maximal IV relative bias 9.080 9.080 20% maximal IV relative bias 6.460 6.460 30% maximal IV relative bias 5.390 5.390 10% maximal IV size 19.930 22.300 22.300 15% maximal IV size 11.590 12.830 12.830 20% maximal IV size 8.750 9.540 9.540 25% maximal IV size 7.250 7.800 7.800 Hansen J statistic 0.275 0.543 0.222 (overidentification test, p-value) Number of observations 91 91 91 91 81 81 Adjusted R2 0.968 0.349 0.967 0.410 0.964 0.418 note: *** p<0.01, ** p<0.05, * p<0.1 Robust standard errors reported. UK legal origin omitted. Control variables (values for year 2005) include governance variables from the Worldwide Governance Indicators (such as Voice and Accountability, Political Stability, Rule of Law, and Control of Corruption), trade as a share of GDP, commodity price shocks (between 2005 and 2018), domestic private credit (% of GDP), Shares of manufacturing and agriculture in GDP, and Primary school enrollment. 20 Table 3: Summary Statistics – Panel Data Variable Mean Std. Dev. Min Max Observations Log of GDP per capita overall 7.967 1.087 5.367 9.898 N= 1301 between 1.107 5.438 9.866 n= 124 within 0.117 7.383 8.393 T bar = 10.4919 Log of Statistical Capacity score lagged (Overall average) overall -0.367 0.222 -1.455 -0.011 N= 1301 between 0.236 -1.099 -0.066 n= 124 within 0.082 -0.848 0.063 T bar = 10.4919 Log GDP per Capita lagged overall 7.940 1.088 5.393 9.929 N= 1301 between 1.106 5.442 9.819 n= 124 within 0.124 7.184 8.351 T bar = 10.4919 Log difference of commodity Net Export Price Index overall 0.002 0.027 -0.219 0.158 N= 1301 (historic fixed weights) between 0.014 -0.033 0.120 n= 124 within 0.026 -0.264 0.114 T bar = 10.4919 Domestic credit to private sector (% of GDP), lagged overall 0.386 0.298 0.013 1.601 N= 1301 between 0.276 0.019 1.458 n= 124 within 0.089 -0.005 0.861 T bar = 10.4919 Internal Conflict - 25 or more battle deaths a year, lagged overall 0.165 0.372 0.000 1.000 N= 1301 between 0.291 0.000 1.000 n= 124 within 0.219 -0.763 1.094 T bar = 10.4919 WGI Voice and Accountability - lagged overall -0.262 0.761 -2.203 1.293 N= 1301 between 0.791 -2.109 1.208 n= 124 within 0.153 -1.024 0.696 T bar = 10.4919 WGI Political Stability - lagged overall -0.337 0.863 -3.181 1.385 N= 1301 between 0.852 -2.974 1.259 n= 124 within 0.262 -2.010 0.757 bar = 10.4919 WGI Rule of Law - lagged overall -0.412 0.640 -1.897 1.433 N= 1301 between 0.680 -1.833 1.298 n= 124 within 0.140 -1.352 0.177 T bar = 10.4919 WGI Control of Corruption - lagged overall -0.406 0.638 -1.766 1.582 N= 1301 between 0.665 -1.526 1.397 n= 124 within 0.143 -1.273 0.174 T bar = 10.4919 Manufacturing, value added (share of GDP) - lagged overall 0.127 0.071 0.001 0.506 N= 1301 between 0.068 0.004 0.422 n= 124 within 0.015 0.033 0.211 T bar = 10.4919 Agriculture, forestry, and fishing, value added (share of overall 0.142 0.108 0.009 0.652 N= 1301 GDP) - lagged between 0.113 0.012 0.551 n= 124 within 0.024 -0.124 0.309 bar = 10.4919 School enrollment, primary (gross) - lagged overall 1.044 0.136 0.466 1.481 N= 1301 between 0.134 0.563 1.410 n= 124 within 0.057 0.585 1.246 T bar = 10.4919 Trade (share of GDP) - lagged overall 0.835 0.366 0.207 2.771 N= 1301 21 between 0.349 0.257 1.933 n= 124 within 0.117 0.350 2.081 T bar = 10.4919 Debt Shock (log difference T-bill x External debt lagged) overall -0.014 0.074 -0.421 0.394 N= 1149 between 0.017 -0.075 0.062 n= 106 within 0.073 -0.415 0.386 T bar = 10.8396 Table 4: Summary Statistics – Cross-section Data Variable Obs Mean Std. Dev. Min Max Log of GDP per capita, 2018 91 8.114 1.049 5.351 9.720 Log of Statistical Capacity score (Overall average), 2005 91 -0.381 0.227 -1.168 -0.045 French Legal Origin 91 0.604 0.492 0.000 1.000 German Legal Origin 91 0.077 0.268 0.000 1.000 UK Legal Origin 91 0.319 0.469 0.000 1.000 WGI Government Effectiveness 96-03 91 -0.300 0.547 -1.485 1.206 Log of GDP per capita, 2005 91 7.785 1.054 5.399 9.510 Log difference of commodity Net Export Price Index, 2018-2005 91 0.012 0.027 -0.026 0.142 Domestic credit to private sector (% of GDP), 2005 91 0.304 0.247 0.016 1.382 Internal Conflict (Proportion of years 2005-2018) 91 0.173 0.307 0.000 1.000 WGI Voice and Accountability, 2005 91 -0.260 0.736 -1.858 1.293 WGI Political Stability, 2005 91 -0.292 0.865 -2.100 1.385 WGI Rule of Law, 2005 91 -0.387 0.653 -1.632 1.305 WGI Control of Corruption, 2005 91 -0.406 0.654 -1.507 1.471 Manufacturing, value added (% of GDP), 2005 91 0.132 0.075 0.002 0.496 Agriculture, forestry, and fishing, value added (% of GDP), 2005 91 0.156 0.112 0.018 0.538 School enrollment, primary (% gross), 2005 91 1.041 0.145 0.483 1.389 Trade (% of GDP), 2005 91 0.827 0.361 0.271 2.039 Debt Shock (log difference T-bill (2018-2005) x External debt lagged) 81 -22.840 13.866 -71.144 -2.027 22 Table 5: Correlations – Panel Data Log of Internal Log Manufactur School Statistical Domestic Conflict - Agriculture, difference WGI Voice WGI WGI ing, value enrollm Trade Capacity Log GDP credit to 25 or more WGI Rule forestry, and Log of GDP of and Political Control of added ent, (share of score per Capita private sector battle of Law - fishing, value per capita commodity Accountabi Stability - Corruption (share of primary GDP) - lagged lagged (% of GDP), deaths a lagged added (share of Net Export lity - lagged lagged - lagged GDP) - (gross) - lagged (Overall lagged year, GDP) - lagged Price Index lagged lagged average) lagged Log of GDP per capita 1 Log of Statistical Capacity score 0.3333*** 1 lagged (Overall average) Log GDP per Capita lagged 0.9993*** 0.3286*** 1 Log difference of commodity Net Export Price Index (historic 0.0151 -0.0536* 0.0146 1 fixed weights) Domestic credit to private sector 0.4663*** 0.2770*** 0.4664*** -0.0348 1 (% of GDP), lagged Internal Conflict - 25 or more -0.1098*** 0.0117 -0.1090*** -0.0125 -0.0597** 1 battle deaths a year, lagged WGI Voice and Accountability - 0.4364*** 0.2418*** 0.4370*** -0.0347 0.2792*** -0.2514*** 1 lagged WGI Political Stability - lagged 0.4238*** 0.1096*** 0.4235*** -0.0252 0.2702*** -0.5210*** 0.5647*** 1 WGI Rule of Law - lagged 0.5029*** 0.2616*** 0.5013*** -0.0533* 0.5015*** -0.1734*** 0.7229*** 0.6933*** 1 WGI Control of Corruption - 0.4921*** 0.2011*** 0.4911*** -0.0308 0.4075*** -0.2103*** 0.6948*** 0.6644*** 0.8841*** 1 lagged Manufacturing, value added 0.2096*** 0.2944*** 0.2068*** -0.0429 0.1813*** 0.1757*** -0.1281*** -0.1259*** -0.0104 -0.0316 1 (share of GDP) - lagged Agriculture, forestry, and fishing, value added (share of -0.8326*** -0.3319*** -0.8326*** -0.0109 -0.4057*** 0.1247*** -0.3591*** -0.3515*** -0.4024*** -0.4085*** -0.2722*** 1 GDP) - lagged School enrollment, primary 0.0056 0.1080*** 0.0037 -0.0086 0.0351 -0.0617** 0.1268*** 0.1284*** 0.0808*** 0.0978*** 0.0616** -0.0428 1 (gross) - lagged Trade (share of GDP) - lagged 0.2398*** -0.0738*** 0.2376*** 0.0375 0.2239*** -0.2774*** 0.1179*** 0.3907*** 0.2120*** 0.1871*** -0.0017 -0.2086*** -0.0362 1 23 Table 6: Correlations – Cross-section Data Log Log of difference School Domestic Agriculture, Statistical of Internal enrollm Log of WGI Log of credit to WGI Voice WGI WGI Control Manufacturin forestry, Trade Capacity French German commodit Conflict WGI Rule ent, GDP per UK Legal Government GDP per private and Political of g, value and fishing, (% of score Legal Legal y Net (Proportion of Law, primary capita, Origin Effectivenes capita, sector (% Accountabi Stability, Corruption, added (% of value added GDP), (Overall Origin Origin Export of years 2005 (% 2018 s 96-03 2005 of GDP), lity, 2005 2005 2005 GDP), 2005 (% of 2005 average), Price 2005-2018) gross), 2005 GDP), 2005 2005 Index, 2005 2018-2005 Log of GDP per capita, 1 2018 Log of Statistical Capacity score (Overall 0.4219*** 1 average), 2005 -0.0497 0.1246 1 French Legal Origin 0.2739*** 0.3020*** -0.3568*** 1 German Legal Origin -0.1045 -0.3034*** -0.8453*** -0.1974* 1 UK Legal Origin WGI Government 0.6093*** 0.3954*** -0.2613** 0.1811* 0.1706 1 Effectiveness 96-03 Log of GDP per capita, 0.9811*** 0.3511*** -0.0449 0.2389** -0.0895 0.5959*** 1 2005 Log difference of commodity Net Export 0.0907 -0.1014 0.0488 0.0101 -0.057 -0.0929 0.1131 1 Price Index, 2018-2005 Domestic credit to private 0.4757*** 0.2304** -0.2279** 0.0115 0.2326** 0.6271*** 0.4837*** -0.0575 1 sector (% of GDP), 2005 Internal Conflict (Proportion of years -0.1003 -0.0426 0.037 -0.1536 0.0491 -0.2442** -0.0925 -0.1103 -0.1301 1 2005-2018) WGI Voice and 0.5125*** 0.2799*** -0.1849* 0.1803* 0.0909 0.6884*** 0.5260*** -0.0711 0.4579*** -0.3063*** 1 Accountability, 2005 WGI Political Stability, 0.3923*** 0.0191 -0.2114** 0.2142** 0.0993 0.5724*** 0.4114*** 0.1594 0.3038*** -0.5590*** 0.5266*** 1 2005 0.5137*** 0.2332** -0.3177*** 0.1229 0.2632** 0.8182*** 0.5156*** -0.091 0.5741*** -0.2886*** 0.7691*** 0.7315*** 1 WGI Rule of Law, 2005 WGI Control of 0.5434*** 0.2276** -0.2514** 0.1529 0.1764* 0.8049*** 0.5531*** -0.0053 0.4868*** -0.2772*** 0.7450*** 0.6685*** 0.8512*** 1 Corruption, 2005 Manufacturing, value 0.2234** 0.3686*** 0.1338 0.0674 -0.1790* 0.0949 0.2043* -0.2154** 0.1219 0.2179** -0.0197 -0.095 -0.0034 -0.0206 1 added (% of GDP), 2005 Agriculture, forestry, and fishing, value added (% -0.8377*** -0.4353*** -0.0169 -0.1796* 0.1204 -0.5081*** -0.8475*** -0.0589 -0.4538*** 0.0922 -0.4699*** -0.3421*** -0.4641*** -0.4997*** -0.2956*** 1 of GDP), 2005 School enrollment, 0.1816* 0.1496 0.1147 -0.1248 -0.049 0.1204 0.1800* 0.0391 0.079 -0.1938* 0.1931* 0.1162 0.1472 0.142 0.1433 -0.2267** 1 primary (% gross), 2005 0.2956*** 0.0074 -0.1084 0.1501 0.0279 0.2608** 0.2686** 0.1518 0.3652*** -0.2631** 0.0537 0.3738*** 0.2235** 0.1740* 0.0313 -0.2032* -0.0228 1 Trade (% of GDP), 2005 24 Table A1: Variable Definitions Variable Definition Source GDP per capita GDP per capita in constant 2010 US$ World Development Indicators Average of three sub-indicators: Source data, Methodology, and Periodicity and timeliness of socioeconomic indicators. Source data reflects whether a country conducts data collection activity in line with internationally recommended periodicity, and whether data from administrative systems are available and reliable for statistical estimation purposes. Specifically, the criteria used are the periodicity of population and agricultural censuses, the periodicity of poverty and health related surveys, and completeness of vital registration system coverage. Statistical methodology measures a country’s ability to adhere to internationally recommended standards and World Bank, methods. This aspect is captured by assessing guidelines http://datatopics.worldbank.org/statisticalca Statistical Capacity score (Overall and procedures used to compile macroeconomic statistics pacity/ average) and social data reporting and estimation practices. Methodology note: Countries are evaluated against a set of criteria such as use https://datatopics.worldbank.org/statisticalc of an updated national accounts base year, use of the latest apacity/files/Note.pdf balance of payments manual, external debt reporting status, subscription to International Monetary Fund’s Special Data Dissemination Standard, and enrollment data reporting to the United Nations Educational, Scientific, and Cultural Organization. Periodicity and timeliness measure the availability and periodicity of key socioeconomic indicators, of which nine are MDG indicators. This dimension attempts to measure the extent to which data are made accessible to users through transformation of source data into timely statistical outputs. Criteria used include indicators on income poverty, child and maternal health, HIV/AIDS, primary completion, gender equality, access to water and GDP growth. The commodity terms of trade index is based on international prices of up to 45 individual commodities, constituting broad categories of energy, metals, food and Commodity Net Export Price Index beverages, and agricultural raw materials. We calculate Gruss and Kebhaj (2019) (historic fixed weights) commodity price shocks by taking the first differences of the log of the price index. Historical fixed weights are employed. Domestic credit to private sector (% of Self-explanatory World Development Indicators GDP) Voice and accountability capture perceptions of the extent to which a country's citizens are able to participate in WGI Voice and Accountability Worldwide Governance Indicators selecting their government, as well as freedom of expression, freedom of association, and a free media. Political Instability and Absence of Violence/Terrorism measures perceptions of the likelihood of political WGI Political Instability Worldwide Governance Indicators instability and/or politically motivated violence, including terrorism. Rule of law captures perceptions of the extent to which agents have confidence in and abide by the rules of society, and in particular the WGI Rule of Law Worldwide Governance Indicators quality of contract enforcement, property rights, the police, and the courts, as well as the likelihood of crime and violence. 25 Control of corruption captures perceptions of the extent to which public power is exercised for private gain, including WGI Control of Corruption both petty and Worldwide Governance Indicators grand forms of corruption, as well as "capture" of the state by elites and private interests. Government effectiveness captures perceptions of the quality of public services, the quality of the civil service and the degree of its independence from political pressures, WGI Government Effectiveness Worldwide Governance Indicators the quality of policy formulation and implementation, and the credibility of the government's commitment to such policies. Manufacturing, value added (share of Self-explanatory World Development Indicators GDP) Agriculture, forestry, and fishing, value Self-explanatory World Development Indicators added (share of GDP) School enrollment, primary (gross) Self-explanatory World Development Indicators Trade (share of GDP) Self-explanatory World Development Indicators Estimated as the log difference of the US T-bill (10-year, World Development Indicators, US Debt Shock long term) multiplied by the lagged external debt Treasury Binary variable takes a value of 1 for countries with French Legal Origin La Porta et al., (2008) French Legal Origin, 0 otherwise Binary variable takes a value of 1 for countries with German Legal Origin La Porta et al., (2008) German Legal Origin, 0 otherwise Binary variable takes a value of 1 for countries with UK UK Legal Origin La Porta et al., (2008) Legal Origin, 0 otherwise Internal armed conflict defined (similar to Abu Bader and Ianchovichina, 2019) as a contested incompatibility where the use of armed force between two parties, of which at least one is the government of a state, results in at least 25 battle-related deaths per year. For the panel dataset the Internal Conflict (25 or more battle deaths variable takes a value of 1 if internal armed conflict, zero Pettersson and Öberg (2020), UCDP/PRIO a year) otherwise. The exception is economies such as Iraq and Armed Conflict Database Afghanistan that were in conflict for all years in the sample and thus attain values of zero. For cross-section estimations the variable is the number proportion of years that an economy experienced internal conflict between 2005 and 2018. 26 Table A2: Country List – Panel Data Afghanistan China Honduras Moldova Sierra Leone Albania Colombia Hungary Mongolia Slovak Republic Algeria Congo, Dem. Rep. India Montenegro South Africa Angola Congo, Rep. Indonesia Morocco Sri Lanka Antigua and Barbuda Costa Rica Iran, Islamic Rep. Mozambique St. Lucia St. Vincent and the Argentina Côte d'Ivoire Iraq Namibia Grenadines Armenia Croatia Jamaica Nepal Sudan Azerbaijan Dominica Jordan Nicaragua Suriname Bangladesh Dominican Republic Kazakhstan Niger Tajikistan Belarus Ecuador Kenya Nigeria Tanzania Belize Egypt, Arab Rep. Kyrgyz Republic North Macedonia Thailand Benin El Salvador Lao PDR Pakistan Timor-Leste Bhutan Equatorial Guinea Latvia Panama Togo Bolivia Eritrea Lebanon Papua New Guinea Tonga Botswana Estonia Lesotho Paraguay Tunisia Brazil Eswatini Liberia Peru Turkey Bulgaria Gabon Libya Philippines Uganda Burkina Faso Gambia, The Lithuania Poland Ukraine Burundi Georgia Madagascar Romania Uruguay Cabo Verde Ghana Malawi Russian Federation Vanuatu Cambodia Grenada Malaysia Rwanda Venezuela, RB Cameroon Guatemala Maldives Samoa Vietnam Central African Republic Guinea Mauritania Senegal Zambia Chad Guinea-Bissau Mauritius Serbia Zimbabwe Chile Guyana Mexico Seychelles Table A3: Country List – Cross-section Data Albania Cambodia Ghana Malawi Panama Thailand Algeria Cameroon Grenada Malaysia Paraguay Togo Argentina Chad Guatemala Mauritania Peru Tonga Azerbaijan Chile Guinea Mauritius Philippines Tunisia Bangladesh Colombia Guinea-Bissau Mexico Poland Turkey Belarus Congo, Rep. Guyana Moldova Romania Uganda Belize Costa Rica Honduras Mongolia Russian Federation Ukraine Benin Croatia Hungary Morocco Rwanda Uruguay Bhutan Dominica Indonesia Mozambique Samoa Vanuatu Bolivia Dominican Republic Kazakhstan Namibia Seychelles Vietnam Botswana Ecuador Kenya Nepal South Africa Zambia Brazil Egypt, Arab Rep. Kyrgyz Republic Nicaragua Sri Lanka Bulgaria El Salvador Lao PDR Niger St. Lucia St. Vincent and the Burkina Faso Eswatini Lebanon Nigeria Grenadines Burundi Gambia, The Libya North Macedonia Sudan Cabo Verde Georgia Madagascar Pakistan Tanzania 27