Policy Research Working Paper 9491 Measuring Total Factor Productivity Using the Enterprise Surveys A Methodological Note David C. Francis Nona Karalashvili Hibret Maemir Jorge Rodriguez Meza Development Economics Global Indicators Group December 2020 Policy Research Working Paper 9491 Abstract Total factor productivity is a key element of economic estimates are all published in the Enterprise Surveys data- growth and an important performance metric for policy base with a unique firm identifier to link to the rest of the makers. This note describes the methodology for measuring Enterprise Surveys data; because the estimates are reliant on firm-level total factor productivity using the World Bank’s new data, they are updated periodically as new Enterprise Enterprise Surveys cross-country data. It also presents Surveys data become available. The results show that: (i) some estimates recovered from the production function. median firms operate close to constant returns to scale; Two versions of the production function are estimated: one (ii) gross-output and value-added production functions Cobb-Douglas, the other a more flexible translog specifi- provide similar ranking of sectors in terms of output elas- cation. Both estimations are at the two-digit industry level ticities, capital intensity, and returns to scale; (iii) there is pooling all the Enterprise Surveys data across economies. large, firm-level heterogeneity in output elasticities; and (iv) Evidence is found against using a Cobb-Douglas specifi- gross-output-based total factor productivity measures are cation, which is more parsimonious, and in favor of using less dispersed than the value-added ones. the flexible translog specification. The resulting firm-level This paper is a product of the Global Indicators Group, Development Economics. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at hmaemir@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Measuring Total Factor Productivity Using the Enterprise ∗ Surveys: A Methodological Note David C. Francis Nona Karalashvili Hibret Maemir Jorge Rodriguez Meza† Keywords : TFP, Enterprise Surveys. JEL Classification : 012 ∗ This note uses the dataset “Firm Level TFP Estimates and Factor Ratios September 10 2020.dta” which is published at www.enterprisesurvyes.org in the Firm-Level Datasets for Researchers section. The estimation procedure of this note uses the full cross-country ES data, which is continuously being updated as new data becomes available. Every month estimates are updated, and older datasets are archived and available upon request. † The authors are from the Enterprise Analysis Unit of the World Bank. We would like to thank Arvind Jain and Rita Ramalho for very helpful comments. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. 1 Introduction Total factor productivity (TFP)—the ability to generate greater value or output with fewer inputs—is one of the key elements of economic growth. There is now a broad consensus that TFP differences account for the bulk of observed cross-country income differences (Klenow & Rodriguez-Clare 1997, Hall & Jones 1999). As Krugman (1994) succinctly put it “productivity isn’t everything, but, in the long run, it is almost everything”. Considerable scholarly analysis has been devoted to measuring productivity. The recent, increased availability of detailed firm-level datasets has further intensified the interest in the subject, including investigations of how productivity varies by characteristics of firms (see, for instance, Syverson (2011)). To the extent that data can be disaggregated, researchers can delve into within-economy differences; if data are comparable across economies and across time, cross-economy differences can be examined. For example, a large and growing body of work explores the link between within-industry productivity dispersion across firms and cross-country differences in aggregate productivity (Banerjee & Duflo 2005, Restuccia & Rogerson 2008, Hsieh & Klenow 2009, Bartelsman et al. 2013). In the absence of comparable census data, researchers have attempted to estimate productivity using survey-based data, which are often the only available data in less developed economies. The Enterprise Surveys (ES), detailed firm-level data collected by the World Bank’s (WB) Enterprise Analysis unit, are well suited for such inquiry. To facilitate the study of productivity by researchers and policymakers, the data published with this paper provide estimates of both TFP and factor ratios. The latter, more straightforward estimates are provided as TFP estimation may be troublesome for multiple reasons, e.g. the endogeneity of input choice (Olley & Pakes 1996, Levinsohn & Petrin 2003, Ackerberg et al. 2015). Unlike TFP estimates, some of these ratios are also available for most non-manufacturing firms. The rest of this paper is organized as follows. Section 2 briefly summarizes the ES data, including steps taken for comparability and regarding outlier observations. Section 3 discusses the estimation of revenue-based TFP, so-called TFPR. Estimates of output elasticities and their derived characteristics are presented in section 4. 2 Data The World Bank’s Enterprise Analysis unit has been conducting surveys using a methodology that allows cross-economy analysis since 2006. To date, over 168,000 face-to-face interviews with top managers and business owners in 144 economies have taken place under this “Global Methodology”. This note uses these data to estimate TFP. Surveys not using the Global Methodology are excluded, as are the surveys conducted earlier than 2006. The data from Zimbabwe 2011 are excluded from analysis due to the complications due to hyperinflation just prior to the data collection. An additional 25 surveys are dropped because at least one of the key variables used in the analysis was not collected in these surveys.1 This leaves 267 surveys in 134 economies and more than 161,000 interviews with top managers and business owners of firms spanning more than 40 different industries (specified by two-digit ISIC 1 These are: Bangladesh 2007, Benin 2009, Bhutan 2009, Cambodia 2013, Cabo Verde 2009, Central African Republic 2011, Chad 2009, Congo 2009, Eritrea 2009, Fiji 2009, Gabon 2009, Lesotho 2009, Liberia 2009, Malawi 2009, Micronesia 2009, Niger 2009, Pakistan 2007, Rwanda 2011, Samoa 2009, Sierra Leone 2009, Timor-Leste 2009, Togo 2009, Tonga 2009, Vanuatu 2009, and the Rep´ublica Bolivariana de Venezuela 2006. 2 Rev. 3.1 code). Of these interviews, more than 90,000 are with top managers and business owners of manufacturing firms, for which we provide TFPR estimates. Factor ratios using labor costs and revenues are provided in the associated dataset for most firms in the sample, including firms in selected services sectors covered by the ES. To construct TFPR, we need information on sales (Y), employment (L), capital stock (K), and intermediate inputs (M). These variables are proxied using the questions available in the data. More precisely, Y is proxied by total annual sales of the establishment (variable d2); K is proxied by the replacement value of machinery, vehicles, and equipment (variable n7a); L is proxied by the total annual cost of labor (variable n2a); and M is proxied by the total annual cost of inputs (variable n2e). For value-added (VA) specifications (presented below), VA is proxied by the difference between total annual sales of the establishment (variable d2) and annual cost of inputs (variable n2e); K and L are the same as in the gross output specification (also elaborated below). The appendix presents each variable used along with the exact wording of the questions. International Comparability All the variables used for the productivity estimation are collected in local currency units (LCUs), which are specific to the survey and year. Consequently, the data span different fiscal years. For the estimation of cross-economy regressions all data must be transformed to a common currency-year. To do this, all variables are first converted into U.S. dollars (USD) using the official exchange rate (period average) from the World Development Indicators (WDI).2 The data are then deflated to 2009 using the GDP deflator for the United States from the relevant reference fiscal year.3 Note that information on the closing month of the firms’ fiscal year is used to adjust exchange rates and deflators for each firm.4 Treatment of Outliers In order to minimize sensitivity to extreme values, outlier firms are eliminated from the analysis. More specifically, outliers in d2 (capturing Y), n7a (capturing K), n2a (capturing L), n2e (capturing M), and VA (d2 minus n2e), as well as outliers in ratios n7a/VA and n2a/VA were turned into missing before estimating the production function. To find outliers in levels, we first transform variables as ln(x +1), and group observations by economy and broadly defined sector (more precisely, manufacturing and services). Next, we calculate (unweighted) means and standard deviations of these transformed variables within each group. Observations that are more than three standard deviations away from the mean are then marked as outliers and turned into missing. To find outliers in ratios, we first transform variables as ln(x), and group observations by industry. The three-standard-deviation rule is then applied (unweighted) and the corresponding observations are turned into missing. 2 WDI indicator code: PA.NUS.FCRF 3 WDI indicator code: NY.GDP.DEFL.ZS 4 The fiscal year and its closing month information are given in variables “d2 l1 year perf indicators” and “d2 n3 last month f y perf ind” respectively. 3 Item Nonresponse Another challenge in estimating total factor productivity is dealing with item nonresponse: i.e., sam- pled firms do not answer specific questions of the survey. For example, respondents may answer the employment question but not sales. To reduce item non-response, the Enterprise Surveys team follows a strict quality control process that identifies non-responses and contacts firms to complete the data. Despite this effort, like many other firm-level datasets, the ES also suffers from item nonresponse in variables needed to calculate TFPR. One way to handle the item nonresponse is through imputation. For example, in the U.S. Census Bureau’s 2007 manufacturing data 73% of observations have imputed data for at least one variable used to compute TFPR (White et al. 2018). While item non-response may be consequential for most analysis, we do not attempt to address it in the data. We do not employ any of the available imputation or re-weighting methods that assume that data “missingness” is not ignorable and is related to underlying firm characteristics (Little & Rubin 2019). Additionally, the survey (probability) weights included in the data are agnostic to item non-response and the missingness of productivity estimates: weights are not re-adjusted or scaled to account for this missingness. We exclude observations missing any one of the main production function variables (i.e., sales, labor, capital, or materials). Additionally, note that in cases of negative value-added, logarithms cannot be defined, and thus these observations are not included. This leaves 50,754 observations in 134 countries . 3 Estimating Firm-Level Total Factor Productivity We begin with a Cobb-Douglas production function (1) for ease of exposition. Throughout this paper, all lowercase variables denote the natural logarithms of the corresponding uppercase variables, representing raw, level values. vai = β0 + βk ki + βl l + i (1) Where firm-level value-added ( vai ) is a function of inputs of capital (ki ), and labor (li ). Firms’ (logged) TFP is estimated as a sum of the constant and the residual, i.e., tf pi = β ˆ0 + ˆi . We refer to the above model as value-added specification (VAKL). In addition to VAKL, TFP is also estimated using gross-output specification (YKLM) where vai in (1) is replaced with yi , output in terms of total revenues, and the right-hand-side has an additional input variable, expenditure on material inputs (mi ).5 The coefficients βk and βl estimate the elasticities of capital and labor, respectively. Throughout, we ˆl ), where input ∈ k, l, m. ˆinput (i.e., θ will denote estimated elasticities by θ While analytically straightforward, this estimation of TFP bears several caveats. First, ordinary least squares estimates of equation (1) may suffer from a simultaneity problem: firms’ input choices may be guided by their productivity (Marschak & Andrews 1944). Several methods have been developed in order to address this endogeneity problem (Olley & Pakes 1996, Levinsohn & Petrin 2003, Ackerberg et al. 2015). In these methods, past input decisions (for instance the choice to invest in capital) are 5 Note that other versions of (1) are also possible, e.g. YKL, or YKELM with E for Electricity. We do not analyze these here as adding a fourth input into a translog production function substantially increases the number of parameters to be estimated. 4 used to proxy for the current production and use of inputs. To extend future analysis to include these estimations, the ES has started collecting information on these lagged variables of inputs. This will be used in the future as the data become available. Other, firm-level fixed effects methods have been used to estimate average firm-level productivity over time. The ES has constructed longitudinal/panel data for a large number of countries: TFP estimates based on the panel data will also be provided in the future. Second, there are issues associated with the fact that often only monetary (as opposed to physical) output and input expenditure are observed in typical firm-level data. Such revenue-based TFP is often referred to as T F P R(T F P Ri = Pi × T F P Qi ), where R stands for revenue and Q for quantity, and Pi denotes the firm’s product price. Market dynamics are inseparable within TFPR estimates, which incorporate clearing prices of inputs and revenue-based outputs and can conflate productivity and market (e.g. negotiating) power. As in the case for most firm-level datasets, the ES collect information on revenues and firm-level line item costs (rather than physical inputs and outputs), and hence TFPR is the only measure that can be estimated using the ES data. For a recent discussion of these and other issues in estimation see for instance Foster et al. (2008), Hsieh & Klenow (2009). A third caveat relates to the importance of the functional form of the production function. The Cobb-Douglas specification in equation (1) assumes a constant elasticity of output, regardless of other output choices. These elasticities are constant and in the form of βk and βl . In other words, a firm’s elasticity of capital output, for example, does not depend on its use of labor: labor-intensive firms expect the same elasticity of output of capital as non-labor-intensive ones. This assumption may be unrealistic in two ways. First, imposing one production function on firms in different industries is almost surely too restrictive. And indeed, most of the empirical literature on productivity defines industries as narrowly as data permit (Syverson 2011). The ES estimations address this point with a very practical solution: TFPR is estimated separately for each industry — grouped by two-digit ISIC codes — over pooled economies.6 Second, the assumption of constant elasticities of output for all three (or two) inputs may be too stringent even within industries defined at the two-digit level: so, we also consider a more flexible functional form, the translog specification, which does not impose this restriction.7 Table A.3 in the Appendix presents the 16 industries with separate estimations. Hence, the estimations assume that the production function, either Cobb-Douglas or translog, is sector-specific but common across countries. To address this rather restrictive assumption, the specifi- cation is enriched as follows. First, in order to control for potential differences in production technology between countries, wherever possible, the production coefficients are allowed to vary by the income-level grouping of the corresponding economy by adding interaction terms between income group and factor inputs. The income levels are grouped according to the WB classification (low-income and lower middle income grouped as low-income and upper-middle income and high income grouped as high income) as of the year in which each survey was conducted and are denoted with Ic (equals 1 for high-income). Empirical investigation of the stability of our estimates revealed that this income grouping is appro- 6 The production functions could in principle be estimated by country and sector. However, estimating production coefficients separately by country-sector is difficult with the ES data since there are few observations for most countries within each sector. 7 Due to low number of observations in the current dataset, four groups of industries are defined: group 15 and 16: food, beverages and tobacco; group 23 and 24: refined petroleum, nuclear fuel and chemicals; group 30,31,32 and 33: electrical machinery and electronics; and group 34 and 35: transport equipment. 5 priate if the number of observations per industry and income group is at least 500. For industries with fewer than 500 observations per income group, the coefficients are estimated across all economies (removing Ic ). The number of observations by industry and income group is presented in Figure 1.8 All industries except 19 (leather), 21 (paper), 27 (basic metals), a group of 34 and 35 combined (transport equipment) have more than 500 observations per income group. Hence, productivity coefficients vary across countries with each two-digit sector except for those sectors. The regressions also control for an average economy-level and time effects by including dummy variables for each economy (F Ec ) and year F Et (Halvorsen et al. 1980). An income level fixed effect F EI is also included in the regression.9 Figure 1: Number of Observations by Sector and Income Group as of September 10, 2020 Note: The figure shows the number of observations by two-digit ISIC Rev. 3.1 sector in the low- and high-income economies. The codes represent the two-digit ISIC Rev. 3.1 sectors. Due to low number of observations, some two-digit sectors are combined: group 15 and 16 (15t16): beverages and tobacco; group 23 and 24 (23t24): refined petroleum, nuclear fuel and chemicals; group 30,31,32 and 33 (30t33): electrical machinery and electronics; and group 34 and 35 (34t35): transport equipment. The functional form of Cobb-Douglas is examined in comparison with the more flexible translog production function. The latter is a second-order Taylor expansion of the Cobb-Douglas function; it interacts each input term with itself and all other combinations of input terms. For the gross output specification this (YKLM) gives: 8 Note: As the dataset is periodically updated, the number of observations in the production function estimations are subject to change. 9 A dummy for income group is not subsumed by the country fixed effect because income status can change across survey years. The countries that have changed income group during the survey period are Albania, Colombia, Ecuador, Georgia, Guatemala, Kosovo, Namibia, Paraguay and Peru. 6 yict = βk kict + βl lict + βm mict + βki kict × Ic + βli lict × Ic + βmi mict × Ic coefficients vary by income group 2 2 +βkk kict + βll lict + βmm m2 ict + βkl kict .lict + β11 kict .msci (2) + βlm lsci .mict + βklm kict .lict .mict + cY KLM + F EI + F Ec + F Et + ict represents TFP The value-added production function (VAKL), which imposes a fixed proportion assumption on material inputs (Leontief), is estimated as follows: vaict = βk kict + βl lict + βki kict × Ic + βli lict × Ic coefficients vary by income group (3) 2 2 +βkk kict + βll lict + βkl kict .lict + cV AKL + F EI + F Ec + F Et + ict represents TFP where subscripts i, c, t index establishments, countries, and year, respectively; cV AKL , and cY KLM are constants which are common across establishments within each industry. To test whether either the Cobb-Douglas or translog production specification was more appropriate, the joint significance of the translog terms (all interaction and square terms) was tested. Table 1 reports the results of these joint significance tests. Under the gross-output (YKLM) specification, the translog terms are jointly different from zero for all sectors, suggesting that the translog specification fits the data better. Under the value-added (VAKL) specification, the translog terms are jointly significant at the 10% significance level for all except four sectors (17, 21, 29, and 34-35). To ensure comparability across sectors, only translog estimates are reported for all sectors under both specifications in the associated dataset and are used in the rest of the note. 7 Table 1: Joint Significance Test Gross-output (YKLM) Value-added (VAKL) Sector (ISIC Rev 3.1) F-Stat p-value F-Stat p-value ISIC 15, and 16 43.01 0.00 4.22 0.01 ISIC 17 28.96 0.00 1.58 0.19 ISIC 18 29.48 0.00 7.39 0.00 ISIC 19 28.51 0.00 3.53 0.01 ISIC 20 18.22 0.00 7.63 0.00 ISIC 21 54.72 0.00 0.93 0.43 ISIC 22 7.82 0.00 3.57 0.01 ISIC 23, and 24 12.13 0.00 3.21 0.02 ISIC 25 10.98 0.00 8.54 0.00 ISIC 26 46.91 0.00 4.97 0.00 ISIC 27 37.77 0.00 10.45 0.00 ISIC 28 25.81 0.00 2.69 0.04 ISIC 29 7.32 0.00 0.87 0.45 ISIC 30, 31, 32, and 33 18.77 0.00 2.16 0.09 ISIC 34, and 35 20.29 0.00 0.89 0.45 ISIC 36 22.46 0.00 5.18 0.00 Having adopted the translog specification, firm-level TFPR is estimated by: f f T F P Ricf = ict + cf f f f s + F EI + F Ec + F Et (4) f where f ∈ V AKL, Y KLM . TFPR is estimated as a sum of the establishment-level residual ict , constant term (cf s ) which are common across establishment within each industry, country-fixed effects f f f (F Ec ), income group fixed effects (F EI ), year fixed effects (F Et ) which are common across establish- ments within industry-country, industry-income group, and industry-year, respectively. The database Y KLM and T F P RV AKL , along- available on the ES portal contains the firm-level estimates, i.e. T F P Ricf icf side the variables used to estimate TFPR. All estimates take into consideration the survey design for the ES by incorporating both stratification and probability (survey) weight information. 4 Estimates The output elasticity is given by the first derivative of the production function with respect to each input. For instance, under the gross-output production function, the output elasticity of material inputs for ˆ m=β high-income economies is estimated as theta ˆm + 2β ˆkm kict + β ˆmm mict + β ˆklm kict lict + β ˆlm lict + β ˆmi . ˆmi . The elasticities for material inputs depend The elasticity for low-income counties is the same except β on the level of use of all inputs used in the production, including labor and capital and not only on the level of use of material inputs. Translog coefficients (β ˆmm , β ˆm , β ˆlm , β ˆkm , β ˆklm , β ˆmm , β ˆmi ) are the same across establishments within an industry and income group. However, unlike the Cobb-Douglas production function, the output elasticities can vary across firms within the same industry/country, because they depend on the level of use of the inputs of production. Hence, small firms can have different elasticities than large firms (just the same, elasticities can vary between small firms, e.g.). This flexible form is an advantage of translog 8 over the Cobb-Douglas production functions, which would have the same output elasticity across estab- lishments within an industry/income group. For example, under the Cobb-Douglas specification, the ˆmi for high-income countries and β ˆm + β output elasticity of material input is simply given by β ˆm for low-income countries. Figure 2 plots estimates of the median output elasticities for each input by sector and income group recovered from the translog production functions. The figure shows that, under the gross-output production function, material inputs has the highest elasticity in all sectors, ranging from 0.41–0.63. The median labor and capital elasticities range from 0.22–0.56 and 0.04–0.15, respectively. The sum of the elasticities, a measure of the returns to scale, ranges from 0.91 to 1.14, indicating that median firms in an industry/income-group operate close to constant returns to scale. When using the value-added production function, labor input has the highest median elasticity in all sectors. The median labor and capital elasticities range from 0.10 to 0.29, and 0.60–0.92, respectively. The returns to scale range from 0.89 to 1.17. The input factor elasticities and returns to scale obtained from the estimation are in line with earlier findings in the literature De Loecker et al. (2016) for India, and Gandhi et al. (2020) for Colombia and Chile. Figure 2: Median Output Elasticities by Sector and Income Group Note: The figure shows the median estimated output elasticity with respect to each factor of production under the gross-output (YKLM) and value-added (VAKL) translog production functions for all manufacturing firms in the low- and high-income economies. Figure 3 shows the median output elasticities for capital and labor for both the gross output and value-added specifications. The figure clearly shows the ranking of sectors in terms of output elasticities of labor and capital is broadly similar for most sectors. For instance, in low-income countries, the apparel 9 (18) sector has the lowest output elasticity of capital both under the gross output and value-added specifications. Figure 3: Median Output Elasticities of Labor and Capital Note: The figure shows the median output elasticities for capital and labor estimated under the gross-output (YKLM) and value-added (VAKL) translog production functions. The dot labels are two-digit ISIC Rev. 3.1 codes. Figure 4 shows the ratio of the median capital to labor elasticities, which measures the capital intensity in each sector (left panel), and the sum of elasticities which under constant returns to scale add-up to one (right panel). The Apparel (18) and Food (15 and 16) industries in low-income economies are the least capital intensive; Printing and Publishing (22) and Petroleum and Chemicals (23 and 24) in both high and low-income economies are the most capital-intensive industries. The sum of the output elasticities is around one for most of the sectors. Hence, median firms in each sector operate close to constant returns to scale. The gross-output and value-added specifications provide a consistent ranking of sector in terms of the capital intensity and returns to scale. 10 Figure 4: Median Returns to Scale and Capital Intensity Note: The figure shows the standard deviations for the output elasticities within each industry and income group. The dot labels are two-digit ISIC Rev. 3.1 codes. Figure 5 shows the standard deviations of output elasticities for labor and capital inputs. The figure shows that there is large, firm-level heterogeneity in the output elasticity of capital and labor in all sectors, although the magnitudes varies across industries. For example, the basic metals sector (27) exhibits the highest firm-level dispersion in output elasticity to capital in both low and high-income economies. This large heterogeneity provides strong evidence against the Cobb-Douglas specification that assumes constant output elasticity. 11 Figure 5: Dispersions of Output Elasticities Note: The figure shows the standard deviations of output elasticities for capital and labor estimated under the gross-output (YKLM) and value-added (VAKL) translog production functions. The dot labels are two-digit ISIC Rev. 3.1 codes. The discussion above suggests heterogeneity in output elasticities within industry and income group. To explore whether the median output elasticities vary across countries within industry, Figure 6 plots the distribution of country median estimates of output elasticities by industry. The median output elasticities are comparable for most countries. However, there are some outliers. The output elasticity estimates turn negative in some countries and sectors. For example, the median elasticities for basic metals (27) can become negative for both capital and labor under the value-added specification. 12 Figure 6: Output Elasticities by Country To see the differences in estimates across countries and sectors more clearly, Figure 7 plots the economy-sector level output elasticities of labor and capital. The median estimates of labor elasticities 13 are positive and sensible for most country and sectors except a few under the gross-output specification. The median capital elasticities are positive for most of the country-sector pairs. Tables A.1 and A.2 in the Appendix report the list of country-sector pairs where the median output elasticities of capital are negative (red dots in Figure 7). Using the value-added specification, the median estimates of labor elasticity are positive for all country-sector pairs. Figure 7: Median Output Elasticities by Country-Sector Pairs Note: Each point is median output elasticity by country and two-digit ISIC Rev. 3.1 sector. The red dots show country-sector pairs where the median output elasticities of capital are negative. Figure 8 shows the distribution of output elasticities within countries – adjusted for the country x industry x year fixed effects. The figure shows that there is a substantial heterogeneity in production technology across firms within the same country-sector in a given year. 14 Figure 8: Distribution of Output Elasticities Note: The figures show the densities of computed residuals of output elasticities. The residuals are obtained after controlling for country-industry-year fixed effects. To systematically explore the differences in output elasticities across countries, we regress the output elasticities on the set of fixed effects used in the regressions to quantify their respective explanatory power. Figure 9 plots R-squares for a regression of output elasticities on country, sector, and year fixed effects with no other controls. Under the gross-output specification, a regression of capital, labor and material elasticities: (i) on country fixed effects yields R-squared of 0.04, 0.08, and 0.07, respectively; (ii) on sector fixed effects gives 0.31, 0.06, and 0.03, respectively. Year fixed effects have little explanatory power with R-squared = 0.03 for capital and 0.04 for labor and material output elasticities. The combined effects of country, sector and year fixed effects is 0.35 for capital, 0.14 for labor and 0.10 for material elasticities. Under the value-added specification, the country effects account for 0.08 and 0.05 for capital and labor elasticities, respectively. The sector fixed effects explain 0.17 and 0.33 for capital and labor elasticities respectively. 15 Figure 9: R-squared for Various Sets of Fixed Effects Figure 10 plots the distributions of TFPR (log) measured using gross-output (tfprYKLM) and value- added (tfprVAKL) production functions. The figure shows the distribution of residuals from regressing 16 establishment-level TFPR on country-sector-year fixed effects. Results show that there is a sizable dispersion of TFPR across establishments within industry in a country and that TFPR estimates based on VAKL model are more dispersed than YKLM specification. Figure 10: Distribution of TFPR: YKLM vs VAKL Specifications Note: The figure plots the distribution of log TFPR residuals. The residuals are obtained after controlling for country-industry-year fixed effects. The dispersions are not driven by differences between industries, countries and years. Figure 11 displays the dispersion of TFPR across establishments measured using gross-output (tf- prYKLM) and value-added (tfprVAKL) production functions for the two largest sectors in terms of number of establishments – food and garment. The value-added specifications suggest a much larger TFPR differences across establishments within an industry, as the standard deviations for most of the countries in both sectors lies above the 45-degree line. 17 Figure 11: Dispersion of TFPR: Gross-output vs Value-Added Specification 5 Discussion This note presents the background methodology for TFP estimates using the Enterprise Surveys data: it accompanies a firm-level dataset and can, in turn, allow users and researchers to explore firm-level heterogeneity and relationships to other underlying data in the ES. Users can refer to this note for guidance on the use of those estimates. Based on the evidence from this note, TFP estimates from the translog form, using the gross output function are considered as the baseline estimate for standard analysis. However, users can explore other functional forms (including Cobb-Douglas and value-added), keeping in mind the caveats noted here. Users and researchers should note that each of these estimates considers only inputs of capital, labor, and materials in the production function. Alternative estimations taking, for example, elements of the business environment as inputs (not just co-variates of TFP) into the production function would need to be calculated separately. Finally, users should note that as newer surveys are added to the ES portal, these calculations will be repeated, updating TFP estimations. 18 References Ackerberg, D. A., Caves, K. & Frazer, G. (2015), ‘Identification properties of recent production function estimators’, Econometrica 83(6), 2411–2451. Banerjee, A. V. & Duflo, E. (2005), ‘Growth theory through the lens of development economics’, Handbook of economic growth 1, 473–552. Bartelsman, E., Haltiwanger, J. & Scarpetta, S. (2013), ‘Cross-country differences in productivity: The role of allocation and selection’, American economic review 103(1), 305–34. De Loecker, J., Goldberg, P. K., Khandelwal, A. K. & Pavcnik, N. (2016), ‘Prices, markups, and trade reform’, Econometrica 84(2), 445–510. Foster, L., Haltiwanger, J. & Syverson, C. (2008), ‘Reallocation, firm turnover, and efficiency: Selection on productivity or profitability?’, American Economic Review 98(1), 394–425. Gandhi, A., Navarro, S. & Rivers, D. A. (2020), ‘On the identification of gross output production functions’, Journal of Political Economy 128(8), 2973–3016. Hall, R. E. & Jones, C. I. (1999), ‘Why do some countries produce so much more output per worker than others?’, The quarterly journal of economics 114(1), 83–116. Halvorsen, R., Palmquist, R. et al. (1980), ‘The interpretation of dummy variables in semilogarithmic equations’, American economic review 70(3), 474–475. Hsieh, C.-T. & Klenow, P. J. (2009), ‘Misallocation and manufacturing tfp in china and india’, The Quarterly journal of economics 124(4), 1403–1448. Klenow, P. J. & Rodriguez-Clare, A. (1997), ‘The neoclassical revival in growth economics: Has it gone too far?’, NBER macroeconomics annual 12, 73–103. Krugman, P. (1994), ‘The age of diminished expectations: Us economic policy in the 1990s, revised and updated edition’. Levinsohn, J. & Petrin, A. (2003), ‘Estimating production functions using inputs to control for unob- servables’, The review of economic studies 70(2), 317–341. Little, R. J. & Rubin, D. B. (2019), Statistical analysis with missing data, Vol. 793, John Wiley & Sons. Marschak, J. & Andrews, W. H. (1944), ‘Random simultaneous equations and the theory of production’, Econometrica, Journal of the Econometric Society pp. 143–205. Olley, G. S. & Pakes, A. (1996), ‘The dynamics of productivity in the telecommunications equipment industry’, Econometrica 64(6), 1263–1297. Restuccia, D. & Rogerson, R. (2008), ‘Policy distortions and aggregate productivity with heterogeneous establishments’, Review of Economic dynamics 11(4), 707–720. Syverson, C. (2011), ‘What determines productivity?’, Journal of Economic literature 49(2), 326–65. White, T. K., Reiter, J. P. & Petrin, A. (2018), ‘Imputation in us manufacturing data and its implica- tions for productivity dispersion’, Review of Economics and Statistics 100(3), 502–509. 19 APPENDIX A Variables for estimation and associated questions in the ques- tionnaire • Sales. Total annual sales of establishment is measured by variable d2, which records responses to the following question: “In [last complete] fiscal year, what were this establishment’s total annual sales for all products and services?” • Cost of labor. Total annual cost of labor is measured by variable n2a, with the corresponding question as follows: “From this establishment’s Income Statement for fiscal year please provide total annual cost of labor including wages, salaries, bonuses, social security payments” • Materials. Total annual cost of inputs is measured by variable n2e, with the corresponding question asked only to the manufacturing firms as follows: “From this establishment’s Income Statement for fiscal year please provide total annual cost of raw materials and intermediate goods used in production” • Total annual cost of finished goods is measured by variable n2i, with the corresponding question asked only to the services firms as follows: “From this establishment’s Income Statement for fiscal year please provide total annual cost of finished goods and materials purchased to resell” • Labor. Total number of workers is measured by variable l1, with the corresponding question as follows: “At the end of [the last complete] fiscal year, how many permanent, full-time individual worked in this establishment? Please include all employees and managers (Permanent, full-time employees are defined as all paid employees that are contracted for a term of one or more fiscal years and/or have a guaranteed renewal of their employment contract and that work a full shift) • Capital. Price of machinery, vehicles, and equipment is measured by variable n7a, with the corre- sponding question as follows: “Hypothetically, if this establishment were to purchase [machinery, vehicles, and equipment] it uses now, in their current condition and regardless of whether the establishment owns them or not, how much would they cost, independently of whether they are owned, rented or leased? 20 Table A.1: List of Country-Industry Pairs with Negative Median Capital Elasticity - YKLM Model Country Year Sector Median Country Year Sector Median Cap. Elast Cap. Elast Afghanistan 2008 36 -0.03 Mongolia 2013 18 0.00 Afghanistan 2014 36 -0.09 Mongolia 2013 27 -0.06 Albania 2013 27 -0.19 Mauritania 2006 20 -0.01 Albania 2013 20 -0.04 Mauritania 2006 27 -0.03 Argentina 2017 19 -0.12 Mauritius 2009 27 0.00 Armenia 2009 27 -0.11 Niger 2017 27 -0.13 Armenia 2009 18 -0.01 Nigeria 2007 27 -0.05 Burkina Faso 2009 27 -0.09 Nigeria 2014 30t33 -0.08 Bangladesh 2013 27 0.00 Nicaragua 2010 20 0.00 Bosnia and Herzegovina 2013 19 -0.02 Nicaragua 2016 27 -0.12 Bosnia and Herzegovina 2019 27 -0.14 Nepal 2009 18 -0.01 Bolivia 2017 27 -0.17 Panama 2010 18 -0.02 Cˆote d’Ivoire 2009 27 -0.11 Peru 2017 27 -0.08 Dominican Republic 2016 27 -0.16 Philippines 2015 27 -0.07 Estonia 2009 19 -0.03 Poland 2009 27 -0.03 Georgia 2013 18 0.00 Paraguay 2006 27 -0.05 Georgia 2013 30t33 -0.05 West Bank and Gaza 2013 36 0.00 Guinea 2006 20 -0.02 Sudan 2014 15t16 -0.01 Guinea 2016 36 -0.02 Sierra Leone 2017 20 -0.02 Guinea-Bissau 2006 18 -0.02 El Salvador 2006 19 0.00 Guinea-Bissau 2006 36 -0.05 South Sudan 2014 27 -0.20 Greece 2018 19 -0.05 Slovak Republic 2019 19 -0.19 Guyana 2010 27 -0.10 Eswatini 2016 18 0.00 Hungary 2009 19 -0.06 Eswatini 2016 36 -0.04 Hungary 2019 19 -0.06 Thailand 2016 27 -0.01 Indonesia 2015 27 -0.04 Tajikistan 2013 36 -0.04 Kazakhstan 2013 18 -0.06 Timor-Leste 2015 36 -0.02 Kazakhstan 2013 27 -0.06 Timor-Leste 2015 30t33 -0.08 Cambodia 2016 27 -0.01 Timor-Leste 2015 18 -0.03 Lao PDR 2016 27 -0.14 Timor-Leste 2015 27 -0.10 Liberia 2017 19 0.00 Trinidad and Tobago 2010 27 0.00 Liberia 2017 27 -0.04 Tunisia 2020 36 -0.04 Lithuania 2019 34t35 -0.08 Turkey 2013 27 -0.32 Morocco 2019 27 -0.17 Uruguay 2017 19 -0.07 Madagascar 2013 27 -0.19 Venezuela, RB 2010 27 -0.20 Mali 2007 27 -0.08 Vietnam 2009 27 -0.03 Mali 2007 30t33 -0.06 Vietnam 2015 27 -0.04 Mali 2016 27 -0.01 Yemen, Rep. 2013 18 -0.01 Myanmar 2014 27 -0.06 Zambia 2007 27 -0.13 Myanmar 2016 27 0.00 Zambia 2013 27 -0.03 21 Table A.2: List of Country-Industry Pairs with Negative Median Capital Elasticity - VAKL Model Country Year Sector Median Country Year Sector Median Cap. Elast Cap. Elast Afghanistan 2014 27 -0.10 Moldova 2013 18 -0.03 Afghanistan 2014 36 -0.15 Madagascar 2013 27 -0.47 Albania 2007 19 -0.04 Mali 2007 27 -0.04 Albania 2013 20 -0.02 Mongolia 2013 27 -0.02 Armenia 2009 27 -0.17 Mauritius 2009 27 -0.07 Burundi 2006 18 -0.02 Namibia 2014 27 -0.04 Burundi 2006 36 -0.06 Niger 2017 27 -0.15 Bosnia and Herzegovina 2019 27 -0.19 Nigeria 2007 27 -0.05 Belarus 2008 20 0.00 Nigeria 2007 36 -0.02 Bolivia 2017 27 -0.10 Nigeria 2014 21 -0.05 Bhutan 2015 21 -0.11 Nicaragua 2010 21 -0.05 Cˆote d’Ivoire 2009 27 -0.03 Nicaragua 2016 27 -0.15 Cameroon 2016 36 -0.02 Nepal 2009 18 -0.05 Congo, Dem.Rep. 2013 36 -0.02 Panama 2010 36 -0.05 Congo, Dem. Rep. 2013 21 -0.15 Peru 2017 27 -0.10 Congo, Dem. Rep. 2013 18 0.00 Philippines 2009 21 -0.12 Dominican Republic 2016 27 -0.30 Philippines 2015 27 -0.12 Ecuador 2010 21 -0.06 Philippines 2015 21 -0.13 Georgia 2013 30t33 -0.03 Sierra Leone 2017 20 -0.03 Guinea 2006 20 -0.02 South Sudan 2014 27 -0.35 Guinea Bissau 2006 18 -0.01 Eswatini 2016 18 -0.17 Guinea Bissau 2006 36 -0.16 Eswatini 2016 26 -0.02 Guyana 2010 27 -0.12 Eswatini 2016 36 -0.07 Honduras 2016 18 -0.02 Thailand 2016 27 -0.01 Hungary 2013 18 0.00 Tajikistan 2013 36 -0.07 Indonesia 2015 27 -0.11 Timor-Leste 2015 36 -0.06 Kazakhstan 2009 27 -0.40 Timor-Leste 2015 30t33 -0.03 Kazakhstan 2013 18 -0.11 Timor-Leste 2015 18 -0.09 Lao PDR 2016 27 -0.11 Timor-Leste 2015 27 -0.03 Liberia 2017 21 -0.10 Tunisia 2020 21 -0.11 Liberia 2017 27 -0.11 Tunisia 2020 36 -0.03 Lithuania 2009 18 0.00 Turkey 2013 27 -0.61 Lithuania 2013 18 -0.02 Uzbekistan 2008 21 -0.05 Latvia 2019 18 -0.02 Venezuela, RB 2010 27 -0.19 Morocco 2019 21 0.00 Yemen, Rep. 2013 18 -0.01 Morocco 2019 27 -0.31 Zambia 2007 27 -0.14 Moldova 2009 21 -0.04 Zambia 2013 27 -0.03 22 Table A.3: Industries included in the analysis ISICs 15 and 16: Manufacturing of food products and beverages, and manufacturing of tobacco products ISIC 17 Manufacture of textiles ISIC 18 Manufacture of wearing apparel; dressing and dyeing of fur ISIC 19 Tanning and dressing of leather; manufacture of luggage, handbags, saddlery, harness and footwear ISIC 20 Manufacture of wood and of products of wood and cork, except furniture; manufacture of articles of straw and plaiting materials ISIC 21 Manufacture of paper and paper products ISIC 22 Publishing, printing and reproduction of recorded media ISICs 23 and 24: Manufacturing of coke, refined petroleum products and nuclear fuel, and manufacturing of chemicals and chemical products ISIC 25 Manufacture of rubber and plastics products ISIC 26 Manufacture of other non-metallic mineral products ISIC 27 Manufacture of basic metals ISIC 28 Manufacture of fabricated metal products, except machinery and equipment ISIC 29 Manufacture of machinery and equipment ISICs 30, 31, 32, and 33: Manufacturing of office, accounting and computing machinery; manufacturing of electrical machinery and apparatus n.e.c., manufacturing of radio, television and communication equipment and apparatus, and manufacturing of medical, precision and optical instruments, watches and clocks ISICs 34 and 35: Manufacturing of motor vehicles, trailers and semi-trailers, and manufacturing of other transport equipment ISIC 36 Manufacture of furniture; manufacturing n.e.c. References Ackerberg, D. A., Caves, K. & Frazer, G. (2015), ‘Identification properties of recent production function estimators’, Econometrica 83(6), 2411–2451. Banerjee, A. V. & Duflo, E. (2005), ‘Growth theory through the lens of development economics’, Handbook of economic growth 1, 473–552. Bartelsman, E., Haltiwanger, J. & Scarpetta, S. (2013), ‘Cross-country differences in productivity: The role of allocation and selection’, American economic review 103(1), 305–34. De Loecker, J., Goldberg, P. K., Khandelwal, A. K. & Pavcnik, N. (2016), ‘Prices, markups, and trade reform’, Econometrica 84(2), 445–510. Foster, L., Haltiwanger, J. & Syverson, C. (2008), ‘Reallocation, firm turnover, and efficiency: Selection on productivity or profitability?’, American Economic Review 98(1), 394–425. Gandhi, A., Navarro, S. & Rivers, D. A. (2020), ‘On the identification of gross output production functions’, Journal of Political Economy 128(8), 2973–3016. Hall, R. E. & Jones, C. I. (1999), ‘Why do some countries produce so much more output per worker than others?’, The quarterly journal of economics 114(1), 83–116. Halvorsen, R., Palmquist, R. et al. (1980), ‘The interpretation of dummy variables in semilogarithmic equations’, American economic review 70(3), 474–475. Hsieh, C.-T. & Klenow, P. J. (2009), ‘Misallocation and manufacturing tfp in china and india’, The Quarterly journal of economics 124(4), 1403–1448. 23 Klenow, P. J. & Rodriguez-Clare, A. (1997), ‘The neoclassical revival in growth economics: Has it gone too far?’, NBER macroeconomics annual 12, 73–103. Krugman, P. (1994), ‘The age of diminished expectations: Us economic policy in the 1990s, revised and updated edition’. Levinsohn, J. & Petrin, A. (2003), ‘Estimating production functions using inputs to control for unob- servables’, The review of economic studies 70(2), 317–341. Little, R. J. & Rubin, D. B. (2019), Statistical analysis with missing data, Vol. 793, John Wiley & Sons. Marschak, J. & Andrews, W. H. (1944), ‘Random simultaneous equations and the theory of production’, Econometrica, Journal of the Econometric Society pp. 143–205. Olley, G. S. & Pakes, A. (1996), ‘The dynamics of productivity in the telecommunications equipment industry’, Econometrica 64(6), 1263–1297. Restuccia, D. & Rogerson, R. (2008), ‘Policy distortions and aggregate productivity with heterogeneous establishments’, Review of Economic dynamics 11(4), 707–720. Syverson, C. (2011), ‘What determines productivity?’, Journal of Economic literature 49(2), 326–65. White, T. K., Reiter, J. P. & Petrin, A. (2018), ‘Imputation in us manufacturing data and its implica- tions for productivity dispersion’, Review of Economics and Statistics 100(3), 502–509. 24