Do Management Interventions Last? Evidence from India

Beginning in 2008, the authors conducted a randomized controlled trial that changed management practices in a set of Indian weaving firms (Bloom et al. 2013). In 2017 the plants were revisited and the authors found three main results. First, while about half of the management practices adopted in the original experimental plants had been dropped, there was still a large and significant gap in practices between the treatment and control plants. Likewise, there remained a significant performance gap between treatment and control plants, suggesting lasting impacts of effective management interventions. Second, while few management practices had demonstrably spread across the firms in the study, many had spread within firms, from the experimental plants to the non-experimental plants, suggesting limited spillovers between firms but large spillovers within firms. Third, managerial turnover and the lack of director time were two of the most cited reasons for the drop in management practices in experimental plants, highlighting the importance of key employees.

Beginning in 2008, the authors conducted a randomized controlled trial that changed management practices in a set of Indian weaving firms (Bloom et al. 2013). In 2017 the plants were revisited and the authors found three main results. First, while about half of the management practices adopted in the original experimental plants had been dropped, there was still a large and significant gap in practices between the treatment and control plants. Likewise, there remained a significant performance gap between treatment and control plants, suggesting lasting impacts of effective management interventions. Second, while few management practices had demonstrably spread across the firms in the study, many had spread within firms, from the experimental plants to the non-experimental plants, suggesting limited spillovers between firms but large spillovers within firms. Third, managerial turnover and the lack of director time were two of the most cited reasons for the drop in management practices in experimental plants, highlighting the importance of key employees.

I. INTRODUCTION
After an early recognition of management as a driver of differences in firm performance (e.g. Walker, 1887 andMarshall, 1887), economists are again paying increasing attention to the role of management in firm and economy-wide performance (Roberts, 2018). Whereas the size and profitability of the management consulting industry is often cited as a revealed preference measure of the importance of management, recent academic work has also established a credible causal link between changes in management practices and performance in medium and large firms (Bloom et al, 2013;Bruhn et al, 2017). The longer-term persistence of management improvements caused by consulting interventions, however, remains an open question. The received wisdom at a leading global management consulting firm when two of the authors were employed there was that such innovations lasted approximately three years.
Competing views of management offer differing predictions about the persistence of consultinginduced improvements in management practices. One view, best exemplified by the "Toyota way" (Liker, 2004) views management improvements as launching a continuous cycle of improvement, as systems put in place for measuring, monitoring, and improving operations and quality enable constant improvement. A related idea is that management practices are complementary to one another, so that the costs of adding new practices fall as others are put in place. For example, in our context of cotton weaving, scientific management of inventory levels will only be possible once the firm has put in place systems to record all yarn transactions and to regularly monitor stock levels. Some evidence for the lasting impacts of changes in management practices on firm performance comes from Giorcelli (2017), who finds that Italian firms that received Marshall Plan sponsored management training trips to the U.S. in the 1950s experienced significantly better performance over the next fifteen years (relative to firms that applied for, but did not receive, the training).
A countervailing view argues that maintaining good management is difficult, with many of the companies extolled in business books as paragons of good management subsequently failing (The Economist, 2009, Kiechel, 2012. This may be even harder when changes are introduced externally, with the Boston Consulting Group reporting that two-thirds of transformation initiatives ultimately fail (Sirkin et al, 2005). One reason may be that these practices are inappropriate and will be abandoned as firms learn that they are not suitable in their setting. Both 3 Karlan et al. (2015) and Higuchi et al. (2016) find that light consulting engagements in smaller firms than the ones we studied led to firms' gradually discarding practices over the subsequent three years. This paper examines the persistence of management practices adopted after an extensive consultant-supported intervention that we undertook in a set of multi-plant Indian textile weaving firms from 2008 to 2010 (see Bloom et al, 2013 for a more detailed description). The intervention took the form of a randomized controlled trial. Firms were randomly allocated into treatment and control groups, and the intervention was done at the plant level within each firm.
Both treatment and control plants were given recommendations for improving management practices in several areas, and the treatment plants received additional consulting help in implementing the recommendations. The intervention led to a substantial uptake of the recommended practices in the treatment plants and a modest one in the control plants, with corresponding improvements in various measures of performance.
We stopped observing the firms in 2011, but we wondered ---as did many in our audiences when we presented our work ---about whether these changes would last. As a result, we returned to the study firms in 2017 with the same consulting team and collected data on management practices and basic firm performance. We found that both treatment and control experimental plants had in fact dropped some practices, though fewer than we and the consultants had forecast. Since the control plants also dropped practices, the treatment effect on practices is constant over time, at 20 percentage points. Meanwhile, the plants in the treatment firms that had not been part of the experiment (treatment firms typically had multiple plants) had adopted many of the recommendations, so their package of current practices were very close to those of the treatment plants.
We were also able to collect information on the reasons for the dropping of management practices, as well as some basic performance indicators. We find that practices are more likely to be dropped when the plant manager changes, when the directors (the CEO and CFO) are busier, and when the practice is one that is not commonly used in many other firms. The first two reasons highlight the importance of key employees within the firm for driving management 4 practices, 1 while the latter emphasizes the importance of beliefs. Despite their dropping some practices, we find treated firms show lasting improvements in worker productivity, which is 35% higher than in the control group after 8 years, that treated firms have gone on to use more consulting services of their own accord, and that they have supplemented the operational management practices introduced by the consultants with better marketing practices. This paper is related to several literatures, including the drivers of firm and national productivity (see, e.g., Syverson 2011), on management randomized control trials (see, for example, Anderson et al. 2017;McKenzie and Woodruff 2014) and the large literature on the importance of management for firm performance (e.g. Osterman 1994, Huselid 1995, Ichniowski et al. 1997, Capelli and Neumark 2001, Braguinsky et al. 2015, and Fryer 2017. Section II of the paper discusses the original consulting experiment, section III the follow-up and section IV offers concluding remarks.

II.A. The Experimental Design
To investigate the impact of management on firm productivity we initiated a randomized controlled intervention on management practices in a set of large textile companies near Mumbai in 2008. This experiment involved 28 plants across 17 firms in the woven cotton fabric industry.
These firms had been in operation for 20 years on average, and all were family-owned and managed. They produced fabric for the domestic market (although a few also exported). Table 1 reports summary statistics for the textile manufacturing parts of these firms (a few of the firms had other businesses in textile processing, retail and real estate). On average the study firms had about 270 employees, assets of $8.5 million and annual sales of $7.5 million. Compared to US manufacturing firms, these firms would be in the top 1% by employment and the top 4% by 1 This links to the literature on management and CEOs -for example, Bertrand and Schoar (2003), Bennesden et al. (2007), Lazear et al. (2016 and Bandiera et al. (2017). sales, and compared to Indian manufacturing firms they are in the top 1% by both measures (Hsieh and Klenow, 2010). Hence, these are large manufacturing firms. 2 These firms are complex organizations, with a median of 2 plants per firm (in addition to a head office in Mumbai) and 4 reporting levels from the shop-floor to the managing director.
The managing director was the largest shareholder in all firms, and all directors belonged to the same family. Two firms were publicly listed on the Mumbai Stock Exchange, although more than 50% of the equity in each of these was held by the managing family.
The field experiment aimed to improve management practices in the treatment plants and we measured the impact of doing so on firm performance. We contracted with a leading international management consultancy firm to work with the plants as the easiest way to change plant-level management practices rapidly. The full-time team of (up to) 6 consultants had been educated at leading Indian business and engineering schools and most of them had prior experience working with U.S. and European multinationals.
The intervention ran from August 2008 until August 2010, with data collection continuing until November 2011. The intervention focused on a set of 38 management practices that are standard in American, European, and Japanese manufacturing firms and which can be grouped into five broad areas: factory operations, quality control, inventory control, humanresources management, and sales and orders management (for details see Appendix Table A1).
Each practice was measured as a binary indicator of the adoption (1) or non-adoption (0) of the practice. A general pattern at baseline was that plants recorded a variety of information (often on paper sheets), but had no systems in place to monitor these records or use them in decisions. For example, 93 percent of the treatment plants recorded quality defects before the intervention, but only 29 percent monitored them daily, or by the particular sort of defect, and none of them had any standardized system to analyze and act upon this data.
The consulting intervention had three phases. The first phase, called the diagnostic phase, took one month and was given to all treatment and control experimental plants. It involved evaluating the current management practices of each plant and constructing a performance database. At the end of the diagnostic phase the consulting firm provided each plant with a 2 Note that most international agencies define large firms as those with more than 250 employees. 6 detailed analysis of its current management practices and performance and, crucially, recommendations for change.
The second phase was a four-month implementation phase given only to the treatment experimental plants. In this phase, the consulting firm followed up on the diagnostic report to help introduce as many of the 38 management practices as the plants could be persuaded to adopt. The consultant assigned to each plant worked with the plant management to put the procedures into place, fine-tune them, and stabilize them so that employees could readily carry them out.
The third phase was a measurement phase, which lasted until November 2011. This involved collection of performance and management data from all treatment and control plants.
In return for this continuing data, the consultants provided light consulting advice to the treatment and control plants (primarily to keep them involved).

II.B. The Initial Experimental Results -Management Practices
The intervention led to increases in the adoption of the 38 management practices in the treatment plants by an average of 38 percentage points by August 2010 (approximately one year after the start of the intervention). This adoption rate dropped by 3 percentage points in the second year of tracking, showing persistence in practices after the consultants had exited the firms. Not all practices were adopted equally, with firms adopting the practices that (unsurprisingly) were the easiest to implement and/or had the largest perceived short-run payoffs, e.g. the daily quality, inventory and efficiency review meetings. This adoption also occurred gradually, in large part reflecting the time taken for the consulting firm to gain the confidence of the firms' directors. Initially many directors were skeptical about the suggested management changes, and the intervention often started by piloting the easiest changes around quality and inventory in one part of the factory. Once these started to generate improvements, these changes were rolled out and the firms then began introducing the more complex improvements around operations and human resources.
In contrast, the control plants, which were given only the one-month diagnostic and corresponding recommendations, increased their adoption of the management practices, but by only 12 percentage points on average. This is substantially less than the increase in adoption in the treatment firms, indicating that the four months of the implementation phase were important 7 in changing management practices. Table 2 Column 2 reflects this and shows a statistically significant 25 percentage point treatment effect on management practices in 2011. We note that the change for the control firms is still an increase relative to the rest of the industry around Mumbai (more than 100 non-project plants), which did not change their management practices on average between 2008 and 2011.
Finally, since these are multi-plant firms and the consulting firm worked at the plant level, the treatment and control firms also had plants that were not part of the intervention, which we label "non-experimental plants." For example, if a treatment Firm has three plants A, B and C and the diagnostic and implementation intervention was performed on plant A this would be a "Treatment Experimental plant" while plants B and C would be "Treatment Non-Experimental plants". Likewise if a control firm had plants D, E and F and the diagnostic intervention was only performed on plant D, then D would be an "Control Experimental plant" while E and F would be "Control Non-Experimental plants". Appendix Table A2 reports the breakdown of the plant count into these four groups.
Although the consulting firm did not provide consulting services to the non-experimental plants, it was still able to collect bi-monthly management data and some basic plant data for these other plants. The non-experimental plants in the treatment firms saw a substantial increase in the adoption of management practices. In these 5 plants the adoption rates increased by 17.5 percentage points by August 2010, without any drop back in the second year. This increase occurred because the executives of the treatment firms copied the new practices from their experimental plants over to their other (non-experimental) plants. Interestingly, this increase in adoption rates is similar to the control firms' 12 percentage point increase, suggesting that the copying of best practices across plants within firms can be as least as effective at improving management practices as short (1-month) bursts of external consulting.

II.C. The Initial Experimental Results -Firm Performance
Treatment firms experienced a significant increase in output of 9.4% relative to the control firms, which came about both by decreasing quality defects (so that less output was scrapped); and by undertaking routine maintenance of the looms, collecting and monitoring breakdown data, and keeping the factory clean, which reduced machine downtime. Total factor productivity (TFP) increased by 16.6% due to both the increase in output and a reduction in 8 inputs due to reduced inventory and reduced labor inputs for mending defective fabric. These improvements were estimated to have increased profits per plant by about $325,000 per year. We estimate that this represented, on average, a doubling of profitability.

III.A. The Follow-up Process
In January 2017, working with the same consulting firm with which we had worked in 2008-2011, we re-contacted the 17 textile firms from the original study. Fortunately, all 17 firms agreed to work with the research team again on a follow-up study. This 100% uptake was aided by a combination of three factors: (A) the positive impact of the intervention in the first wave on the firms' management and performance; (B) the stability of the firms, which had maintained the same address and contact details, and (C) the engagement of the same three consulting company partners and project manager as the 2008-2011 intervention. 3 One complication is that one single-plant treatment firm was in the midst of closing down after the owner's death. Without any close male relatives to continue the business, the owner's wife had decided to sell the business, which, given its location, meant the business would stop trading and the site would be converted into residential housing. 4 One weakness of this follow-up wave is that our budget allowed us only two months of the consultants' time, which was sufficient to collect management data for all production sites and a basic set of firm performance indicators (e.g. on employment and looms), but not to collect detailed weekly output data that would allow TFP estimation, because that would have required extracting data on a firm-by-firm basis from log-books and accounting software. Consequently, our analysis is confined to management practices and basic performance indicators like employment or looms/employee, along with an imputed measure of labor productivity.
This follow-up data collection corresponds to an average period of 9 years since the implementation phase of the consulting intervention started and 7 years since it ended. It 3 These personal contacts are very important in our context. In fact, we delayed the start of this project to ensure we could staff the project with the same senior consulting team as the 2008-2011 wave. 4 The firm was over 30 years old, and due to the expansion of Mumbai was now located in a residential area so the land was more valuable as housing than for production. 9 therefore enables us to examine the long-term persistence of these large changes in management practices.

III.B. Results on Management Practices
In Figure 1 we plot the management scores over time after re-visiting the plants in January 2017 evaluated on the same 38-management practice scoring grid as in the prior experiment. We find substantial persistence of the management intervention, which we summarize below with four main results.
Treatment Experimental Plants: First, the management scores in the treatment plants fell from 0.60 at the end of the last wave to 0.46 eight years later. This drop of 0.14 points in the management score reverses 40% of the original 0.35 increase (noting these firms started pretreatment with an average management score of 0.25) over an eight-year period. This fall in the management practice score is equivalent to about an annual depreciation rate of 6% in the original increase in management practices.
Control Experimental Plants: Second, the control plants also saw a drop in their management scores, falling by 0.08 points from 0.40 at the end of the last wave to 0.32. This is smaller in absolute terms compared to the fall in scores in the treatment plants, but the increase in management practices in the control plants was only 0.12 points (from an original score of 0.28), so that the drop in practice scores is 66% of the intervention gain, implying about a 13% depreciation rate of the original management increase.
Together this indicates that, even eight years after the initial intervention the treatment firms still had higher management practices. Table 2 reports the results from running the Ancova specification for plants (i) at time (t): Indeed, we see that the long-run treatment effect in 2017 of 19.7 percentage points is similar in magnitude to the short-run effect in 2011 (20.6 percentage points), and we cannot reject equality of these treatment effects over time (p=0.802). These effects are individually statistically significant both using conventional (large-sample normality-based) inference as well as permutation procedures with exact finite sample size (the corresponding p-values are also reported in Table 2). Thus, the intervention generated persistent impacts on the treatment plants.
Moreover, the greater percentage depreciation of the improvements in the control plants (66%) versus the treatment plants (40%) suggests that small improvements in management may be less stable than large improvements. One possible reason which we discuss further below is that bundles of management practices are complementary, so that adopting only parts of them may be less stable than adopting all of them. Of course, given the small sample sizes in this experiment this could also reflect sampling noise -something that should be remembered when evaluating all our results from this experiment. Other examples of getting experts to provide ex ante predictions of the results of an experiment can be found in Hirschleifer et al. (2016), Groh et al. (2016) and Dellavigna and Pope (2017). 6 The predictions of the individual consultant and academic team members were made independently -Bloom estimated first and then the other team members individually e-mailed expected steeper declines in management practices relative to what actually occurred, particular for the non-experimental plants. While some of the practices were dropped, the majority of the interventions remained in place eight years later and the gap with the control group remained steady. The results therefore lie between these two extreme views of constant improvement and of no long-run impact.
To delve further into the management changes, we also analyzed the 38 individual practices as highlighted in Figure 2, which plots the average score for the experimental plants in the treatment firms on each practice on the X-axis against the average scores for the non-  These scores subsequently subside as some practices are dropped. The non-experimental plants him their predicted scores. The average predicted scores were not particularly different across the two groups (hence we present them averaged together).

III.C. What Drives Changes in Management Practices
We next explore the proximate causes for the adoption or non-adoption of management practices on a practice-by-practice basis in Table 3 using directors' and plant managers' stated reasons for adding or dropping practices. In the "Treatment experimental" column we report the percentage of practices added (top panel) and dropped (bottom panel). In the second, third and fourth panels we report similar figures for the "Treatment non-experimental", "Control Experimental" and "Control Non-experimental", while reporting all plants in the final column. A few results are worth noting.
First, we see that, while a significant fraction of practices remains unchanged from 2011, there is notable churn in management practices across all plants. In particular, 4.1% of practices have been added and 12.4% of practices dropped since the end of the experiment. We are reasonably confident that these are accurately measured, derived as they are from detailed interviews with firm directors and plant managers. Second, in the non-experimental plants in the treatment firms, spillovers from other plants (in the same firm) is the single largest reason for practice adoption and accounts for 4.2% of improvements (out of a total improvement rate of 6.9%). In the control firms, spillovers from other firms outside the experimental group 7 were the most important driver of management improvements, driving 2.2% on average of the practice upgrades (out of a total of 2.6%). These two figures highlight the importance of within and across firm spillovers in improving management practices over the long run.
Third, in the experimental plants (in the treatment firms) the major reason for dropping practices was the introduction of a new plant manager (9.9% out of a total of 16.7%, so well over a half). The plant manager was evidently a critical part of the management improvement in the intervention plants, and if he left the firm then many of the practice improvements subsequently 7 Qualitatively these improvements appear to be from copying other firms in the industry, outside of those in our experimental sample. We did not come across cases of the control firms saying they had learned from the treated firms. 13 collapsed. 8 Another major factor across all the plants was director time -overall 3.6% of practices were dropped when directors had to reduce the time they spent managing the plant, often because of other business commitments (e.g. finance, marketing, or other businesses like retail or real-estate). This highlights the importance of CEO time for firm management, consistent with the work of Bandiera et al. (2017). Finally, we see that 4.2% of practices were dropped because of "perceived negative benefits," which means the firms decided the practices were actually not worth adopting and decided to drop them. Table 4   See also Fryer (2017) who argues that principal turnover was the primary reason for declines in school performance improvements following an experimental intervention aimed at changing school management practices in the United States. 9 We test if having a new plant-manager is differential across treatment and control, experimental or non-experimental, or correlated with management score in 2011, and find no significant difference. The point-estimate (standard-errors clustered at the firm-level) are 0.050 (0.234), 0.086 (0.222), 0.654 (0.517) respectively. Of course, we should as always be cautious of inference given the small sample size.
In column (5) we focus instead on the correlation of changes in practices with the frequency of usage across all plants of the practices in 2008, which is valued from 0 to 1, measuring the share of plants in the pre-experimental period that had adopted this practice. This proxies for how widespread their adoption was prior to the intervention, and the positive coefficient indicates that common practices were more likely to be maintained (so uncommon practices were more likely to be dropped). This highlights that the intervention was more successful at getting badly managed plants to adopt relatively standard practices -such as basic measurement systems -than getting plants to adopt more advanced practices like data review meetings and performance rewards. In column (6) we add these all together and the results look similar, suggesting these are reasonably independent relationships.
Finally, in column (7) we include the management score in 2011 to look for mean reversion, finding a negative but insignificant coefficient. This is confirmed in Figure 4 which shows that both the initial treatment increase in management practices from 2008 to 2011 and the subsequent drop are uncorrelated with initial levels of management practices. So, changes in management practices appear not to be strongly correlated with initial levels, implying that, like TFP, a highly persistent auto-regressive (or random-walk) form of stochastic evolution. Figure 4 is also useful in showing the distribution of changes in management practices among treated plants. We see that every single treated experimental plant improved its practices between 2008 and 2011, and every one of these plants subsequently saw a drop in its management practice score between 2011 and 2017. It is therefore not the case that there were some treated experimental plants in which a "Toyota way" virtuous cycle of continuous improvement occurred.
Finally, we examine the practices that were adopted to see which were the least likely to be retained, and which were the stickiest. Table A3 reports the number of firms which ever adopted a practice (i.e. were not using it in 2008, and then used it in at least one of 2011 or 2017), the number who after adopting were no longer using the practice in 2017, and the proportion of adopters who dropped the practice. We see two types of practices that were most likely to be dropped. The first are a set of visual displays and written practices that very few firms were using before the intervention and then were discarded afterwards. These include displaying written procedures for warping, drawing, weaving and beam gaiting; displaying standard operating procedures for quality supervisors; and displaying visual reports of daily efficiency by 15 loom and weaver. The second set of practices most likely to be dropped were ones that required daily attention from management: monitoring defects on a daily basis; meeting daily to discuss quality defects and gradation; and updating visual aids of efficiency on a daily basis. They were thus costly, and presumably seen as not very valuable.
In contrast, we see that many of these practices are very sticky. Of our 38 practices, once adopted, 14 are not dropped by a single plant, and a further 8 are dropped by at most one-quarter of those adopting. Particularly noticeable among these sticky practices are that those which were adopted by 10 or more plants and then never dropped. These relate very closely to the most immediate improvements in quality and inventory levels that we saw from the original consulting intervention: recording quality defects in a systematic manner (defect-wise); having a system for monitoring and disposing old stock; and carrying out preventative maintenance.
Finally, we note that not all daily activities were susceptible to being dropped, with those most closely tied to keeping machines running quite persistent: firms still maintained daily monitoring of machine downtime and had daily meetings with the production team.

III.D. Results on Long Run Performance
The other question we investigated when returning to the plants was the long-run performance impact of the original management interventions. Because of budget limitations and the reluctance of firms to share financial data, we are not able to undertake a detailed analysis of TFP. 10 We were able, however, to collect basic information on plant size and looms in 2014 and 2017 to supplement our original data for 2008 and 2011. Since there were changes over time in the number of plants per firm, and the management practices have converged across plants within firms, we examine performance at the firm level.
We run Intention to Treat (ITT) panel regressions over four years (2008, 2011, 2014 and 2017) at the firm level with firm and year fixed effects and standard errors clustered at the firmlevel: OUTCOMEi,t = aTREATi,t + bt + ci +i,t where OUTCOME is one of the key outcome metrics of looms, looms/employee, etc. We report statistical significance using both conventional inferential procedures based on normal approximations as well as using permutation tests that have exact finite sample size to allay sample size concerns. 11 We start in column (1) of Table 5 in the top panel looking at the number of looms (in logs), which is a basic measure of production capacity. In panel A, we regress this on a dummy for the year being greater or equal to 2011 -a post-intervention dummy -finding a statistically insignificant coefficient of -0.032. In panel B, we break down this impact by year, with the point estimates suggesting a 16.1 percent increase in capacity by 2017, but this is also not statistically significant.
In column (2) we examine employment. The point estimates suggest a relatively large drop in employment, of 23 to 24 percent on average over the full period, and in 2017. However, this drop is also not statistically significant. There are two reasons why employment may have fallen.
The first is that, at baseline, firms employed many workers fixing quality defects and would need less of this sort of labor as quality improved. Second, production processes improvements and fewer breakdowns can enable the same worker to be in charge of more looms.
Column (3) then combines these measures to focus on our main measure of long-term firm productivity, which is log looms per employee. This is a classic productivity measure in the literature (see, for example, Clark 1987or Braguinsky et al. 2015. One reason is that employees spend much of their time dealing with malfunctioning looms, so that a higher number of looms per employee indicates fewer breakdowns and higher rates of production uptime (the time the loom is producing output rather than being repaired). As such, column 3, panel A, shows that the average treatment effect over the full post-intervention period was to increase looms per employee by a statistically significant 26.7%. Panel B suggests this efficiency improvement was rising over time in that the coefficients are generally larger for 2017 that 2011, with the long-run impact a statistically significant 51.0 percent increase in this productivity measure. However, despite the trend of rising coefficients, we cannot reject that this productivity impact is constant over time.
We also want to investigate the impact on labor productivity. While we did not collect information on labor productivity in 2017, we can use the survey data from the initial wave to 11 We also estimate the regression at the plant level and the results are qualitatively similar. impute a labor productivity impact. In particular, we use data from a survey we ran in 2011 of 113 firms in the broader textile industry around Mumbai (see details in Appendix A2), in which we collected data on physical production, employment, and looms. Using this, we show in Appendix Table A4 and Figure A2 that there is a strong correlation between labor productivity (output per worker), and looms per worker in both the cross-section and the panel. Taking the fitted coefficient of 0.734 from column (4) of Table A4, we impute labor productivity from looms per employee for our experimental firms. The average imputed increase in labor productivity since 2011 is then 19.0% (exp(0.237*0.734)), and the long-run impact is 35.3% (exp(0.412*0.734)). These impact figures are remarkably similar to the 15.3% and 31.2% 1-year and 10-year productivity impacts respectively reported for management interventions in post-war Italy reported in Table 3 of Giorcelli (2017). 12 In column (4) we asked the plants if they had used any consultants since 2011, and if so how many days. Many of these firms had, and indeed, as column (5) shows, this use of consultants was significantly higher in the treatment plants. These consultants were local firms offering very practical advice on loom-changing practices, fabrics, human resources, or textile marketing, rather than the types of expensive international-firm management consulting provided by our intervention. We interpret this as a revealed preference indicator that treatment firms found the intervention useful and were more willing to pay for commercial consulting in the future. This was more likely to occur once some time had passed since their previous consulting experience in our project (panel B).
Finally, in column (5) we look at the adoption of marketing practices. Marketing practices were not part of our initial intervention, and so this enables us to examine whether changes in the specific practices on which our intervention focused are accompanied by broader management changes. Our measure is a score given for the adoption of seven practices: (1) does a director regularly attend trade shows; what is the frequency of systematically analyzing markets, products and prices to assess policies (and make changes wherever necessary) ( (2), (3) and (4)); (5) does the firm have a dedicated brand; (6) does the firm have a sales and marketing professional; and (7) does the firm use any e-commerce (for sales) and social media (for advertising). Panel A shows that treatment firms are significantly more likely to adopt these marketing practices. Discussions with firms highlighted their attempts to be more systematic in 18 management across a range of activities. So, in this sense, there were cross-practice management spillovers. Improving production and human-resource management practices led firms to value a more data-driven, systematic management approach, and hence apply this to other areas like marketing.

IV. CONCLUSIONS
In summary, the intervention in 2008-2010 did have lasting effects, but not the multiplier effect of on-going further improvements that the "Toyota Way" theory would have predicted. Indeed, a significant fraction of the induced improvements were dropped, especially if the plant manager changed, the directors were short of time, or if the practices were not common before the intervention. Still, many of the changes persisted, and spread throughout the treatment firms.
There was also some persistence and some drop in the control plants' set of practices. Thus, the "inappropriate technologies" view does not find much support. Beyond that, the "three-year life" conventional wisdom described in the introduction is also decisively rejected, at least for the sort of practices changes our intervention induced.
The treatment firms were still much better managed in 2017 than the control, and key practices around quality control and inventory management were maintained. Moreover, the treatment firms used more consulting and did more marketing, suggesting that the more systematic approach to management introduced by the intervention was spreading to areas the intervention had not addressed, and we see long-term benefits in terms of a measure of worker productivity.
These lasting impacts highlight the importance of management in explaining persistent productivity differences amongst firms. Understanding why more firms do not invest in improving management, and what types of policies can change this, is therefore an important question for future research. Table A2 reports the sample of plants by the four types (treatment and control, experimental and nonexperimental). As noted in the text, one treatment firm exited because of the death of the owner without any male heirs, which led to the closure of one plant. Two more treatment plants closed because they were amalgamated into other plants within the same firm -that is, all the looms and equipment were moved onto one site for production economies of scale. We count these as a plant closure (since that plant stopped operating) but the output of that plant will still be included at the firm-level. Finally, both treatment and control firms opened some plants over this period due to demand growth.

AII) Management survey in 2011 and Imputing Labor Productivity:
Between November 2011 and January 2012 we ran an in-person survey of textile firms around Mumbai with 100 to 1,000 employees, using the Ministry of Commercial Affairs registry of firms plus a combination of industry lists, internet searches, and referrals as a sample frame (see online Appendix A2 of Bloom et al, 2013 for more sampling details). We identified 172 such firms, and were able to interview 113 of them (17 project firms and 96 non-project firms). The main purpose of this survey was to benchmark the management practices of our experimental sample against the industry as a whole, and we found that our project firms did not differ significantly in management practices from the non-project firms interviewed.
The interview followed a relatively standardized script, asking background questions about the firm (age, ownership, family involvement, markets etc), followed by questions about plant size (employees, output, plant numbers, production quantity), management practices, organizational structure, computerization, prior consulting, prior knowledge of the Stanford-World Bank project (we skipped this question for firms involved in the experiment), and any potential interest in future consulting waves. The full survey is available at www.stanford.edu/~nbloom/Template.xlsx.
In this paper, we use the data collected in this survey on the annual physical output of the firm (in meters or production picks), the number of employees (permanent plus contract), and the number of looms in the firm. We attempted to collect this for four years 2008-2011, and we were able to collect this information for all four years for 87 firms, and for two or three years for a further 7 firms. Using this data, we construct labor productivity as the log of physical production units per worker. This is similar to the sales per worker term often using to measure labor productivity, but has the advantage of not incorporating price effects.
Appendix Figure A2 shows the strong correlation (0.561) between labor productivity and looms per employee. Appendix Table A4 presents the corresponding regression relationship. Column 1 shows the strong cross-sectional relationship, which persists after adding year fixed effects (column 2), firm fixed effects (column 3), and both year and firm fixed effects (column 4). Column 4 then shows that annual changes in looms per employee are associated with changes in labor productivity. This yields the fitted relationship: Log production per worker = 0.734 (s.e. 0.114) * Log looms per worker + year effect + firm fixed effect.
We use this fitted relationship to impute labor productivity impacts from our impact on looms per worker in Table 5.        Both are clustered at the firm level. *, **, and *** denote significance at the 10, 5, and 1 percent levels respectively on the robust standard errors. Permutation tests report the p-value for testing the null hypothesis that the treatment had no effect by constructing the permutation distribution of the estimator using 4000 possible permutation of firm-level random assignment. The second column limits the sample from column 1 to plants that were present in both years with no missing management scores. Total 100 100 100 100 100 Notes: Lists the shares of practice by plant cells in terms of reasons for change between 2011 and 2017 in terms of practices added, dropped or left unchanged. Calculated as a share of 1,042 practices, which are comprised of the 38 practices across the 28 plants (11 treatment experimental, 9 treatment non-experimental, 6 control experimental and 2 control non-experimental) in operation in both 2011 and 2017, except for the inventory practices which are missing in plants which hold no inventory because they make to order. Observations 1,042 1,042 1,042 1,042 1,042 1,042 1,042 Notes: Dependent variable is the change in the -1,0,1 indicator for the change in management practice between 2011 and 2017. The sample is the 38 practices across the 28 plants (11 treatment experimental, 9 treatment non-experimental, 6 control experimental and 2 control non-experimental) in operation across both periods, except for the inventory practices which are missing in plants which hold no inventory because they make to order. Regressions clustered at the firm level. *** denotes 1%, ** denotes 5%, * denotes 10%