WATER GLOBAL PRACTICE WSS GSG UTILITY TURNAROUND SERIES Statistical Analysis Global Study on the Aggregation of Water Supply and Sanitation Utilities AUGUST 2017 Michael Klien About the Water Global Practice Launched in 2014, the Word Bank Group's Water Global Practice brings together financing, knowledge, and implementation in one platform. By combining the Bank's global knowledge with country investments, this model generates more firepower for transformational solutions to help countries grow sustainably. Please visit us at www.worldbank.org/water or follow us on Twitter at https://twitter.com/search?q=%40WorldBankWater&src=tyah. Statistical Analysis Global Study on the Aggregation of Water Supply and Sanitation Utilities AUGUST 2017 Michael Klien © 2017 International Bank for Reconstruction and Development / The World Bank 1818 H Street NW, Washington, DC 20433 Telephone: 202-473-1000; Internet: www.worldbank.org This work is a product of the staff of The World Bank with external contributions. The findings, interpretations, and conclusions expressed in this work do not necessarily reflect the views of The World Bank, its Board of Executive Directors, or the governments they represent. The World Bank does not guarantee the accuracy of the data included in this work. The boundaries, colors, denominations, and other information shown on any map in this work do not imply any judgment on the part of The World Bank concerning the legal status of any territory or the endorsement or acceptance of such boundaries. Rights and Permissions The material in this work is subject to copyright. Because The World Bank encourages dissemination of its knowledge, this work may be reproduced, in whole or in part, for noncommercial purposes as long as full attribution to this work is given. Please cite the work as follows: Klien, Michael. 2017. Statistical Analysis: Global Study on the Aggregation of Water Supply and Sanitation Utilities. World Bank, Washington, DC. Any queries on rights and licenses, including subsidiary rights, should be addressed to World Bank Publications, The World Bank Group, 1818 H Street NW, Washington, DC 20433, USA; fax: 202-522-2625; e-mail: pubrights​@­worldbank.org. Cover design: Jean Franz, Franz & Company, Inc. Contents Executive Summary v Chapter 1  Introduction 1 Note 2 References 2 Chapter 2  A Framework for Water Utility Structure 3 2.1 Key Dimensions of Utility Structure 3 2.2 Clustering Utilities According to Structural Dimensions 7 2.3 Relationship to Performance and Input Structure 9 Notes 14 References 14 Chapter 3  The Empirics of Aggregation 15 3.1 Are Aggregating Utilities Different? 15 3.2 How Aggregations Change Utility Structure 16 Note 19 References 19 Chapter 4  The Performance Consequences of Aggregation 21 4.1. Empirical Strategy 21 4.2. Matching Results 24 4.3. Difference-in-Differences Results 25 4.4. Postaggregation Performance 30 4.5. Alternative Aggregation Measures 32 4.6. Distinguishing Strong and Weak Utilities 33 Notes 35 References 35 Chapter 5  Discussion and Conclusion 37 Reference 38 Appendix A  IBNET Data 39 Appendix B  Methodological Details of Clustering 43 Reference 44 Statistical Analysis iii Figures 2.1. Scatter Plot for Structural Characteristics 5 2.2. Histograms for Customers, Density, and Number of Towns 6 2.3. Box plots for Customers, Density, and Number of Towns, by Cluster 8 2.4. Scatter Plot for Customers and Number of Towns 9 2.5. Scatter Plot for Customers and Density 10 2.6. Scatter Plot for Density and Number of Towns 10 2.7. Box Plots for Unit Costs and WUPIall over Clusters 11 2.8. Box Plots for Cost Shares 12 2.9. Box Plots for Costs per m 13 3 2.10. Box Plots for Labor Cost Components 13 3.1. Aggregations and Change in Number of Customers 16 3.2. Aggregations and Change in Density 17 3.3. Labor Share before and after Aggregations 18 3.4. Cost Components before and after Aggregations 18 B.1. Tests Statistics for K-Means Cluster Choice 43 Tables 2.1. Pairwise Correlations 4 2.2. Principal Component Analysis Output 4 2.3. Clusters According to Structural Dimensions 11 3.1. Comparison of Aggregating and Not Aggregating Utilities in IBNET Sample 15 4.1. Distribution of Aggregations by Region 21 4.2. Distribution of Aggregations by Income Level of Countries 21 4.3. Propensity Score Estimation 24 4.4. Bias before and after Matching 25 4.5. Difference-in-Differences 26 4.6. Difference-in-Differences: Conditional on Initial Number of Systems 26 4.7. Difference-in-Differences: Conditional on Initial Volume 28 4.8. Difference-in-Differences: Conditional on Initial Density 29 4.9. Postaggregation Phase 31 4.10. Alternative Merger Indicator 32 4.11. Difference-in-Differences: Conditional on Initial Performance (WUPI) 34 A.1. Utilities per Country, by Treatment Status 39 A.2. Summary Statistics 41 iv Statistical Analysis Executive Summary By definition, aggregation implies a change in the structural setup to another, the outcome will depend structure of involved utilities by combining several on not only (a) the initial structure before aggregation, preexisting utilities into an integrated organization. To but also (b) the way the aggregation changes the analyze and quantify the performance consequences structure. of aggregations, one must have an understanding Given that utility structure is a key to understanding of  the relationship between utility structure and how utility performance might change because of performance and of the way aggregations change util- ­ aggregations, the longitudinal data in IBNET enables ity structure. researchers to analyze how aggregations change the Structure is more than size. A one-dimensional three dimensions of utility structure. At least for the description of utilities is too narrow to describe key sample of aggregations examined in IBNET, the num- aspects of utility structure. The data-driven frame- ber of towns increased (by definition of aggregation), work in this report suggests that apart from pure but the aggregations added few customers and in size-related output indicators (such as volume or num- many cases reduced density. This result corresponds ber of customers), utilities need to be differentiated to the findings in a number of previous studies that according to density and the number of towns served. suggest that density losses prohibit economies of scale. Another contrast to the case of growing a utility Using the three dimensions—customers, density, and that serves a single town is that aggregating utilities towns—to describe utility structure, the universe of util- has not been found to decrease the share of labor cost ities in the IBNET database can be grouped into a small over time. These findings raise the following question: number of clusters of similar utilities. Both the cluster- Through which channels can the aggregations in prin- ing and also the findings regarding the relationship of ciple and in practice improve the performance of structure with performance and utility input structure utilities? show that there is a divide between utilities that serve a single town and utilities that serve several towns. The causal analysis also confirms these impressions: On average, the analyzed aggregations have had no For utilities that serve a single town with water and effect on cost and various other performance indica- wastewater, larger volumes and density are positively tors when compared with similar utilities that did not related to performance in terms of lower unit cost and aggregate. However, the analysis also indicates that higher quality of service. These results do not carry over the results often depend on the initial structure of the as clearly to utilities that serve multiple towns. Because utility and the design of the aggregation. Regardless aggregations will tend to move utilities from serving a of the fact that the conditionalities may be very situa- ­ erformance single town to serving several towns, the p tion specific, the findings stress that the design and consequences are much less clear than in  the simple structure of the affected utilities may be more impor- case of growing a utility that serves a single town. tant than the question of whether or not to aggregate. In addition, although utilities serving a single town And although it is difficult to directly deduce a recipe experience reductions in the share of labor costs as for successful aggregations from these results, there they grow, utilities serving several towns exhibit less is  little doubt that a careful analysis of the existing clear patterns and seem to incur additional transaction and targeted utility structure is a prerequisite to man- costs. This observation suggests that if aggregations aging expectations and making the most of an aggrega- are simply a process to transform utilities from one tion reform. Statistical Analysis v Chapter 1 Introduction The ultimate goal of this report is to empirically assess Applying a data-driven approach to the universe of the performance consequences of aggregations. To IBNET data, we start by identifying a small number answer this question in a meaningful way, it is crucial of key dimensions of utility structure. These indica- to understand what aggregations are and how they tors are then used to classify utilities into homoge- change the structure of water utilities. Although tech- neous clusters of specific utility types. Finally, the nically aggregations are loosely defined as “the process section analyzes the relationship between utility by which two or more WSS service providers consoli- structure and (a) performance measures and (b) the date some or all their activities under a shared organi- cost structure. zational structure,”1 from an organizational perspective, In the second part of the report, the developed frame- aggregations can also be seen as a transformation work is used to compare the structure of aggregating process that changes the structure of the involved util- ­ utilities (before aggregation) with utilities that do not ities along various dimensions. As a consequence, the aggregate. This step is helpful to understanding if effect of an aggregation will depend largely on how it ­ certain utility types are more frequently involved in changes the structural characteristics of a utility. aggregations than others are, a crucial factor in choos- The focus on utility structure as the main mechanism ing a suitable control group in the ­ ensuing empirical through which aggregations could affect performance is analysis. In addition, the section describes how aggre- warranted by two facts. First, already the definition of gations change the structure of utilities on the basis of aggregations as a merger of several organizational struc- observed aggregations in IBNET. This step should con- tures implies that the change in structure is a key com- vey a clearer picture of what aggregations mean in ponent of the phenomenon. Second, arguments in favor terms of utility structure (magnitude and direction of of aggregation often relate to organizational design. For changes) and which aggregation designs appear instance, aggregation reforms are often accompanied by frequently. the expectation of achieving economies of scale through In the final section, we attempt to measure the causal increased utility size (see Abbott and Cohen 2009; effect of aggregations on utility performance. The Carvalho, Marques, and Berg 2012; González-Gómez approach following Klien and Michaud (2017) is to others 2013; Walter and García-Rubio 2008; Saal and ­ measure how utility performance evolved for utilities and others 2009). However, as the following sections that grew through an aggregation compared with utili- ­everal dimensions of will show, aggregations affect s ties that were not aggregated. Building on the insights utility structure simultaneously. Utility size in the sense from the previous discussion, the effect of the aggrega- of volume or customers is too narrow a description of tions will be allowed to vary depending on the type how aggregations actually change utility structure. of  aggregation as well as on the initial structure of a For this reason, the first part of this report proposes a utility. This step should help to explain whether and in framework to describe, classify, and analyze water which cases the reform design and the structure of the utilities according to their structural characteristics. affected utilities matter. Statistical Analysis 1 Klien, Michael, and David Michaud. 2017. “Diseconomies of Note Consolidation in Water Utility Mergers: When Economies of Scale Are 1. The definitions and conceptual basis for large parts of analysis can be Not Enough.” Unpublished manuscript. found in Michaud and others (2017). Michaud, David, Maria Salvetti, Michael Klien, Berenice Flores, Gustavo Ferro, and Stjepan Gabric. 2017. Joining Forces for Better Services? References When,  Why, and How Water and Sanitation Utilities Can Benefit from Abbott, Malcolm, and Bruce Cohen. 2009. “Productivity and Efficiency in Working Together. Washington, DC: World Bank. the Water Industry.” Utilities Policy 17 (3–4): 233–44. Saal, David S., Pablo Arocena, Alexandros Maziotis, and Thomas Carvalho, Pedro, Rui Cunha Marques, and Sanford Berg. 2012. “A Meta- Triebs. 2013. “Scale and Scope Economies and the Efficient Vertical and regression Analysis of Benchmarking Studies on Water Utilities Market Horizontal Configuration of the Water Industry: A Survey of the Structure.” Utilities Policy 21: 40–49. Literature.” Review of Network Economics 12(1): 93–129. González-Gómez, Francisco, and Miguel A. García-Rubio. 2008. Walter, Matthias, Astrid Cullmann, Christian von Hirschhausen, Robert “Efficiency in the Management of Urban Water Services. What Have We Wand, and Michael Zschille. 2009. “Quo vadis Efficiency Analysis of Learned after Four Decades of Research?” Hacienda Pública Española/ Water Distribution? A Comparative Literature Review.” Utilities Policy Revista de Economía Pública 185 (2): 39–67. 17 (3–4): 225–32. 2 Statistical Analysis Chapter 2 A Framework for Water Utility Structure 2.1 Key Dimensions of Utility Structure (<2 percent), and the results for water-only companies appear very similar. Given the vast number of dimensions of utility size and structure, it is hardly surprising that no existing Using the indicators available in IBNET2, the following framework is precise and operational enough to allow potential measures are considered: a classification of water utilities according to very styl- • Volume of water produced (m ) 3 ized theoretical considerations. In the absence of such a clear theoretical concept, a data-driven approach is • Volume of wastewater collected (m ) 3 applied to identify a small number of distinct mea- • Population in the service area for water (#) sures for utility structure. The focus is to reduce the large number of indicators by discarding indicators • Population in the service area for wastewater (#) that measure similar underlying structural factors. • Number of customers connected to water supply (#) For the purpose of the underlying study, the definition • Number of customers connected to wastewater of utility structure should capture those aspects of a ­services (#) utility’s setup (in a very broad sense, and in contrast to the more narrow meaning of size) that could poten- • Length of water network (km) tially change in an aggregation process. Apart from • Length of sewer network (km) the  increase in the number of towns served by the aggregated utility—which is the definition of an aggre- ­ • Number of towns served with water (#) gation1—it should particularly include factors related • Number of towns served with wastewater (#) to  output and supply characteristics. Conversely, it does not include indicators for input choices (share of • Density of water system (equals population con- nected to water supply/length of water network) different cost components) or performance in the ­ sense of economic outcomes (like cost or quality). • Density of wastewater system (equals population This exclusion is deliberate and should avoid a confla- connected to wastewater services/length of sewer tion of (largely) exogenous structural features with network) highly endogenous process and managerial choice To narrow down the number of indicators, redundant variables. Starting from a long list of possible indica- indicators that measure the same underlying structural tors, correlation measures as well as principal compo- characteristic are removed step by step. First, the cor- nent analysis  (PCA) are used to identify the key relations in table 2.1 show that water and wastewater structural indicators. characteristics are generally very highly correlated.3 To avoid inconsistent comparisons between utilities Utilities with a large volume of water tend to exhibit a offering only water or wastewater services and utilities large volume of wastewater. In addition, a PCA on all offering both services, we restrict the data to the latter these variables suggests that water and wastewater type of utilities. Utilities providing both water and waste- characteristics of a particular measure capture the same water services account for more than 80 percent of the underlying characteristics (see table 2.2). For instance, observations in the underlying IBNET dataset. Moreover, the variables on the number of towns served with both wastewater-only utilities represent a negligible share water and wastewater load on the same  component. Statistical Analysis 3 TABLE 2.1. Pairwise Correlations vol_w vol_ww cus_w cus_ww popsa_w popsa_ww len_w len_ww dens_w dens_ww towns_w towns_ww vol_w 1 vol_ww 0.887 1 cus_w 0.956 0.861 1 cus_ww 0.890 0.917 0.915 1 popsa_w 0.941 0.845 0.987 0.897 1 popsa_ww 0.925 0.842 0.968 0.905 0.978 1 len_w 0.891 0.807 0.920 0.837 0.911 0.879 1 len_ww 0.825 0.867 0.832 0.892 0.812 0.808 0.855 1 dens_w 0.428 0.377 0.476 0.447 0.462 0.488 0.0935 0.196 1 dens_ww 0.366 0.345 0.408 0.479 0.409 0.432 0.191 0.0315 0.607 1 towns_w 0.250 0.235 0.275 0.223 0.282 0.225 0.311 0.222 0.00209 0.0622 1 towns_ww 0.255 0.279 0.285 0.275 0.290 0.273 0.310 0.273 0.0282 0.0768 0.848 1 Note: The variables have been transformed by taking the natural log of the original value and then standardizing the variables. vol_w = volume of water produced; vol_ww = volume of wastewater collected; cus_w = customers connected to water supply; cus_ww = customers connected to wastewater population of service area for water; popsa_ww = population of service area for wastewater; len_w = length of water network; len_ services; popsa_w = ­ ww = length of service network; dens_w = density of water system; dens_ww = density of wastewater system; towns_w_= number of towns served with water; towns_ww = number of towns served with wastewater. TABLE 2.2. Principal Component Analysis Output Component 1 Component 2 Component 3 vol_w 0.3446793 −0.0449649 −0.0680859 cus_w 0.3529888 −0.04525 −0.0241244 popsa_w 0.3499888 −0.038884 −0.0177919 len_w 0.3261232 0.1137127 −0.2510409 dens_w 0.1650122 −0.3699714 0.5017993 towns_w 0.11942 0.6087636 0.3438152 vol_ww 0.3301522 −0.0223877 −0.0866724 cus_ww 0.3441992 −0.0712156 −0.017246 popsa_ww 0.3469444 −0.0760358 −0.0084673 len_ww 0.3133416 0.081476 −0.317169 dens_ww 0.1529108 −0.3161935 0.5786726 towns_ww 0.129799 0.5966951 0.3436472 Note: The variables have been transformed by taking the natural log of the original value and then standardizing the variables. vol_ww = volume of wastewater collected; cus_w = customers connected to water supply; cus_ww = customers connected to wastewater services; popsa_w = population of service area for water; popsa_ww = population of service area for wastewater; len_w = length of water network; len_ww = length of service network; dens_w = density of water system; dens_ww = density of wastewater system; towns_w_= number of towns served with water; towns_ww = number of towns served with wastewater. 4 Statistical Analysis The components represent the underlying factors. network are very highly correlated—the correlations of Following a rule of thumb, we choose the number of these variables vary between 0.90 and 0.98. Bearing in components depending on eigenvectors being close to mind that 0 implies no correlation and 1 represents a or above 1, yielding three components. As a result, the perfect correlation, the observed correlations are distinction between water and wastewater indicators is extremely high. Utilities that serve a large volume have dropped, and instead an integrated measure represent- many customers, a large population in the service area, ing the sum of each water and wastewater indicator is and a large network. Also, the previous PCA indicated used henceforth. This reduces the number of indicators that these variables load on the same component—that needed to measure utility structure by half. is, they seem to represent similar structural characteris- tics. Consequently, to further reduce the number of From the remaining six indicators, four appear to mea- indicators of utility structure, of the four size indicators, sure a similar structural characteristic that could loosely further analysis. only customers is retained for the ­ be interpreted as “size.” As the correlations in the scat- ter plots in figure 2.1 show, the indicators volume, cus- What remains are three structural indicators of utility tomers, population in service area, and length of the structure: the number of customers, density, and the FIGURE 2.1. Scatter Plot for Structural Characteristics Volume 4 2 Customers 0 −2 4 2 Population 0 in service area −2 4 2 Network 0 length −2 2 Number of 0 towns −2 6 4 Density 2 0 −2 0 2 −2 0 2 4 −2 0 2 4 −2 0 2 4 −2 0 2 Note: The variables have been transformed by taking the natural log of the original value and then standardizing the variables. Statistical Analysis 5 number of towns.4 It is important to stress that these of the indicators. In other words, there are many obser- indicators measure different aspects of utility struc- vations with relatively low and moderate numbers of ture. For example, for a similar number of customers, customers, density, or number of towns and only a few we observe utilities with a large variation in density observations with very high values. and the number of served towns. It also means that As shown in the upper and middle panels in ­figure 2.2, with these 3 indicators we are able to describe utilities because of the very long tail of the distribution, the of widely varying structure without using all 12 initial median values of customers and density are consider- indicators. ably lower than the average. In the case of density, Before we go on to identify utility clusters with similar the median is 252 compared with the mean of 308; for structural characteristics, it is useful to look at some volume the median is 69,000 customers compared descriptive statistics for the three chosen structural with an average of over 383,000. The distribution is indicators. even more extreme for the number of towns: although First, for all three indicators, the distribution is the average is roughly seven towns, more than 80 ­ (heavily) right skewed, meaning that the number of percent of all observations in the sample serve a sin- observations steadily decreases for increasing values gle town for water and wastewater (and are thus FIGURE 2.2. Histograms for Customers, Density, and Number of Towns 0.0015 0.0010 Density 5.0e−04 0 0 2,000 4,000 6,000 8,000 10,000 Customers (10,000) 0.003 0.002 Density 0.001 0 0 1,000 2,000 3,000 Density in Customers per km2 0.15 Density 0.10 0.05 0 0 20 40 60 80 100 Number of Towns Note: In the lowest panel, utilities that serve 2 and more than 100 towns are excluded. 6 Statistical Analysis counted as two  towns: one for water and one for comprises the middle 50 percent of observations. The wastewater). For the remaining utilities that serve line within the box is the median. The lower end of the more towns, the distribution is also very heavily right box signifies the first quartile, whereas the upper end skewed. As seen in the lowest panel of figure 2.2, of the box corresponds to the third quartile. In addi- excluding utilities that serve one town for water and tion, the lowest and the highest lines outside the box wastewater, there are fewer and fewer observations indicate the minimum and maximum values. as the number of towns increases. The clustering appears to strongly distinguish between This demonstrates that although IBNET may not be utilities that serve a single town with water and sewer- representative of the whole population of utilities—it age (clusters 1 to 3), and those that serve several towns likely oversamples larger utilities—there is still consid- (clusters 4 to 6). In the latter group, the clusters are fur- erable variation in the structural dimensions. For ther distinguished into utilities that tend to serve a lim- example, more than 10 percent of all observations are ited number of towns (clusters 4 and 5) and a group of utilities with a combined number of customers for utilities that serve a large number of towns (cluster 6). water and wastewater below 10,000. This fact should For customers and even more so for density, the dis- ensure that the ensuing description and classification tinctions between the clusters are not as clear as for of utility types may be applied beyond the data sample the dimension of towns. As can be seen, the dispersion in IBNET. of customers is nonnegligible for most clusters. For 2.2 Clustering Utilities According to customers, cluster 1 is clearly the cluster with the low- Structural Dimensions est number, followed by clusters 2 and 4, which serve an intermediate category. The remaining three clusters In this section the chosen dimensions of utility struc- (cluster 3, 5, and 6) serve a high or medium-high num- ture are used to classify utilities into homogeneous ber of customers. In the case of density, the clusters clusters of specific utility types. The first goal is could also be roughly described as exhibiting low, to  identify frequently appearing configurations of medium, or high densities. Clusters 2 and 5 show high utility  structure. Clustering utilities can be seen as a ­ densities; clusters 3 and 4 show low densities. For clus- middle ground between a hard-to-interpret multidi- ters 1 and 6, the results are less clear with a wide dis- mensional  representation and an overly simplified persion of densities, possibly indicating an average ­ dimensional description—for instance, a small- one-­ density. versus-large dichotomy. The choice of the number of clusters considers the trade-off between few but very Although the overlaps seem considerable for some heterogeneous clusters and many not easily distin- clusters in terms of customers and density, it should be guishable clusters. Although the final number of six 5 noted, however, that in combination with the number clusters is somewhat arbitrary, using a relatively small of towns the final cluster grouping is quite distinctive. number of clusters appears to give an appropriate but To illustrate this observation, figures 2.4 to 2.6 show meaningful representation of the utilities in IBNET. the scatter plots of each dimension with each other, differentiating clusters by color. As shown in the pan- The results of the clustering—a combination of hierar- els, already using two combinations of the structural chical clustering followed by k-means clustering—are dimensions clearly separates most clusters. shown in figure 2.3. The box plots are particularly ­ useful because they simultaneously display informa- Finally, the combination of all three dimensions gives tion about the shape and dispersion of the structural a quite clear-cut distinction, which can be qualitatively dimensions across the six clusters. Each box itself described as in table 2.3. Cluster 1, for example, is what Statistical Analysis 7 FIGURE 2.3. Box plots for Customers, Density, and Number of Towns, by Cluster 3 2 6 2 1 4 1 Number of towns Customers Density 0 0 2 −1 −1 −2 −2 0 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 Cluster Cluster Cluster Note: The variables have been transformed by taking the natural log of the original value and then standardizing the variables. Outliers have been omit- ted from the graph for presentation purposes. might typically be understood as a “small” utility, in all are much more common than are utilities that serve respects: it serves few customers, with low density and several towns, by a multiple. only in a single town. The other two clusters that serve only a single town are cluster 2 and 3, and they distin- Also, the utilities that serve more than several towns guish themselves through customers and density. are separated rather distinctly through the clustering— Cluster 2 exhibits both a medium number of customers however, the relation between customers and density and medium density. Cluster 3 has a high number of is much less clear-cut. Cluster 6 represents utilities customers and high density. Hence, utilities that serve that serve a large number of towns and typically many a single town could broadly be distinguished as small, customers, albeit at a lower density. Cluster 5 also medium, and large because customers and density serves many customers, but with a higher density and seem to be highly correlated inside this subgroup. It fewer towns. Finally, cluster 4 serves an intermediate should be added that the distinction by number of number of towns, a medium number of customers, towns is also critical for the observations of each clus- and low density. Compared with clusters that serve ter: clusters with utilities that serve only a single town only a single town (clusters 1 to 3), utilities in the 8 Statistical Analysis FIGURE 2.4. Scatter Plot for Customers and Number of Towns 3 2 1 Customers 0 −1 −2 0 2 4 6 8 Towns Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Cluster 6 Note: The variables have been transformed by taking the natural log of the original value and then standardizing the variables. clusters that serve more towns tend to exhibit lower structure exhibit systematic differences with respect densities for similar numbers of customers. This find- to (a) performance indicators and (b) their input struc- ing suggests that serving more towns will often go tures? Although this is no causal analysis, sustained hand in hand with reduced supply densities. differentials in production decisions (input mix), outcomes, or both could indicate how changing ­ Overall, the results of the clustering suggest that the structure—through ­ aggregations— will ultimately number of towns, particularly whether a municipality affect a utility. serves a single town or several towns, is a key variable to distinguish utilities. By definition, aggregations will To start with, figure 2.7 exhibits box plots for cost tend to move utilities from clusters 1, 2, and 3 to cluster per  m3 and the composite performance indicator 4 or 5, or even to cluster 6. water utility performance indicator (WUPI).6 What is striking is the quite large dispersion both for costs and 2.3 Relationship to Performance and for WUPI. Cost and utility performance are affected by Input Structure a multitude of other factors apart from structural Even if the clustering is able to identify homogeneous characteristics. Thus not only is this a correlation but distinct clusters of utilities, the question arises exercise and not causal relationship, but also the dif- whether the observed differences in utility structure ferences between clusters are not extremely clear-cut are also meaningful for input decisions and output/ in the sense that some clusters always exhibit better outcomes. Put differently, do utilities of varying performance indicators than other clusters do. Statistical Analysis 9 FIGURE 2.5. Scatter Plot for Customers and Density 3 2 1 Customers 0 −1 −2 −2 −1 0 1 2 3 Density Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Cluster 6 Note: The variables have been transformed by taking the natural log of the original value and then standardizing it. FIGURE 2.6. Scatter Plot for Density and Number of Towns 3 2 1 Density 0 −1 −2 0 2 4 6 8 Towns Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Cluster 6 Note: The variables have been transformed by taking the natural log of the original value and then standardizing the variables. 10 Statistical Analysis Moreover, there is a continuum of utilities—ranging cluster 1, whose WUPI scores (25th, median, and 75th from well performing to potentially troubled, unsus- percentile) are below and whose unit costs (25th, tainable providers—in every cluster. Nevertheless, a median, and 75th percentile) are above any other clus- number of regularities seem to arise: in the case of the ter. The results are more mixed for utilities that serve clusters for utilities that serve a single town, supply- several towns. For instance, although the cost distribu- ing more customers with higher density seems to be tion of cluster 5 tends to be systematically lower than positively correlated with performance. Both WUPI for utilities in cluster 4, performance in terms of WUPI and cost per m improve when a utility moves from 3 scores is comparable or even slightly lower. Moreover, cluster 1 to cluster 2 and further to cluster 3. The pic- although the average number of customers in cluster 6 ture is particularly clear for the “small” utilities in is much larger than in cluster 4, the unit cost distribu- tions look very similar. Conversely, the WUPI score dis- tribution of cluster 6 appears to be the highest of all Clusters According to Structural TABLE 2.3. clusters. Without trying to interpret these correlations Dimensions too much, one can glean an important insight that Towns Customers Density Observations more customers and higher density clearly seem to be Cluster 1 Low Low 2,148 positively related to performance in utilities that serve Cluster 2 Low Medium 2,995 a single town. The relationships, however, are more Cluster 3 Low High 1,834 complicated when looking at clusters of utilities that Cluster 4 Medium Medium Low 380 serve several towns. Cluster 5 Medium High High 400 In order to understand the performance differences Cluster 6 High High Medium 251 between clusters of different structures, considering the role that differences in the cost structure may play is impor- FIGURE 2.7. Box Plots for Unit Costs and WUPIall over Clusters tant. To help with a ­ nalyzing this figure  2.8 displays relationship, ­ 1.0 the cost shares for (a) labor, (b)  energy, and (c) other costs 0.8 for the six clusters. Despite con- siderable variation in each clus- 0.6 ter—for example, each cluster contains utilities with very high but also very low labor cost 0.4 shares—a few striking patterns emerge. Again, it is helpful to 0.2 first concentrate on clusters 1 to 3, which serve only a single town. 0 The left panel shows that for util- 1 2 3 4 5 6 ities that serve a single town, Cost per m3 WUPIall cost shares (median as well as 25th and 75th percentiles) spent Note: Unit costs are in converted local currency unit (LCU)-dollars per m3; water utility performance on labor decrease from cluster 1 indicator (WUPI) scores range between 0 and 1, with 1 indicating the highest score. Statistical Analysis 11 FIGURE 2.8. Box Plots for Cost Shares 1.0 1.0 1.0 0.8 0.8 0.8 0.6 0.6 0.6 Cost share energy Cost share others Cost share labor 0.4 0.4 0.4 0.2 0.2 0.2 0 0 0 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 Cluster Cluster Cluster to cluster 2 and further to cluster 3. Given the underly- and taking the size of the staff into consideration, ing characteristics of these clusters, this observation figure 2.10 shows that the cost reductions related to ­ can be interpreted as a negative correlation between labor are due to having fewer workers (right panel). size and density and the share of labor cost. Larger, Conversely, as the left panel in figure 2.10 shows, cost denser utilities spend a lower proportion on labor. per worker as an indicator of wage level seems higher A similar pattern applies for energy costs, which also in larger utilities. decrease from cluster 1 to 2 and 3. Because the three Switching to the component for “other cost,” as shown cost shares add up to 1, it is little surprising that the in the rightmost panel of figure 2.9, this cost compo- converse holds for other costs (such as consulting costs nent does not appear to decrease with m3 as labor and or various procured goods). energy costs do. If anything, the clusters with larger The idea that larger utilities spend less on labor and and denser utilities exhibit higher cost per m3. Taken energy not only as a share but also in absolute values is together, the results for utilities serving a single town confirmed by figure 2.9, in which as we move from suggest that increasing customers and density cluster 1 to cluster 2 and 3 we observe falling labor and are  related to lower labor cost and lower energy energy cost per m3. Zooming in further on labor cost cost—both in cost shares and in absolute terms. 12 Statistical Analysis FIGURE 2.9. Box Plots for Costs per m3 0.8 0.8 0.8 0.6 0.6 0.6 Energy cost per million m3 Other cost per million m3 Labor cost per million m3 0.4 0.4 0.4 0.2 0.2 0.2 0 0 0 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 Cluster Cluster Cluster For  other  cost  components, FIGURE 2.10. Box Plots for Labor Cost Components rather the opposite relation seems to apply. Economies of 40,000 150 scale, if any, therefore seem to originate from labor and energy, whereas other costs might even 30,000 increase with customers and Workers per million m3 100 density. Switching to utilities that Cost per worker serve several towns, the results 20,000 on the input structure are gener- ally less clear. Despite differences 50 in customers, density, and the 10,000 number of towns, clusters 4 to 6 are not clearly distinguishable in terms of cost shares. For energy 0 0 cost, all utilities that serve sev- 1 2 3 4 5 6 1 2 3 4 5 6 eral towns exhibit compara- Cluster Cluster tively  low energy cost  shares. Statistical Analysis 13 The difference is considerable compared with clusters 2. IBNET is a data repository initiated and maintained by the World Bank with the objective of improving the service delivery of water of utilities that serve only a single town. Conversely, all supply and sewerage utilities through the provision of international three clusters have relatively high shares of “other” comparative benchmark performance information. For more infor- costs (right panel in figure 2.8) and also in absolute mation on IBNET, see the appendix and Van den Berg and Danilenko (2011). terms, clusters 4 to 6 exhibit higher “other” costs per 3. Because the dispersion of the indicators is often considerable, due to unit than clusters 1 to 3 (right panel in figure 2.9). few very large values, the indicators are in natural logs and are Because these other costs often account for more than standardized. 50 percent of total cost, the question arises whether 4. Although it would also be possible to use the principal compo- those costs represent higher transaction costs in the nents  directly, the raw indicators sort sufficiently clear into the case of utilities that serve several municipalities. components—indicating that they measure different structural ­ aspects. Moreover, the interpretation of the indicators is much more However, without more detailed knowledge about the straightforward than of the principal components. cost types in this residual category, it is difficult to 5. See appendix A for a description of the methodology for the cluster- speculate about the source of these cost differences. ing procedures. Also for labor cost, the picture is very mixed. Both for 6. WUPIall indicates the aggregate/composite indicator from subcom- ponents WUPIcoverage, WUPIquality, and WUPImgmt, which are labor share and the absolute labor cost per unit, the used later on. More information about the index and its construction dispersion in clusters 4 to 6 is very large—suggesting a is given in World Bank/IAWD (2015). large amount of heterogeneity beyond utility struc- ture. The median cost per worker is relatively low in References these clusters, and staffing per m3 tends to be compara- Van den Berg, Caroline, and Alexander Danilenko. 2011. The IBNET Water ble to that of utilities that serve a single town. A tenta- Supply and Sanitation Performance Blue Book: The International tive appraisal is that utilities that serve several towns, Benchmarking Network of Water and Sanitation Utilities Databook. Washington, DC: World Bank. therefore, seem to have larger staffs albeit at lower wages. World Bank. 2005. Models of Aggregation for Water and Sanitation Provision. Water Supply and Sanitation Working Notes, note 1. Washington DC: World Bank. http://documents.worldbank.org/curated​ Notes /­en/2005/ 01/5731013/models-aggregation-water-sanitation-provision. 1. In this study, we follow World Bank (2005) and define aggregations as World Bank/IAWD (International Association of Water Supply Companies a situation in which previously separate utilities are integrated into a in the Danube River Catchment Area). 2015. Water and Wastewater single utility. This definition is general enough to comprise both Services in the Danube Region: A State of the Sector. Regional Report. purely managerial aggregations and cases of asset bundling or even Washington, DC: International Bank for Reconstruction and Development​ physical connection of networks and infrastructure /The World Bank. http://sos.danubis.org/files/File/SoS_Report.pdf. 14 Statistical Analysis Chapter 3 The Empirics of Aggregation Although the relationship between utility structure comparison utilities, table 3.1 shows the structural and performance is interesting and helpful to under- indicators for aggregating and not aggregating utilities standing how aggregations could possibly affect a at various percentiles and the mean. utility, looking at the relationship alone ignores that The units are identified as aggregating utilities when aggregations are an inherently dynamic process. As a the number of towns served increases over time. These result, here the focus is to describe the process of utilities could also be interpreted as the acquiring aggregation in terms of utility structure. The starting firms. As table 3.1 illustrates, even before aggregating, point is to look at the structure of aggregating utili- the utilities that would later take over more towns ties before the aggregation and compare them with were different from utilities that do not aggregate in all the utilities not aggregating. The second subsection three measured structural dimensions. Looking at the deals with the question of how aggregations, as average, the number of customers served by aggregat- observed in IBNET, change the structure of a utility. ing utilities is roughly two times larger, their density is Thus, the movement between utility types and clus- more than 10 percent higher, and the number of towns ters is analyzed. they serve even before the aggregation is already 3.1 Are Aggregating Utilities Different? higher in most cases. This comparison suggests that the aggregating utilities that are observed in IBNET The question of whether aggregating utilities are dif- (those that increase the number of towns served) are ferent from the “average” utility is important for any larger, denser, and serve more towns from the start. evaluation measuring the success of aggregations. Although a pure before-and-after comparison is inter- This finding has two main implications. First, the causal esting, a real test of whether a reform was successful is analysis in the following section will have to incorporate by comparing the change in performance with that of the fact that aggregating utilities are considerably differ- other similar utilities. Hence the question of choosing ent from nonaggregating utilities in terms of structural an appropriate counterfactual is a key step in any eval- features. Choosing appropriately similar comparison uation process. To understand if more involved statis- utilities will be important. Second, the following results tical tools are necessary to choose meaningful have to be interpreted carefully in the sense that they do TABLE 3.1. Comparison of Aggregating and Not Aggregating Utilities in IBNET Sample Indicator p10 p25 p50 mean p75 p90 Aggregating utilities Customers 34.1 64.7 167.7 593.5 349.0 1,021.0 Density 144.5 226.5 299.0 380.1 478.0 628.2 Towns 2.0 2.0 4.5 17.2 10.0 24.0 Not aggregating utilities Customers 8.0 18.9 57.0 281.4 198.4 542.5 Density 130.6 180.9 254.5 306.9 372.0 553.0 Towns 2.0 2.0 2.0 3.5 2.0 2.0 Note: Customers are in 1,000s; p = percentile. Statistical Analysis 15 not measure how aggregations affected the average util- The  scatterplots show nonaggregating utilities once ity. Rather, the results measure the effect of aggrega- (averaged over all observation periods) and aggregat- tions on utilities that were already larger in many ing utilities twice, once before and once after the dimensions before the aggregation. Although it is quite aggregation (again averaged over the preaggregation common in aggregation reforms to have large utilities and postaggregation periods). In both figures, the take over many small providers, we can only speculate arrows indicate how the aggregation changed the about the performance effects for very small utilities. structural characteristics. In figure 3.1, the arrows rep- resent the change in  customers and the number of 3.2 How Aggregations Change towns served. In ­ figure  3.2, the arrows represent the Utility Structure change in density and the number of towns served. As Aggregations involve the expectation that utilities will before, the variables were first logged and then stan- grow in size. By definition, the acquiring utility will dardized. What the graphs have in common is the fact grow in the number of served towns. By how many that the change in number of towns is larger than the towns depends on the aggregation design, as do the change in customers or density. In the case of custom- changes in customers and density. The clustering ers, the arrows appear almost horizontal, in most results suggested that utilities with more towns tend to instances showing little increase in customers.1 have more customers, but those conditions are often Even more striking is the decrease in density through coupled with a lower density. Using the data in IBNET it aggregations, shown in figure 3.2. Except in a small is possible to go a step further and look directly at how minority of cases, aggregations seem to lower density, aggregations change all three structural dimensions. sometimes very strongly. A likely explanation for this The way that aggregations affect utilities in the sample finding is that many of the aggregations involved a large of IBNET data is displayed in figures 3.1 and 3.2. number of small utilities, hence decreasing  density. FIGURE 3.1. Aggregations and Change in Number of Customers 3 2 1 Customers 0 −1 −2 0 2 4 6 8 Towns Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Cluster 6 16 Statistical Analysis FIGURE 3.2. Aggregations and Change in Density 3 2 1 Density 0 −1 −2 0 2 4 6 8 Towns Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Cluster 6 Also, the movement between clusters is telling in this evolution of the cost structure. The clustering sug- respect. Because the movement is mostly right and gested that cost savings due to having more customers down, the utilities are moving from clusters of higher and higher density are often related, with lower labor density to clusters of lower density (such as from cluster shares, especially a reduced workforce. The data do 3 to cluster 4). not suggest any reduction in the labor share for aggre- gating utilities. Figure 3.3 shows the evolution of the The picture that develops of small gain in the number of labor share from five years before the aggregation to customers and loss of density might be an explanation five years after the aggregation in a local linear why not all consolidations decrease cost through econ- smooth  plot: the orange bars mark the upper and omies of scale. Importantly, in most empirical studies, percent confidence intervals; the blue dots lower 95  ­ economies of scale are defined as a proportional show the mean per year, which are smoothed in the volume, and increase in outputs—that is, in customers, ­ purple line by a local linear smoother (lowess). The the number of towns (see, for example, Garcia and graphs show that on average, labor cost shares do not Thomas 2001). The assumption of proportionality seem to decrease—rather the contrary. This finding is seems clearly violated in the sample of IBNET utilities. also consistent with some of the case studies in In this case, it seems that the design of the aggregations Michaud and others (2017), in which an upward wage might have been unfavorable to achieving cost savings harmonization occurred or the aggregated utility was from the start. forced to take over the staff of the previous utilities. Another issue that calls into question the poten- Labor costs appear to play a key role in this setting, tial  cost savings in the observed aggregations is the not only because they are frequently the largest single Statistical Analysis 17 FIGURE 3.3. Labor Share before and after Aggregations cost component, but also because they are the only cost component 0.56 that appears to exhibit downward rigidity. Similar to the macroeco- nomic phenomenon that wages 0.48 rarely decrease in nominal terms, utility labor costs do not seem to Share of labor cost decrease even after aggregations 0.40 (see figure 3.4). Although all cost components increase before ­ aggregations— 0.32 caused by inflation and possibly also some short-run transaction cost of the aggregation reform— 0.24 energy and other costs come to a −5 −4 −3 −2 −1 0 1 2 3 4 5 halt and even decrease after the Years since aggregation FIGURE 3.4. Cost Components before and after Aggregations 0.25 0.064 0.27 0.20 0.051 0.22 Energy cost per million m3 Other cost per million m3 Labor cost per million m3 0.14 0.039 0.16 0.09 0.026 0.11 0.03 0.014 0.05 −5−4−3−2−1 0 1 2 3 4 5 −5−4−3−2 −1 0 1 2 3 4 5 −5−4−3−2−1 0 1 2 3 4 5 Years since aggregation Years since aggregation Years since aggregation 18 Statistical Analysis aggregations while labor costs continue to increase. produced water over time, a change that manifests as a secular trend in the data over time. Although this is no causal analysis, it alludes to the crit- ical role of labor costs to achieving cost savings through an aggregation reform. A more coherent analytical References framework to analyze the consequences of aggregations Garcia, Serge, and Alban Thomas. 2001. “The Structure of Municipal Water Supply Costs: Application to a Panel of French Local Communities.” is considered in the next section. Journal of Productivity Analysis 16 (1): 5–29. Michaud, David, Maria Salvetti, Michael Klien, Berenice Flores, Gustavo Note Ferro, and Stjepan Gabric. 2017. Joining Forces for Better Services? When, 1. Some utilities even observe decreases, which are likely the result of Why, and How Water and Sanitation Utilities Can Benefit from Working depopulation. Similarly, a significant number of utilities reduce Together. Washington, DC: World Bank. Statistical Analysis 19 Chapter 4 The Performance Consequences of Aggregation 4.1 Empirical Strategy detail in qualitative case studies in Michaud and others (2017). In this statistical analysis, the focus is on a Although IBNET covers several thousand utilities all general appraisal of whether aggregations generated ­ over the globe, the number of aggregations in the data- the expected cost savings or performance improve- base is substantially lower: after cleaning the data and ments. Regarding utility performance, this report uses a restricting the analysis to utilities suitable for an evalu- set of quantitative indicators to capture the various pur- ation, 79 aggregation cases remained. Most of those poses of aggregations. Most important, these indicators cases occurred in Europe or Central Asia (table 4.1). are coverage, quality of service, and management Although IBNET is not representative in terms of coun- ­efficiency. In addition, these subindicators are also used try coverage, the database suggests that most of the as an aggregate in the form of a composite performance aggregation reforms occurred in Central and Eastern indicator (WUPI). It should again be noted, however, Europe and, to some extent, in South America and that although these indicators capture some important Central Asia. Virtually all aggregations are in the time aggregation purposes, goals that go beyond these period 2000 to 2010, with a scant few before and after dimensions are outside the analysis of this report. these dates. On a country level, the following countries exhibit most aggregation cases: Romania (15); Poland The previous sections relied heavily on cross-sectional (12); Kazakhstan (7); Hungary (6); Serbia (5); the Czech comparisons of utility structure and its connection to Republic (4); and the former Yugoslav Republic of performance. Do systems with many customers, for Macedonia (4). It should be noted, however, that example, exhibit lower unit cost than systems with few although the bulk of cases are located in this region, customers? When considering an aggregation reform, overall 25 countries exhibit cases of aggregations that the relevant policy question, however, is  whether the feed into the analysis. TABLE 4.1. Distribution of Aggregations by Region Related to the country distribution, most aggregations Number Number Region occur in upper-middle-income and high-income of cases of countries East Asia and Pacific 3 3 countries (table 4.2). Some cases are located in lower-­ middle-income countries, but none are from low-­ Europe and Central Asia 69 17 income countries. Latin America and Caribbean 5 3 Middle East a n d North Africa 1 1 The quantitative analysis is limited to the data and Sub-Saharan Africa 1 1 information available in IBNET and therefore focuses on the outcome of aggregation processes in terms of TABLE 4.2. Distribution of Aggregations by Income economic efficiency and performance improvements. Level of Countries Because of data limitations, the impact on externalities Region Number of cases such as equity or environmental factors is therefore High income 27 excluded. Likewise, the dataset does not allow an Upper middle income 43 in-depth investigation of the influence of utility Lower middle income 9 governance or aggregation process design on overall Note: Income levels based on World Bank’s World Development outcomes. Those issues are investigated in greater Indicators definitions. Statistical Analysis 21 utility did improve compared with a situation in which a enter the specification to capture heterogeneity across utility did not aggregate. Particularly, the previous sec- countries and time. The former is particularly relevant tion has shown that aggregations often add few custom- because some countries do not experience any aggre- ers and tend to decrease density—hence a comparison of gation while others exhibit a considerable number. utilities with many and few customers could be very Apart from the statistical need to balance utility char- misleading. acteristics between treatment and control groups, this In this section, utility performance is monitored approach also ensures that the consolidation effects before and after consolidations for aggregating utili- are evaluated in comparison with utilities of similar ties and is compared with nonaggregating utilities. To initial size, and that utilities do exhibit a similar share this end, we run regressions including utility-fixed of water and wastewater services. Because the existing effects to compare the performance change of consol- empirical literature has stressed decreasing economies idating firms with that of nonconsolidating firms.1 As of scale and even diseconomies of scale, matching util- the previous sections have shown, aggregating utili- ities according to their production structure in size and ties are different from the average utility in IBNET, scope seems imperative. The production characteris- suggesting that the choice of the control group—that tics were first added linearly, before adding squared is, the group of utilities without aggregation that is terms where necessary to achieve balancing. used as a comparison—might be important for the Because the choice of the matching algorithm is obtained results. With the overall goal of a counter- somewhat arbitrary, we use three different matching factual scenario—of what the average cost of a utility approaches and also the full sample of utilities, which would be in the absence of a consolidation—not all result in using four different control groups. We utilities are suitable for comparison. use  (a)  nearest-neighbor propensity score matching, (b)  four-nearest neighbor propensity score match- For this reason, different matching techniques are ing,  (c) radius matching, and (d) all utilities in the used to select suitable comparison utilities. In each sample. The different algorithms (a) to (c) represent case, a large set of pretreatment characteristics to esti- difference choices in the trade-off between bias and mate the probability that a utility experiences a consol- variance (see Caliendo and Kopeinig 2008). All three idation (see Rosenbaum and Rubin 1985) is used to algorithms are limited to the utilities on common identify the final sample. Depending on the matching support. The full sample, (d), is displayed for compari- algorithm, one or several utilities with similar treat- son reasons but should be interpreted with care ment probability are then chosen as the control group. because the compared utilities differ substantially. While the combined analysis of water and wastewater is continued (volume is the sum of water produced and These different subsamples of comparable treatment wastewater collected), for the choice of comparison and control utilities are then used in the generalized units the separate indicators are used. Hence the 2 difference-in-differences specification:3 variables xk,it to estimate the probability of an aggrega- Perf it = b 0 + b 1* aggregationit + g i + ht + uit (1) tion include important utility characteristics such as the population in the service area and the number of where Perfit refers to a performance indicator for towns served, separate for water and wastewater. In utility i in year t. In addition to variable cost per m3 addition, the pretreatment performance of a utility in (in  natural logs of dollar-converted local cur- terms of managerial and operating efficiency (WUPI) is rency),  the composite  performance indicator WUPI also added. Finally, country as well as year dummies as well as its sub­ components are used. Regarding the 22 Statistical Analysis subcomponents, WUPIcoverage, WUPIquality, and would indicate that the system configuration is fixed WUPImgmt are distinguished:4 for a very long time horizon. Although a comprehen- sive analysis of short-run and long-run costs would • WUPIcoverage is basically an indicator for the share still be desirable, that is not feasible with the data of population connected to water and wastewater at hand. services and the extent of wastewater treatment. Higher values indicate a higher share of popula- Given the discussions in the previous sections, the tion connected and a higher extent of wastewater effect of aggregations might depend both on the initial treatment. structure of the utility and on how the aggregation changes a utility’s structure. To allow for the latter pos- • WUPIquality represents the performance of a utility sibility that the effect of the aggregations is not inde- with respect to the number of hours of service as pendent of the size of the change, the model in well as the frequency of sewerage blockages. Higher equation (1) is rerun with the indicator variable aggre- values indicate more hours of service and fewer gationit (a) replaced by dummy variables distinguishing blockages. small aggregations (aggregation_size < 20 percent • WUPImgmt is less an indicator on the customer increase in the number of towns), medium aggrega- side than it is related to the managerial efficiency. tions (aggregation_size, between 20 percent and It is based on a number of subindicators such as 100 percent change in the number of towns), and large the extent of staffing, cost recovery, the share of aggregations (aggregation_size more than 100 percent metered connections, revenue collection, and non- change in the number of towns). revenue water. Higher values indicate higher cost recovery and recovery collection, more metered 4 Perf it = β 0 + ∑ β k * k.aggregation _ sizeit + γ i + ηt + uit (2) connections, lower staffing, and lower nonrevenue k =1 water. Similar specifications are run for small, medium, As does the aggregate indicator WUPIall, the subindi- and large changes in density and volume.5 cators range from 0 to 100, with higher values indicat- Moreover,  to make the aggregation effect condi- ing better performance. Looking at various tional on the initial structure of the utility, the performance indicators is necessary because aggre- simple treatment dummy is replaced by adding gations can follow various purposes, and achieving dummy variables that distinguish utilities with few scale economies may not be a goal at all. The regres- towns (initial_level, 2 towns), utilities with an inter- sions include utility and time-fixed effects, thus the mediate number of towns (initial_level, between effect of aggregationit is identified by comparing unit 4  and 14 towns), and utilities with many towns costs over time and between treated and control (initial_level, more than 14 towns). utilities. 4 It should be noted that the use of variable cost gives Perf it = β 0 + ∑ β k * k.initial _ levelit + γ i + ηt + uit (3) k =1 the estimates a short- term interpretation. Capital- stock in terms of the network infrastructure is cer- Again, the same estimations are repeated with dum- tainly fixed, a modification infeasible or prohibitively mies indicating utilities of small, medium, and large costly. (See Garcia and Thomas 2001.) Water pipes typ- density and volume. In all specifications, we cluster ically last a long time—up to 50 years, depending on standard errors at the utility level and robustify for the situation and the chosen material. Such durability heteroscedasticity. Statistical Analysis 23 4.2 Matching Results TABLE 4.3. Propensity Score Estimation (1) Before we present the regression results for the effect aggregation of aggregations on various performance indicators, WUPIall 0.103 this section addresses the results from the matching (0.0759) algorithms that are used to identify useful control util- popsa_w 0.0000123*** ities. The probit regression to obtain the propensity (0.00000324) score is exhibited in table 4.3. popsa_ww −0.0000115*** (0.00000300) It should be noted that for aggregating utilities, the vol_w −3.45e-09 period t-1 with t indicating the aggregation year is used (9.18e-09) in the regression. The pseudo-R-squared of the regres- vol_ww −3.18e-11 sion is 0.44, indicating that the chosen variables can (4.48e-10) help determine the probability that a utility consoli- cus_w 0.000000995 dates. Apart from the country and year fixed effects, (0.00000183) the indicators for population in the service area water cus_ww −0.00000208** (0.00000105) and wastewater utilities seems to enter the regression towns_w 0.0108 significantly. Because the goal of the matching is not (0.00731) exactly to explain the determinants of aggregation but towns_ww 0.0184 rather to evaluate utilities similarity using the esti- (0.0165) mated propensity score, the success of the matching is dens_w 0.0000309 judged by comparing the treatment and control groups. (0.000257) Moreover, explaining the determinants of aggregation dens_ww −0.000432 this way would be very difficult given the high col- (0.000317) WUPIall^2 −0.000712 linearity in many included regressors. (0.000542) A more substantive measure in this respect is to eval- popsa_w^2 6.15e-14 uate if the matching procedures decreased the (3.98e-13) observed differences between treatment and control vol_w^2 −2.18e-18 (1.57e-17) group. This is displayed in table 4.4. The first column _cons −5.407* of the table shows the initial bias between treated and (2.825) the full control sample. The measure standardized N 3897 bias is calculated as the difference in means between Note: WUPIall is a performance indicator. popsa_w = population of service the two groups, divided by the standard deviation of area for water; popsa_ww = population of service area for wastewater; the variable in the treated group: (Xtreated − Xcontrol)/ vol_ww = volume of wastewater collected; cus_w = customers connected to water supply; cus_ww = customers connected to wastewater services; streated. As can be seen from the first column in the towns_w = number of towns served with water; towns_ww = number of towns table, these differences are large for a number of vari- served with wastewater; dens_w = density of water system; dens_ww = density of wastewater system; ^2 = squared variable; __cons is the constant.+ ables in the initial sample.6 The treatment group is Standard errors in parentheses systematically different from the nontreated group. * p < 0.10, ** p < 0.05, *** p < 0.01. Columns 2 to 4 show the remaining bias after the matching procedures. As a rule of thumb, the abso- number of towns is slightly unbalanced, the applied lute values of the remaining bias should be statisti- matching techniques seem capable of choosing cally insignificant and below 25 (see Rubin 2001). appropriate control units. None of the remaining Except in the case of radius matching, in which the biases are statistically significant at a 10 percent level. 24 Statistical Analysis TABLE 4.4. Bias before and after Matching Initial NN-PSM 4NN-PSM Radius WUPIall 36.383186 15.185448 3.892014 7.6758876 popsa_w 21.7962 0.22726597 1.6039469 0.48275611 popsa_ww 19.818401 0.35545123 1.6005663 0.25640118 vol_w 19.591293 2.8924422 1.8626966 0.77712399 vol_ww 11.313143 1.9680327 4.1508121 1.5142661 cus_w 22.180407 1.0372882 1.2646612 0.73440069 cus_ww 21.990953 0.6844226 0.18139637 0.27736446 towns_w 67.926651 1.222091 1.9422516 35.645447 towns_ww 52.364017 4.9559183 8.1277056 35.711647 dens_w 14.555485 13.110049 8.5666676 6.4670525 dens_ww 14.837564 6.7616673 1.1273751 3.0515025 WUPIall^2 38.948284 14.194642 3.6725125 8.1848831 popsa_w^2 17.278624 0.65376735 0.55176806 0.75902593 vol_w^2 17.133339 2.1188266 0.53734493 0.40599945 Note: WUPIall is a performance indicator. popsa_w = population of service area for water; popsa_ww = population of service area for wastewater; vol_ww = volume of wastewater collected; cus_w = customers connected to water supply; cus_ww = customers con- nected to wastewater services; towns_w = number of towns served with water; towns_ww = number of towns served with wastewater; dens_w = density of water system; dens_ww = density of wastewater system; ^2 = squared variable; NN-PSM = nearest neighbor propensity score matching; 4NN-PSM = four-nearest neighbors propensity score matching. Table 4.4 suggests that at least on observables, treated structure of a utility. For the magnitude of the reform— and nontreated utilities do not differ systematically which is measured as change relative to  the initial after the matching approaches. ­ ggregation matters value—there is little evidence that a indicators. The only case for the analyzed performance ­ 4.3 Difference-in-Differences Results with a statistically significant effect is the effect of aggre- gations that add only a small number of towns (less than The results from estimating the model in equation 10 percent). Only for this type of aggregations is there a (1) for the different performance indicators are shown clearly negative effect on unit cost.7 in table 4.5. Except in the estimations in which the full sample is used (column 4), aggregations do not seem The second conditionality—that the impact depends to matter for firm performance, for unit cost, nor for on the initial configuration—is at least partially sup- WUPI and its subcomponents. In other words, when ported by the data. From a unit-cost perspective, utili- similar utilities are used as comparison, no evidence ties that are initially large in the number of towns and suggests that aggregation affects a utility’s perfor- in volume appear to be able to profit from economies mance, positively or negatively. of scale (see the upper panels in tables 4.6 and 4.7). Apart from the possibility that aggregations have a very Conversely, utilities with initially low densities and limited impact on performance, the previous discussions volumes seem to experience improving WUPIall have suggested a heterogeneous effect, depending on scores  after aggregation (see table 4.8). Negative the magnitude of the reform as well as on the initial effects in terms of WUPIcoverage appear for utilities of Statistical Analysis 25 TABLE 4.5. Difference-in-Differences (1) (2) (3) (4) AVC2 AVC2 AVC2 AVC2 1.after −0.00666 −0.0103 −0.0153 −0.0512** (0.0221) (0.0217) (0.0220) (0.0202) N 865 1,159 5,721 7,621 (1) (2) (3) (4) WUPIall WUPIall WUPIall WUPIall 1.after −0.0506 0.248 0.280 0.426 (0.975) (0.850) (0.813) (0.903) N 936 1,244 5,487 7,014 (1) (2) (3) (4) WUPIcoverage WUPIcoverage WUPIcoverage WUPIcoverage 1.after −1.159 −1.473 −1.245 −3.109* (1.988) (1.864) (1.877) (1.816) N 936 1,244 5,487 7,014 (1) (2) (3) (4) WUPIquality WUPIquality WUPIquality WUPIquality 1.after −0.795 0.556 0.245 0.689 (1.519) (1.154) (1.233) (1.490) N 915 1,223 2,718 4,209 (1) (2) (3) (4) WUPImgmt WUPImgmt WUPImgmt WUPImgmt 1.after 0.605 1.089 1.031 2.315** (1.081) (1.016) (0.961) (1.066) N 936 1,244 5,487 7,014 Note: Standard errors are in parentheses. * p < 0.10, ** p < 0.05, *** p < 0.01. TABLE 4.6. Difference-in-Differences: Conditional on Initial Number of Systems (1) (2) (3) (4) AVC2 AVC2 AVC2 AVC2 1.treatedsize 0.00794 0.00287 −0.00344 −0.0569* (0.0337) (0.0340) (0.0348) (0.0321) 2.treatedsize 0.00827 0.00474 −0.000235 −0.0361 (0.0344) (0.0339) (0.0336) (0.0360) 3.treatedsize −0.0458** −0.0474** −0.0511** −0.0712** (0.0215) (0.0213) (0.0216) (0.0281) N 865 1,159 5,721 7,621 table continues next page 26 Statistical Analysis TABLE 4.6. Continued (1) (2) (3) (4) WUPIall WUPIall WUPIall WUPIall 1.treatedsize 2.416 2.809 2.854 2.924 (2.275) (2.195) (2.171) (2.245) 2.treatedsize −0.425 −0.109 −0.0696 −0.0603 (0.999) (0.890) (0.868) (0.849) 3.treatedsize −1.433 −1.182 −1.144 −1.001 (1.528) (1.434) (1.413) (1.672) N 936 1,244 5,487 7,014 (1) (2) (3) (4) WUPIcoverage WUPIcoverage WUPIcoverage WUPIcoverage 1.treatedsize 0.203 −0.128 0.188 −2.307 (6.023) (6.006) (5.999) (5.906) 2.treatedsize −1.333 −1.657 −1.441 −3.219* (1.941) (1.797) (1.797) (1.858) 3.treatedsize −1.973 −2.228 −2.034 −3.639** (1.597) (1.434) (1.456) (1.501) N 936 1,244 5,487 7,014 (1) (2) (3) (4) WUPIquality WUPIquality WUPIquality WUPIquality 1.treatedsize −0.978 0.578 0.176 0.695 (3.872) (3.597) (3.825) (3.981) 2.treatedsize −1.744 −0.398 −0.679 −0.387 (1.379) (0.975) (1.065) (1.214) 3.treatedsize 0.851 2.022 1.738 2.377 (1.829) (1.566) (1.679) (1.888) N 915 1,223 2,718 4,209 (1) (2) (3) (4) WUPImgmt WUPImgmt WUPImgmt WUPImgmt 1.treatedsize 2.662 3.300 3.196 4.785** (2.217) (2.158) (2.110) (2.055) 2.treatedsize 0.931 1.451 1.404 2.510** (1.254) (1.231) (1.204) (1.096) 3.treatedsize −1.525 −1.157 −1.173 −0.141 (2.268) (2.176) (2.158) (2.341) N 936 1,244 5,487 7,014 Note: Standard errors are in parentheses; 1. indicates the smallest initial structure, 2. indicates intermediate initial values, and 3. indicates the largest values of initial structure. * p < 0.10, ** p < 0.05, *** p < 0.01. Statistical Analysis 27 TABLE 4.7. Difference-in-Differences: Conditional on Initial Volume (1) (2) (3) (4) AVC2 AVC2 AVC2 AVC2 1.treatedsize −0.00165 −0.00603 −0.0119 −0.0366 (0.0306) (0.0304) (0.0307) (0.0321) 2.treatedsize −0.00219 −0.00618 −0.0123 −0.0482 (0.0356) (0.0353) (0.0354) (0.0360) 3.treatedsize −0.0427* −0.0481* −0.0542** −0.0832*** (0.0249) (0.0252) (0.0264) (0.0240) N 813 1107 5669 7569 (1) (2) (3) (4) WUPIall WUPIall WUPIall WUPIall 1.treatedsize 2.898* 3.163* 3.228** 3.278* (1.711) (1.663) (1.632) (1.821) 2.treatedsize −0.633 −0.313 −0.263 −0.575 (0.967) (0.833) (0.810) (0.792) 3.treatedsize −2.329 −1.881 −1.873 −1.217 (2.049) (1.938) (1.907) (2.173) N 883 1,191 5,434 6,961 (1) (2) (3) (4) WUPIcoverage WUPIcoverage WUPIcoverage WUPIcoverage 1.treatedsize 5.654 5.250 5.467 3.960 (4.312) (4.253) (4.228) (4.304) 2.treatedsize −4.942*** −5.286*** −4.985*** −7.327*** (1.848) (1.675) (1.693) (1.661) 3.treatedsize −1.160 −1.392 −1.277 −2.928 (2.568) (2.450) (2.497) (2.343) N 883 1,191 5,434 6,961 (1) (2) (3) (4) WUPIquality WUPIquality WUPIquality WUPIquality 1.treatedsize −0.969 0.364 0.197 −0.301 (2.632) (2.342) (2.531) (2.469) 2.treatedsize 0.277 1.722 1.367 1.935 (1.771) (1.434) (1.558) (1.781) 3.treatedsize −2.351 −0.624 −0.922 −0.689 (2.735) (2.339) (2.448) (2.835) N 867 1,175 2,670 4,161 table continues next page 28 Statistical Analysis TABLE 4.7. Continued (1) (2) (3) (4) WUPImgmt WUPImgmt WUPImgmt WUPImgmt 1.treatedsize 1.728 2.213 2.163 3.136* (1.697) (1.690) (1.646) (1.767) 2.treatedsize 1.727 2.258* 2.207* 3.173*** (1.349) (1.305) (1.278) (1.195) 3.treatedsize −3.376 −2.772 −2.806 −1.004 (2.676) (2.564) (2.535) (2.860) N 883 1,191 5,434 6,961 Note:Standard errors in parentheses, 1. indicates the smallest initial structure, 2. indicates intermediate initial values; and 3. Indicates the largest values; of initial structure. *p < 0.10, **p < 0.05, ***p < 0.01. TABLE 4.8. Difference-in-Differences: Conditional on Initial Density (1) (2) (3) (4) AVC2 AVC2 AVC2 AVC2 1.treatedsize −0.0258 −0.0278 −0.0318 −0.0520 (0.0266) (0.0263) (0.0266) (0.0316) 2.treatedsize 0.00724 0.00342 −0.00190 −0.0436 (0.0363) (0.0361) (0.0360) (0.0372) 3.treatedsize −0.00505 −0.0104 −0.0162 −0.0647*** (0.0261) (0.0259) (0.0263) (0.0228) N 864 1,158 5,714 7,614 (1) (2) (3) (4) WUPIall WUPIall WUPIall WUPIall 1.treatedsize 2.299 2.541* 2.564* 2.764* (1.488) (1.433) (1.410) (1.670) 2.treatedsize −2.512** −2.191** −2.168** −1.738 (1.204) (1.071) (1.032) (1.080) 3.treatedsize 0.743 1.036 1.077 0.869 (1.010) (0.911) (0.896) (0.919) N 934 1,242 5,479 7,006 (1) (2) (3) (4) WUPIcoverage WUPIcoverage WUPIcoverage WUPIcoverage 1.treatedsize 4.995 4.665 4.846 3.399 (3.671) (3.583) (3.560) (3.666) 2.treatedsize −4.650** −5.062*** −4.858*** −6.562*** (1.818) (1.614) (1.634) (1.589) 3.treatedsize −4.945** −5.235** −4.950** −7.090*** (2.485) (2.467) (2.477) (2.353) N 934 1,242 5,479 7,006 table continues next page Statistical Analysis 29 TABLE 4.8. Continued (1) (2) (3) (4) WUPIquality WUPIquality WUPIquality WUPIquality 1.treatedsize −1.620 −0.347 −0.604 −0.219 (2.084) (1.813) (1.945) (2.062) 2.treatedsize −2.047 −0.537 −0.818 −0.527 (1.746) (1.333) (1.463) (1.494) 3.treatedsize 3.074 4.304 3.854 4.678 (3.273) (3.072) (3.225) (3.608) N 913 1,221 2,710 4,201 (1) (2) (3) (4) WUPImgmt WUPImgmt WUPImgmt WUPImgmt 1.treatedsize 1.447 1.839 1.787 2.855* (1.577) (1.550) (1.518) (1.677) 2.treatedsize −1.475 −0.924 −1.013 0.559 (1.659) (1.602) (1.547) (1.675) 3.treatedsize 3.266*** 3.778*** 3.763*** 4.964*** (1.472) (1.421) (1.383) (1.372) N 934 1,242 5,479 7,006 Note: Standard errors are in parentheses; 1. indicates the smallest initial structure; 2. Indicates intermediate initial values; and 3. indicates the largest values of initial structure. * p < 0.10, ** p < 0.05, *** p < 0.01. medium and large density upon aggregation. Large- findings very sample specific. Nevertheless, the effects density utilities, however, can compensate for this of the aggregations seem to depend to some extent on decrease with improvements in WUPImgmt, yielding the initial structure of the utilities before the aggrega- no decrease in the aggregate WUPIall. tion. Positive effects are not limited to small utilities but can also accrue to large ones, albeit showing in It is important to note that the obtained results are different performance indicators. sample specific and that the conditionalities are not easy to interpret: there may well be a correlation 4.4 Postaggregation Performance between the magnitude of aggregation and the initial structure, which could confound both sets of results. In addition to comparing the performance of utilities For instance, it is unlikely in aggregation reforms that before and after the aggregation, one gains insightful utilities with low volume and density would be tasked by looking at the postmerger evolution of performance to take over multiple other possibly large-volume utili- in particular. An aggregation might entail a change in ties. For this reason, an observation of an improve- performance when utilities are aggregated simply ment in unit cost for large-volume utilities should not because the new integrated utility experiences the be interpreted too narrowly because the types of (weighted) average of the previous performance. For aggregations probably vary with the initial structure. example, adding many rural utilities with low degrees Moreover, the effects are identified on a subsample of of connection might decrease the indicator for the initial 75 aggregations, a matter that makes the WUPIcoverage despite no actual change. Because data 30 Statistical Analysis on the integrated utilities are not available, a WUPIall score. That the aggregations can have a small second-best strategy is to look at the performance evo- but positive effect on WUPImgmt in the postaggrega- lution after the aggregation. The idea is that discarding tion years is supported by a number of specifications initial performance and therefore any detrimental that differentiate the magnitude of the reform: first, shocks through the aggregation could enable advan- moderate and large increases in the number of towns tages through the aggregation to then materialize over are associated with higher growth in WUPImgmt. the years after the aggregation. 8 Second, large relative increases in density also seem to lead to more improvement in WUPImgmt. The The results are displayed in table 4.9 Similar to the latter effect is sufficiently large to show up in an difference-in-differences estimates, the postaggrega- improvement in WUPIall. To summarize, although tion estimations show little effect by aggregations on the impact is generally neither positive nor negative, average. The panel at the bottom of table 4.9 indi- in some cases aggregations helped to improve the cates a slightly positive effect on WUPImgmt, which growth in WUPImgmt. is, however, too weak to show up in the overall TABLE 4.9. Postaggregation Phase (1) (2) (3) (4) D.AVC2 D.AVC2 D.AVC2 D.AVC2 1.after −0.0000858 0.00115 0.00123 −0.00124 (0.00572) (0.00568) (0.00581) (0.00523) N 639 759 1,848 5,700 (1) (2) (3) (4) D.WUPIall D.WUPIall D.WUPIall D.WUPIall 1.after 0.246 0.323 0.270 0.427 (0.314) (0.308) (0.312) (0.300) N 685 808 1,749 4,973 (1) (2) (3) (4) D.WUPIcoverage D.WUPIcoverage D.WUPIcoverage D.WUPIcoverage 1.after 0.0288 0.0353 −0.0348 −0.0477 (0.613) (0.610) (0.612) (0.549) N 685 808 1,749 4,973 (1) (2) (3) (4) D.WUPIquality D.WUPIquality D.WUPIquality D.WUPIquality 1.after −0.342 −0.180 −0.149 −0.287 (0.475) (0.421) (0.465) (0.406) N 667 790 885 2,946 (1) (2) (3) (4) D.WUPImgmt D.WUPImgmt D.WUPImgmt D.WUPImgmt 1.after 0.561 0.662* 0.579 0.626* (0.378) (0.374) (0.373) (0.360) N 685 808 1,749 4,973 Note: Standard errors are in parentheses. * p < 0.10, ** p < 0.05, *** p < 0.01. Statistical Analysis 31 With respect to the initial structure, the postaggrega- capture pure mergers well, other reorganizations in tion performance seems to vary little. Among the few which a utility takes over a service area without taking results, utilities with only two towns before the aggre- over a town, at least by definition, may be missed. For gation can improve their WUPImgmt and through that this reason, a change in the population in the service improvement also raise their aggregate WUPIall score. area is considered as an alternative here. Although Utilities with large densities are also able to improve changes in population are quite frequent, in the their WUPImgmt. Overall, it might not be surprising absence of data errors, a large change in the service that many of the postmerger results are related to area population can be explained only by an enlarge- WUPImgmt, simply because, for example, decisions ment or aggregation of the service area. The year-by- related to metering and collection are somewhat more year change considered here is 20 percent—that might easily altered than reducing the labor force or improv- seem excessive, but the results are not sensitive to this ing quality and coverage. The analysis of the post- 9 choice because 10 percent and 15 percent lead to simi- merger phase also shows that some of the previous lar conclusions. aggregation effects seem to be driven by immediate As the treatment indicator changes, also the control changes in the wake of the aggregation. group has to be adapted. Accordingly, the matching procedures are repeated but with a different dependent 4.5 Alternative Aggregation Measures variable: a change in the service area population exceeding 20 percent instead of a change in the number In this section of the empirical analysis, an alternative of towns. As before, utilities were eliminated before- measure for aggregations is used. The main motivation hand if the population in the service area decreased in for choosing an alternative lies in the question of how the sample period to ensure a meaningful comparison. closely the indicator “number of towns” in IBNET measures whether a utility has increased its service The results are exhibited in table 4.10. In contrast to area. Although a change in number of towns might before, on average the aggregations here appear to TABLE 4.10. Alternative Merger Indicator (1) (2) (3) (4) AVC2 AVC2 AVC2 AVC2 1.after −0.0536** −0.0371** −0.0303* −0.0300 (0.0237) (0.0178) (0.0170) (0.0205) N 566 846 1,602 2,624 (1) (2) (3) (4) WUPIall WUPIall WUPIall WUPIall 1.after 0.108 −0.351 −1.132 −1.464 (1.066) (1.038) (1.005) (1.180) N 553 833 1,557 2,394 (1) (2) (3) (4) WUPIcoverage WUPIcoverage WUPIcoverage WUPIcoverage 1.after −3.234** −3.126** −4.348*** −4.086** (1.405) (1.495) (1.375) (1.780) N 553 833 1,557 2,394 table continues next page 32 Statistical Analysis TABLE 4.10. Continued (1) (2) (3) (4) WUPIquality WUPIquality WUPIquality WUPIquality 1.after 0.0214 −0.918 −1.065 −4.589 (3.651) (3.353) (3.164) (3.715) N 334 474 884 1,632 (1) (2) (3) (4) WUPImgmt WUPImgmt WUPImgmt WUPImgmt 1.after 2.023* 1.158 0.694 0.407 (1.199) (1.063) (1.091) (1.311) N 553 833 1,557 2,394 Note: Standard errors are in parentheses. * p < 0.10, ** p < 0.05, *** p < 0.01. affect utility performance. First, there is a sizeable different results for utilities with different starting posi- negative effect on unit cost. Although the number of tions in terms of service quality. The previous results aggregations to identify the conditional effects is showed that smaller utilities in terms of customers partly very small, the results suggest that aggregations tend to benefit through aggregations by improving with medium to large increases in volume exhibit the WUPI. Because small utilities are also typically those largest unit cost reductions. This finding is probably of that have a lower initial WUPI score, this result could little surprise, because volume enters the denominator also indicate that aggregations are particularly benefi- of unit cost. Still it confirms that unit cost reductions cial to utilities with low performance. depend on a sizeable increase in volume. Moreover, This intuition is confirmed by statistical tests that dif- the unit cost decreases also seem to be associated with ferentiate utilities depending on the initial level of larger utilities in terms of towns, volume, and density. quality (see table 4.11).10 Utilities with the lowest initial There are also some indications that WUPIcoverage WUPI compared with utilities in the same country decreases because of an increase in the population exhibit larger improvements in the postaggregation in  the service area. Most commonly the coverage phase than do nonmerging utilities. This observation decreases when density decreases most or with utili- is true for both managerial efficiency and the overall ties that already serve many towns. Finally, WUPImgmt performance indicator (WUPI). In contrast, utilities can rise through aggregations when utilities have a with higher initial WUPI do not exhibit such an high density and high volume. improvement through aggregations. There is even some evidence that utilities with high initial WUPI 4.6 Distinguishing Strong and experience lower improvement in coverage—and Weak Utilities through this also lower overall WUPI—in the postag- gregation phase. This final subsection of the empirical analysis addresses the question of whether apart from structure, the ini- The fact that no results are found with respect to cost tial performance of a utility matters. If aggregations are suggests that utilities with initially low quality considered a general reform strategy to improve quality (measured by WUPI) can benefit from aggregations by and to leave a low-level equilibrium, one would expect improving quality, though the aggregation does not Statistical Analysis 33 TABLE 4.11. Difference-in-Differences: Conditional on Initial Performance (WUPI) (1) (2) (3) (4) AVC2 AVC2 AVC2 AVC2 1.treatedweak −0.0115* −0.0101 −0.0103 −0.0116* (0.00664) (0.00655) (0.00667) (0.00655) 2.treatednormal 0.0102 0.0114 0.0112 0.00560 (0.00977) (0.00982) (0.00988) (0.00927) 3.treatedstrong −0.00197 −0.000868 −0.000838 −0.00250 (0.00504) (0.00487) (0.00496) (0.00570) N 608 728 1,817 5,472 (1) (2) (3) (4) WUPIall WUPIall WUPIall WUPIall 1.treatedweak 0.638* 0.726** 0.665** 0.856** (0.346) (0.334) (0.337) (0.336) 2.treatednormal 0.123 0.216 0.165 0.332 (0.483) (0.475) (0.483) (0.485) 3.treatedstrong −0.567** −0.502* −0.562** −0.409 (0.276) (0.269) (0.272) (0.271) N 659 782 1,723 4,947 (1) (2) (3) (4) WUPIcoverage WUPIcoverage WUPIcoverage WUPIcoverage 1.treatedweak 0.801 0.797 0.711 0.736 (0.589) (0.583) (0.578) (0.546) 2.treatednormal −0.207 −0.206 −0.284 −0.342 (0.896) (0.894) (0.898) (0.812) 3.treatedstrong −1.420* −1.407* −1.505* −1.440* (0.827) (0.817) (0.810) (0.788) N 659 782 1,723 4,947 (1) (2) (3) (4) WUPIquality WUPIquality WUPIquality WUPIquality 1.treatedweak −0.834 −0.649 −0.634 −0.674 (0.621) (0.566) (0.627) (0.485) 2.treatednormal −0.239 −0.0317 −0.0284 −0.0794 (0.597) (0.533) (0.590) (0.493) 3.treatedstrong −0.251 −0.136 −0.0950 0.00460 (0.614) (0.567) (0.584) (0.535) N 641 764 859 2,920 table continues next page 34 Statistical Analysis TABLE 4.11. Continued (1) (2) (3) (4) WUPImgmt WUPImgmt WUPImgmt WUPImgmt 1.treatedweak 0.878* 1.003** 0.913** 0.952** (0.471) (0.461) (0.459) (0.473) 2.treatednormal 0.557 0.678 0.612 0.666 (0.606) (0.599) (0.604) (0.583) 3.treatedstrong −0.180 −0.0976 −0.190 −0.238 (0.329) (0.329) (0.319) (0.313) N 659 782 1,723 4,947 Notes: Standard errors are in parentheses; 1. indicates the smallest initial structure; 2. indicates intermediate initial values; and 3. Indicates the largest values of initial structure. * p < 0.10, ** p < 0.05, *** p < 0.01. lead to cost savings. This observation is consistent 6. In most cases the biases are also statistically significant (not shown in the table). with the view of aggregations as a reform option to enable utilities to leave a low-level equilibrium. The 7. The results are not shown but are available upon request. finding that utilities with low initial WUPI improve 8. This means to drop all preaggregation observations. Because with utility fixed effects the aggregation effect is no longer estimable, the performance is related to the previous finding that dependent variable is transformed to first differences. The observa- small utilities seem to be able to improve WUPI tions are now pooled and aggregating utilities identified by a dummy through aggregations: utilities with low WUPI are variable. more frequent in the group of small utilities (few cus- 9. Results for the conditional effects are omitted but are available tomers), and therefore it is not surprising that the upon request. empirical results show that small utilities and those 10. Given the previous arguments that the aggregation may initially with low WUPI can improve WUPI scores through lower WUPI, the focus of this analysis is again on the postaggrega- tion period. aggregations. Notes References Angrist, Joshua D., and Jörn-Steffen Pischke. 2009. Mostly harmless 1. See Angrist and Pischke (2009) or Wooldridge (2010) for comprehen- econometrics: An empiricist’s companion. Princeton, NJ: Princeton sive approaches to the treatment effect literature. University Press. 2. Unlike before, the number of customers is replaced by the volume Caliendo, Marco, and Sabine Kopeinig. 2008. “Some Practical Guidance produced by a utility. Although the overall results are unaffected by for the Implementation of Propensity Score Matching.” Journal of this choice, the goal is to be comparable to the bulk of existing Economic Surveys 22 (1): 31–72. research, which has focused on volume Garcia, Serge, and Alban Thomas. 2001. “The Structure of Municipal 3. In the case of multiple comparison utilities, the weights are adjusted Water Supply Costs: Application to a Panel of French Local Communities.” accordingly. Journal of Productivity Analysis 16 (1): 5–29. 4. For more detailed information on the construction and background, Michaud, David, Maria Salvetti, Michael Klien, Berenice Flores, Gustavo see World Bank/IAWD (2015). Ferro, and Stjepan Gabric. 2017. Joining Forces for Better Services? When, 5. In both cases, the groups are less than −5 percent, between −5 percent Why, and How Water and Sanitation Utilities Can Benefit from Working and 5 percent, and above 5 percent. Together. Washington, DC: World Bank. Statistical Analysis 35 Rosenbaum, Paul R., and Donald B. Rubin. 1985. “Constructing a Wooldridge, Jeffrey M. 2010. Econometric Analysis of Cross Section and Control Group Using Multivariate Matched Sampling Methods That Panel Data, 2nd ed. Cambridge, MA: MIT Press. Incorporate the Propensity Score.” American Statistician 39(1): World Bank/IAWD (International Association of Water Supply Companies 33–38. in the Danube River Catchment Area). 2015. Water and Wastewater Rubin, Donald B. 2001. “Using Propensity Scores to Help Design Services in the Danube Region: A State of the Sector. Regional Report. Observational Studies: Application to the Tobacco Litigation.” Health Washington, DC: International Bank for Reconstruction and Development/ Services and Outcomes Research Methodology 2 (3-4): 169–88. The World Bank. http://sos.danubis.org/files/File/SoS_Report.pdf. 36 Statistical Analysis Chapter 5 Discussion and Conclusion The results of the causal analysis suggest that on were  negative. Therefore, the findings also call into average, the analyzed aggregations have had no ­ ­ question the simple logic of economies of scale, which ­ systematic effect on cost or other performance assumes that small utilities would always benefit from ­ measures. Nevertheless, more nuanced additional aggregations and that large utilities would experience tests, conditioning the aggregation effect on either diseconomies. The fact that small utilities did not con- (a) the magnitude of the reform or (b) the initial struc- sistently show more favorable results from aggrega- ture of the utility, suggest that the effect varies consid- tions than did large utilities is particularly striking and erably across aggregations. suggests that the process and type of reform may also matter greatly. Therefore, although clear-cut predic- Some of the findings correspond with the preceding tions on which type of aggregation or which initial analysis of utility structure. For instance, utilities with structure is most appropriate are beyond the possibili- initially low densities and volume can benefit from ties of this analysis, the results strongly suggest that aggregations by improving performance. Moreover, there are factors related to utility structure that can aid reforms that add only a few towns were found more or hamper the success of aggregations. helpful cost-wise than others. On the other hand, utili- ties that were already large to begin with, both in terms The empirical analysis is, however, subject to a num- of volume and number of towns, still appear to be able ber of limitations. Most prominently, there are a num- to benefit from cost savings due to economies of scale. ber of points that cannot be tackled with the underlying Finally, a number of results found that WUPImgmt, data. First, the analysis is rather short term (in the which measures the financial and managerial perfor- sense of observation period after the aggregation). mance, also experienced positive developments in the It should still be noted that in the long term, the over- postaggregation period. Although the original condi- all cost effects could be different from what is mea- tions can only be speculated, the improvements in sured here by looking at variable cost, because the WUPImgmt were frequently related to utilities that structure of the supply system might be adapted to the were initially high in density and volume. Considering larger network after an aggregation. Second, there is that this indicator might be changed more easily and no information on the “acquired” utilities. Depending quickly than coverage or quality—which relate to the on the initial state of these utilities, the aggregation whole network and to changes in the infrastructure— results might differ considerably. For example, some of the finding highlights that aggregations can benefit the results on reduced coverage could possibly be already-large utilities but possibly more in areas other explained by the fact that the merged utilities exhib- than cost. On top of that, these results indicate the ited a lower coverage than the acquiring utility. potential of aggregations to achieve improvements in Related, the measured average effects after the match- the short term. ing are sometimes called “average treatment effect on The results are quite clear in the sense that the find- the treated,” which suggests that it measures the effect ing of no average effect seems to be driven by a strongly if the aggregating utilities would not have aggregated. heterogeneous treatment effect. In some cases, the Because aggregating utilities were often larger in many effects of the aggregations were positive in the sense dimensions, this effect is a poor picture of what would of  cost or performance; in other cases, the effects happen if the reforms were targeted to more average or Statistical Analysis 37 smaller utilities. Although differentiating the treat- through the reform, and which utilities are affected. ment effects with respect to initial utility size tries The case-studies featured in Michaud and others (2017) to deal with this issue, the problem may still prevail. can help to fill this gap by shedding light on the way aggregations are implemented. This work will go Finally, the design and process of an aggregation reform beyond the purely structural analysis in this report and not only determines how the structure of a utility can also pinpoint less visible but nevertheless decisive changes (for example, to which cluster a utility moves), changes. Moreover, whereas this report could only but also affects a large number of other factors such as allude to the mechanism through which aggregations the allocation of control and decision rights. Many such affect performance (such as the input mix and cost factors are not measured in IBNET and therefore can- shares), the case studies should be able to highlight not be analyzed in this section. In particular, the vari- more in detail the channels that were responsible for ous purposes of the analyzed aggregations are missing. success or failure of aggregations. While the focus on structural characteristics is certainly warranted, the sole focus on that dimension is a clear limitation of this statistical analysis. Reference To conclude, the analysis stresses the importance of Michaud, David, Maria Salvetti, Michael Klien, Berenice Flores, Gustavo Ferro, and Stjepan Gabric. 2017. Joining Forces for Better Services? When, the reform design. More important than whether to Why, and How Water and Sanitation Utilities Can Benefit from Working aggregate or not seems the way a utility is transformed Together. Washington, DC: World Bank. 38 Statistical Analysis Appendix A IBNET Data The main data for this analysis stem from the which utilities consolidate—increase the number of International Benchmarking Network (IBNET) data- served towns—and utilities that keep the number of base. IBNET is a data repository initiated and main- ­ table. For this reason, utilities that served  towns s tained by the World Bank with the objective to improve experienced a reduction in the number of served the service delivery of water supply and sewerage util- towns were excluded, even if they followed or were ities through the provision of international compara- preceded by an increase. Those utilities or part of ­ tive benchmark performance information. integrated into other firms in our them might be ­ sample and could therefore blur the effect we try to The utility coverage by IBNET varies strongly identify. among  countries, both in terms of the number of utilities and in the population living in the service The eventual number of covered utilities by country is area of the utilities. Because the main objective of exhibited in table A.1. The 1,306 utilities span an unbal- this study is to measure the effect of consolidations, anced panel of 8,059 utility-year observations from some particular utilities were excluded. The main 1995 to 2015. Summary statistics of the used variables idea was to restrict the comparisons to cases in are displayed in table A.2. TABLE A.1. Utilities per Country, by Treatment Status Country Not aggregated Aggregated Total Albania 27 0 27 American Samoa 0 1 1 Argentina 5 0 5 Armenia 1 2 3 Belarus 27 2 29 Bolivia 2 0 2 Bosnia and Herzegovina 32 2 34 Brazil 629 2 631 Chile 4 1 5 Côte d’Ivoire 0 1 1 Croatia 11 1 12 Czech Republic 3 4 7 Ecuador 1 0 1 Egypt, Arab Rep. 18 0 18 Fiji 1 0 1 Georgia 17 2 19 Guam 1 0 1 table continues next page Statistical Analysis 39 TABLE A.1. Continued Country Not aggregated Aggregated Total Honduras 1 0 1 Hungary 11 6 17 Jordan 2 0 2 Kazakhstan 22 5 27 Kiribati 1 0 1 Korea, Rep 160 0 160 Kosovo 6 0 6 Kyrgyz Republic 11 2 13 Lithuania 35 2 37 Macedonia, FYR 10 4 14 Marshall Islands 1 0 1 Mexico 7 2 9 Micronesia, Fed. Sts. 2 0 2 Moldova 29 0 29 Mongolia 1 0 1 Montenegro 5 0 5 Namibia 1 0 1 Norway 1 0 1 Pakistan 2 0 2 Panama 1 0 1 Papua New Guinea 1 1 2 Poland 3 12 15 Romania 8 15 23 Russian Federation 38 3 41 Samoa 0 1 1 Serbia 16 5 21 Singapore 1 0 1 Slovak Republic 7 1 8 Solomon Islands 1 0 1 South Africa 8 0 8 Swaziland 1 0 1 Tajikistan 8 0 8 Turkey 5 0 5 Uganda 1 0 1 table continues next page 40 Statistical Analysis TABLE A.1. Continued Country Not aggregated Aggregated Total Ukraine 26 0 26 Uzbekistan 6 1 7 West Bank and Gaza 1 1 2 Yemen, Rep. 6 0 6 Zambia 2 0 2 Total 49 1,227 79 1,306 TABLE A.2. Summary Statistics Variable Mean Std. Dev. Min. Max. N AVC 0.28 0.21 0 1.89 7,769 after 0.05 0.23 0 1 8,059 WUPIall 70.44 12.21 20.16 99.95 7,152 WUPIcoverage 60.82 18.91 3.92 100 7,152 WUPIquality 86.52 20.42 0 100 4,336 WUPImgmt 74.05 14.91 8.52 100 7,152 vol_both 30,201,859.37 76,735,798.81 40,000 996,000,000 8,008 dens_both 307.53 225.62 34.33 2,977.94 8,059 towns_both 7.34 35.11 2 1,187 8,059 Note: AVC = average variable cost; after = utilities after aggregation; WUPIall = performance indicator including all subindicators; WUPIcoverage = performance indicator for coverage; WUPIquality = performance indicator for quality; WUPImgmt = performance indicator for management; vol_both = ­ volume of both water and wastewater; dens_both = density of both water and wastewaster; towns_both = number of towns for both water and wastewater. Statistical Analysis 41 Appendix B Methodological Details of Clustering In this section the approach used for the cluster analy- After taking the natural log, the dispersion is much sis is briefly described. First, it should be noted that reduced. the three input variables—volume, density, and num- Second, because the number of clusters in k-means ber of towns—are transformed by taking the natural log clustering is somewhat ad hoc (similar to the cut-off in and then standardizing the variable. The resulting hierarchical clustering), we compute a number of test variable has mean 0 and a standard deviation of 1. statistics for the choice of the number of clusters. This last step is taken to ensure that the different Following Makles (2012), we compute within cluster measurements do not affect the cluster results. sum of squares (WSS), its logarithm (ln(WSS)), as well Without the transformation, variables with higher two measures of fit (h 2 and PRE, proportional reduc- variation would tend to have a higher influence on tion of error). The results are displayed in figure B.1. the clustering. Taking the natural log is motivated by What is apparent in the graph is that a minimum of the fact that the distribution is highly right skewed. three clusters is necessary to capture a large share of FIGURE B.1. Tests Statistics for K-Means Cluster Choice 15000 9.5 9.0 10000 log(WSS) WSS 8.5 5000 8.0 0 7.5 0 2 4 6 8 10 12 14 16 18 20 0 2 4 6 8 10 12 14 16 18 20 k k 0.4 0.8 0.3 0.6 0.2 PRE η2 0.4 0.1 0.2 0 0 −0.1 0 2 4 6 8 10 12 14 16 18 20 0 2 4 6 8 10 12 14 16 18 20 k k Note: The top panels exhibit within sum of squares and the log of within sum of squares for different numbers of k. The lower panels show h2 and PRE. Statistical Analysis 43 the cluster variation. Moreover, after seven clusters, Third, because k-means clustering is sensitive to the the gains in fit by adding more clusters are very small. initial cluster allocation of individual utilities, first a After experimenting with the results and the distinct hierarchical clustering is run and the associated results clusters generated by different k’s, the number of six are used as a starting classification for the k-means final clusters was chosen. Less than six would mean clustering. For the hierarchical clustering, Euclidean obtaining considerably more heterogeneous clusters, distance and an average-linkage algorithm is used. whereas having more than six does not add to the interpretation and distinctness of the clusters. Adding Reference a seventh cluster would further differentiate the utili- Makles, Anna. 2012. “Stata Tip 110: How to get the Optimal K-means ties serving a single town. Cluster Solution.” Stata Journal 12 (2) :347–51. 44 Statistical Analysis SKU W17061