Policy Research Working Paper 10327 The Importance of Maintenance Geospatial Analysis of Cholera Risk and Water and Sanitation Infrastructure in Harare, Zimbabwe George Joseph Sveta Milusheva Hugh Sturrock Tonderai Mapoko Sophie Charlotte Amy Ayling Yi Rong Hoo Water Global Practice February 2023 Policy Research Working Paper 10327 Abstract Understanding the specific factors associated with cholera the results highlight a number of sociodemographic risk outbreaks is an integral part of designing better approaches factors and suggest that there is a relationship between chol- to mitigate their impact. This paper uses georeferenced case era risk and water infrastructure, with populations living data from the cholera epidemic that occurred in Harare, in close proximity to the sewer network with high access Zimbabwe, from September 2018 to January 2019. The to piped water being at higher risk. A possible explanation paper applies spatio-temporal modeling to understand for this surprising observation is that sewer bursts led to how the outbreak unfolded and the factors associated with the contamination of the piped water network, turning higher risk of being a reported case. The study highlighted access to piped water, usually assumed to be protective, into a number of findings. First, using call detail records to esti- a risk factor. Although further studies are required to test mate weekly population movement throughout the city, the this hypothesis, if it is true, it highlights the importance of results suggest that human movement helps to explain the maintenance for achieving the Sustainable Development spatiotemporal patterns of the cases observed. In addition, Goals of improved water and sanitation infrastructure. This paper is a product of the Water Global Practice. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at gjoseph@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team The Importance of Maintenance: Geospatial Analysis of Cholera Risk and Water and Sanitation Infrastructure in Harare, Zimbabwe George Joseph1, Sveta Milusheva, Hugh Sturrock, Tonderai Mapoko, Sophie Charlotte Amy Ayling, Yi Rong Hoo 1George Joseph (corresponding author), Water Global Practice, World Bank, Washington DC 20433, email: gjoseph@worldbank.org. Sveta Milusheva, Development Impact Evaluation, World Bank , Washington DC 20433, smilusheva@worldbank.org, Hugh Sturrock, Co-Founder and Chief Science Officer at Locational.UK, hugh@locational.io, Tonderai Mapako, Chair, Biomedical Research and Training Institute, Institutional Review Board, Harare, Zimbabwe, mapakot2008@gmail.com, Sophie Charlotte Amy Ayling, Center for Advanced Spatial Analysis, University College, London UK, sayling@worldbank.org,and Yi Rong Hoo, World Bank, Washington DC 20433, yhoo@worldbank.org. We would like to acknowledge the support of the Global Water Security and Sanitation Partnership (GWSP) of the World Bank and the ieConnect for Impact Program funded with UK aid from the UK government in completing this work. Also this analysis formed part of a broader collaboration of the World Bank with the Zimbabwe National Modeling Consortium. In particular the Biomedical Research and Training Institute - Institutional Review Board (BRTI-IRB) and the COVID-19 Modeling and Spatial analysis sub-group (modeling consortium); Mr. Tendayi Kureya, Executive Secretary of the Medical Research Council of Zimbabwe (MRC-Z); Dr Shungu Munyati, Director General of BRTI; Professor in Demography and Behavioral Science Simon Gregson and Dr. Mike Pickles, Research Fellow of Imperial College London (ICL). From the World Bank we would like to thank Chenjerai Sismayi from the Health, Nutrition and Population Global Practice of the World Bank as well as the Country Management Unit of the World Bank in Zimbabwe. The findings, interpretations and conclusions expressed in this paper do not necessarily reflect the views of the World Bank, the Executive Directors of the World Bank or the governments whom they represent. The World Bank does not guarantee the accuracy of the data included in this work. The Importance of Maintenance: Geospatial Analysis of Cholera Risk and Water and Sanitation Infrastructure in Harare, Zimbabwe Introduction The World Health Organization’s (WHO’s) Global Task Force on Cholera Control views cholera not just as a global threat to public health, but also an indicator of poverty and a mark of inequity[1]. Since 1970, cholera has become endemic in many countries in Sub-Saharan Africa [2] with annual case fatality ratios (CFRs) of 1.6% - the highest in the world [3]. While several oral cholera vaccines (OCVs) now exist, they only protect against cholera infection for up to two years, and they address the symptoms and not the cause. Large outbreaks continue to plague several countries, especially in Sub-Saharan Africa. Without addressing the pathways to contamination in these low-income settings with inadequate infrastructure, outbreaks will continue to be a risk. Pathways for transmission can include contaminated water sources, both for drinking and domestic use [4][5], [6][7], poor sanitation [8][9][10]; contaminated food, poor personal hygiene [11] and poor drainage [11], [12]. These same pathways can also lead to other water-borne diseases such as typhoid, making it vital to understand which pathways are most important in a given setting in order to target resources effectively to reduce the spread of these diseases. Additionally, since cholera is a water-borne disease, the focus in terms of risk factors is usually on infrastructure and water sources, yet another important factor is mobility. If infected individuals travel from an area with a cholera outbreak to an area without any cases and the water/sewer networks are susceptible to contamination, the traveler could contaminate the water in this new location and lead to the spread of the disease [[13], [14]]. Studying the role of mobility could be important for helping to contain spread [15]–[17]. In particular, outbreaks in developing country contexts are usually attributed to where households have low access to WHO Joint Monitoring program classified ‘improved’ water and sanitation infrastructure such as networked water and sewer systems [2], [18]. However, poorly maintained or operated network infrastructure can also fail to bring about the protection from cholera that it may seem to promise. There are examples from developed and middle-income countries where poor operation of sewerage infrastructure can lead to health and environmental hazards [19]–[21]. An understanding of the spatial distribution of cholera risk, and the factors driving risk, can support decision making around longer-term interventions to reduce the risk of outbreaks. Especially in cities with mixed service provision - where households may rely on piped water and point sources for drinking water, and on site sanitation or sewer networks for sanitation - there are a variety of potential improvements that could be made. These range from investing in repair or construction of network infrastructure (water or sewer pipes), to ensuring better access to protected point sources (boreholes, protected wells) or improving the quality of on-site sanitation (such as pit 2 latrines or septic tanks). While in general, improving access to piped water is considered an improvement in infrastructure with public health benefits, piped water networks can become health hazards if not properly maintained. For example in Zambia, Ashraf et al.(2017) found that service interruptions were associated with an increased incidence of diarrheal disease, upper respiratory infections and typhoid fever [22]. Equally, sewer networks are considered to be the top of the sanitation service ladder according to the Joint Monitoring Program of the WHO (JMP-WHO, 2020) with risks from onsite facilities more well documented in the literature (Bancalari & Martinez, 2018)[23]. Nevertheless, when bursts or leakages are not properly addressed, liquid sewage can potentially be even more hazardous than waste from onsite sanitation facilities [24]. This paper uses geo-spatial analysis to explore the role of water and sanitation infrastructure in cholera risk. The study focuses on Harare, Zimbabwe, using the cholera outbreak of September - December 2018 as a case study. We use spatial modeling, detailed data on infrastructure and mobility, and georeferenced case data to identify factors associated with cholera risk. Second, we model different scenarios of infrastructural improvements and compare which of them contributes the most to reducing such risk. Third, we look at how the implementation of infrastructural improvements can be selectively targeted by ward throughout the city in order to potentially reduce cost and maximize health impacts. Methods Ethics Statement This study was approved by the Medical Research Council of Zimbabwe. The approval number is MRCZ/E/302 and it was assessed as exempt. All data obtained and herein reported were anonymized prior to analysis. The Study Area Zimbabwe is a landlocked Sub-Saharan African country with a population of just under 15 million people (World Bank, 2021). It is divided into 60 districts and bordered by Zambia, Mozambique, Botswana and South Africa. Its capital, on which this analysis is focused, is Harare, a city with a population of 1.5 million inhabitants (UN, 2012) and 46 wards. Zimbabwe has experienced cholera outbreaks dating back to 1971 with the worst occurring in 2008/9 which resulted in 4,288 reported deaths over 60 of its 62 districts [25]. The focus of this paper is a more localized but severe outbreak which took place between September and December 2018 and was concentrated in the capital, Harare. Of almost 10,000 cases, 93% (9,755) were reported in Harare, resulting in 46 deaths. According to the latest DHS data, 30% of Harare’s population is connected to an existing sewer network, compared to just 12% in neighboring country capital Lusaka, Zambia (DHS, 2015 & 2018). All the data detailed in the paragraphs below was cleared for use by the Medical Research Council of Zimbabwe (MRC-Z). The approval letter can be found in the supporting information. 3 Case data Georeferenced cholera case data were available for the Harare area collected by the Ministry of Health and Child Care (MoHCC) following the World Health Organization (WHO) guidelines. The first case was recorded on September 4, 2018, and the last on January 9, 2019. WHO criteria for case diagnosis state that any patient aged 5 years or more presenting with acute watery diarrhea and severe dehydration where cholera is not known to be occurring, or any patient 2 years or older presenting with acute watery diarrhea where cholera is known to be occurring can be suspected as a cholera case. Confirmation is achieved by sampling a subset of patients during the course of the epidemic.2 Date of diagnosis, gender, age and geo-location of the case’s dwelling are also collected. For the purpose of spatio-temporal analyses, cases were aggregated across 7-day bins from September 1, 2018. Age and gender of the patients is not taken into account in this analysis. A total of 9,890 cases were included in the analysis after cleaning. Population Characteristics Data Population density data were available from Facebook HDX[27]. Small Area Estimates (SAEs) for poverty produced by the World Bank were used for ward area estimates of poverty. Household access to water, sanitation and hand washing infrastructure was taken from the DHS 2015. This was available at the cluster level for 60 clusters for Harare and the surrounding area, and 44 inside the city. Water and Sanitation Infrastructure and Environment Data Data on water and sanitation infrastructure in the city was collected from a variety of sources. The water and sanitation network shape files for the city were obtained with the support of the City of Harare’s (CoH) Water and GIS departments. Complaints data were digitized for 7,179 water network breakage reports and 24,713 sewer burst location reports from around the city with the support of Harare’s GIS department. Water quality samples for 23 sampling points, administrative boundaries for water supply (DMAs) (165 suburbs), and the location of 362 markets and public facilities were also collected and digitized by City of Harare’s GIS department. Data on 1,848 private borehole locations was provided by Manyame, Upper Mazowe and Nyagui Sub-catchment area operators, for which 1,721 were inside Harare. Finally, a groundwater vulnerability index was developed by Ziva, based on a combination of geological factors (fault index, soil permeability, depth to groundwater, topographic wetness index and fissure flow). Population Mobility Data We use anonymous and aggregated call detail record (CDR) data provided by two mobile network operators in Zimbabwe covering the period of the outbreak (September 2018-February 2019). Anonymous CDR data contain information on the location of the nearest cellphone tower through which any calls or texts made or received are routed. By comparing the locations of where these 2 It is standard practice in epidemic response that ‘Once an outbreak is declared, there is no need to confirm all suspected cases. The clinical case definition is sufficient to monitor epidemiological trends’. [26] 4 calls or texts are placed for the same anonymized SIM over time, it is possible to estimate movement patterns for any SIM down to the spatial resolution of the cell phone tower catchment area [[28]–[30] citations]. Spatial Modeling To generate continuous maps of cholera risk across Harare and to examine relationships with potential risk factors, we fitted spatiotemporal Poisson point process models to the cholera case data. Poisson point process models are suitable for modeling the distribution of points in space and time and have been used in a number of settings including modeling disease case data [31], species distributions [32] and locations of storm peaks [33]. Similar to the approach taken by Youngman and Economou [33], point process models were fit in a generalized additive framework [34] using a grid of 4,000 quadrature points per time period. Quadrature weights were set to be equal to the population in each quadrature grid cell using gridded population data available via Facebook [27]. Table S1 lists details of the covariates examined in the modeling process. All covariates were produced at, or resampled to, approximately 100m resolution (Figure 1). Figure 1 - Covariates included in the point process modeling. To integrate human movement into the model, for every individual/SIM we calculated the probability that an individual traveling to a destination became infected in the last ten days based on the locations the individual spent time in during those seven days, the duration of time spent in 5 each location and the local level of risk at each location (estimated by calculating observed weekly incidence per tower catchment). This was aggregated across individuals entering a destination ward in a given week using a methodology drawn from Milusheva et al. to calculate the risk of imported malaria [28]. This resulted in an estimate of the incidence of ‘imported’ cases per ward for each week. This was rasterized to the same resolution and extent as the other covariates and included in the model as a dynamic covariate lagged by 1 week, i.e. incidence of imported cases in the week prior. Figure 2- Estimated incidence of imported cholera cases for the first 6 weeks of the outbreak in Harare. These static and dynamic covariates were included in the model as linear effects. Covariates were retained in the model if their inclusion led to a decrease in AIC of >2. In addition, to examine the hypothesis that faults in the sewer network may have led to contamination of the piped water network, we included an interaction between the proportion of households with piped water and distance to sewer network. We also included a spatiotemporal effect, modeled using a 3- dimensional tensor product between a low-rank Gaussian process [31] smooth on latitude and longitude and a cubic regression spline smooth on week. An additional penalty term was added to each covariate to allow non-significant variables to essentially be selected out of the model. Model fit was evaluated by plotting the observed number of cases per time period against the number predicted by the model. In addition, we constructed a hexgrid of approximately 1km width and compared the number of observed cases across time periods per grid cell to the number predicted. 6 Scenario Modeling To explore the potential impact of different public health intervention scenarios (Table 1), we adjusted the relevant covariate and used the final model to predict a counterfactual. For this exercise we refit the final model without the imported incidence variable as this would change under the different scenarios – i.e. if there were no sewer bursts, transmission might be predicted to be lower which in turn would reduce the incidence of imported infections to other wards. For each scenario, we also ranked the wards (3rd administrative level) by the expected change in number of cases that would result from that scenario. This allowed us to identify where interventions could have been targeted and the expected impact of such targeting. We also calculated the population of each ward to allow us to create an efficiency index, estimated as: % ℎ = % Scenario Change to covariate layers No sewer bursts throughout city Sewerburst density set to 0 throughout Harare No sewer bursts within 200m of the water line Sewerburst density set to 0 within 200m of waterline No sewer bursts in high (>median) density burst Sewerburst density set to 0 where areas sewerburst density >median No sewer bursts in highest (75th percentile) Sewerburst density set to 0 where density burst areas sewerburst density >75th percentile Piped water provided to entire city Proportion with piped water set to 1 Piped water provided to areas within 500m of the Proportion with piped water set to 1 within existing piped water network 500m of the water network Sewer access provided to entire city Distance to sewer and sanitation risk both set to 0 Sewer access provided to areas within 500m of Distance to sewer and sanitation risk both the existing sewer network set to 0 within 500m of sewer network Piped water and sewer access provided to entire Proportion with piped water set to 1, city distance to sewer and sanitation risk both set to 0 Piped water provided within 500m of existing Proportion with piped water set to 1 within piped water network and sewer access provided 500m of the water network, Distance to within 500m of existing sewer network sewer and sanitation risk both set to 0 within 500m of sewer network Table 1: Scenarios modeled and corresponding changes to covariate layers Results Figure 3 shows the distribution of cases temporally and spatially. Cases were concentrated towards the first few weeks in the south-west area of the city in the Glen View and Budiriro neighborhoods. 7 Figure 3 - Distribution of cholera cases across Harare temporally by week (left) and spatially across all weeks (right) All covariates except for water burst density were retained in the final model, including the interaction between proportion with piped water and distance to the sewer network. Table 2 shows the coefficients for the final model. The final model had an AIC value of 56,034. The same model without lag incidence of imported cases had an AIC of 56,219 and a spatiotemporal only model (i.e. with no covariates) had an AIC of 57,943. This indicates that the addition of covariates and lag incidence improved the model fit. Table 2. Results from the point process modeling of cholera risk in Harare. Term Coefficient 95% CI p-value Intercept -5.7930 -8.17, -3.41 <0.0001 Geological vulnerability -2.9880 -3.75, -2.23 <0.0001 Distance to medium sized market (decimal -0.000643, - degrees) -0.0005 0.000399 < 2e-16 Sanitation risk 4.2810 3.53, 5.03 < 2e-16 Proportion with piped water 2.2090 -3.96, 8.38 0.4831 Proportion with unimproved/other water -81.5300 -100, -62.7 < 2e-16 Poverty 14.3600 10.3, 18.5 <0.0001 Private borehole density -36.2900 -61.6, -11.0 0.0049 Sewer burst density 3.1790 2.38, 3.98 <0.0001 Distance to water network -199.3000 -231, -167 < 2e-16 Distance to sewer network 20.3800 -15.4, 56.2 0.2645 Lag incidence of imported cases 246.7000 176, 318 <0.0001 Proportion with piped water * Distance to sewer network -729.1000 -830, -628 < 2e-16 8 Model validation showed that the fitted values showed good alignment with the observed numbers of cases, both per week and per 1km hexgrid cell (Figure 4). Figure 4 - Observed versus predicted numbers of cases per week (left) and per ~1km hexgrid cell (right). The modeling suggested that sanitation risk, poverty, sewer burst density and incidence of imported cases were all risk factors for cholera (Table 2). In contrast, increasing geological vulnerability, distance to medium sized market, proportion with unimproved or other water source, private borehole density and distance to water network were all protective. The modeling also suggested that there was an interaction between the proportion of households with piped water and distance to the sewer network. Figure 5 describes this interaction, showing the relationship between the proportion of households with piped water and cholera risk at different distances from the sewer network when all other covariates are set to their mean. This shows that at a distance of 0km to the sewer network, risk is slightly higher, the higher the proportion of households with piped water. However, this relationship switches direction in areas >2km from the sewer network, where the higher the proportion of households with piped water, the lower the risk of cholera. 9 Figure 5 -The effect of proportion with piped water at different distances to the sewer network. The y axis represents the log rate when holding all other covariates at their mean value. Higher values indicate higher cholera risk Figure 6 shows the predicted incidence at 100m resolution across Harare for the first 6 weeks in which 88.6% of cases were recorded. Incidence is markedly higher towards the south west part of the city around Glen View and Budiriro neighborhoods. 10 Figure 6- Predicted incidence of cholera across Harare for the first 6 weeks of the outbreak (September 1st – October 13th 2018). Note the log-scaled colour palette. Scenario Modeling The results of the scenario modeling are shown in table 3. These show that if there had been no sewer bursts, case numbers could have been 87% lower. Furthermore, targeting prevention of sewer bursts to those areas in which burst density was highest (above the 75th percentile) would have nearly the same effect as prevention throughout the city, but would only require focusing on areas that contain 29% of the population. This was the most efficient of the strategies explored. As the modeling suggested that piped water is a risk factor when households are very close to the sewer network, the scenarios that included improving access to piped water and/or sewers generally led to a predicted rise in cases. Scenario Proportion Population Proportion Efficiency change in cases targeted of (% (% change) population change in cases/% of pop targeted No sewer bursts throughout city -0.87 (-87%) 2599530 1 0.87 No sewer bursts within 200m of the water line -0.85 (-85%) 1554432 0.6 1.41 No sewer bursts in high (over median) density burst areas -0.87 (-87%) 1455466 0.56 1.55 No sewer bursts in highest (75th percentile) density burst areas -0.86 (-86%) 764361 0.29 2.97 Piped water provided to entire city 8.20 (820%) 2599530 1 -8.20 11 Piped water provided to areas within 500m of the existing piped water network 7.55 (755%) 1726325 0.66 -11.44 Sewer access provided to entire city 1.71E+16 2599530 1 -1.71E+16 Sewer access provided to areas within 500m of the existing sewer network -0.54 1459135 0.56 0.96 Piped water and sewer access provided to entire city 5.03E+16 2599530 1 -5.03E+16 Piped water provided within 500m of existing piped water network and sewer access provided within 500m of existing sewer network 2.91 (291%) 1830207 0.7 -4.15 Table 3: Scenarios and results of respective changes in water and sanitation coverage across the city in terms of reducing cholera risk. In terms of opportunities to spatially target intervention options, results show that for all four scenarios involving preventing sewer bursts, the same five wards were ranked 1st to 5th (Figure 7).3 Targeting these five wards could have prevented nearly 70% of cases. 3 Ranking was conducted by calculating the difference between the total cases in the ward predicted under a given scenario and the observed cases in the ward, aggregating the cases from all the grid cells in a ward. 12 Figure 7 - Left - The five wards in which targeting interventions would prevent the largest number of cases for the scenarios A – no sewer bursts, B – no sewer bursts within 200m of the water line, C - No sewer bursts in high (median) density burst areas and 13 D -- no sewer bursts in highest (75th percentile) density burst areas. Right – the cumulative reduction (%) in cases achieved, and population targeted, from targeting wards in order of priority. Discussion There are a few key findings that come out of this research and have direct policy implications. The modeling suggested that sanitation risk, poverty, sewer burst density and incidence of imported cases were all risk factors for cholera. Meanwhile, increasing geological vulnerability, distance to medium sized market, proportion with unimproved or other water source, private borehole density and distance to water network were also correlated, but negatively, meaning they acted as protective factors. The correlation of poor sanitation more generally with cholera risk [12] as well as sewer bursts are well known in the literature [20] as well as specifically being known to be an issue in the context of a previous cholera outbreak in Zimbabwe [35]. The association with poverty is also unsurprising [36]. Associated factors with poverty and market size are often identified in the literature such as population density, overcrowding and low educational status [37], [38]. The protective character of proximity to the water network is also supported by a similar study in Lusaka, Zambia [12]. Surprising findings from this work include the fact that proportion of those with unimproved water source is a protective factor and that increases in geological vulnerability is also a protective factor. The former can be explained by the proximity of piped water sources to sewers with a high number and density of bursts. However, the association of increased geological vulnerability with reduced cholera risk is unexpected, not substantiated in the literature and requires further exploration. The modeling suggests that there was an interaction between the proportion of households with piped water and distance to the sewer network. In the case of Harare, it suggests that closer to the sewer network (at 0km), the risk of cholera is slightly higher, the higher the proportion of households who have piped water. However, this relationship switches direction in areas >2km from the sewer network, where the higher the proportion of households with piped water, the lower the risk of cholera. This implies that there is a higher risk of having a piped water connection for households which are proximate to the sewer network in Harare, which we believe to be due to the high rate of sewer bursts in the city at the time of the outbreak. The most accurate way of testing whether this is the case would be through water quality testing data at different points along the network. Although water quality testing data was provided for this analysis, there were insufficient data points to be able to make such a comparison. In the absence of such data, this analysis can still provide some useful insights into investments that can reduce sewer bursts particularly in areas which are closer to the piped water network, and thereby offer the greatest health benefits in terms of reducing cholera risk. We identify that spatially, the risk of sewer bursts on this occasion was concentrated in five wards in the south west of the city. On the assumption that sewer bursts do not occur at random, but are spatially correlated due to underlying weaknesses in the network in those areas, targeting these five wards promises to reduce cases across the city by up to 70%. Using the data from the bursts reported and analyzed, there appeared to be limited benefits (in terms of reducing cholera risk) in investing in repairing sewer bursts throughout the city as it would have led to a .87 reduction in 14 risk, versus a 0.86 reduction in risk when concentrating on the areas where the top 25% of sewer bursts are concentrated. The efficiency and cost effectiveness of reducing cholera risk is therefore much greater when concentrating areas that fall within the top 25th percentile of sewer bursts. It is also more efficient to provide piped water to within 500 meters of the existing network (7.55), rather than to extend access to the whole city (8.2) when considering their relative impact on reducing the number of cholera cases. Finally, we also see how extending access to the sewer network in its state at the time of the data collection in fact seems to increase cholera risk. This finding may seem perverse given that according to the Joint Monitoring Program of the WHO, which set the standards for sanitation globally, sewered sanitation is at the top of the improved sanitation ladder. However, it serves to highlight the importance of proper maintenance of existing sanitation networks rather than necessarily emphasizing new construction. This is particularly important in a context where the institutional capacity and financial resources to maintain large-scale sanitation systems may be lacking. An important limitation of the study is that the analysis presented relies on observational data and thus we draw correlations and associations, rather than drawing causal inferences. This clearly has important implications when interpreting the scenario analysis. Additionally, there are likely to be some errors in case data such as the locations and times of case identification. Similarly, a number of layers used in the analysis, such as proportion of households with access to piped water, were generated using DHS data, which have been spatially obfuscated. As a result, this may have introduced error into the layer, which could obscure, or falsely lead to, associations with cases. In addition, sewer and water burst density was based on spatiotemporally referenced complaints data, which is obviously a proxy measure with a number of reported and recorded issues. Data from direct inspections of sewer and water networks would have been a more reliable measure but such data was not available to the researchers. The focus of this analysis has been to study what factors might be associated with a cholera outbreak and could explain the spatial heterogeneity in the outbreak. An area of future research would be to explore what may lead to some of these factors, especially sewer bursts, in order to be able to predict areas at risk for future outbreaks. Above all, this analysis shows the importance of emphasizing maintenance above and beyond new construction, especially when thinking of extending an already poorly maintained system. Acknowledgments The authors acknowledge with thanks the financial support received from the Global Water Security and Sanitation Partnership (GWSP). The authors also acknowledge the cooperation of officials from City of Harare (CoH) Water and GIS Departments, in particular Toine Ramaker from VEI, Dr. Manangazira from the Ministry of Health, Zimbabwe; in country counterparts at the World Health Organization (WHO); and the Country Management Unit (CMU) of the World Bank in Zimbabwe, in particular then Country Manager Mukami Kariuki, Senior Operations Officer Tonderai Fadzai Naome Mukonoweshuro and Health Specialist Consultant Chenjerai Sisimayi for facilitating contact with counterparts and local dissemination of findings. 15 The findings, interpretations and conclusions expressed in this paper do not necessarily reflect the views of the World Bank, the Executive Directors of the World Bank or the governments whom they represent. The World Bank does not guarantee the accuracy of the data included in this work. References [1] D. Legros, ‘Global Cholera Epidemiology: Opportunities to Reduce the Burden of Cholera by 2030’, J. Infect. Dis., vol. 218, no. suppl_3, pp. S137–S140, Oct. 2018, doi: 10.1093/infdis/jiy486. [2] G. Bwire et al., ‘Identifying cholera “hotspots” in Uganda: An analysis of cholera surveillance data from 2011 to 2016’, PLoS Negl. Trop. Dis., vol. 11, no. 12, p. e0006118, Dec. 2017, doi: 10.1371/journal.pntd.0006118. [3] World Health Organization, ‘Weekly epidemiology record’, WHO, 37, 2020. [Online]. Available: https://apps.who.int/iris/bitstream/handle/10665/334241/WER9537-eng- fre.pdf?sequence=1&isAllowed=y [4] S. Gundry, J. Wright, and R. Conroy, ‘A systematic review of the health outcomes related to household water quality in developing countries’, J. Water Health, vol. 2, no. 1, pp. 1– 13, Mar. 2004, doi: 10.2166/wh.2004.0001. [5] N. Jones, M. Bouzid, R. Few, P. Hunter, and I. Lake, ‘Water, sanitation and hygiene risk factors for the transmission of cholera in a changing climate: using a systematic review to develop a causal process diagram’, J. Water Health, vol. 18, no. 2, pp. 145–158, Apr. 2020, doi: 10.2166/wh.2020.088. [6] C. C. Dan-Nwafor et al., ‘A cholera outbreak in a rural north central Nigerian community: an unmatched case-control study’, BMC Public Health, vol. 19, no. 1, p. 112, Dec. 2019, doi: 10.1186/s12889-018-6299-3. [7] G. Pande et al., ‘Cholera outbreak caused by drinking contaminated water from a lakeshore water-collection site, Kasese District, south-western Uganda, June-July 2015’, PLOS ONE, vol. 13, no. 6, p. e0198431, Jun. 2018, doi: 10.1371/journal.pone.0198431. [8] A. Jutla, R. Khan, and R. Colwell, ‘Natural Disasters and Cholera Outbreaks: Current Understanding and Future Outlook’, Curr. Environ. Health Rep., vol. 4, no. 1, pp. 99–107, Mar. 2017, doi: 10.1007/s40572-017-0132-5. [9] R. J. Waldman, E. D. Mintz, and H. E. Papowitz, ‘The Cure for Cholera — Improving Access to Safe Water and Sanitation’, N. Engl. J. Med., vol. 368, no. 7, pp. 592–594, Feb. 2013, doi: 10.1056/NEJMp1214179. [10] D. L. Taylor, T. M. Kahawita, S. Cairncross, and J. H. J. Ensink, ‘The Impact of Water, Sanitation and Hygiene Interventions to Control Cholera: A Systematic Review’, PLOS ONE, vol. 10, no. 8, p. e0135676, Aug. 2015, doi: 10.1371/journal.pone.0135676. [11] A. Prüss‐Ustün et al., ‘Burden of disease from inadequate water, sanitation and hygiene in low‐ and middle‐income settings: a retrospective analysis of data from 145 countries’, Trop. Med. Int. Health, vol. 19, no. 8, pp. 894–905, Aug. 2014, doi: 10.1111/tmi.12329. [12] S. Sasaki, H. Suzuki, Y. Fujino, Y. Kimura, and M. Cheelo, ‘Impact of Drainage Networks on Cholera Outbreaks in Lusaka, Zambia’, Am. J. Public Health, vol. 99, no. 11, pp. 1982– 1987, Nov. 2009, doi: 10.2105/AJPH.2008.151076. [13] L. Mari et al., ‘On the role of human mobility in the spread of cholera epidemics: towards an epidemiological movement ecology: HUMAN MOBILITY AND CHOLERA EPIDEMICS’, Ecohydrology, vol. 5, no. 5, pp. 531–540, Sep. 2012, doi: 10.1002/eco.262. 16 [14] L. Mari et al., ‘Modelling cholera epidemics: the role of waterways, human mobility and sanitation’, J. R. Soc. Interface, vol. 9, no. 67, pp. 376–388, Feb. 2012, doi: 10.1098/rsif.2011.0304. [15] L. Bengtsson et al., ‘Using Mobile Phone Data to Predict the Spatial Spread of Cholera’, Sci. Rep., vol. 5, no. 1, p. 8923, Aug. 2015, doi: 10.1038/srep08923. [16] F. Finger et al., ‘Mobile phone data highlights the role of mass gatherings in the spreading of cholera outbreaks’, Proc. Natl. Acad. Sci., vol. 113, no. 23, pp. 6421–6426, Jun. 2016, doi: 10.1073/pnas.1522305113. [17] L. Bengtsson, X. Lu, A. Thorson, R. Garfield, and J. von Schreeb, ‘Improved Response to Disasters and Outbreaks by Tracking Population Movements with Mobile Phone Network Data: A Post-Earthquake Geospatial Study in Haiti’, PLoS Med., vol. 8, no. 8, p. e1001083, Aug. 2011, doi: 10.1371/journal.pmed.1001083. [18] J. Mwaba et al., ‘Identification of cholera hotspots in Zambia: A spatiotemporal analysis of cholera data from 2008 to 2017’, PLoS Negl. Trop. Dis., vol. 14, no. 4, p. e0008227, Apr. 2020, doi: 10.1371/journal.pntd.0008227. [19] D. Häfliger, P. Hübner, and J. Lüthy, ‘Outbreak of viral gastroenteritis due to sewage- contaminated drinking water’, Int. J. Food Microbiol., vol. 54, no. 1–2, pp. 123–126, Mar. 2000, doi: 10.1016/S0168-1605(99)00176-2. [20] H. T. Olds, S. R. Corsi, D. K. Dila, K. M. Halmo, M. J. Bootsma, and S. L. McLellan, ‘High levels of sewage contamination released from urban areas after storm events: A quantitative survey with sewage specific bacterial indicators’, PLOS Med., vol. 15, no. 7, p. e1002614, Jul. 2018, doi: 10.1371/journal.pmed.1002614. [21] F. F. Sodré, M. A. F. Locatelli, and W. F. Jardim, ‘Occurrence of Emerging Contaminants in Brazilian Drinking Waters: A Sewage-To-Tap Issue’, Water. Air. Soil Pollut., vol. 206, no. 1–4, pp. 57–67, Feb. 2010, doi: 10.1007/s11270-009-0086-9. [22] N. Ashraf, E. Glaeser, A. Holland, and B. M. Steinberg, ‘Water, Health and Wealth’, National Bureau of Economic Research, Cambridge, MA, w23807, Sep. 2017. doi: 10.3386/w23807. [23] A. Bancalari and S. Martinez, ‘Exposure to sewage from on-site sanitation and child health: a spatial analysis of linkages and externalities in peri-urban Bolivia’, J. Water Sanit. Hyg. Dev., vol. 8, no. 1, pp. 90–99, Mar. 2018, doi: 10.2166/washdev.2017.179. [24] Y. G. Diab and D. Morand, ‘Risks Analysis for Prioritizing Urban Sewer Rehabilitation: A Decision Support System’, in New Pipeline Technologies, Security, and Safety, Baltimore, Maryland, United States, Jul. 2003, pp. 610–620. doi: 10.1061/40690(2003)82. [25] A. Chimusoro et al., ‘Responding to Cholera Outbreaks in Zimbabwe: Building Resilience over Time’, in Current Issues in Global Health, D. Claborn, Ed. IntechOpen, 2018. doi: 10.5772/intechopen.79794. [26] GTFCC, ‘Cholera Outbreak Response Field Manual’. WHO, Oct. 2019. [Online]. Available: https://www.gtfcc.org/wp-content/uploads/2020/04/gtfcc-cholera-outbreak- response-field-manual.pdf [27] CIESIN - Columbia Universit, ‘Facebook Connectivity Lab and Center for International Earth Science Information Network - . High Resolution Settlement Layer (HRSL). Accessed DAY MONTH YEAR."’. Source imagery for HRSL ©, 2016. [28] S. Milusheva, ‘Managing the spread of disease with mobile phone data’, J. Dev. Econ., vol. 147, p. 102559, Nov. 2020, doi: 10.1016/j.jdeveco.2020.102559. [29] A. Wesolowski et al., ‘Quantifying the Impact of Human Mobility on Malaria’, Science, 17 vol. 338, no. 6104, pp. 267–270, Oct. 2012, doi: 10.1126/science.1223467. [30] M. C. González, C. A. Hidalgo, and A.-L. Barabási, ‘Understanding individual human mobility patterns’, Nature, vol. 453, no. 7196, pp. 779–782, Jun. 2008, doi: 10.1038/nature06958. [31] P. J. Diggle, P. Moraga, B. Rowlingson, and B. M. Taylor, ‘Spatial and Spatio-Temporal Log-Gaussian Cox Processes: Extending the Geostatistical Paradigm’, Stat. Sci., vol. 28, no. 4, pp. 542–563, Nov. 2013, doi: 10.1214/13-STS441. [32] I. W. Renner et al., ‘Point process models for presence‐only analysis’, Methods Ecol. Evol., vol. 6, no. 4, pp. 366–379, Apr. 2015, doi: 10.1111/2041-210X.12352. [33] B. D. Youngman and T. Economou, ‘Generalised additive point process models for natural hazard occurrence: Generalised additive point process models’, Environmetrics, vol. 28, no. 4, p. e2444, Jun. 2017, doi: 10.1002/env.2444. [34] G. Marra and S. N. Wood, ‘Practical variable selection for generalized additive models’, Comput. Stat. Data Anal., vol. 55, no. 7, pp. 2372–2387, Jul. 2011, doi: 10.1016/j.csda.2011.02.004. [35] I. Chirisa, L. Nyamadzawo, E. Bandauko, and N. Mutsindikwa, ‘The 2008/2009 cholera outbreak in Harare, Zimbabwe: case of failure in urban environmental health and planning’, Rev. Environ. Health, vol. 30, no. 2, Jan. 2015, doi: 10.1515/reveh-2014-0075. [36] G. Cowman et al., ‘Factors associated with cholera in Kenya, 2008-2013’, Pan Afr. Med. J., vol. 28, 2017, doi: 10.11604/pamj.2017.28.101.12806. [37] M. Ali, M. Emch, J. P. Donnay, M. Yunus, and R. B. Sack, ‘The spatial epidemiology of cholera in an endemic area of Bangladesh’, Soc. Sci. Med., vol. 55, no. 6, pp. 1015–1024, Sep. 2002, doi: 10.1016/S0277-9536(01)00230-1. [38] F. B. Osei and A. A. Duker, ‘Spatial and demographic patterns of Cholera in Ashanti region - Ghana’, Int. J. Health Geogr., vol. 7, no. 1, p. 44, 2008, doi: 10.1186/1476-072X-7-44. 18 Supporting Information Figure S1. Observed and predicted numbers of cases over a ~1km hexgrid. Table S1: Details of covariates in the modeling process Source Data Type Number of cases Cholera Cases 9,890 cases Market and Public Facilities 362 Water Bursts 7,179 water reports Sewer Bursts 24,713 sewer reports City of Harare Boundaries of admin zones 165 suburbs mapped Water Quality Sampling 23 sample point names matched with weekly water quality reports Water and sewer network Shape files Public boreholes 481 geo-coded Household Data (DHS, Drinking water source/sanitation 60 WASH clusters, 44 inside Harare 2015) facility Groundwater Specialist Groundwater Vulnerability 2 raster maps with groundwater risk level Manyame, Upper Private Boreholes 1848 private boreholes geocoded: Mazowe, Nyagui Sub 1,721 in Harare and 127 outside the catchment Councils city FB HDX Population density Raster World Bank from Poverty data N/A ZIMSTAT 19