The World Bank Economic Review, 36(1), 2022, 1–18 https://doi.org10.1093/wber/lhab008 Article Local Water Quality, Diarrheal Disease, and the Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 Unintended Consequences of Soda Taxes Emilio Gutierrez and Adrian Rubli Abstract Could taxing sugar-sweetened beverages in areas where clean water is unavailable lead to increases in diarrheal disease? An excise tax introduced in Mexico in 2014 led to a significant 6.6 percent increase in gastrointestinal disease rates in areas lacking safe drinking water throughout the first year of the tax, with evidence of a dimin- ishing impact in the second year. Suggestive evidence of a differential increase in the consumption of bottled water by households without access to safe water two years post-tax provides a potential explanation for this declining pattern. The costs implied by these results are small, particularly compared to tax revenues and the potential public health benefits. However, these findings inform the need for accompanying soda taxes with policy interventions that guarantee safe drinking water for vulnerable populations. JEL classification: I18, I15, Q53, H23 Keywords: water quality, sugar taxes, gastrointestinal diseases, avoidance behavior 1. Introduction Developing countries face a disease-burden mix of acute and chronic conditions. In these settings, the need to provide clean drinking water to reduce diarrheal diseases (Kremer et al. 2011; Dupas and Miguel 2017) coexists with recent actions aimed at reducing obesity, like taxing sugar-sweetened beverages (SSBs) (Cawley 2015; Allcott, Lockwood, and Taubinsky 2019). Figure 1 presents two motivating, descriptive pictures. From a total of 39 countries that have implemented a national tax on SSBs since 1940, 31 have done so over the last decade, and 22 correspond to developing countries. Furthermore, as expected, these countries have a higher mortality rate from gastrointestinal diseases (GIDs), which has been linked to the lower availability of safe drinking water in these settings. Broadly speaking, the effectiveness of SSB taxes depends on consumers’ substitution patterns (Allcott, Lockwood, and Taubinsky 2019). Many empirical studies have documented substitution from SSBs to Emilio Gutierrez is an associate professor at Instituto Tecnologico Autonomo de Mexico (ITAM), Department of Economics, Mexico City, Mexico; his email address is emilio.gutierrez@itam.mx. Adrian Rubli (corresponding author) is an assistant pro- fessor at Instituto Tecnologico Autonomo de Mexico (ITAM), Department of Business Administration, Mexico City, Mexico; his email address is adrian.rubli@itam.mx. The authors thank Eric Edmonds (the editor) and three anonymous referees, as well as Anna Aizer, Jay Bhattacharya, Janet Currie, Andrew Foster, Joshua Graff Zivin, Michelle Marcus, Emily Oster, Jody Sindelar, and seminar participants at the Frontiers of Health Economics Research in Latin America workshop (iHEA 2017), Latin American and Caribbean Economic Association (LACEA 2017), Northeastern Universities Development Consortium (NEUDC 2018), and ITAM-UCSD conference for insightful comments and Ricardo Enrique Miranda and Daniela Soto for expert research assistance. The authors acknowledge support from the Asociación Mexicana de Cultura. A supplementary online appendix is available with this article at The World Bank Economic Review website. © The Author(s) 2021. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2 Gutierrez and Rubli Figure 1. Sugar-Sweetened Beverage Taxes around the World over Time Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 Source: Authors’ own analysis from data reported in Allcott, Lockwood, and Taubinsky (2019), the United Nations World Economic Situation and Prospects classifi- cation of developing countries, and diarrheal disease rates recovered from https://ourworldindata.org. Note: Panel A shows the number of countries implementing a tax on sugar-sweetened beverages (SSBs) by year since 2011. In total, there are 31 countries implementing a tax in this time frame (versus 8 from 1940 to 2007). Countries are classified by development status. Panel B shows the age-adjusted yearly gastrointestinal disease (GID) death rates per 100,000 in countries implementing a tax on SSBs by year of implementation, using measures from 2010 and plotted on a logarithmic scale for clarity. other bottled beverages (Fletcher, Frisvold, and Tefft 2010, 2013; Nakhimovsky et al. 2016). In theory, however, substitution towards water may lead to unintended costs if households lack access to safe drinking water (Roache and Gostin 2017), a common feature in developing settings.1 This paper aims to fill this gap by asking whether an SSB tax in areas where access to safe drinking water is low leads to increased diarrheal disease. To answer the question, this study analyzes a nation-wide tax on SSBs introduced in Mexico on January 1, 2014, by comparing diarrheal disease rates at public clinics over time in areas lacking access to clean water relative to those with access.2 Mexico is an ideal setting to explore this question because (a) the tax was introduced nationwide, limiting cross-border shopping, (b) many regions in Mexico still lack access to safe drinking water (CONAGUA 2016; DHAyS 2017), and (c) existing evidence suggests that consumers responded to this tax by decreasing their consumption of SSBs, and increasing their (bottled and non-bottled) water consumption (Colchero et al. 2015; Grogger 2017; Colchero, Molina, and Guerrero-López 2017; Aguilar, Gutierrez, and Seira 2019). This paper exploits health data from all public outpatient clinics, data on piped-water access at the electoral precinct level from the census, and data on surface-water quality from government monitoring stations, using Thiessen polygons to extrapolate water quality measures to the whole country.3 Areas with low access to safe drinking water are defined as those with low access to piped water and bad surface-water quality. Due to data restrictions, it is assumed that all piped water is sufficiently clean for human consumption, consistent with studies on a large piped-water chlorination program from the 1990s (Bhalotra et al. 2017). 1 There is limited empirical evidence on this. Onufrak et al. (2014) presents suggestive evidence by documenting that Hispanics in the United States that mistrust their local tap water are twice as likely to consume SSBs than those who perceive it to be clean. 2 To the best of our knowledge, the tax was not accompanied by campaigns for clean water or public service announce- ments with information about disinfecting water. 3 Electoral precincts are the smallest administrative unit in Mexico (over 64 thousand in 2010). The World Bank Economic Review 3 Using a difference-in-differences framework, the empirical strategy contrasts diarrheal disease rates seen at public clinics over time in facilities located in areas with low access to piped water and bad surface- water quality against all other clinics in Mexico. The econometric specifications include time-period fixed effects to account for seasonality, and estimate the impact from changes within healthcare centers over time by including clinic fixed effects. Alternative definitions of areas with poor drinking-water quality deliver similar results. Supportive evidence that the parallel pre-trends assumption holds is also shown, as well as specifications with additional controls that address potential time-varying confounders. Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 There is a statistically significant but localized effect of the SSB tax on diarrheal disease rates seen at public clinics of 6.6 percent in areas lacking safe drinking water during 2014 relative to 2013. Point estimates for 2015 are smaller across specifications and some are not statistically significant. To shed light on this pattern over time, section Potential Explanations for the Short-Lived Impact presents some evidence consistent with delayed avoidance behavior, with households in areas with low-quality drinking water differentially increasing their consumption of bottled beverages (mostly, bottled water) relative to other households two years post-tax. While alternative explanations cannot be ruled out, the evidence suggests that the decline in the effect two years post-tax could be due to households switching towards bottled beverages. A simple back-of-the-envelope calculation aids in contextualizing the costs potentially implied by these findings. The main results are extrapolated to get the aggregate GIDs attributable to the tax. Survey data provide a rough calculation of the average cost per GID episode. Hence, at most 92 thousand GID cases in the first two years could be attributed to the tax, at a cost of around USD 4.75 million. The total cost of these cases of diarrheal disease is very small relative to the SSB tax revenues and to the potential health gains. Therefore, while not an argument against SSB taxes, this paper’s findings do indicate that in contexts where individuals lack safe drinking water, introducing such policies may have unintended consequences. This issue may be more salient in countries where more households lack clean water and mortality rates due to diarrhea are higher, such as the Philippines and South Africa, where SSB taxes were introduced in 2018. The paper most similar to this study is Ritter (2019), which analyzes soda price changes across different regional markets in Peru, documenting that a decrease in soda prices is correlated with increases in soda consumption and decreases in self-reported diarrheal disease. This paper complements Ritter (2019) in at least three ways. First, the results inform about the generalizability of the relationship between GIDs and soda prices. Second, this study exploits the introduction of a tax on SSBs as the source of variation in prices and consumption. Finally, by exploiting high-frequency administrative data at a fine geographic level, the dynamics in medically diagnosed GIDs can be explored in detail.4 Additionally, this paper contributes to the literature linking local water quality to diarrheal disease (Galiani, Gertler, and Schargrodsky 2005; Cutler and Miller 2005; Gamper-Rabindran, Khan, and Timmins 2010; Garg et al. 2018) and to studies that estimate substitution patterns attributed to SSB taxes and price variations (Fletcher, Frisvold, and Tefft 2010, 2013; Finkelstein et al. 2013; Cawley et al. 2019). By identifying the causal link between SSBs and diarrheal disease in contexts with low water quality, this paper bridges these literatures. This connection has been overlooked since, until recently, these taxes have only been implemented and analyzed in developed countries. Thus, the findings herein also contribute by identifying this unintended consequence of a common policy. The remainder of the paper is organized as follows. Section Background presents some context. Section Data describes the main data sources. Section Empirical Strategy introduces the identification strategy. Section First Stage Evidence of the Tax provides some first stage evidence of the tax on 4 Related work has also analyzed the interaction between breastfeeding and water cleanliness in developing settings. Keskin, Shastry, and Willis (2017) finds that Bangladeshi women breastfeed longer in areas without access to safe drink- ing water, while Anttila-Hughes et al. (2018) documents increases in infant mortality rates following the introduction of baby formula in areas where availability of clean water is low. 4 Gutierrez and Rubli prices and quantities. Section Effect on Outpatient Diarrheal Disease Rates presents the main results. Section Potential Explanations for the Short-Lived Impact provides suggestive evidence consistent with avoidance behavior through differential purchases of bottled beverages. Section Discussion and Conclusion discusses the findings and concludes. 2. Background Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 In Mexico, over 70 percent of adults are overweight or obese (National Health Survey ensanut, 2012). In late 2013, the Mexican Congress approved an excise tax—colloquially called the “soda tax”—of 1 peso (USD 0.06) per liter on SSBs as part of the Special Tax on Production and Services (IEPS, by its Spanish acronym), effective January 1, 2014. IEPS defines SSBs as sodas, nectars, and concentrates with added sugar, and powdered drink mixes. Beverages sweetened with non-caloric sugar substitutes and dairy products were exempt from the tax. Previous literature, focusing mostly on children, has documented that consumption of SSBs has been historically high in Mexico. Stern et al. (2014) uses dietary recall surveys in 1999 and 2012 to describe trends in caloric beverages. Barquera et al. (2010) finds that pre-school and school-aged children obtain 28 and 21 percent of their energy from caloric beverages, respectively. To complement this picture, table S1.1 in the supplementary online appendix available presents some descriptives on consumption of bottled beverages pre-tax, indeed showing high purchases of SSBs across income levels. In terms of the SSB tax, Grogger (2017) estimates a pass-through of the tax of over 100 percent.5 Existing studies calculate a 10–12 percent increase in the average price of SSBs and a 6 percent decline in consumption post-tax (Colchero et al. 2015; Aguilar, Gutierrez, and Seira 2019). Colchero et al. (2016) further shows that the consumption of taxed beverages fell by up to 9 percent among low socioeconomic status (SES) urban households, while untaxed bottled beverages increased by just 2 percent. This suggests that at least some households substituted towards non-bottled drinking water.6 While anecdotally the market for bottled water in Mexico is big, a large fraction of households rely on non-bottled water, especially low SES households (see table S1.1 in the supplementary online appendix).7 3. Data This paper relies on two main data sets. The first corresponds to medically diagnosed diarrheal disease cases at public healthcare clinics. The second aggregates census-level information on access to piped water with measures of surface-water quality from monitoring stations. 3.1. Health Outcomes GID cases and rates at each public outpatient clinic were obtained from the Ministry of Health (SSA) for 2009 to 2015. This information is collected on a weekly basis, and contains all new diarrheal disease diagnoses at the outpatient clinic level. This dataset records doctors’ diagnoses.8 All clinics are geocoded by merging information from SSA’s publicly available Infrastructure Dataset for 2014. The public healthcare system in Mexico is divided into separate, disjoint subsystems targeting different segments of the population. The analysis is restricted to the four principal subsystems: SSA (through 5 Total pass-through has not always been observed in other settings due to consumers making SSB purchases in other jurisdictions. See, for example, Cawley and Frisvold (2017). 6 Similar descriptive evidence based on consumption perceptions from survey data is shown in the supplementary online appendix (table S1.3). 7 See, for example, https://www.forbes.com.mx/agua-embotellada-el-negocio-multimillonario-que-mexico-no-necesita/ (accessed September 11, 2018). 8 Diagnoses are registered using ICD-10 codes. GIDs include codes from A00 to A09 according to SSA’s classification. This is consistent with the literature (in particular for Mexico, see Agüero and Beleche 2017). The World Bank Economic Review 5 Seguro Popular insurance), IMSS, IMSS-Oportunidades, and ISSSTE.9 This amounts to 15,634 clinics, with the excluded ones making up around 1 percent of public healthcare services (ensanut 2012). Importantly, individuals eligible for public healthcare services are assigned to and required to seek care for all non-life-threatening conditions at the clinic closest to their home. Supplementary online appendix S2 presents complementary information on public outpatient clinics. These data effectively measure GID rates at public clinics, which is a subset of all GIDs, since some Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 individuals may not seek medical care and others may use a private provider. Survey data shows that over half of sick individuals seek medical care and around 64 percent of those seeking care go to a public clinic (table S4.1 in the supplementary online appendix). Correlates of public healthcare seeking, shown in fig. S4.1, motivate the inclusion of time-varying controls for household characteristics in a robustness check.10 The empirical strategy relies on the assumption that the probability of seeking care is not changing differentially over time across areas. Evidence in supplementary online appendix S4 supports this assumption, showing that the rate of healthcare-seeking behavior does not change with disease prevalence in this context. 3.2. Water Access and Quality Data on households’ access to piped water is obtained from the 2010 census at the electoral precinct (sección electoral) level, the smallest administrative unit in Mexico, with a total of over 64,000 in the country. On average, a precinct has 383 households and a median of 320. For each precinct, this dataset reports the fraction of households in 2010 obtaining their water from sources outside the home. This in- cludes households obtaining piped water from a neighbor or a communal tap, as well as non-piped water (from vendors, surface water sources, such as rivers, lakes, and dams, or from wells). See supplementary online appendix S2 for more information. It is assumed that all piped water in Mexico is sufficiently safe for consumption, due to data restrictions on tap-water quality, and due to evidence that a large national water program in the early 1990s saw important impacts in chlorinating piped water (Bhalotra et al. 2017). The analysis therefore focuses on water quality from surface water sources. Surface-water quality measures are obtained from 2,071 fresh-water monitoring stations throughout the country belonging to the regulator conagua (National Water Commission). The data include a measure of biochemical oxygen demand (BOD), chemical oxygen demand (COD), and total suspended solids (TSS) per station per year from 2012 to 2014.11 For each, conagua reports the precise measure, as well as a classification into five categories (very polluted, polluted, acceptable, good, and excellent), based on conagua’s established thresholds. See supplementary online appendix S2 for more details on water quality. For each public outpatient clinic, the fraction of households without access to piped water inside the home is assigned from the precinct in which the clinic is located.12 Thiessen polygons extrapolate the water quality data from the monitoring stations, effectively assigning quality measures to each clinic 9 The Mexican Institute for Social Security (IMSS) provides healthcare for formal workers and their families; IMSS- Oportunidades is the rural branch of IMSS, linked to the cash transfer program Oportunidades; the Institute for Social Security and Services for State Workers (ISSSTE) corresponds to government workers; and Seguro Popular provides coverage for informal workers and the unemployed, through SSA’s own network of clinics and hospitals. The remaining smaller subsystems are for workers of the national oil company, the marines, and the army. 10 Overall, the estimates suggest that lower SES individuals are more likely to seek public healthcare. 11 While it is unclear how the government chose to locate these stations in 1996, official documents claim that they provide “reliable and representative information”. See http://dgeiawf.semarnat.gob.mx:8080/ibi_apps/ WFServlet?IBIF_ex=D3_R_AGUA05_03&IBIC_user=dgeia_mce&IBIC_pass=dgeia_mce (accessed October 14, 2019). 12 Results are robust to assigning piped-water access from a spatially weighted average of precincts within a 2 km radius of the clinics (table S3.1 in the supplementary online appendix). 6 Gutierrez and Rubli from the single nearest station. Collapsing to quarterly observations yields a balanced panel of 15,634 clinics × 28 quarters. 4. Empirical Strategy Given the available water measures, the first step is to define areas with access to poor-quality drinking water. Then the empirical strategy compares the evolution of GID rates across areas over time. Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 4.1. Defining Areas with Low Access to Safe Drinking Water Areas with low access to safe drinking water are defined as those without access to piped water and with bad-quality surface water. Since access and quality are measured in different units and for ease of interpretation, the main specification relies on a simple binary classification. To shed light on the underlying variation, additional regressions using alternative binary definitions of access to safe drinking water are also shown. For the main specification, clinics are classified into low and high access to tap water based on the median of the precinct-level distribution, considering the 13,732 precincts that have at least one public outpatient clinic. Regarding surface-water quality, each measure (BOD, COD, and TSS) is averaged over 2012 and 2013, and then classified into good and bad quality based on the median of the station-level distribution. Lastly, bad surface-water quality is defined as having all three measures above the median. This exercise considers all three available quality measures. First, there is no clear consensus in the economics literature as to which measure is best. For example, Duflo et al. (2013) uses both BOD and COD measures for pollution in India, while Lipscomb and Mobarak (2016) and Sigman (2002) focus on BOD only in other settings. Second, including all three should minimize any noise and strengthen the signal of the true underlying water quality.13 Given these binary classifications of access to tap-water and surface-water quality, areas with low access to safe drinking water are then defined as the intersection of low access to piped water and bad surface-water quality. This yields 1,596 clinics in areas with poor drinking-water quality. Additional specifications consider various other binary definitions. First, the official thresholds pro- vided by the government regulator conagua are used for the binary classification of water quality at each monitoring station instead of the median. Second, 2014 water quality data are included. Third, the mean of the distribution of piped-water access is used as the cutoff instead of the median. Although there are similarities in these classifications across definitions, there is also significant spatial variation (see fig. S2.10 in the supplementary online appendix). Table 1 shows summary statistics under the main definition of areas with poor drinking-water quality. By construction, clinics differ by access to piped water and surface water-quality measures. There are also differences in terms of precinct-level sociodemographic characteristics, with clinics in areas with poor drinking-water quality significantly poorer in this regard. This is also related to the fact that these clinics are more likely to be SSA clinics. Importantly, there is no statistically significant difference in distance to the assigned monitoring station. Altogether, table 1 suggests that clinics in places with low access to safe drinking water are in low SES and more rural areas. Crucially, the identification strategy will rely on assuming (and partially testing for) similar trends over time across groups of clinics, not similar levels. 4.2. Econometric Specification The analysis follows a difference-in-differences (DiD) strategy for identification: 2015 [GID rate]cqy = βτ ([poor drinking water]c × 1[y=τ ] ) + λc + θqy + εcqy , (1) τ =2009 13 Results are robust to just using the BOD measures (table S3.2 in the supplementary online appendix), which are the most commonly used in the literature. The World Bank Economic Review 7 Table 1. Summary Statistics Good-quality drinking Poor-quality drinking water water Difference Mean Std. dev. Mean Std. dev. Diff. Std. error Panel A: Clinic outcomes and characteristics −12.28*** Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 Gastrointestinal disease rate per 100,000 117.52 (186.67) 105.25 (180.88) (0.93) IMSS clinic 0.07 (0.26) 0.01 (0.11) −0.06*** (0.00) IMSS-Oportunidades clinic 0.23 (0.42) 0.20 (0.40) −0.04*** (0.00) ISSSTE clinic 0.03 (0.18) 0.00 (0.05) −0.03*** (0.00) SSA clinic 0.66 (0.47) 0.79 (0.41) 0.13*** (0.00) Distance to monitoring station, km 18.89 (20.25) 18.91 (18.11) 0.02 (0.10) Panel B: Characteristics at the precinct level Fraction households without water inside 0.53 (0.36) 0.88 (0.10) 0.34*** (0.00) Fraction households without electricity 0.06 (0.11) 0.08 (0.15) 0.02*** (0.00) Fraction households without sewerage 0.28 (0.30) 0.43 (0.30) 0.16*** (0.00) Fraction households without bathroom inside 0.13 (0.17) 0.27 (0.24) 0.15*** (0.00) Average precinct population 1,794.61 (1,540.45) 1,655.41 (1,215.54) −139.20*** (7.54) Fraction without any healthcare coverage 0.61 (0.26) 0.74 (0.24) 0.13*** (0.00) Average years of schooling 5.03 (2.05) 4.12 (1.21) −0.90*** (0.01) Panel C: Water quality from monitoring stations Biochemical oxygen demand, 2012 17.69 (44.43) 36.47 (58.72) 18.79*** (0.28) Biochemical oxygen demand, 2013 21.39 (178.21) 39.62 (153.06) 18.23*** (0.98) Chemical oxygen demand, 2012 55.86 (116.13) 107.77 (166.97) 51.91*** (0.74) Chemical oxygen demand, 2013 78.02 (393.97) 155.81 (405.23) 77.79*** (2.19) Total suspended solids, 2012 30.78 (75.44) 82.25 (140.85) 51.47*** (0.47) Total suspended solids, 2013 77.62 (207.33) 278.40 (413.73) 200.78*** (1.21) River 0.67 (0.47) 0.68 (0.47) 0.00 (0.00) Dam, lake, or lagoon 0.26 (0.44) 0.31 (0.46) 0.04*** (0.00) Stream, canal, spring, or reservoir 0.06 (0.24) 0.02 (0.14) −0.04*** (0.00) Observations 393,064 44,688 437,752 Total clinics 14,038 1,596 15,634 Source: Authors’ analysis based on data from public outpatient clinics obtained from the Ministry of Health (SSA), surface-water quality measures collected by the National Water Commission (conagua), and the 2010 census statistics obtained from the National Statistics Office (inegi). Note: This table presents summary statistics at the clinic level. The mean and standard deviation for each group of clinics is shown, as well as the difference and the standard error for a test of difference in means. Panel A shows clinic-level outcomes and characteristics. Note that clinics belong to different subsystems of public healthcare (IMSS, IMSS-Oportunidades, ISSSTE, SSA) targeting different populations. Panel B shows household characteristics at the precinct level from the 2010 census. Panel C shows descriptives for water quality obtained from government monitoring stations, including the location of the stations by water source. ***p < 0.01, **p < 0.05, *p < 0.1 where [GIDrate]cqy is the diarrheal disease rate per 100,000 at public outpatient clinic c in quarter q year y, [poordrinkingwater]c is an indicator for being in an area with low access to clean drinking water, 1[·] is the indicator function, λc are clinic fixed effects, θ qy are time-period dummies, and ε cqy is the error term. Standard errors are clustered at the clinic level to allow for serial correlation. Given that the distribution of the GID rate is heavily skewed, as shown in fig. 2, the outcome variable is winsorized at the 5 percent level throughout. As a robustness check, results are shown with the raw GID rate and with a more conservative winsorization of 1 percent, yielding similar estimates. The coefficients of interest are given by β τ , particularly for 2014 and 2015, as they represent differ- ential changes in GID rates for clinics in areas with poor drinking-water quality relative to other areas for each year. The clinic and time-period fixed effects imply that the effects are estimated from changes within clinics over time, net of overall seasonal effects. 8 Gutierrez and Rubli Figure 2. Distribution of Raw Gastrointestinal Disease Rate Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 Source: Authors’ analysis based on data from public outpatient clinics obtained from the Ministry of Health (SSA). Note: This graph shows the distribution of the raw gastrointestinal disease rate per 100,000 at public outpatient clinics (15,634 clinics observed over 28 quarters). Obtaining a consistent estimate relies on the identifying assumption that there are no time-varying omitted variables correlated with both GIDs and the implementation of the tax in areas identified as having low access to safe drinking water. Hence, the empirical strategy relies on assuming that absent the tax, GID rates at clinics in areas without access to safe drinking water would have followed a similar trend to the one observed at other clinics. Asking whether the β τ coefficients pre-tax are statistically indistinguishable from zero offers a partial test of this identifying assumption. As shown in table 1, clinics in areas with low access to safe drinking water are more likely to be in low SES and rural places. If unobservable factors correlated with SES are changing concurrently with the tax, then the estimates may be biased. A battery of time-varying household controls in additional specifications address these concerns of differential trends in low SES areas. Specifically, an indicator for each year-quarter is interacted with indicators for being in quartile p of the precinct-level distribution of access to electricity, sewerage, having a bathroom inside the home, total population, share without any healthcare coverage, and average years of schooling. An additional specification that includes flexible geographic controls allays concerns of any differential weather or epidemiological shocks. Indicators for each year-quarter are interacted with an indicator for a grid cell defined by latitude and longitude degrees. There are 5,098 cells in total, measuring approximately 103 by 110 km. Additional specifications and robustness checks are also shown. First, results are similar under alternative binary definitions of areas with poor-quality drinking water, as well as under two continuous measures. Second, findings are robust to not winsorizing or using a 1 percent winsorization level, and to dropping clinics that are far away from their assigned station. Third, results hold when assigning piped- water access from precincts within a 2 km radius of the clinics, and when using only BOD measures for water quality. Lastly, supplementary analyses show that general healthcare demand-and-supply changes cannot explain the results, by calculating the effect on placebo conditions and showing that there are no differential changes in staffing and infrastructure at public clinics. An additional concern related to measurement error comes from the fact that GIDs are only observed conditional on seeking care at a public clinic. Although the text refers to the GID rate throughout, it is understood that this is not the overall epidemiological prevalence, since some bouts of diarrhea may not The World Bank Economic Review 9 be associated with any doctor visit. This may be a concern only if the likelihood of seeking medical care at the clinic conditional on being sick changes differentially over time across clinics. Analyses of survey data discussed in supplementary online appendix S4 suggest that this is not the case. In terms of the independent variable, clinics may be misclassified along the piped-water access and surface-water quality dimensions due to the different moments in time when the data were collected. First, over such a short time period, any gains in these dimensions must be relatively small, with a (roughly) equal probability of classifying clinics correctly or incorrectly. This would lead to attenuation Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 bias. Second, if improvements in access and quality are indeed sufficiently large, then high-access/good- quality clinics would likely be classified as low access/bad quality. This, too, would attenuate the results. Lastly, since 2010, Mexico has implemented many policies to tackle rising obesity rates, including introducing healthy food options at schools, regulating the marketing of high-caloric food items to chil- dren, and requiring front-of-package labeling of nutritional facts on all foods and beverages (Barquera, Campos, and Rivera 2013). However, it is unlikely that these policies, many of which are not concurrent with the tax, would have a direct effect on GIDs or would differentially affect areas classified as lacking access to safe drinking water. 5. First Stage Evidence of the Tax Before exploring whether the soda tax resulted in increased diarrheal disease in areas lacking access to safe drinking water, it is important to establish that the tax indeed led to a decline in consumption of SSBs and an increase in the consumption of local water. Unfortunately, the latter cannot be observed since no survey collects reliable information on all water consumed and its sources. Therefore, this section relies on showing (a) an increase in the prices of SSBs, (b) a decline in purchases of SSBs, and (c) trends in purchases of bottled water. Data on SSB and bottled water prices were obtained from the consumer price index constructed by the Mexican National Statistics Office inegi. These are monthly data from 2011 to 2016, covering 46 cities in Mexico. SSBs are defined here as all sodas, while water includes both regular and sparkling. The data are at the product (barcode) level, for a total of 1,464 products. The general price index is used to deflate prices to the January 2011 general price level. Figure S1.1 in the supplementary online appendix shows the distribution of these prices. This exercise estimates the following regression: 2016 ln price jkmy = ατ 1[y=τ ] + ξ jk + ωm + ν jkmy , (2) τ =2011 where pricejkmy is the price per liter of a product j in a city k in month m of year y, ξ jk are product-city fixed effects, ωm are month of the year indicators, and ν jkmy is the error term. Standard errors are clustered at the product-city level. The α τ coefficients show how the log price changes over time, accounting for within-year seasonality and constant differences across product-city pairs. Since the tax was implemented nationally, a control group is lacking and the reader should be cautious when interpreting this evidence as causal. The first two panels of fig. 3 show the estimates of equation (2) for SSB and bottled water prices. Prices increased by around 10 percent for SSBs in the years after the tax relative to the excluded year of 2013 (panel A). Given the average price per liter in 2013 of 10.39 pesos, this is consistent with full pass- through of the 1 peso per liter tax. Trends for bottled water suggest a slight decline in prices during the whole 2011–2016 period, with no clear trend change in 2014 (panel B). Overall, this evidence is consistent with previous findings (Colchero et al. 2015; Grogger 2017; Aguilar, Gutierrez, and Seira 2019). 10 Gutierrez and Rubli Figure 3. First Stage Evidence on Prices and Purchases of Bottled Beverages Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 Source: Authors’ analysis based on data from the inegi consumer price index, the 2008–2016 National Household Income and Expenditures Survey (enigh) rounds, surface-water quality measures collected by the National Water Commission (conagua), and the 2010 census statistics obtained from the National Statistics Office (inegi). Note: Panels A and B show trends in prices of sugar-sweetened beverages (SSBs) and bottled water before and after the tax. SSBs are defined as bottled sodas, and water includes regular and sparkling. Plots show the coefficients of a regression of the log of prices per liter (in January 2011 Mexican pesos) on year indicators, month of the year dummies, and product-city fixed effects. Average prices per liter in 2013 were 10.39 and 5.90 pesos, respectively. Panels C and D show trends in purchases of taxed beverages and bottled water. Each series corresponds to a separate regression. Taxed beverages are defined as all sodas, and energy and sports drinks. Bottled water includes regular and sparkling water. The graphs show the point estimates of a regression of the inverse hyperbolic sine of household-level purchases on survey round indicators, log income, indicators for whether the house has access to electricity, sewerage, and a bathroom inside, indicators for household head’s years of education, indicators for household size, fraction of household members that are adults, indicators for size of the locality, and municipality fixed effects. We distinguish between areas with poor- and good-quality drinking water. Survey weights are included. Error bars show 95 percent confidence intervals using robust standard errors clustered at the product-city or municipality level. The repeated cross-sections of household data from the biennial 2008–2016 National Household Income and Expenditures Survey (enigh) rounds, representative at the national level, provide a picture of consumption changes. A disadvantage of these data relative to the data used for the main results is that each household’s location is observed only at the municipality level. However, two important advantages are that the surveys include rural areas (which are usually absent in scanner data), and access to piped water is directly observed at the household level. Using the same definitions provided in section Empirical Strategy, households are classified by access to safe drinking water. For these data, definitions rely on piped-water access information at the household level, and water quality is assigned as outlined above, aggregating up to municipalities by taking the spatially weighted average of the intersection of Thiessen polygons and municipalities. The outcomes of interest are weekly purchases in liters of all taxed drinks (sodas and energy drinks) and liters of bottled water (this includes all presentation sizes of regular water—including large 20-liter jugs—as well as all types of sparkling water and club soda).14 These household-level outcomes are transformed using the inverse hyperbolic sine function.15 14 The survey does not allow distinctions between regular and (untaxed) diet soda. However, sodas with artificial sweet- eners represent a small fraction of total soda consumption. For example, diet sodas produced by Coca Cola in Mexico represent only 4 percent of their total soda sales (https://www.elfinanciero.com.mx/empresas/refrescos-sanos-le-dan- punch-a-coca-cola-mexico, accessed October 16, 2019). We also exclude all juices since the survey only distinguishes between “natural” and bottled juice, while the tax applies only to juices with added sugar. 15 Specifically, the inverse hyperbolic sine of y is given by ln(y + y2 + 1). Unlike the log function, this transformation is well defined at zero. The World Bank Economic Review 11 Trends over time are shown separately for households in areas with and without safe drinking water by estimating an equation similar to equation (2), including household-level controls, excluding the month indicators that are not available in the survey, and swapping the product-city fixed effects for municipality effects. Standard errors are clustered at the municipality level. The last two panels of fig. 3 present these results. Estimates for SSBs show a decline in purchases post-tax (panel C). The point estimates suggest a larger decline in areas with poor-quality drinking water, but a test does not reject the null hypothesis that the effects are the same. Consumption of bottled water Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 appears to have increased in areas with unsafe drinking water post-tax (panel D). Overall, these findings also echo the results in the literature. 6. Effect on Outpatient Diarrheal Disease Rates The central finding of this paper, shown in table 2, column (1), is that the GID rate increased by 7.2 cases per 100,000 in clinics in areas with poor-quality drinking water relative to other clinics, in 2014 (the first year of the tax) relative to 2013 (the excluded year). Given the mean in low-quality drinking-water clinics pre-tax, this is a differential 6.6 percent significant increase in GIDs, or an average of 40 additional cases per clinic. The estimate for 2015 is smaller and statistically indistinguishable from zero (the point estimate suggests 17.5 additional cases per clinic). The hypothesis that the post-tax estimates are equal can be rejected. The hypothesis that the pre-tax estimates are all simultaneously zero cannot, which helps to allay concerns about differential pre-trends. Interpretation of magnitudes This main finding can be interpreted by calculating the local average treatment effect (LATE), by dividing the estimates by the share of compliers. Since surface-water quality is the same for all households at a given location, compliers are determined by access to piped water only. Given that 88 percent of households in areas classified as having poor drinking water lack piped water inside the home (table 1), the LATE is a 7.5 percent increase in the GID rate in 2014. Using the share of households obtaining water from surface water and wells instead, yields a LATE of 23 percent, while considering only surface water yields 89 percent.16 These magnitudes are in line with the literature. Ritter (2019) calculates that a 10 percent decline in the price of sodas in Peru is associated with a 0.5 percentage point decrease in self-reported diarrheal disease. Given the baseline prevalence, this amounts to a 71 percent decline. In a different setting but with clinic-level data, Ashraf et al. (2017) finds that for a 24-day period of piped-water outages in Lusaka, there are 23.7 additional GID cases at the local clinic, equivalent to a 16 percent increase given the baseline mean. Focusing on infant mortality, Anttila-Hughes et al. (2018) finds that the introduction of baby formula increased mortality by 10 percent, while Bhalotra et al. (2017) shows that the program that chlorinated piped water in Mexico in the early 1990s led to an average 50 percent decline in mortality, and up to 80 percent in areas with good-quality pipes. Additional controls Columns (2) and (3) of table 2 present specifications that allay concerns about potential time-varying confounders. Clinics in places with poor-quality drinking water are located in more rural, poorer areas. Therefore, column (2) includes time-varying socioeconomic controls, by inter- acting indicators for each period with indicators for being in quartile p of the precinct-level distribution of household access to electricity, sewerage, share with a bathroom inside the home, total population, share without any healthcare coverage, and average years of schooling. Results echo the main findings, with slightly larger coefficients: a differential 7.9 percent increase in 2014, and a significant 4.1 percent 16 The census only asks households their main source of water. We assume that any household with piped water inside will report it as their main source. However, households without piped access inside may obtain water from multiple sources. Hence, we consider the share without piped water inside to be more reliable. 12 Gutierrez and Rubli Table 2. Effect of the Soda Tax on Diarrheal Disease Rates Alternative binary definitions Continuous measures Main definition of conagua Include Mean Single Three poor-quality drinking water threshold 2014 access station stations (1) (2) (3) (4) (5) (6) (7) (8) Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 Post-tax years Poor drinking water × 2015 3.154 4.528** 3.971* 5.285* 3.303 1.833 0.847*** 1.116*** (2.052) (2.175) (2.294) (2.870) (2.100) (1.936) (0.322) (0.280) Poor drinking water × 2014 7.188*** 8.711*** 7.510*** 4.432** 6.911*** 5.507*** 0.603*** 1.015*** (1.662) (1.755) (1.842) (2.134) (1.713) (1.571) (0.218) (0.226) Pre-tax years Poor drinking water × 2012 2.145 2.601 2.725 0.565 1.181 1.171 −0.116 −0.210 (1.637) (1.735) (1.792) (2.127) (1.740) (1.554) (0.230) (0.206) Poor drinking water × 2011 2.825 1.651 3.332 0.850 1.619 1.793 −0.288 −0.399 (1.998) (2.096) (2.162) (2.671) (2.046) (1.882) (0.281) (0.262) Poor drinking water × 2010 3.226 1.316 1.098 3.214 2.770 2.305 0.087 0.066 (2.170) (2.297) (2.436) (2.730) (2.281) (2.043) (0.337) (0.289) Poor drinking water × 2009 −0.324 −1.783 −2.215 −3.058 0.023 −0.861 −0.158 −0.337 (2.351) (2.475) (2.636) (3.204) (2.458) (2.224) (0.383) (0.331) Observations 437,752 437,752 437,192 437,752 437,752 437,752 437,752 437,752 Clinics in areas with poor- quality drinking water 1,596 1,596 1,596 720 1,558 1,818 R-squared 0.800 0.802 0.814 0.800 0.800 0.800 0.800 0.800 Household controls Yes Yes Geographic controls Yes Mean dependent variable 109.6 109.6 109.6 94.45 114.4 110.3 121.5 121.5 Coefficient tests: H0 : β 2014 = β 2015 0.024 0.024 0.070 0.742 0.050 0.028 0.430 0.675 H0 : βk = 0 ∀ k ≤ 2012 0.185 0.225 0.117 0.179 0.557 0.351 0.566 0.113 Source: Authors’ analysis based on data from public outpatient clinics obtained from the Ministry of Health (SSA), surface-water quality measures collected by the National Water Commission (conagua), and the 2010 census statistics obtained from the National Statistics Office (inegi). Note: This table shows difference-in-differences estimates for a balanced panel of outpatient clinic-quarters (15,634 clinics × 28 quarters). The outcome is the clinic gastrointestinal disease rate per 100,000, winsorized at the 5 percent level. Coefficients for the interaction of the indicator for areas with poor-quality drinking water and each year are shown, with 2013 as the excluded year. Household controls are indicators for each year-quarter interacted with indicators for being in quartile p of the precinct-level distribution of access to electricity, sewerage, a bathroom inside the home, total precinct population, share without any healthcare coverage, and average schooling. Geographic controls are indicators for each year-quarter interacted with indicators for latitude–longitude grid cells. Columns (1)–(3) use the main definition for areas with poor-quality drinking water. Columns (4)–(6) consider alternative binary definitions. Columns (7)–(8) construct continuous measures from the nearest monitoring station (column 7), and taking a weighted average of the three nearest stations (column 8). Robust standard errors clustered at the clinic level. The mean of the dependent variable for clinics in areas with poor-quality drinking water pre-tax is shown. ***p < 0.01, **p < 0.05, *p < 0.1 increase in 2015. Once again, one can reject that these effect sizes are equal and cannot reject that pre-tax effects are jointly zero. To address the possibility of differential weather shocks, such as changes in rainfall or temperatures, column (3) further adds time-varying geographic controls, by interacting indicators for each period with indicators for latitude–longitude degree grid cells.17 Results are once again very similar, with a differential 6.8 percent increase in 2014 and a slightly significant 3.6 percent increase in 2015. The coefficient tests still show significantly different effects post-tax and statistically insignificant pre-tax estimates. Alternative definitions of low access to safe drinking water Table 2, columns (4)–(6) consider alter- native binary definitions of the classification of areas lacking clean drinking water. Column (4) uses 17 Note that including these controls results in 20 clinics dropped from the estimation procedure due to being singletons within each grid cell—quarterly date. The World Bank Economic Review 13 Figure 4. Event Study of the Effect of the Soda Tax on Diarrheal Disease Rates Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 Source: Authors’ analysis based on data from public outpatient clinics obtained from the Ministry of Health (SSA), surface-water quality measures collected by the National Water Commission (conagua), and the 2010 census statistics obtained from the National Statistics Office (inegi). Note: These graphs show the coefficients from an event study for a balanced panel of outpatient clinic-quarters (15,634 clinics × 28 quarters). The outcome is the clinic gastrointestinal disease rate per 100,000, winsorized at the 5 percent level. Coefficients for the interaction of the indicator for areas with poor-quality drinking water and each quarter for two years before and after the tax are shown, with quarter 4 of 2013 as the excluded period. Panel A corresponds to the main definition of poor-quality drinking water, with the lighter-colored coefficient series corresponding to specifications with additional controls. Panel B contrasts the main definition with alternative binary definitions. Robust standard errors clustered at the clinic level. Error bars show 95 percent confidence intervals. The mean of the dependent variable for clinics in areas with poor-quality drinking water pre-tax is 110. government-established thresholds for the binary classification of quality, column (5) includes 2014 data, and column (6) uses the mean to classify access to piped water. Columns (7) and (8) further consider continuous measures, by creating a continuous index using principal components analysis. Column (7) uses water quality measures from the nearest monitoring station as before, and column (8) uses the inverse-distance weighted average of the three nearest stations to each clinic. Both indices are standardized with a mean of 0 and a standard deviation of 1. These results show that the effects are stable and similar across specifications. For the binary def- initions in columns (4)–(6), the estimated effect in 2014 ranges from a 4.7 to a 6 percent differential increase in clinic GID rates, relative to the 6.6 percent at baseline. With the exception of column (4), the 2015 estimates are insignificant and statistically different from the ones for 2014. For the continuous indices, the estimates show that for a one standard deviation change in the continuous measure, GID rates in 2014 increased by 0.6 and 1.0 in columns (7) and (8), respectively. The smaller magnitudes obtained under these continuous metrics suggest that this measure is less precise in classifying exposure.18 However, estimates are positive and significant for both 2014 and 2015. Across all specifications, the pre-tax coefficients are individually and jointly indistinguishable from zero. Event study plots To visualize the dynamics of the effects more clearly, fig. 4 plots estimates from a regression similar to equation (1), interacting the indicator for low access to clean drinking water with indicators for each period instead of each year. The darker series corresponds to the baseline specification, while the lighter-colored series include additional controls and use alternative definitions of areas with poor-quality drinking water (see above). All plots show small and insignificant coefficients pre-tax, positive and (mostly) significant estimates throughout the first five quarters post-tax, and small and indistinguishable from zero coefficients by the end of 2015. These plots corroborate the main findings, 18 Figure S2.9 in the supplementary online appendix shows the distribution of these indices, distinguishing between clinics under the main binary definition. While there is considerable variation in the distance between the indices for the binary groups, the average difference between them as measured by the continuous indices is around 0.75. Hence, average effects under this metric are much smaller than the previous estimates. 14 Gutierrez and Rubli Table 3. Robustness Checks on the Effect of the Soda Tax on Diarrheal Disease Rates Alternative winsorization Excluding large distances Baseline None 1% 99th p. 95th p. (1) (2) (3) (4) (5) Post-tax years Poor drinking water × 2015 −1.807 Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 3.154 1.926 2.741 2.414 (2.052) (5.425) (3.649) (2.033) (2.067) Poor drinking water × 2014 7.188*** 8.942* 11.888*** 7.156*** 7.068*** (1.662) (4.777) (3.047) (1.672) (1.695) Pre-tax years Poor drinking water × 2012 2.145 6.116 5.481* 1.665 1.793 (1.637) (5.172) (3.328) (1.639) (1.655) Poor drinking water × 2011 2.825 13.826** 7.697* 2.601 2.733 (1.998) (6.679) (4.095) (2.009) (2.037) Poor drinking water × 2010 3.226 16.761** 9.464** 2.780 2.599 (2.170) (7.033) (4.180) (2.173) (2.210) Poor drinking water × 2009 −0.324 9.908 3.116 −0.800 −1.110 (2.351) (8.652) (4.592) (2.364) (2.396) Observations 437,752 437,752 437,752 433,384 415,856 Clinics in areas with poor- quality drinking water 1,596 1,596 1,596 1,583 1,543 R-squared 0.800 0.692 0.774 0.802 0.802 Mean dependent variable 109.6 166.3 146.6 109.3 109.8 Coefficient tests: H0 : β 2014 = β 2015 0.024 0.013 0.002 0.012 0.009 H0 : βk = 0 ∀ k ≤ 2012 0.185 0.151 0.096 0.230 0.194 Source: Authors’ analysis based on data from public outpatient clinics obtained from the Ministry of Health (SSA), surface-water quality measures collected by the National Water Commission (conagua), and the 2010 census statistics obtained from the National Statistics Office (inegi). Note: This table shows robustness checks on the main results for a balanced panel of outpatient clinic-quarters (15,634 clinics × 28 quarters). The outcome is the clinic gastrointestinal disease rate per 100,000, winsorized at the 5 percent level. Columns (2)–(3) consider alternative winsorization levels: no winsorization and 1 percent, respectively. Columns (4)–(5) exclude clinics that were assigned a monitoring station from the top 1 percent of the distance distribution (distance greater than 97 km) and the top 5 percent (57 km), respectively. Coefficients for the interaction of the indicator for areas with poor-quality drinking water and each year are shown, with 2013 as the excluded year. Robust standard errors clustered at the clinic level. The mean of the dependent variable for clinics in areas with poor-quality drinking water pre-tax is shown. ***p < 0.01, **p < 0.05, *p < 0.1 showing the dynamics of the effect post-tax and lending credibility to the identification strategy by showing the pre-tax estimates. Robustness checks Table 3 presents two robustness checks on the main effect. First, columns (2) and (3) consider alternative winsorization levels: none and at the 1 percent, respectively. Second, columns (4) and (5) exclude clinics in the top 99th and 95th percentiles of the clinic-level distribution of distance to the assigned monitoring station, respectively. Column (1) just replicates the baseline result. Overall, these findings are consistent with the main results. Using the raw GID rate, the estimates tend to be noisier. For example, the 2014 effect is only significant at the 90 percent level. Precision increases when winsorizing at 1 percent. However, in column (3), a test that the pre-tax coefficients are jointly zero rejects the null at the 90 percent confidence level (but not at the 95 percent).19 Results in columns (4) and (5) are very similar to the baseline estimates, and the hypothesis that pre-tax coefficients are jointly zero cannot be rejected. Supplementary online appendix S3 presents additional robustness checks. Summary Overall, table 2 and fig. 4 show that clinics in areas defined as having poor-quality drink- ing water experienced a differential increase in GID rates in 2014, ranging from 4.7 to 7.9 percent. The 19 Note that in both columns (2) and (3), if anything, GID rates were differentially decreasing before the tax, which could suggest that the impacts of the tax in these columns are downward biased. The World Bank Economic Review 15 evidence for 2015 is much weaker, with coefficients that are smaller, less significant, and statistically dif- ferent from the 2014 estimates. This suggests that the effect was short-lived, and section Potential Expla- nations for the Short-Lived Impact explores a potential mechanism for this pattern. Results are robust to the inclusion of controls and stable under alternative definitions of areas with poor-quality drinking water. Lastly, the pre-tax coefficients provide supporting evidence that the common trends assumption holds.20 Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 7. Potential Explanations for the Short-Lived Impact The main results show an important increase in GID rates at clinics in areas lacking safe drinking water in 2014, and a smaller and weaker effect in 2015, suggesting that the effect dissipated over time. This section explores some potential reasons for this pattern, focusing on the possibility of affected households increasing their purchases of bottled beverages, and showing suggestive evidence consistent with this idea. However, alternative explanations cannot be ruled out. It is plausible that individuals in areas with poor-quality drinking water learned about their local water quality after switching away from SSBs due to the tax. With knowledge of local pollution, individuals may adopt avoidance behaviors such as increasing their consumption of bottled beverages, as shown, for example, by Graff Zivin, Neidell, and Schlenker (2011). Mirroring the first stage evidence shown in section First Stage Evidence of the Tax, this exercise also exploits data from the 2008–2016 enigh rounds and classifies households by using the household-specific information on tap-water access and summarizing the surface-water quality data at the municipality level with spatially weighted averages. To show differential patterns in purchases of bottled beverages between areas with and without access to safe drinking water, equation (1) is estimated on the repeated cross-sections, with municipality and survey-round fixed effects. The outcome variables are the inverse hyperbolic sine of weekly household purchases of liters of SSBs and bottled water. Results are presented in table 4. Even-numbered columns include additional household-level controls. For all bottled beverages (SSB and water) in columns (1) and (2), estimates show that households in areas lacking safe drinking water did not differ in their consumption response from other households in 2014. This is consistent with all households responding in the same manner to the shock of the tax. However, coefficients for the 2016 interaction are large, positive, and significant, indicating that households in areas with poor-quality drinking water differentially increased their consumption by 18– 19 percent relative to other areas. The 2014 and 2016 effects are statistically different from one another, and, reassuringly, the hypothesis that the pre-tax estimates are jointly zero cannot be rejected. These results seem mostly driven by bottled water (columns 3 and 4), with an estimated differential increase of 17–20 percent in 2016. Altogether, the results in table 4 suggest that households in areas without access to safe drinking water differentially increased their consumption of bottled beverages two years after the tax, consistent with the decline in the effect on GIDs estimated above, and driven mostly by bottled water. Other alternative and complementary explanations cannot be ruled out. It is also possible that individuals developed tolerance or immunity to their local water contaminants, or that they learned how to treat similar bouts of diarrheal disease without visiting the local clinic. It may also be that doctors at public clinics informed patients of simple measures they can take to decrease their likelihood of infection, such as boiling water or using iodine tablets.21 Unfortunately, systematic data on individuals’ knowledge on water quality pre- and post-tax is unavailable. 20 Additional results in supplementary online appendix S5 show that the tax is not associated with any changes in GID hospitalization rates, suggesting that these additional cases of diarrheal disease were successfully contained at the out- patient level. However, these results are merely suggestive since data limitations allow us to observe hospitalizations at a subset of public hospitals only. 21 Anecdotal evidence from a few informal interviews with public clinic doctors and information from official government guidelines suggest that this is the case. We conducted 12 telephone interviews with IMSS doctors asking about com- mon recommendations, such as boiling water or using disinfectants, for GID patients. More details are available upon 16 Gutierrez and Rubli Table 4. Differential Changes in Consumption of Bottled Beverages SSBs and bottled water Bottled water SSBs (1) (2) (3) (4) (5) (6) Post-tax years Poor drinking water × 2016 0.167* 0.183** 0.176* 0.211** 0.050 0.039 Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 (0.090) (0.087) (0.103) (0.098) (0.052) (0.051) Poor drinking water × 2014 0.025 0.025 0.086 0.099 −0.026 −0.035 (0.097) (0.095) (0.111) (0.106) (0.060) (0.059) Pre-tax years Poor drinking water × 2010 0.026 0.038 0.071 0.100 −0.018 −0.026 (0.087) (0.087) (0.093) (0.092) (0.060) (0.059) Poor drinking water × 2008 0.012 0.030 0.036 0.068 −0.021 −0.025 (0.090) (0.088) (0.093) (0.091) (0.060) (0.058) Observations 153,064 153,064 153,064 153,064 153,064 153,064 R-squared 0.128 0.169 0.135 0.150 0.125 0.173 Controls Yes Yes Yes Mean dependent variable 9.68 9.68 6.90 6.90 2.79 2.79 Coefficient tests: H0 : β 2014 = β 2016 0.028 0.013 0.240 0.127 0.041 0.046 H0 : βk = 0 ∀ k ≤ 2012 0.947 0.906 0.708 0.544 0.938 0.897 Source: Authors’ analysis based on data from the 2008–2016 National Household Income and Expenditures Survey (enigh) rounds, surface-water quality measures collected by the National Water Commission (conagua), and the 2010 census statistics obtained from the National Statistics Office (inegi). Note: This table shows difference-in-differences estimates of changes in consumption of bottled beverages over time for households in areas with poor-quality drinking water versus other areas, with 2012 as the excluded year. The outcome variable is the inverse hyperbolic sine of liters purchased. The unit of observation is a household-year. Sugar-sweetened beverages (SSBs) are defined as all sodas, and energy and sports drinks. Bottled water includes regular and sparkling water. Even- numbered columns include household-level controls (log income, whether the house has access to electricity, sewerage, and a bathroom inside, indicators for household head’s years of education, indicators for household size, fraction of household members that are adults, and indicators for size of the locality). Robust standard errors clustered at the municipality level. The mean of the dependent variable (in liters) for households in areas with poor-quality drinking water pre-tax is shown. ***p < 0.01, **p < 0.05, *p < 0.1 In summary, there are many potential reasons why the estimated effect of the tax on GIDs in areas with low access to safe drinking water decreased in 2015. The evidence in this section is consistent with households differentially increasing their consumption of bottled water in 2016 relative to other households. Hence, this type of avoidance behavior may have contributed to the observed decline in the effects over time. 8. Discussion and Conclusion This paper identifies a differential 4.7 to 7.5 percent significant increase in GID rates in areas with low access to safe drinking water during the first year of the soda tax in Mexico across various specifications, and a differential 1.7 to 5.6 percent increase in the second year, that is generally statistically insignificant. Based on the point estimates from the main specification (table 2 column 1), this amounts to 64,000 additional cases in the whole country in 2014, and 28,000 in 2015 (40 additional GID cases per clinic in 2014, and 17.5 in 2015). Performing a simple back-of-the-envelope calculation contrasts the cost of the increase in GIDs relative to the observed tax revenues, and relative to the potential health gains from the tax, in order to put these findings into perspective. On average, GID patients at public clinics pay 41 pesos in transportation, fees, and medicines, and spend 129 minutes getting to the clinic, waiting, and with the doctor (ensanut 2012). With an average request. For government guidelines, see, for example, https://www.gob.mx/salud/prensa/la-secretaria-de-salud-emite- recomendaciones-para-evitar-enfermedades-diarreicas-colera-y-golpe-de-calor (accessed August 30, 2017). The World Bank Economic Review 17 hourly wage of 33.40 pesos (National Occupation and Employment Survey, enoe 2014), this amounts to 112.81 pesos. An additional upper bound of two full days of unpaid sick leave implies a total cost of 647 pesos (USD 52) per GID episode. Given the estimates, the total cost of the SSB tax due to GIDs during 2014–2015 was 59.4 million pesos (USD 4.75 million), or 0.33 percent of the government’s revenue from this tax in 2014 alone.22 Alternatively, this represents only 0.05 percent of the yearly obesity cost in Mexico, estimated at 120 billion pesos (Molina et al. 2015). Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 Overall, these findings are not an argument against implementing SSB taxes. Nevertheless, they do inform the need for accompanying them with targeted policies that guarantee safe drinking water for vulnerable populations. This insight may matter even more in settings where access to clean water is lower than in Mexico, such as in the Philippines and South Africa, where SSB taxes were introduced in 2018. This paper thus emphasizes the unintended consequences of SSB taxes in areas where the local water supply is unsafe for consumption. These results contribute to the literature by showing a causal link between an SSB tax and increased diarrheal disease in areas without access to clean drinking water, as well as documenting some suggestive evidence consistent with households responding by increasing their consumption of bottled beverages to avoid the local unsafe drinking water. References Agüero, J. M., and T. Beleche. 2017. “Health Shocks and their Long-Lasting Impact on Health Behaviors: Evidence from the 2009 H1N1 Pandemic in Mexico.” Journal of Health Economics 54: 40–55. Aguilar, A., E. Gutierrez, and E. Seira. 2021. “The Effectiveness of Sin Food Taxes: Evidence from Mexico.” Journal of Health Economics, forthcoming. Allcott, H., B. B. Lockwood, and D. Taubinsky. 2019. “Should We Tax Sugar-Sweetened Beverages? An Overview of Theory and Evidence.” Journal of Economic Perspectives 33 (2): 202–27. Anttila-Hughes, J. K., L. C. Fernald, P. J. Gertler, P. Krause, and B. Wydick. 2018. “Mortality from Nestle’s Marketing of Infant Formula in Low and Middle-Income Countries.” Working Paper No. w24452, National Bureau of Economic Research. Ashraf, N., E. Glaeser, A. Holland, and B. M. Steinberg. 2021. “Water, Health and Wealth: The Impact of Piped Water Outages on Disease Prevalence and Financial Transactions in Zambia.” Economica, forthcoming. Barquera, S., F. Campirano, A. Bonvecchio, L. Hernández-Barrera, J. A. Rivera, and B. M. Popkin. 2010. “Caloric Beverage Consumption Patterns in Mexican Children.” Nutrition Journal 9 (1): 47. Barquera, S., I. Campos, and J. Rivera. 2013. “Mexico Attempts to Tackle Obesity: The Process, Results, Push Backs and Future Challenges.” Obesity Reviews 14 (S2): 69–78. Bhalotra, S. R., A. Diaz-Cayeros, G. Miller, A. Miranda, and A. S. Venkataramani. 2017. “Urban Water Disinfec- tion and Mortality Decline in Developing Countries.” Working Paper No. w23239, National Bureau of Economic Research. Cawley, J. 2015. “An Economy of Scales: A Selective Review of Obesity’s Economic Causes, Consequences, and Solutions.” Journal of Health Economics 43: 244–268. Cawley, J., and D. E. Frisvold. 2017. “The Pass-Through of Taxes on Sugar-Sweetened Beverages to Retail Prices: The case of Berkeley, California.” Journal of Policy Analysis and Management 36 (2): 303–326. Cawley, J., D. Frisvold, A. Hill, and D. Jones. 2019. “The Impact of the Philadelphia Beverage Tax on Purchases and Consumption by Adults and Children.” Journal of Health Economics 67: 102225. Colchero, M. A., M. Molina, and C. M. Guerrero-López. 2017. “After Mexico Implemented a Tax, Purchases of Sugar- Sweetened Beverages Decreased and of Water Increased: Difference by Place of Residence, Household Composition, and Income Level.” The Journal of Nutrition 147 (8): 1552–1557. Colchero, M. A., B. M. Popkin, J. A. Rivera, and S. W. Ng. 2016. “Beverage Purchases from Stores in Mexico under the Excise Tax on Sugar Sweetened Beverages: Observational Study.” BMJ 352: h6704. 22 The SSB tax revenue in 2014 was 18 billion pesos (http://finanzaspublicas.hacienda.gob.mx/es/Finanzas_Publicas/ Estadisticas_Oportunas_de_Finanzas_Publicas, accessed May 13, 2017). 18 Gutierrez and Rubli Colchero, M. A., J. C. Salgado, M. Unar-Munguía, M. Molina, S. Ng, and J. A. Rivera-Dommarco. 2015. “Changes in Prices after an Excise Tax to Sweetened Sugar Beverages Was Implemented in Mexico: Evidence from Urban Areas.” PLoS One 10 (12): e0144408. CONAGUA. 2016. “Atlas del Agua en Mexico.”. Technical Report, National Water Commission in Mexico, Comision Nacional del Agua. Cutler, D., and G. Miller. 2005. “The Role of Public Health Improvements in Health Advances: The Twentieth-Century United States.” Demography 42 (1): 1–22. Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 DHAyS. 2017. “Informe Sobre Violaciones a Los Derechos Humanos Al Agua Potable y Al Saneamiento en Mexico.” Technical Report. Duflo, E., M. Greenstone, R. Pande, and N. Ryan. 2013. “Truth-Telling by Third-Party Auditors and the Response of Polluting Firms: Experimental Evidence from India.” Quarterly Journal of Economics 128 (4): 1499–545. Dupas, P., and E. Miguel. 2017. “Impacts and Determinants of Health Levels in Low-Income Countries.” Handbook of Economic Field Experiments 2: 3–93. Finkelstein, E. A., C. Zhen, M. Bilger, J. Nonnemaker, A. M. Farooqui, and J. E. Todd. 2013. “Implications of a Sugar- Sweetened Beverage (SSB) Tax When Substitutions to Non-Beverage Items Are Considered.” Journal of Health Economics 32 (1): 219–39. Fletcher, J. M., D. E. Frisvold, and N. Tefft. 2010. “The Effects of Soft Drink Taxes on Child and Adolescent Con- sumption and Weight Outcomes.” Journal of Public Economics 94 (11): 967–74. Fletcher, J., D. Frisvold, and N. Tefft. 2013. “Substitution Patterns Can Limit the Effects of Sugar-Sweetened Beverage Taxes on Obesity.” Preventing Chronic Disease 10: e38. Galiani, S., P. Gertler, and E. Schargrodsky. 2005. “Water for Life: The Impact of the Privatization of Water Services on Child Mortality.” Journal of Political Economy 113 (1): 83–120. Gamper-Rabindran, S., S. Khan, and C. Timmins. 2010. “The Impact of Piped Water Provision on Infant Mortality in Brazil: A Quantile Panel Data Approach.” Journal of Development Economics 92 (2): 188–200. Garg, T., S. E. Hamilton, J. P. Hochard, E. P. Kresch, and J. Talbot. 2018. “(Not So) Gently Down the Stream: River Pollution and Health in Indonesia.” Journal of Environmental Economics and Management 92: 35–53. Graff Zivin, J., M. Neidell, and W. Schlenker. 2011. “Water Quality Violations and Avoidance Behavior: Evidence from Bottled Water Consumption.” American Economic Review 101 (3): 448–53. Grogger, J. 2017. “Soda Taxes and the Prices of Sodas and Other Drinks: Evidence from Mexico.” American Journal of Agricultural Economics 99 (2): 481–98. Keskin, P., G. K. Shastry, and H. Willis. 2017. “Water Quality Awareness and Breastfeeding: Evidence of Health Behavior Change in Bangladesh.” Review of Economics and Statistics 99 (2): 265–80. Kremer, M., J. Leino, E. Miguel, and A. P. Zwane. 2011. “Spring Cleaning: Rural Water Impacts, Valuation, and Property Rights Institutions.” Quarterly Journal of Economics 126 (1): 145–205. Lipscomb, M., and A. M. Mobarak. 2016. “Decentralization and Pollution Spillovers: Evidence from the Re-Drawing of County Borders in Brazil.” Review of Economic Studies 84 (1): 464–502. Molina, H. S., I. A. Pérez, A. A. Alonso, J. P. D. Martínez, M. P. Castellanos, C. F. del Valle Laisequilla, and J. G. R. García et al. 2015. “Carga Económica de la Obesidad y sus Comorbilidades en Pacientes Adultos en México.” PharmacoEconomics Spanish Research Articles 12 (4): 115–22. Nakhimovsky, S. S., A. B. Feigl, C. Avila, G. O’Sullivan, E. Macgregor-Skinner, and M. Spranca. 2016. “Taxes on Sugar- Sweetened Beverages to Reduce Overweight and Obesity in Middle-Income Countries: A Systematic Review.” PloS One 11 (9): e0163358. Onufrak, S. J., S. Park, J. R. Sharkey, and B. Sherry. 2014. “The Relationship of Perceptions of Tap Water Safety with Intake of Sugar-Sweetened Beverages and Plain Water among US Adults.” Public Health Nutrition 17 (01): 179–85. Ritter, P. I. 2020. “Soda Consumption in the Tropics: The Trade-Off between Obesity and Diarrhea in Developing Countries.” Working Paper No. 2018-16, University of Connecticut, Department of Economics. Roache, S. A., and L. O. Gostin. 2017. “The Untapped Power of Soda Taxes: Incentivizing Consumers, Generating Revenue, and Altering Corporate Behavior.” International Journal of Health Policy and Management 6 (9): 489. Sigman, H. 2002. “International Spillovers and Water Quality in Rivers: Do Countries Free Ride?” American Eco- nomic Review 92 (4): 1152–9. Stern, D., C. Piernas, S. Barquera, J. A. Rivera, and B. M. Popkin. 2014. “Caloric Beverages Were Major Sources of Energy among Children and Adults in Mexico, 1999–2012.” Journal of Nutrition 144 (6): 949–56. Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 Supplementary Online Appendix Local Water Quality, Diarrheal Disease, and the Unintended Consequences of Soda Taxes Emilio Gutierrez and Adrian Rubli Appendix S1 summarizes the existing literature on the effects of the 2014 soda tax in Mexico and shows descriptive evidence on perceived consumption changes from the national health survey. Appendix S2 looks more carefully at our data sources, presenting various descriptives and detailing the variation that we exploit and how our final dataset at the clinic level relates to the multiple water measures. Appendix S3 presents additional robustness checks on our main results. We also present regressions that verify that clinic-level staffing and infrastructure is not changing differentially in areas with poor Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 drinking-water quality with the introduction of the tax. Appendix S4 presents evidence to support the claim that our measure of gastrointestinal diseases (GIDs), which relies on clinic reports and not the total epidemiological prevalence, is not introducing a bias in our estimates. Appendix S5 presents estimates on hospitalization outcomes, suggesting that the effects of our main finding are fully contained at the outpatient level. S1. Additional First Stage Evidence of the Soda Tax Table S1.1 shows summary statistics for Mexican households’ purchases of both taxed drinks and bottled water prior to the tax, using data from the 2008, 2010, and 2012 National Household Income and Expenditures Survey (enigh) rounds. Taxed drinks include all sodas and energy drinks; bottled water includes all presentation sizes of bottled water and club soda. Areas with poor water quality are defined as in the main estimation. Table S1.1 shows that 64 percent of households in areas not defined as lacking access to clean water and 57 percent in areas with poor water quality purchased soda over the last week prior to the survey, compared to 33 and 26 percent of households making purchases of bottled water, respectively. This is consistent with our areas without access to clean drinking water being lower socioeconomic status (SES) Table S1.1. Pre-Tax Purchases of Beverages Drinking-water quality Poor Good Difference Fraction with taxed drinks purchases 0.57 0.64 0.07*** (0.49) (0.48) (0.00) Taxed drinks purchased (L) 2.75 3.42 0.67*** (4.39) (4.67) (0.00) Taxed drinks, if purchased (L) 4.82 5.39 0.57*** (4.87) (4.87) (0.00) Fraction with bottled water purchases 0.26 0.33 0.07*** (0.44) (0.47) (0.00) Bottled water purchased (L) 6.98 10.04 3.06*** (16.71) (22.40) (0.01) Bottled water, if purchased (L) 26.66 30.47 3.81*** (23.27) (30.01) (0.02) Observations 6,772 58,879 65,651 Source: Authors’ analysis based on data from the 2008–2016 National Household Income and Expenditures Survey (enigh) rounds, surface-water quality measures collected by the National Water Commission (conagua), and the 2010 census statistics obtained from the National Statistics Office (inegi). Note: This table shows weekly purchases of beverages in liters (L) by households prior to the sugar-sweetened beverage (SSB) tax. We show the fraction of households with positive purchases, the average amount purchased, and the average amount conditional on purchasing a positive quantity. Taxed drinks include all sodas and energy drinks. Bottled water includes all bottled water and club soda. We present statistics for households in areas with poor and good quality drinking water. Survey weights are included in the calculations. The mean and standard deviation are shown. Stars denote significance from a difference in means test. ***p < 0.01, **p < 0.05, *p < 0.1 Table S1.2. Previously Estimated Impacts of the SSB Tax in Mexico on SSB Prices and Consumption Short-run effects Long-run effects Aguilar, Gutierrez, and Seira (2019) 100% pass-though of the tax on N.A. prices, and 6% decrease in purchases throughout 2014; an initial sharp drop in consumption Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 is followed by an increasing trend Colchero, Molina, and Guerrero-López (2017) 6.3% reduction in purchases and N.A. 2% reduction in probability of purchasing SSBs during 2014 Colchero et al. (2016) average 6% reduction in purchases N.A. throughout 2014, decreasing at an increasing rate with a 12% decline in December 2014 Colchero et al. (2015) 0.95–1.12 pesos per liter increase in N.A. SSB prices, 11% increase in carbonated SSB prices throughout 2014; evidence of full pass-through of the tax Grogger (2017) 12.3–14.1% price increase during Estimates hold up to June 2014; over 100% pass-through 2015, no increasing or decreasing trend Source: Authors’ compilation and classification of related literature. Note: This table summarizes the main findings of the papers that have studied the effect of the sugar-sweetened beverage (SSB) tax in Mexico on both prices and consumption of SSBs. We distinguish between short- and long-run effects (roughly one year after the implementation of the tax). Cells with “N.A.” indicate that the paper did not estimate those effects. areas. Households in poor water quality areas purchased about 20 percent fewer liters of sugar-sweetened beverages (SSBs) and 30 percent fewer liters of bottled water. These statistics show that not all households purchase bottled water, and less so if they are in areas without access to clean drinking water. Note that in contrast to the enigh data, retail panels typically do not include rural areas, where bottled water consumption is less common. We present additional supporting evidence for the first stage effect of the soda tax. First, we rely on previously estimated impacts. Second, we present some descriptives from the National Health Survey ensanut. For our literature review, we were able to identify five papers that analyzed the effects of the tax on prices and consumption. A summary of the main findings are shown in table S1.2. Altogether, these studies consistently find a 100 percent pass-through of the tax on SSB prices throughout the first year of the tax. There is little evidence on the long-run effects on prices. Grogger (2017) finds similar estimates in the first six months of 2015 as the full pass-through of 2014. We complement this information by presenting the distribution of SSB prices from 2011 to 2016, using data from the Mexican consumer price index, in fig. S1.1. We also plot the distribution for bottled water prices. Figure S1.1 shows that the median price per liter of SSBs was 11.84 pesos pre-tax and 10.77 pesos post-tax, consistent with the 1 peso per liter tax. The median price per liter of bottled water is very similar over time, with 5.82 pesos pre-tax and 5.69 post-tax. The lower prices of bottled water mostly reflect per liter prices of the large presentations, such as 20 liter jugs. In terms of consumption, findings are consistent across studies with an estimated 6 percent decline in SSB purchases during 2014. However, there are mixed results in terms of the dynamics of this effect throughout the first year. Using an extrapolation method, Colchero et al. (2016) finds that there is an initial drop in consumption that gets larger over time. On the other hand, Aguilar, Gutierrez, and Seira (2019) implement a synthetic control to find the opposite: after an initial sharp drop, the decrease in SSB Figure S1.1. Distribution of Prices of Bottled Beverages Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 Source: Authors’ analysis based on data from the the National Statistics Office (inegi) consumer price index. Note: These graphs show the distribution of prices for sugar-sweetened beverages (SSBs) and bottled water. SSBs are defined as bottled sodas, and water includes regular and sparkling. Prices per liter are measured in constant January 2011 Mexican pesos. Each plot distinguishes between pre-tax (2011 to 2013) and post-tax (2014 to 2016) years. The median price pre- and post-tax is shown. Table S1.3. Distribution of Perceptions of Consumption Changes after the Tax Water consumption (in percent) Panel A: High Socioeconomic Status Went down The same Went up Total Went down 2 9 34 45 The same 3 19 18 40 Went up 5 5 5 15 Total 10 33 57 SSB consumption Panel B: Low Socioeconomic Status Went down The same Went up Total Went down 3 12 23 38 The same 3 31 13 47 Went up 4 6 5 15 Total 10 49 41 Source: Authors’ analysis based on data from the 2016 National Health Survey (ensanut). Note: This table shows the distribution of individuals answering questions about how they have perceived changes in their consumption of both water and sugar- sweetened beverages (SSBs) during the two years after the tax. Panel A shows high socioeconomic status individuals, corresponding to those in the top tercile (definition provided in the survey). Panel B shows the bottom tercile. Survey weights are included in the calculations. purchases gets smaller in absolute value over time. A large drawback is that neither study analyzes data after 2014, limiting our sense of how SSB purchases were affected by the tax in the long run. Identifying the long-run effects of a nationwide policy is not a simple task. This is likely the reason why studies have not attempted to calculate them. Overall, the existing literature allows us to confidently infer that the tax had a strong effect of around 10 percent on prices, as well as a 6 percent decrease in purchases of SSBs during 2014, the first year of the tax. However, we are less sure about effects in 2015, where the literature is silent with respect to consumption effects, although at least it seems that pass-through was maintained. We complement this with one last piece of evidence that may indicate substitution patterns between soda and water by presenting summary statistics from the 2016 ensanut. This was a special round of the usual health survey administered every six years. This particular round focused on questions regarding consumption of beverages, which had not been previously recorded. This survey asked about perceived changes in consumption after the implementation of the tax. Although we recognize that this is an imper- fect measure, we believe that it sheds light on the possibility that water and SSBs are substitutes to some degree in our context. Table S1.3 shows the distribution of individuals reporting that their water and SSB consumption went down, stayed the same, or went up, for the top and bottom SES terciles as registered in the survey. Note that we are unable to identify areas with and without access to clean drinking water from the data in this survey, but we take SES terciles as a crude approximation. This table indicates that the majority Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 of the increase in water consumption in the two years after the tax was implemented corresponds to individuals who decreased their SSB consumption. Furthermore, the numbers suggest that substitution occurred across all SES groups, although perhaps to a lesser degree among low SES individuals. S2. Additional Information on Data This section provides more details on the data used in the main text. We first discuss piped-water access, then surface-water quality measures, health outcomes at the public outpatient clinics, and finally the assignment of water variables to the healthcare facilities. S2.1. Access to Piped Water Table S2.1 shows descriptive statistics at the electoral precinct level for the variables measuring access to piped water in the 2010 census for the total number of precincts in Mexico. Households are asked to report the main source of water they use. This table shows that on average 63 percent of households within a precinct have access to piped water inside their home. The remaining 37 percent is then broken down by other sources: around 24 percent obtain piped water from neighbors or a communal tap, 1 percent buy water from vendors, and the remaining 12 percent use ground and surface water from wells (8 percent) and from rivers, lakes and dams (3 percent). We also show similar descriptives for our subset of precincts with public clinics used in our estimating sample (i.e., precincts with a public clinic). On average, there is a larger fraction of households in our sample precincts obtaining water from outside the home. We interpret this as suggesting that public clinics are not skewed towards locating in very urban and developed areas. Figure S2.1 shows histograms of lack of access to piped water inside the home, using precinct-level data from the 2010 census. Not all precincts have a public outpatient clinic. For our main sample, we have 13,732 precincts. These plots show the distribution for our sample. Figure S2.1a corresponds to the density Table S2.1. Access to Piped Water at the Precinct Level All precincts Precincts in sample Mean Std. dev. Mean Std. dev. Percentage of households getting water from: Sources outside the home 37.03 (36.83) 57.94 (35.40) Piped water from neighbors/communal tap 24.10 (28.14) 38.21 (31.06) Water from vendors 1.24 (6.26) 1.29 (5.90) Ground and surface water 11.70 (24.50) 18.45 (27.59) Wells 8.47 (19.71) 13.12 (22.65) Rivers, lakes and dams 3.22 (11.59) 5.33 (13.87) Total observations 64,532 13,732 Source: Authors’ analysis based on data from the 2010 census statistics obtained from the National Statistics Office (inegi). Note: This table shows precinct-level averages for access to piped water according to the 2010 census. Precincts in the sample correspond to those that have at least one public outpatient clinic. Figure S2.1. Distribution of Piped-Water Access Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 Source: Authors’ analysis based on data from the 2010 census statistics obtained from the National Statistics Office (inegi). Note: These graphs show histograms for piped-water access for the estimating sample. The graph on the left shows the distribution of the fraction of households in an electoral precinct that do not have access to piped water inside the home for all precincts. The graph on the right shows the number of households in a precinct without access to piped water inside the home (i.e., the fraction of households without access multiplied by the number of households in each electoral precinct). The thick dashed line represents the median of the distributions, and the thin dashed line corresponds to the mean. Figure S2.2. Geographic Distribution of Access to Piped Water Source: Authors’ analysis based on data from the 2010 census statistics obtained from the National Statistics Office (inegi). Note: This map shows the geographic distribution of the piped-water access variable, corresponding to the fraction of households in an electoral precinct without access to piped water inside the home. The map splits the data into quartiles. of the fraction of households in an electoral precinct without piped water at home. This distribution has two humps, one at each extreme. This is consistent with a large number of precincts having full or almost full access to piped water, with another relatively big mass of precincts that have almost no access to piped water at home. The latter are mostly rural. Figure S2.1b shows the same distribution over the number of households instead of the fraction. Figure S2.2 shows a map of our access to piped water variable at the precinct level, stratifying by quartiles. Overall, the map shows considerable spatial variation in piped-water access. Figure S2.3. Water Monitoring Stations and Thiessen Polygons Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 Source: Authors’ analysis based on surface water-quality measures collected by the National Water Commission (conagua). Note: This map shows the 2,071 monitoring stations in the sample, as well as their corresponding Thiessen polygons. Figure S2.4. Water Sources for Monitoring Stations in Sample Source: Authors’ analysis based on surface water-quality measures collected by the National Water Commission (conagua). Note: This graph shows the number of monitoring stations that correspond to each water source where quality is being measured. S2.2. Surface-Water Quality As described in the main text, our data for surface-water quality comes from monitoring stations belong- ing to the Mexican government. While there are over 3,000 stations, we discard those that are located at salt-water sources. In total, we are left with 2,071 monitoring stations in our main sample, from which we construct our Thiessen polygons. Figure S2.3 shows a map with the location of these monitoring stations, as well as the corresponding Thiessen polygons. The higher density of stations seems to coincide with higher population densities in the central parts of the country. These stations are located across a variety of water sources. Figure S2.4 shows the distribution of these water sources for our monitoring stations. The graph shows that the vast majority (62 percent) of Table S2.2. Summary Statistics of Surface-Water Quality Mean Std. Dev. Median Obs. Biochemical oxygen demand (mg/L) 2012 15.70 38.90 4.00 1,694 2013 17.29 115.44 4.90 1,876 2014 16.65 107.82 5.00 1,906 Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 2012–2013 average 17.30 108.12 5.20 1,955 Three-year average 17.51 109.25 5.23 1,914 Chemical oxygen demand (mg/L) 2012 54.99 107.71 22.00 1,696 2013 72.34 278.24 35.60 1,879 2014 70.07 238.75 37.20 1,905 2012–2013 average 68.76 247.71 33.88 1,956 Three-year average 69.06 249.59 34.07 1,915 Total suspended solids (mg/L) 2012 42.82 104.76 18.00 1,843 2013 86.03 172.09 34.00 2,014 2014 72.16 231.73 27.00 2,071 2012–2013 average 69.20 144.38 31.33 2,119 Three-year average 69.49 144.95 32.00 2,071 Source: Authors’ analysis based on surface-water quality measures collected by the National Water Commission (conagua). Note: This table shows descriptives for the surface-water quality measures across 2,071 monitoring stations used in the main sample. Figure S2.5. Evolution of Water Quality Measures over Time Source: Authors’ analysis based on surface-water quality measures collected by the National Water Commission (conagua). Note: These graphs show how the water quality measures change over time for the three years from 2012 to 2014. Each plot shows the point estimates from a regression of the quality measure on indicators for each year and monitoring-station fixed effects. Error bars show 95 percent confidence intervals using robust standard errors. Figure S2.6. Geographic Distribution of Water Quality Measures Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 Source: Authors’ analysis based on surface-water quality measures collected by the National Water Commission (conagua). Note: These maps show the geographic distribution of the surface-water quality variables. Each graph shows each of the three water quality measures, averaged over 2012 and 2013, at the Thiessen polygon level. All graphs split the data into quartiles. monitoring stations are located at rivers. This is followed by dams (15 percent), lagoons (9 percent), and lakes (9 percent). Table S2.2 shows descriptive statistics of the surface-water quality measures at these stations. As de- scribed in the main text, there are three measures per monitoring station: biochemical oxygen demand (BOD), chemical oxygen demand (COD), and total suspended solids (TSS). The mean, standard devia- tion, and median for each year (2012, 2013, and 2014) are shown for each measure. Note that some stations have missing values for some measures in some years. We also report summary statistics for the 2012–2013 average and the three-year average of each measure. Figure S2.5 shows how each measure has changed over time. Each plot corresponds to one of the three measures, and shows point estimates from a regression of the continuous measure on year indicators and monitoring-station fixed effects. The plot for BOD shows that there was not a lot of change in this measure over the course of these three years. The plot for COD shows an increase from 2012 to 2013. Lastly, the plot for TSS shows the most variation, with an increase from 2012 to 2013, followed by a decline in 2014. The three maps in fig. S2.6 show the Thiessen polygons constructed for each station, as well as the spatial distribution of the BOD, COD, and TSS measures, taking the average over 2012 and 2013. Table S2.3 shows the criteria used by the National Water Commission (conagua) in classifying each of the three surface-water quality measures into five categories of cleanliness. In one of our robustness checks, we use this stratification to classify stations into those with good surface-water quality (“excellent” and “good” categories) and bad quality (below “good”). Since it is unclear to us how conagua chose Table S2.3. Surface Water-Quality Thresholds Biochemical oxygen Chemical oxygen Total suspended demand (mg/L) demand (mg/L) solids (mg/L) Excellent ≤3 ≤10 ≤25 Good 3–6 10–20 25–75 Acceptable 6–30 20–40 75–150 Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 Polluted 30–120 40–200 150–400 Very polluted >120 >200 >400 Source: Information collected by the National Water Commission (conagua). Note: This table shows the thresholds established by conagua in classifying each of the three surface-water quality measures. Figure S2.7. Electoral Precincts and Public Outpatient Clinics Source: Authors’ analysis based on data from public outpatient clinics obtained from the Ministry of Health (SSA) and the 2010 census boundaries for electoral precincts obtained from the National Statistics Office (inegi). Note: This map shows the 15,634 public outpatient clinics in the sample, as well as their corresponding electoral precinct. these thresholds (or even why five categories and not just two), we conservatively use the “good” category as our cutoff to construct our binary measure of quality. Standards in developed settings suggest that the “acceptable” category put forward by conagua tends to fall above established thresholds. For example, the Canadian Ministry of the Environment establishes a threshold of 5.5–6.5 mg/L for BOD in warm- water ecosystems. Michigan’s Department of Environmental Quality indicates water appears cloudy for TSS levels from 40–80 mg/L. Utah, according to the Environmental Protection Agency (EPA), establishes a threshold of 90 mg/L for TSS. S2.3. Public Outpatient Clinics Figure S2.7 shows a map with the location of the public outpatient clinics in our sample, as well as the electoral precincts in which they fall. S2.4. Assignment of Water Variables to Public Outpatient Clinics Figure S2.8 shows the distribution of distance from each public outpatient clinic to the monitoring station. The plot depicts the 90th, 95th, and 99th percentiles which are used in a robustness check. Figure S2.9 shows the distribution of the continuous treatment indices that we construct as part of our sensitivity analysis. Each plot shows the distribution separately for clinics that were classified as with and Figure S2.8. Distance from Clinics to Monitoring Stations Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 Source: Authors’ analysis based on data from public outpatient clinics obtained from the Ministry of Health (SSA) and monitoring station locations collected by the National Water Commission (conagua). Note: This graph shows the distribution of distance from the public outpatient clinics to their assigned monitoring station. The dashed lines show the 90th, 95th, and 99th percentiles of the distribution. Figure S2.9. Distribution of the Continuous Indices of Access to Clean Drinking Water Source: Authors’ analysis based on surface-water quality measures collected by the National Water Commission (conagua), and the 2010 census statistics obtained from the National Statistics Office (inegi). Note: This graph shows the distribution of the continuous indices for quality of local drinking water separately for areas with poor- and good-quality water, based on the main binary definition. Each index has been standardized to have a mean of 0 and a standard deviation of 1. The values of each index in the graph are capped at the 99th percentile for clarity. The graph on the left constructs this index from the water quality measures of the single nearest monitoring station to the clinic. The graph on the right considers the inverse-distance weighted average of the water measures from the three nearest stations. without access to safe drinking water according to the baseline binary definition. Figure S2.9a constructs the index from the water quality measures of the monitoring station that is closest in distance to the clinic. Figure S2.9b takes the inverse-distance weighted average of the water quality measures from the three nearest stations to the clinic. These plots show that there is considerable overlap, although there is a larger mass towards the left for areas classified as having access to safe drinking water and a larger mass towards the right for those without access. Figure S2.10 shows the geographic distribution of areas with and without access to safe drinking water. Figure S2.10a shows the main binary definition. Figure S2.10b uses the official thresholds provided by the Figure S2.10. Geographic Distribution of Areas with Poor Quality Drinking Water Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 Source: Authors’ analysis based on surface-water quality measures collected by the National Water Commission (conagua), and the 2010 census statistics obtained from the National Statistics Office (inegi). Note: These maps show the geographic distribution of areas with poor- versus good-quality drinking water based on various binary definitions. Blank areas indicate intersections of precincts and Thiessen polygons without public outpatient clinics. The top-left map considers the main definition, while the other maps show alternative definitions. The top-right map uses the conagua thresholds instead of the median for quality. The bottom-left map includes 2014 measures of quality. The bottom-right map uses the mean of the precinct-level distribution of access to piped water as the cutoff instead of the median. government regulator conagua for the binary classification of water quality at each monitoring station instead of the median. Figure S2.10c includes the 2014 water quality data. Figure S2.10d considers the definition that uses the mean of the distribution of piped-water access as the cutoff instead of the median. Note that although there are similarities in these classifications across definitions, there is also significant spatial variation. S3. Additional Robustness Checks We present graphical results for event study plots mirroring the ones in the main text for our robustness checks where we consider alternative winsorization levels and exclude clinics that are in the top 1 and 5 percent of the distribution of distance to the assigned monitoring station. Figure S3.1a considers specifi- cations where the outcome has been winsorized at more conservative levels, namely at the 1 percent and not winsorized at all. Figure S3.1b plots coefficient estimates for subsamples where we exclude clinics based on distance to the assigned water monitoring station. Overall, these plots mirror the main results and show that our findings are robust to these specifications. Table S3.1 considers assigning piped-water access from precincts within a 2 km radius of the public clinics instead of the single precinct where the clinic is located. We assign a spatially weighted average of access from the precincts that fall within this radius. We present our estimates as before, with the main Figure S3.1. Event Study of the Effect of the Soda Tax on Gastrointestinal Disease (GID) Rates: Robustness Checks on Winsorization Levels and Distance to the Monitoring Stations Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 Source: Authors’ analysis based on data from public outpatient clinics obtained from the Ministry of Health (SSA), surface-water quality measures collected by the National Water Commission (conagua), and the 2010 census statistics obtained from the National Statistics Office (inegi). Note: These graphs show robustness checks on the main results. The plot on the left considers winsorizing the outcome variable (clinic GID rate per 100,000) at different levels. The dark coefficient series corresponds to the outcome in the main specification, winsorized at the 5 percent level. The lighter-colored series winsorize the outcome at more conservative levels: 1 percent and not winsorized. The graph on the right plots estimates from a difference-in-differences (DiD) on a balanced panel of outpatient clinic-quarters, excluding clinics assigned to far away monitoring stations. The dark coefficient series corresponds to the main result including all clinics. The lighter-colored series exclude clinics by distance to the assigned monitoring station: top 1 percent (distance greater than 97 km; 156 clinics excluded), top 5 percent (distance greater than 57 km; 781 clinics excluded), and top 10 percent (distance greater than 44 km; 1,563 clinics excluded). Coefficients for the interaction of the indicator for poor-quality drinking water areas and each quarter for two years before and after the tax was introduced are shown, with quarter 4 of 2013 as the excluded period. Robust standard errors clustered at the clinic level. Error bars show 95 percent confidence intervals. The mean of the dependent variable for clinics in areas with poor drinking water prior to the tax for the baseline specification is 110. Table S3.1. Robustness Check on the Difference-in-Differences (DiD) Effect of the Soda Tax on Gastrointestinal Disease (GID) Rates: Assigning Water Access within a 2 Kilometer Radius (1) (2) (3) Post-tax years Poor drinking water × 2015 2.641* 1.889 1.161 (1.491) (1.558) (1.769) Poor drinking water × 2014 2.975** 2.798** 2.133 (1.215) (1.258) (1.412) Pre-tax years Poor drinking water × 2012 −1.588 −1.077 −0.242 (1.166) (1.212) (1.369) Poor drinking water × 2011 −1.093 −0.484 1.255 (1.453) (1.511) (1.742) Poor drinking water × 2010 −1.998 −1.184 −0.826 (1.619) (1.687) (1.970) Poor drinking water × 2009 −3.895** −3.316* −3.333 (1.778) (1.843) (2.143) Observations 437,752 437,752 437,192 Clinics in areas with poor quality drinking water 3,301 3,301 3,301 R-squared 0.800 0.802 0.814 Household controls Yes Yes Geographic controls Yes Mean dependent variable 110.6 110.6 110.6 Table S3.1. Continued. (1) (2) (3) Coefficient tests: H0 : β 2014 = β 2015 0.789 0.480 0.505 H0 : βk = 0 ∀ k = 2009, . . . , 2012 0.202 0.327 0.196 Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 Source: Authors’ analysis based on data from public outpatient clinics obtained from the Ministry of Health (SSA), surface-water quality measures collected by the National Water Commission (conagua), and the 2010 census statistics obtained from the National Statistics Office (inegi). Note: This table shows robustness checks on the main results from estimating equation (1) on a balanced panel of outpatient clinic-quarters (15,634 clinics × 28 quarters). The outcome is the clinic gastrointestinal disease rate per 100,000, winsorized at the 5 percent level. Access to piped water is assigned from the spatially weighted average of electoral precincts within a 2 km radius of the clinic. Coefficients for the interaction of the indicator for areas with poor-quality drinking water and each year are shown, with 2013 as the excluded year. Robust standard errors clustered at the clinic level. The mean of the dependent variable for clinics in areas with poor-quality drinking water pre-tax is shown. ***p < 0.01, **p < 0.05, *p < 0.1. Table S3.2. Robustness Check on the Difference-in-Differences (DiD) Effect of the Soda Tax on Gastrointestinal Disease (GID) Rates: Using Biochemical Oxygen Demand Measures Only (1) (2) (3) Post-tax years Poor drinking water × 2015 −3.755** −1.643 1.517 (1.603) (1.737) (1.884) Poor drinking water × 2014 2.158* 4.031*** 2.603* (1.301) (1.402) (1.551) Pre-tax years Poor drinking water × 2012 1.502 0.956 1.582 (1.290) (1.394) (1.513) Poor drinking water × 2011 4.468*** 2.071 1.829 (1.599) (1.706) (1.805) Poor drinking water × 2010 4.557*** 0.923 −0.837 (1.718) (1.865) (2.007) Poor drinking water × 2009 2.447 −1.083 −3.111 (1.873) (2.030) (2.212) Observations 437,752 437,752 437,192 Clinics in areas with poor quality drinking water 3,398 3,398 3,398 R-squared 0.800 0.802 0.814 Household controls Yes Yes Geographic controls Yes Mean dependent variable 119.9 119.9 119.9 Coefficient tests: H0 : β 2014 = β 2015 0.000 0.000 0.489 H0 : βk = 0 ∀ k = 2009, . . . , 2012 0.031 0.430 0.131 Source: Authors’ analysis based on data from public outpatient clinics obtained from the Ministry of Health (SSA), surface-water quality measures collected by the National Water Commission (conagua), and the 2010 census statistics obtained from the National Statistics Office (inegi). Note: This table shows robustness checks on the main results from estimating equation (1) on a balanced panel of outpatient clinic-quarters (15,634 clinics × 28 quarters). The outcome is the clinic gastrointestinal disease rate per 100,000, winsorized at the 5 percent level. Surface-water quality is measured only with biochemical oxygen demand. Coefficients for the interaction of the indicator for areas with poor-quality drinking water and each year are shown, with 2013 as the excluded year. Robust standard errors clustered at the clinic level. The mean of the dependent variable for clinics in areas with poor-quality drinking water pre-tax is shown. **p < 0.01, **p < 0.05, *p < 0.1 specification and subsequent columns adding household and geographic time-varying controls as in the main text. These estimates show similar results, reassuring us that the way we assigned piped-water access is not driving the results. Table S3.2 repeats this exercise assigning piped-water access as in the main text, but considering only BOD measures for quality. Although no consensus exists, BOD seems to be the most prevalent water Table S3.3. Difference-in-Differences (DiD) Effect of the Soda Tax on Gastrointestinal Disease (GID) Rates: Placebo Check on Unrelated Diseases and Conditions Accidents Sexually Transmitted Diseases Chronic All (1) (2) (3) (4) Post-tax years Poor drinking water × 2015 −0.120* −1.015 −0.784 −1.919 Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 (0.051) (0.640) (0.737) (1.105) [0.079] [0.188] [0.317] [0.158] Poor drinking water × 2014 −0.028 −0.769 0.754 −0.043 (0.040) (0.520) (0.602) (0.870) [0.683] [0.396] [0.406] [0.980] Pre-tax years Poor drinking water × 2012 0.017 0.230 0.399 0.646 (0.037) (0.517) (0.613) (0.838) [0.901] [0.901] [0.901] [0.832] Poor drinking water × 2011 −0.025 0.398 1.035 1.407 (0.047) (0.605) (0.742) (1.026) [0.772] [0.772] [0.465] [0.465] Poor drinking water × 2010 0.081 −0.718 1.823* 1.187 (0.047) (0.610) (0.802) (1.090) [0.188] [0.386] [0.080] [0.386] Poor drinking water × 2009 0.114* −1.754* 1.440 −0.200 (0.048) (0.640) (0.850) (1.147) [0.099] [0.059] [0.139] [0.891] Observations 437,752 437,752 437,752 437,752 R-squared 0.454 0.574 0.762 0.763 Mean dependent variable 6.220 15.26 55.96 77.44 Source: Authors’ analysis based on data from public outpatient clinics obtained from the Ministry of Health (SSA), surface-water quality measures collected by the National Water Commission (conagua), and the 2010 census statistics obtained from the National Statistics Office (inegi). Note: This table shows placebo tests of the main results from estimating equation (1) on a balanced panel of outpatient clinic-quarters (15,634 clinics × 28 quarters) for unrelated conditions. The outcomes considered are accidents and external injuries, sexually transmitted diseases, chronic diseases, and the sum of all three. All outcomes are measured as rates per 100,000, winsorized at the 5 percent level. Coefficients for the interaction of the indicator for areas with poor-quality drinking water and each year are shown, with 2013 as the excluded year. Robust standard errors, clustered at the clinic level, are shown in parentheses. Romano–Wolf step-down adjusted p-values for multiple hypothesis testing are shown in brackets. Stars denote significance from the latter. The mean of the dependent variable for clinics in areas with poor-quality drinking water pre-tax is shown. ***p < 0.01, **p < 0.05, *p < 0.1 quality measure in the literature. We again find similar results, but note that the pre-tax coefficients are significantly different from zero when we do not include our time-varying controls. Overall, this table allays any concerns with our choice of including all three measures. We then perform a series of placebo checks on conditions unaffected by the soda tax to test that no other general supply-and-demand changes are driving our results. Since other infectious diseases may be affected by changes in GIDs (Agüero and Beleche 2017), we abstain from using them in this analysis. Instead, we first consider accidents and external conditions (ICD-10 codes from S00 to T98), sexually transmitted diseases (STDs, with ICD-10 codes from A53 to A60, as well as B18, B20, and B97), and chronic conditions (ICD-10 codes E11–E14, I10–I15, and M15–M19). We also present results for the sum of these three placebo conditions. Table S3.3 shows the results from estimating equation (1) on these outcomes, which have also been winsorized at the 5 percent level. Each of the first three columns corresponds to a different condi- tion, while column (4) considers the sum of all three. We present Romano–Wolf step-down adjusted p-values robust to multiple hypothesis testing. We find no significant effects for the first year of the soda tax. We also find no evidence of any significant increases in 2015, although accidents seem to slightly decrease (significant at the 90 percent level). Figure S3.2 complements these results by show- ing the corresponding event study plots. Overall, these placebo checks suggest that our results are not Figure S3.2. Event Study of the Effect of the Soda Tax on Gastrointestinal Disease (GID) Rates: Placebo Check on Unrelated Diseases and Conditions Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 Source: Authors’ analysis based on data from public outpatient clinics obtained from the Ministry of Health (SSA), surface-water quality measures collected by the National Water Commission (conagua), and the 2010 census statistics obtained from the National Statistics Office (inegi). Note: These graphs show placebo tests of the main results from estimating an event study on a balanced panel of outpatient clinic-quarters (15,634 clinics × 28 quarters) for unrelated conditions. The outcomes considered are accidents and external injuries, sexually transmitted diseases, chronic diseases, and the sum of all three. All outcomes are measured as rates per 100,000, winsorized at the 5 percent level. Coefficients for the interaction of the indicator for areas with poor-quality drinking water and each quarter for two years before and after the tax was introduced are shown, with quarter 4 of 2013 as the excluded period. Robust standard errors clustered at the clinic level. Error bars show 95 percent confidence intervals. The mean of the dependent variable for clinics in areas with poor drinking water prior to the tax is 6 for accidents, 15 for sexually transmitted diseases, 56 for chronic conditions, and 77 for the sum. confounded by other healthcare policies that might have increased public supply or general demand for healthcare. To further show that public clinic staffing and infrastructure did not change differentially in areas with poor-quality drinking water, we obtain clinic-level data on staffing and infrastructure at a yearly level for both 2013 and 2014. Unfortunately, data for previous years are not available. We estimate a regression akin to our main specification. Since we cannot test for pre-trends given that we only observe one year pre-tax, we include flexible controls that are meant to capture the possibility of differential trends. For this, we construct quartiles of the variables measuring the fraction of households without access to electricity, without a bathroom inside the house, and without access to the sewerage system. We then interact an indicator for each of these quartiles with an indicator for 2014. This is analogous to the controls in our main specification. Formally, our estimating equation for this exercise is given by yct = β ((poor˜drinking˜water)c × 1[t =2014] ) + λc + θt 4 + γ p (Xcp × 1[t =2014] ) + νct , (S3.1) p=1 Table S3.4. Difference-in-Differences (DiD) Estimates on Public Outpatient Clinic Resources Poor drinking Mean water × 2014 Obs. R-squared dep. var. Panel A: Material resources Number of examination rooms −0.008 31,268 0.976 1.391 (0.006) Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 Number of X-ray machines −0.000 31,268 0.974 0.002 (0.000) Number of mammogram machines −0.000 31,268 0.975 0.001 (0.000) Panel B: Human resources Total number of doctors 0.016 31,268 0.958 1.697 (0.011) Number of general practitioner doctors 0.003 31,268 0.938 0.883 (0.012) Number of family medicine doctors −0.001 31,268 0.996 0.050 (0.002) Number of medical interns 0.000 31,268 0.916 0.003 (0.002) Number of pediatricians −0.000 31,268 0.902 0.004 (0.000) Number of obstetricians (OB/GYN) −0.000 31,268 0.923 0.001 (0.000) Total number of nurses 0.007 31,268 0.950 1.810 (0.012) Number of general nurses −0.011 31,268 0.927 0.536 (0.012) Number of specialized nurses −0.002 31,268 0.967 0.018 (0.002) Source: Authors’ analysis based on infrastructure data from public outpatient clinics obtained from the Ministry of Health (SSA), surface-water quality measures collected by the National Water Commission (conagua), and the 2010 census statistics obtained from the National Statistics Office (inegi). Note: This table shows DiD estimates on public clinic resources for the balanced panel of clinic infrastructure data (15,634 clinics × 2 years). Each row corresponds to a different regression based on equation (S3.1) for the outcome variable listed. All outcomes are transformed using the inverse hyperbolic sine function. The first column reports the coefficient of interest, measuring the differential change from 2013 to 2014 in areas with poor-quality drinking water relative to the rest. The last column shows the mean of the dependent variable in areas with poor drinking water in 2013, prior to the soda tax. Regressions include clinic fixed effects, an indicator for the post-tax year, and flexible household socioeconomic status (SES) controls as described in equation (S3.1). Standard errors are clustered at the clinic level. **p < 0.01, **p < 0.05, *p < 0.1 where yct is a resource outcome at clinic c in year t, λc are clinic fixed effects, θ t is an indicator for each p year, Xc is a vector of indicators for whether household characteristics (i.e., access to electricity, indoor bathroom, and sewerage) in the precinct where clinic c is located fall in quantile p of the distribution, and ν ct is the error term. We transform the outcome variables using the inverse hyperbolic sine function. We cluster our standard errors at the clinic level as before. We include 12 variables measuring clinic resources as our outcomes of interest. First, we consider variables that refer to infrastructure and material resources: number of examination rooms at the clinic, number of X-ray machines, and number of mammogram machines.1 The second set of resources refer to medical staffing at the clinic: total number of doctors, total general practitioners, total family medicine doctors, total number of interns, total number of pediatricians, total OB/GYN doctors, total number of nurses, total general nurses, and total specialized nurses. 1 Note that we exclude very specialized machinery that are reported in the data such as medical linear accelerators and hyperbaric oxygen beds. We also exclude less specialized material resources that are not widely used in our clinics such as MRI machines and CAT scanners. Table S3.4 shows the estimates for this exercise. All coefficients are small relative to the pre-tax mean in clinics in areas with poor-quality drinking water, and statistically indistinguishable from zero. Panel A indicates that there is no differential change in material resources over time. The null results reassure us that changes in material resources are not the drivers of our main results. Panel B shows that medical staffing does not change differentially in areas with and without access to safe drinking water, providing evidence that our main results are not driven by differential changes in the human resources. Overall, table S3.4 provides convincing evidence that we are not confounding the effect of the soda Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 tax in areas where people lack access to safe drinking water with any differential supply-side changes. S4. Prevalence of Gastrointestinal Diseases (GIDs) and Likelihood of Seeking Outpatient Care Our estimates rely on using outpatient GID rates as our dependent variable. Evidently, this may not be an accurate representation of the overall prevalence of diarrheal disease in a clinic’s catchment area. As such, we are careful to interpret our findings as effects on GIDs seen by public clinics. This means that we are identifying a lower bound on all possible new GID cases, since not everyone who is sick seeks medical attention. In terms of the validity of our strategy, this is only a concern if individuals around clinics with and without access to safe drinking water differentially changed their likelihood of seeking medical care when sick around the introduction of the tax. The question then is whether the mapping of unobserved GID prevalence to observed GID visits at public clinics is effectively changing over time, specifically as GIDs become more prevalent. To shed light on this potential issue, we turn to survey data from the National Health Survey (en- sanut). This is a nationally representative survey, carried out every six years (except for the special 2016 round that focused on nutrition). We explore data from the 2006 and 2012 rounds. Although a 2000 round exists, the questionnaire format and sampling design differ substantially. Unfortunately, the 2016 round focused only on nutrition and chronic diseases, excluding questions on disease and healthcare utilization. Nevertheless, we believe that this exercise is informative. We first show descriptive statistics of healthcare-seeking behavior when sick in table S4.1. We show the fraction of sick individuals seeking any type of care, and then decompose by public versus private providers. We consider different types of diseases. These summary statistics show that 55 percent of individuals that were sick with a GID sought medical attention, with 35 percent going to a public clinic and the remaining 20 percent seeking care with a private provider. Table S4.1. Self-Reported Healthcare-Seeking Behavior When Sick Conditional on being sick with Any condition GID Respiratory Chronic Accident Fraction seeking healthcare 0.53 0.55 0.45 0.75 0.58 (0.50) (0.50) (0.50) (0.43) (0.49) Fraction seeking public care 0.36 0.35 0.29 0.59 0.43 (0.48) (0.48) (0.46) (0.49) (0.50) Fraction seeking private care 0.17 0.20 0.15 0.16 0.14 (0.38) (0.40) (0.36) (0.37) (0.35) Observations 48,799 2,738 24,599 3,343 2,238 Source: Authors’ analysis based on data from the 2006 and 2012 National Health Survey (ensanut) rounds. Note: This table shows the fraction of sick individuals reporting that they sought medical attention. The first column considers all sick individuals. Subsequent columns restrict to individuals sick with particular conditions or ailments. Survey weights are included in the calculations. The mean and standard deviation are shown. Figure S4.1. Correlates of Public Healthcare Seeking Conditional on Being Sick with a Gastrointestinal Disease (GID) Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 Source: Authors’ analysis based on data from the 2006 and 2012 National Health Survey (ensanut) rounds. Note: This graph shows correlates of seeking healthcare. The sample is restricted to individuals that were sick with a GID. The outcome variable is an indicator for seeking public healthcare. The excluded categories for age and education are children 0–12 years old and primary education or less, respectively. The healthcare coverage variables allow for respondents to report having both public and private coverage. The regression includes municipality and year fixed effects. Standard errors are clustered at the municipality level, with bars denoting 95 percent confidence intervals. To get a better sense of what correlates with seeking public healthcare when sick with a GID, we esti- mate a regression with municipality and year fixed effects where we include all individual- and household- level observable characteristics from the survey. This includes various age groups, sex, marital status, work status, education, healthcare coverage, and dwelling characteristics. The dependent variable is equal to 1 if the person sought public healthcare, and zero if they did not seek any care or went to a private provider. Figure S4.1 shows these point estimates. There are four takeaways from this exercise. First, we find no correlation between age and seeking public care. Second, although the point estimates indicate a negative correlation between higher educa- tion levels and seeking public care, these estimates are not significant. Note, however, that if we restrict the sample to all individuals that got medical care (public versus private), then the estimates are larger and more significant. Third, we find, as expected, that having public healthcare coverage correlates posi- tively with seeking public care, while having private coverage or no coverage has a negative correlation. These estimates are not significant and have very large standard errors. Note that we allow for indi- viduals to report having both public and private coverage. Lastly, we find that two of our four dwelling characteristics correlate positively with seeking public care. Overall, this exercise shows that there are few individual-level correlates that matter for healthcare-seeking behavior, and further motivates the inclusion of time-varying household controls as a robustness check. We now turn to characterizing changes in healthcare-seeking behavior by estimating the following equation using the individual-level data for both rounds: yimr = β1 sickimr + β2 ratemr + β3 (sick × rate)imr + Ximr γ + λm + θr + ηimr , (S4.1) Table S4.2. Relationship between Seeking Medical Attention and Being Sick with a Gastrointestinal Disease (GID) (1) (2) (3) (4) (5) Panel A: Seeking attention at a public clinic Sick with GID 0.3071*** 0.3039*** 0.3042*** 0.3239*** 0.3094*** (0.0108) (0.0108) (0.0107) (0.0175) (0.0175) GID rate per 1,000 0.0004*** 0.0005*** 0.0005*** 0.0004*** Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 (0.0001) (0.0002) (0.0002) (0.0001) Sick with GID × GID rate −0.0023 −0.0027 (0.0019) (0.0018) Observations 401,450 401,450 401,450 401,450 363,074 R-squared 0.0153 0.0185 0.0273 0.0274 0.0374 Panel B: Seeking attention at a private clinic Sick with GID 0.1852*** 0.1844*** 0.1848*** 0.1881*** 0.1913*** (0.0082) (0.0082) (0.0082) (0.0116) (0.0122) GID rate per 1,000 0.0003*** 0.0004*** 0.0004*** 0.0003*** (0.0001) (0.0001) (0.0001) (0.0001) Sick with GID × GID rate −0.0004 −0.0019* (0.0010) (0.0010) Observations 401,450 401,450 401,450 401,450 363,074 R-squared 0.0115 0.0133 0.0200 0.0200 0.0219 Municipality FE Yes Yes Yes Year FE Yes Yes Yes Base controls Yes Yes Yes Yes Additional controls Yes Source: Authors’ analysis based on data from the 2006 and 2012 National Health Survey (ensanut) rounds. Note: This table shows the correlation between seeking medical attention and being sick with a GID by estimating equation (S4.1). Panel A focuses on public clinics, and Panel B on private care. Observations are individuals in a given municipality-year. The dependent variable is an indicator for seeking medical attention at a public clinic (or private clinic) for any symptoms, unconditional on reporting being sick. GID rate per 1,000 is the prevalence of GID rates in a given municipality-year. Base controls include age, gender, whether the individual lives in a house with a dirt floor, electricity, piped water, and sewerage, as well as municipality-year level averages of these last four characteristics. Additional controls, for which a few missing values are recorded, include education indicators and indicators for health insurance status. Robust standard errors clustered at the municipality level. **p < 0.01, **p < 0.05, *p < 0.1 where yimr is an indicator for whether individual i in municipality m in survey round r sought medical care at a public clinic, sickimr is an indicator for being sick with a GID, ratemr = N1 mr j=i sick jmr is the GID rate excluding individual i, Ximr is a vector of controls, λm are municipality fixed effects, θ r are indicators for each round, and ηimr is the error term. Note that this is a repeated cross-section, where we cannot track the same individuals over time. We recognize that these estimates only allow us to identify correlations within the data. However, these simple relationships may be very informative. The coefficient β 1 indicates by how much the observed probability of going to the public clinic changes when an individual is sick with a GID (this is the overall average healthcare-seeking behavior when sick relative to non-sick healthcare-seeking behavior). The coefficient β 2 measures changes in the likelihood of seeking care as the prevalence of GIDs increases. Lastly, the coefficient β 3 indicates whether this probability changes differentially for individuals that are sick with a GID in areas with varying prevalence of GIDs. We are especially interested in β 3 . If we find a positive and significant coefficient, this would mean that the probability of seeking care when sick with a GID increases with the overall prevalence of GIDs in an individual’s municipality. This would then suggest that clinic reports of GIDs increase mechanically whenever the prevalence of GIDs increases. If instead we find a statistical zero, then an individual’s deci- sion of seeking care when sick is independent of the overall GID rate, regardless of the general effect of GID rates on this likelihood. This would be reassuring, since it would suggest that the mapping of GID prevalence to our clinic reports does not change with changes in GID rates. Panel A in table S4.2 shows the results from estimating equation (S4.1), with an indicator for seeking medical care at a public clinic as the dependent variable. We begin in column (1) by simply showing the correlation between the likelihood of seeking care at a public clinic and being sick with a GID. In columns (2) and (3), we successively add the GID rate and base controls, as well as municipality and survey-round fixed effects. These three columns show a positive and significant link between being sick with a GID Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 and seeking care at a public clinic. The magnitude is relatively stable, increasing the probability of care by 30 percentage points relative to healthcare-seeking behavior of individuals that are not sick with a GID. Columns (2) and (3) also indicate that an additional GID case per 1,000 individuals in a given municipality-year is associated with a small but significant increase in the likelihood of seeking care for any kind of disease. Column (4) adds the interaction between the indicator for whether the individual is sick with a GID and the local GID rate. The estimate is small, negative, and statistically indistinguishable from zero. Col- umn (5) includes additional individual-level controls. The results remain unchanged. The fact that we do not find a significant coefficient, together with the point estimate being small, suggests that there is no differential change in the likelihood of seeking care at a public clinic when sick with a GID as the local prevalence of GIDs increases. As such, this provides reassurance that the fact that we observe GID visits at public clinics, instead of the full prevalence of GIDs, does not introduce an important bias in our results. A related concern would be that individuals alter their decisions with respect to seeking private care. Panel B in table S4.2 shows similar results, using an indicator for seeking care at a private clinic as the dependent variable. We again find that being sick with a GID increases the probability of seeking private care, as does the local GID rate. We do not find significant effects of the interaction term at the conventional 95 percent confidence level. This suggests that there are no differential changes in seeking private care when the local prevalence of disease changes. S5. Effect on Hospitalizations We obtain hospital discharge records for a subset of public hospitals, corresponding to those administered directly by SSA. These publicly available data contain each patient’s date of admission, as well as the final diagnosis based on ICD-10 codes. There are 766 SSA hospitals in this dataset. Unfortunately, the hospital discharge data for the other public subsystems only registers the year, and not the actual dates of hospitalization, and do not provide ICD-10 codes. It should be noted however, that SSA tends to provide healthcare to lower SES groups, and that according to the 2012 ensanut, 40 percent of all hospitalizations occurred at an SSA hospital.2 Our main results correspond to the effect of the tax on GID rates at public outpatient clinics. We now turn our attention to a worse health outcome: hospitalizations. Our goal is to assess whether the increase in GID rates at the outpatient level were severe enough to necessitate inpatient care. Given the data limitations, we must exercise caution when interpreting these results. First, it is possible that find- ings for this subset of hospitals are not representative of the whole country. Second, since patients may seek inpatient care at hospitals that are farther away, and we cannot link patients from clinics to hos- pitalized patients, the classification into areas with and without access to safe drinking water may be noisier. 2 Private hospitalizations account for 15 percent of all inpatient care, which means that almost half of all public hospi- talizations are at an SSA hospital. Table S5.1. Difference-in-Differences (DiD) Effect of the Soda Tax on Gastrointestinal Disease (GID) Hospitalization Rates at Ministry of Health (SSA) Hospitals Hospitalization rate Length of stay All <6 years old GIDs All (1) (2) (3) (4) Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 Post-tax years Poor drinking water × 2015 −0.329 0.562 −0.096 0.168 (11.999) (5.734) (0.274) (0.222) Poor drinking water × 2014 −0.660 −1.850 −0.226 −0.036 (5.043) (1.950) (0.264) (0.094) Pre-tax years Poor drinking water × 2012 1.536 0.187 −0.311 −0.119 (4.414) (2.337) (0.222) (0.143) Poor drinking water × 2011 4.691 −0.473 −0.204 0.871 (6.750) (3.190) (0.275) (1.014) Poor drinking water × 2010 5.815 0.555 −0.508* 1.571 (8.473) (3.818) (0.273) (1.387) Poor drinking water × 2009 5.866 1.807 −0.275 0.958 (8.223) (4.620) (0.316) (1.112) Observations 21,448 21,448 21,448 21,448 R-squared 0.685 0.657 0.399 0.628 Mean dependent variable 71.57 35.42 1.41 2.81 Coefficient tests: H0 : β 2014 = β 2015 0.978 0.667 0.678 0.367 H0 : βk = 0 ∀ k = 2009, . . . , 2012 0.955 0.975 0.001 0.675 Source: Authors’ analysis based on hospital discharge records for hospitals under direct SSA administration, surface-water quality measures collected by the National Water Commission (conagua), and the 2010 census statistics obtained from the National Statistics Office (inegi). Note: This table shows the results on SSA hospitalizations from estimating equation (1) on a balanced panel of SSA hospital-quarters (766 hospitals × 28 quarters). The outcome in columns (1) and (2) is hospitalizations due to gastrointestinal diseases (GIDs) per 100,000, for the full population in column (1) and for children under 6 in column (2), winsorized at the 5 percent level. The outcomes in columns (3) and (4) correspond to the average length of hospital stays (calculated as the difference between discharge and admission dates) due to GIDs and all hospitalizations, respectively. Coefficients for the interaction of the indicator for areas with poor-quality drinking water and each year are shown, with 2013 as the excluded year. Robust standard errors clustered at the hospital level. The mean of the dependent variable for hospitals in areas with poor-quality drinking water prior to the tax is shown. ***p < 0.01, **p < 0.05, *p < 0.1 We assign piped-water access and surface-water quality measures to each hospital as before, resulting in a balanced panel of 766 hospitals × 28 quarters. We consider two outcomes. First, we use hospitalization rates due to GIDs, winsorized at the 5 percent level. We distinguish between all hospitalizations, and hospitalizations of children under six years old, since they are the most at-risk group for dehydration. Second, we calculate the average length of hospital stays for GIDs only and for all hospitalizations. The first measure captures the extensive margin, while the second focuses on the intensive one. Our rationale for including all hospitalizations here is that more diarrheal disease may also impact the severity of illness for those who were hospitalized for reasons unrelated to GIDs. Table S5.1 presents the results from estimating equation (1) on this dataset. Across all four columns (corresponding to the four outcomes outlined above), we find insignificant and small coefficients for the post-tax years. We also find insignificant effects during the pre-tax years. Figure S5.1a presents the corre- sponding event study plots, as in the main text. Taken together, the findings in table S5.1 suggest that the SSB tax had no discernible effect on hospitalization rates and length of stay in areas with low access to tap water and bad surface-water quality. Although the increase in soda prices did lead to more GIDs, as evidenced by our main The World Bank Economic Review 1 Figure S5.1. Event Study of the Effect of the Soda Tax on Gastrointestinal Disease (GID) Hospitalization Rates at Ministry of Health (SSA) Hospitals Downloaded from https://academic.oup.com/wber/article/36/1/1/6238549 by LEGVP Law Library user on 08 December 2023 Source: Authors’ analysis based on hospital discharge records for hospitals under direct SSA administration, surface-water quality measures collected by the National Water Commission (conagua), and the 2010 census statistics obtained from the National Statistics Office (inegi). Note: These graphs show results on SSA hospitalizations from estimating an event study on a balanced panel of SSA hospital-quarters (766 hospitals × 28 quarters). The outcomes for the graph on the left are GID rates per 100,000 for the whole population and for children under 6 years old, winsorized at the 5 percent level. The outcomes for the graph on the right are average lengths of stay for GID hospitalizations and all hospitalizations. Coefficients for the interaction of the indicator for areas with poor-quality water and each quarter for two years before and after the tax was introduced are shown, with quarter 4 of 2013 as the excluded period. Robust standard errors are clustered at the hospital level. Error bars show 95 percent confidence intervals. The mean of the dependent variable for hospitals in poor-quality water areas prior to the tax is 72 for hospitalization rates, 35 for hospitalization rates of children under 6, 1.4 for length of hospital stays for a GID, and 2.8 for length of all hospital stays. findings, these outbreaks seem to have been successfully controlled and contained at the outpatient level. References Agüero, J. M., and T. Beleche. 2017. “Health Shocks and their Long-Lasting Impact on Health Behaviors: Evidence from the 2009 H1N1 Pandemic in Mexico.” Journal of Health Economics 54: 40–55. Aguilar, A., E. Gutierrez, and E. Seira. 2021. “The Effectiveness of Sin Food Taxes: Evidence from Mexico.” Journal of Health Economics, forthcoming. Colchero, M. A., M. Molina, and C. M. Guerrero-López. 2017. “After Mexico Implemented a Tax, Pur- chases of Sugar-Sweetened Beverages Decreased and of Water Increased: Difference by Place of Resi- dence, Household Composition, and Income Level.” The Journal of nutrition, 147(8), 1552–1557. Colchero, M. A., B. M. Popkin, J. A. Rivera, and S. W. Ng. 2016. “Beverage Purchases from Stores in Mexico under the Excise Tax on Sugar Sweetened Beverages: Observational Study.” BMJ 352: h6704. Colchero, M. A., J. C. Salgado, M. Unar-Munguía, M. Molina, S. Ng, and J. A. Rivera-Dommarco. 2015. “Changes in Prices after an Excise Tax to Sweetened Sugar Beverages Was Implemented in Mexico: Evidence from Urban Areas.” PLoS One 10 (12): e0144408. Grogger, J. 2017. “Soda Taxes and the Prices of Sodas and Other Drinks: Evidence from Mexico.” Amer- ican Journal of Agricultural Economics 99 (2): 481–98.