Demand for Information on Environmental Health Risk, Mode of Delivery, and Behavioral Change: Evidence from Sonargaon, Bangladesh

Millions of villagers in Bangladesh are exposed to arsenic by drinking contaminated water from private wells. Testing for arsenic can encourage switching from unsafe wells to safer sources. This study describes results from a cluster randomized controlled trial conducted in 112 villages in Bangladesh to evaluate the effectiveness of different test selling schemes at inducing switching from unsafe wells. At a price of about USD0.60, only one in four households purchased a test. Sales were not increased by informal inter-household agreements to share water from wells found to be safe, or by visual reminders of well status in the form of metal placards mounted on the well pump. However, switching away from unsafe wells almost doubled in response to agreements or placards relative to the one in three proportion of households who switched away from an unsafe well with simple individual sales.


Policy Research Working Paper 9194
Millions of villagers in Bangladesh are exposed to arsenic by drinking contaminated water from private wells. Testing for arsenic can encourage switching from unsafe wells to safer sources. This study describes results from a cluster randomized controlled trial conducted in 112 villages in Bangladesh to evaluate the effectiveness of different test selling schemes at inducing switching from unsafe wells. At a price of about USD0.60, only one in four households purchased a test.
Sales were not increased by informal inter-household agreements to share water from wells found to be safe, or by visual reminders of well status in the form of metal placards mounted on the well pump. However, switching away from unsafe wells almost doubled in response to agreements or placards relative to the one in three proportion of households who switched away from an unsafe well with simple individual sales. This paper is a product of the Knowledge and Strategy Team, Development Economics. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at alessandro.tarozzi@upf.edu.

Introduction
Poor health stands out as a common feature of life in less developed countries (LDCs). Several factors contribute to the persistence of the problem, including poor availability and high cost of good quality health care, insufficient investment in prevention, and the frequent reliance on ineffective and sometimes unnecessarily expensive treatments (see Dupas 2012, Dupas and Miguel 2017, and Tarozzi 2016 for recent reviews). Information campaigns on health risks are sometimes seen as an appealing tool in environmental and health policy. This is because they can be relatively inexpensive to run when compared to other options such as investments in infrastructure or public health measures needed to eliminate the risk at its root. Some health conditions are in fact preventable if appropriate risk-mitigating behavior is adopted. However, governments in LDCs may lack the resources or the political will to carry out even simple information campaigns (let alone campaigns that provide reports specific to each household), and information alone is often not sufficient to promote positive changes in behavior.
This paper describes the results of a randomized controlled trial (RCT) carried out in Sonargaon sub-district of Bangladesh, to examine the impact of different ways of selling a contaminant test on risk-avoiding behavior. Households were offered tests that measure tube well water contamination with arsenic, a common occurrence in the area. The primary objective was to determine whether a novel mode of test delivery, leveraging within-village solidarity networks, could increase health-protective behavioral responses relative to the standard delivery of private information to well users. In a first group of 49 randomly selected villages, field tests were offered at a (subsidized) price of BDT45 (about USD0.60 at current nominal exchange rates, close to the price of one kg of rice in Dhaka), an amount estimated to be just enough to cover for the salary of the surveyors hired for the project. 1 In an additional subset of 48 villages, surveyors received incentives to offer tests-at the same price of BDT45-to groups of buyers: group members were asked to sign an informal agreement according to which those with safe wells would share their well with others in the group whose well water was found to be unsafe. The agreement was not binding legally, but the prior was that it would increase rates of switching from unsafe sources through two mechanisms: first, by making sharing more likely through a form of soft-commitment and, second, by facilitating the spread of information about the safety of wells, thereby facilitating the identification of safe options within the village. While a large literature documents the importance of village networks to cope with shocks, including health shocks (see Fafchamps 2011 for a review), we are not aware of other work studying how informal networks can help in creating opportunities to reduce environmental health risk. 2 The study also examines the impact of a second mode of information delivery, in the form of metal placards attached to the well spout to convey test results. Budget limitations, however, only allowed the inclusion of 15 villages in this experimental arm, reducing statistical power. In these villages, 1 Throughout the paper, Bangladesh Takas (BDT) are converted into United States Dollars (USD) using a nominal exchange rate of 80BDT/USD, and a PPP exchange rate of 23.145, as indicated in World Bank (2015, Table 2.1).
2 In broadly related work, Goldberg et al. (2018) show that peer networks can be leveraged to improve screening for tuberculosis in Indian urban areas.
2 individuals who purchased a test at the same price of BDT45 were also given a metal placard of a color depending on the arsenic level: blue for arsenic up to 10 ppb (parts per billion or micro-grams per liter), the World Health Organization guideline for arsenic in drinking water, green if above 10 and up to 50 ppb, and red if 'unsafe', that is, above the national government standard of 50 ppb. Similar metal placards have been used before in some testing campaigns (Opar et al. 2007), as a more durable alternative to the routine strategy-adopted also during past nationwide testing campaign in Bangladesh-of applying to the well spout red or green paint that often becomes invisible within a year (Pfaff et al. 2017). Such visible indicators are a reminder about the status of specific tube wells with respect to arsenic and can spread this information throughout the village. In different contexts, other researchers have found large impacts of reminders on health-related behavior, for instance through the use of SMS messages, see Pop-Eleches et al. (2011) and Raifman et al. (2014).
However, the cost of the placards (about BDT80) is high enough to increase significantly the total cost of testing. It was therefore important to determine whether the placards made any difference relative to the alternative solution (adopted in the two experimental arms described earlier) of informing the household with a less expensive laminated card indicating the test result and encouraging the household to keep the card in the house.
Despite much progress on numerous health indicators (Chowdhury et al. 2013), Bangladesh remains in the midst of a severe health crisis due to the widespread presence of naturally occurring arsenic (As) in shallow aquifers (see Ahmed et al. 2006, Johnston et al. 2014, and Pfaff et al. 2017). The problem, due to the widespread presence in the country of geological conditions conducive to accumulation of arsenic in groundwater, is compounded by millions of households in rural areas relying on water from privately owned, unregulated shallow tube wells for drinking and cooking. Using nationwide data from 2009, Flanagan et al. (2012) estimated that, in a country of more than 150 million people, about 20 million were likely exposed to arsenic levels above the official Bangladesh standard of 50 ppb, while almost one third of the population was likely exposed to levels above the significantly lower WHO guideline of 10 ppb.
The most visible health consequences of chronic exposure to arsenic from drinking tube well water in South Asia, such as cancerous skin lesions and loss of limb, were recognized in the state of West Bengal, India in the mid-1980s (Smith et al. 2000). It has since then been shown on the basis of longterm studies in neighboring Bangladesh that arsenic exposure increases mortality due to cardiovascular disease, and may inhibit intellectual development in children and be detrimental for mental health (Wasserman et al. 2007, Argos et al. 2010, Rahman et al. 2010, Chen et al. 2011, Chowdhury et al. 2016). These health effects are accompanied by significant economic impacts: exposure to arsenic has been estimated to reduce household labor supply by 8% (Carson et al. 2011) and household income by 9% per every earner exposed (Pitt et al. 2015), while Flanagan et al. (2012) calculated that a predicted arsenic-related mortality rate of 1 in every 18 adult deaths represents an additional economic burden of USD13 billion in lost productivity alone over the next 20 years.
Piped water from regulated and monitored supplies would likely be the most effective policy answer, but such a solution would require immense investments in infrastructure that may not be sustainable or cost-effective for the foreseeable future, so that identifying short-term mitigation strategies remains essential. The consensus view is that household-level water treatment, dug wells, and rain-water harvesting are not viable alternatives for lowering arsenic exposure because of the cost and logistics of maintaining such systems in rural South Asia Howard et al. 2006;Johnston et al. 2014;Sanchez et al. 2016). In contrast, despite being the main source of arsenic exposure, tube wells remain the most effective way of providing safe drinking water to the rural population of Bangladesh in the short to medium term (Krupoff et al. 2020). With the exception of the most severely affected areas of the country, the spatial distribution of high-and low-arsenic wells is highly mixed, even over small distances. At the same time, whether a well is contaminated with arsenic or not rarely changes over time McArthur et al. 2010). Therefore, exposure among users of arsenic-contaminated wells can often be avoided by switching to a nearby safe well, be it a shallow private well or a deeper-which usually means safer-community well (van Geen et al. 2002;van Geen et al. 2003). Using data from Araihazar, a sub-district bordering the location of this study, Jamil et al. (2019) estimate that blanket testing campaigns that inform households about the arsenic contamination of all private wells were significantly more cost-effective at reducing arsenic exposure than the provision of piped water, or the construction of deep wells by the government.
Previous campaigns aimed at testing tube well water for arsenic have only partially succeeded at promoting risk-avoiding behavior, highlighting the need to devise novel strategies to achieve this goal. testing campaign. The campaign tested close to 5 million wells making use of field kits, and identified them as 'safe' or 'unsafe'-according to the Bangladesh standard of 50 ppb-by painting the well spout with green or red paint, respectively. Several studies have documented switching rates from an unsafe to a safe well after testing of between one-third and three-quarters, with higher switching rates in trials that provided information campaigns on arsenic health risks and repeat visits, in some cases with objective measures of exposure taken in the form of urine samples (Chen et al. 2007;Madajewicz et al. 2007;Opar et al. 2007;Bennear et al. 2013;Balasubramanya et al. 2014;Inauen et al. 2014;Pfaff et al. 2017). Despite these partial successes, a substantial fraction of households continues to use unsafe wells today and it is thus important to identify mechanisms to increase riskmitigating responses. In addition, millions of new wells have sprouted in the country, and in most cases users do not know the arsenic level of the water, because campaigns such as BAMWSP have not been replicated, and a market for tests barely exists. There are a few commercial laboratories in Dhaka with the capability to test wells for arsenic, but few rural households are aware of these services. 3 The cost of well testing is greatly reduced and the logistics are greatly simplified by the use of field kits, which have become increasingly reliable and easy to use van Geen et al. 2014), but even these tests are rarely available in the villages.
In the context of this study, only about one in four households purchased a test, regardless of the offer type, despite the low-and subsidized-sale price and widespread awareness about the arsenic problem, and despite little prior awareness about the safety status of individual wells. This is consistent with a growing literature that documents low demand for health-protecting technologies in developing countries for a variety of such products, ranging from insecticide-treated nets (Cohen and Dupas 2010, Dupas 2014, Tarozzi et al. 2009, to de-worming drugs (Kremer and Miguel 2007) and water-disinfectant (Ashraf et al. 2010 This work complements the literature on the demand for health-protecting technologies by looking at demand for health-related information that can be exploited by households to reduce risk. The focus here is on the offer of information that is specific to the buyer (the test measures arsenic contamination in the water from a specific well), in contrast to general information (for instance, on the likelihood of arsenic contamination, or the health risks associated with unsafe water). While this article studies demand for information on environmental factors, earlier work has considered the demand for information on health status, see in particular Cohen et al. (2015), Bai et al. (2017), Thornton (2008), and Gong (2015). These studies suggest that even among households willing to pay for information, behavioral responses may not be optimal from a public health perspective, so that it is important to study whether the mode of delivery of information can help achieving desirable policy objectives.
This article is related to Barnwal et al. (2017) who estimate a very steep demand curve for arsenic tests in Bihar, India, another location with a groundwater arsenic problem. This study found that uptake was 25% at a price of INR40 (about BDT49), which is about the same as what this study estimates at a very similar price of BDT45. Unlike Barnwal et al. (2017), this study does not analyze how demand changes with price, but it examines the role of non-price factors on demand and behavioral responses to information. The results show that demand was not sensitive to the introduction of informal agreements or the use of placards, but conditional on demand, these nudges led to large and significant increases in switching among users of unsafe wells relative to simpler, private sales.
The paper proceeds as follows. The next section provides additional background information on the extent of the arsenic problem in the study area and describe the experimental design. Section 3 de-scribes the data collection protocol, present selected summary statistics, and show that by chance the means of some covariates were not balanced at baseline, highlighting the importance of controlling for baseline characteristics in our estimates (the adjusted and unadjusted estimates remain qualitatively similar). Section 4 presents the conceptual framework that guided the study design and the interpretation of the results provided in Section 5. The cost effectiveness of the interventions is evaluated in Section 6. Finally, Section 7 concludes and highlights limitations of the results.

Program Description and Study design
This study was carried out in Sonargaon, a 171 km 2 sub-administrative unit (or upazila) of Narayanganj

Study Area and Program Description
The study area for this trial was initially formed by the list all 128 villages in Sonargaon with more than 10 wells and with a 40-90% share of unsafe wells based on the BAMWSP testing conducted years earlier. A lower bound was chosen to focus on areas where a sizeable fraction of new untested wells were likely to be unsafe, while the upper bound was designed to avoid areas where switching to safe wells was not likely to be a viable option for most households.
Between January and June 2016, surveyors conducted home visits to identify all wells in the selected villages, regardless of whether they had been tested before. Privately owned wells were linked to household who owned it, while public wells were linked to the main caretaker or user. Almost all wells (98.6%) were privately owned, and for simplicity in the rest of the paper the term 'owner' will be used to refer to the household who owned the well, or to the household who was the primary user of community wells.
During the home visits, surveyors explained the risk of consuming arsenic-contaminated tube well water to an adult-typically the most senior woman-and offered to test the well for a fee.
When a test was purchased, tube well water was tested in the field using the Arsenic Econo-Quick (EQ) test kit, which has been shown to be reliable, and can deliver results within ten minutes (see   In a second group of villages (B), surveyors were asked to sell the tests to groups of buyers, rather than to individual households. When a well owner was identified, surveyors would propose the formation of a group of buyers of at least three and up to ten households, while individual sales were not allowed. 6 Surveyors would help group formation, for instance by proposing a sale to all households within the same compound (or bari ), and then coordinating the inclusion of additional buyers via mobile phones. After a group was formed and an informal well-sharing agreement was reached by all group members, each household would pay BDT45. The agreement had no legal standing, and was meant to serve as a soft commitment device. Our study design called for an agreement in writing, but in practice most buyers were uncomfortable about signing a document, so in a large majority of cases a verbal agreement took place instead. All members were informed of the test results for all wells within the group.
In a third group of villages (C), households were again assigned to receive individual test offers at BDT45 (as in group A), but in the case of purchase a color-coded stainless steel placard was attached 5 In practice, demand (and thus testers' compensation) was lower than expected, see footnote 15. 6 Most groups included 7-10 buyers, although some had as few as three and some had 11.
to the well's pump-head. Placards displayed both in text and color whether the arsenic concentration was up to 10 ppb (blue), 25 or 50 ppb (green), or above 50 ppb (red). As shown in Figure 1, the placards displayed two hands holding a drinking cups, one hand holding a drinking cup, or a large cross over a hand holding a drinking cup, depending on the arsenic concentration.
The split of the test fee between the tester and supervisor in groups B and C was the same as in group A, but in B the project gave an additional bonus of BDT12 per sale to testers, to compensate for the additional effort necessary to coordinate group formation. 7

Power Calculation and Study Design
The trial was not registered, and was not based on a pre-analysis plan. Comparison of demand for and responses to tests between arms A and B was the primary objective of the study. Data from earlier work in neighboring Araihazar subdistrict (see Bennear et al. 2013), were used to estimate an intra-  Given that the available funding allowed the inclusion of a larger number of villages, it was possible to include the additional experimental arm C, for which however sample size was dictated by budget constraints rather than power calculations.
The assignment to treatment arms was done by the principal investigators using random assignment, using the statistical software Stata, after stratification. Strata were determined by whether the share of unsafe wells in the BAMWSP testing campaign carried out years earlier was below or above 7 The experimental design also included two exploratory arms, with only six villages each, where tests were sold individually either at a village-level price of BDT45 or BDT90, but with payment required only in case of 'good news', that is, in case of arsenic level no higher than 50 ppb. The inclusion of these proof-of-concept conditional sales was motivated by the aversion expressed by several members of focus groups to the idea of 'paying for bad news'. Sales conditional on the results may have thus increased demand (a prediction strongly supported by the observed purchase rates), although the conditional payment also generates a reduction in the (expected) price and a different selection into purchase conditional on beliefs about the safety of the well water. Because of these confounding factors and because of the very small number of villages assigned to these sales, the results are not discussed in detail in this article but they are available upon request from the authors. the median, and by union (an administrative unit). 8 There were two deviations from the experimental protocol. First, while programming the mobile application used for data collection, 27 villages were assigned by mistake to a treatment different from the original one. The partial re-assignment of treatments was thus due to a programming error and not to the incorrect implementation of the protocol in the field. In addition, the checks for balance in covariates are very similar based on originally assigned or actual treatment (see below). For this reason treatment status is defined as actual treatment in the analysis. The second deviation is that surveyors were unable to differentiate a village from the one adjacent to it in four cases. While data were collected from households in these four villages and the ones adjacent to them, only pairs of villages could be distinguished, and both villages in each pair received the same treatment. In the statistical analysis there are thus effectively 112 clusters divided into experimental arms A (49 clusters), B (48), and C (15). For simplicity, in the rest of the paper these clusters will be referred to as 'villages'. 9 It should be emphasized that, while surveyors completed a census of wells in study villages, our data do not represent a census of households. The choice to survey only owners-who were anyway a majority-was due to budgetary constraints, but an implication is that one cannot study whether the choice of the primary source of drinking water changed also among non-owners.

Data
A team of testers who had at least completed secondary education was recruited locally. During the home visits when sale offers were made, testers also administered a short household baseline survey and recorded information on sales and, in case of purchase, the result of the test. Additionally, surveyors recorded GPS coordinates of all wells and noted down whether there were any visible labels attached to the well indicating the status with respect to arsenic. The baseline questionnaire included a household roster, basic questions on socio-economic status, detailed questions on the well, and a number of questions related to knowledge and practice in relation to drinking water and arsenic risk.
Testers also recorded whether, according to the respondent, the arsenic status of the well water was safe, unsafe, or unknown. Information about a total of 12,606 wells was recorded and the household survey was completed for all but three well owners.
A limitation of the data is that, in case of group sales in arm B, surveyors did not keep accurate records of who belonged to which group. In other words, while the data indicate for which wells a test was purchased, one cannot study the characteristics of buyers belonging to the same group, or to what extent well sharing was actually taking place within each group as a consequence of the test results.
The endline survey was completed between August 2016 and January 2017. 10 The average time 8 Unions are the third smallest administrative unit in Bangladesh, and are formed by several mouzas, which in turn are composed of two to three villages. The 128 study villages belong to nine unions.

10
The Appendix shows that the randomization led to large spatial variation in treatments, see Figure A.1. Unlike the baseline survey, where the wages of surveyors and the supervisor were covered mainly from test fees, the between baseline and endline surveys was 7.7 months, and 86% of households had their follow-up interview between seven and nine months after the baseline interview. During the endline survey, surveyors were instructed to return to the wells identified during the test sales and record if the household owing the well was still using it as a primary source of water for cooking and drinking.
In case of a negative answer, the surveyor asked the respondent to accompany him/her to the actual source and would record the new GPS location and the presence of any visible indicators of arsenic status (for instance one of the metal placards distributed during our intervention). The surveyor would also ask the respondent the perceived safety of the new source as well as about the primary reason for switching to a different source. Switching behavior was thus self-reported, but earlier work in the neighboring Araihazar sub-district found that switching behavior recorded in a way similar to this study was actually consistent with urinary arsenic concentrations, an objective biomarker of exposure (Chen et al. 2007). Unfortunately, the records on the location of the new source of drinking water do not allow to measure precisely to what extent switching was associated to a reduction in arsenic risk.
The smartphone's GPS sensor-with a precision of 10 m at best-cannot uniquely identify a specific well among those surveyed due to the density of wells within a typical village in Bangladesh. The GPS data was still used to estimate the distance from the old to the new well used for drinking. Table 1 shows selected summary statistics measured at baseline. Throughout this paper, unless otherwise noted, the analysis is restricted to the large majority of households (91%) that used their own well at baseline, as this is the sample for which baseline water source and post-intervention switching can be determined. The baseline source was not recorded for non-users, and during the endline survey they were only asked again if they used their own well for drinking and cooking. 11 All summary statistics except those on the first row of Table 1 are thus calculated for households who used water from their own well for drinking and cooking.

Summary Statistics at Baseline
Household heads had low levels of educational attainment on average, and most households were poor, with only 17% of the houses having a concrete roof (an indicator of wealth), while the rest had tin or (in rare cases) mud roofs.
The average well owner in our study area lived in a village where 75% of the wells tested by BAMWSP between 1999 and 2000 were unsafe with respect to arsenic. Despite the BAMWSP blanket testing campaign, a large majority of respondents (76%) did not know whether their well was safe or unsafe with respect to arsenic. Only 7% of them thought that their well was unsafe, while the remaining 17% reported believing that their well was safe. Whereas about a quarter of respondents indicated that they knew the status of their well, more than 99% of wells lacked any visible sign of their status with respect to arsenic such as red or green paint. Information on the safety of the minority of wells that had been tested was thus not immediately observable by other households, although in cost of the follow-up survey was paid for by the project. 11 Wells not used for drinking were on average significantly shallower, and thus more likely to be contaminated with high levels of arsenic. Of the 1,193 wells not used for drinking, only 14 (about 1%) were believed to be safe by the owner.
principle knowledge could have been shared with others privately. Using geographic coordinates, we estimate that before our pay-for-test campaign the average well owner had about 0.02 wells labeled as safe within 50 m, out of an average of nearly 12 wells within that distance.
The immense public health challenge due to widespread arsenic contamination of well water has been widely discussed and advertised in Bangladesh, and this is reflected in the data. Almost all respondents replied 'yes' to the question "[h]ave you ever heard about arsenic in tube well water?" Similarly, all but a handful of respondents replied yes when asked "[a]re you aware of the health risks of drinking tube well water containing arsenic?" On average, wells were shallow (179 feet, or 55 meters) and about nine years old-consistent with many wells having been installed after the BAMWSP blanket testing. The average reported installation cost of wells in the sample was BDT7,560, or about USD100 (USD323 using the PPP exchange rate from World Bank 2015). Well depth is a key predictor of installation costs: in the data, the elasticity of cost with respect to depth is 0.72 (s.e. 0.04). The BDT45 price charged for the test in this study thus represented slightly more than one half of a one percent of the installation cost.
Well-sharing was already common in the study area: while the average household had fewer than four members, the average number of individuals using water from a well for drinking was 8.8, and in more than half of the sample wells the number of users was larger than household size. 12

Variable Balance across Experimental Arms
Column 5 of Table 1 shows the p-value for the null hypothesis of equality of means across the three treatment arms. The null is rejected in 5 of 26 cases at the 10% level. The differences are due to chance, and because baseline data were collected while offering wells tests, balance on variables measured at baseline could not be enforced through stratification or re-randomization. Some of the differences are substantively important. While overall 19% of household heads in the study area had no schooling, this number drops to 5% in treatment C and is close to 26% in arm A. The fraction of respondents who did not know the status of their well ranged from 68% in arm A to 90% in arm C, while the fraction of wells perceived as safe ranged from 4% in C to 22% in A. Both the group-specific means and the tests of significance are very similar if the estimation is repeated using treatment as initially randomized rather than actual treatment, sometimes not identical due to errors in coding the smartphones used for surveying. 13 The overlap in the distribution of covariates between arms can be gauged more systematically following the approach described in Imbens and Rubin (2015, Ch. 14). First, for each covariate in Table   1, differences between means for each pair of experimental arms normalized by a measure of variance are estimated, see Appendix A.1 for details. While the usual t-statistics used to construct the tests for balance have a denominator that shrinks to zero when sample size grows large (because the standard errors become smaller), this is not the case for the normalized differences, where the denominator is the simple average of two arm-specific standard deviations. Imbens and Rubin argue that these latter statistics are more relevant than the t-statistics for assessing whether simple adjustment methods such as controlling for covariates or matching estimators can adequately remove bias due to covariance imbalance. One can also calculate, for each pair of arms, a 'multivariate difference' estimated with a Mahalanobis distance that aggregates all the individual differences. Although Imbens and Rubin do not propose formal tests based on these statistics to gauge balance, they argue that balance is excellent in an empirical illustration where all standardized differences are smaller than 0.3 and the multivariate measure is 0.44. In contrast, simple regression adjustments are deemed to be likely inadequate to eliminate bias in cases where some standardized differences are larger than 0.50 and the multivariate measure is 1.5 or above.
The normalized differences, reported in Appendix Table A.1 show that overall there is good balance between arms A and B: there is no variable for which the standardized difference is larger than 0.3, and the aggregate measure of balance is 0.604. In contrast, lack of balance is more problematic when one compares either arm A or B to arm C. In comparing A and C, consistent with the formal tests of equality, the differences are particularly large for schooling of the head and beliefs about well safety, with standardized differences larger than 0.5 in absolute value. The multivariate difference is also relatively large and equal to 1.1. The comparisons between B and C also show that four of 22 standardized differences are larger than 0.3, with a multivariate difference equal to 0.720.
Because there is lack of balance in some characteristics such as beliefs or schooling that may affect behavior, results that control for observed covariates will also be analysed. The estimates are qualitatively robust to such inclusion, although the point estimates are in some cases affected, and the standardized differences described above suggest that some caution should be exercised in particular when making comparisons that involve group C.
Attrition is analyzed at the bottom of Table 1. Overall, 8.8% of households could not be matched to the endline data, either because of true attrition (6 percentage points) or because errors in identifierswhich appear as duplicates in the data-did not allow the match. The null of equality among the three arms is not rejected at conventional levels for any of the attrition measures.

Conceptual Framework
Before discussing the results, it is useful to consider the main factors likely to influence purchase choices and, conditional on test results, risk-mitigating behavior. This section does not describe a formal model but rather offers a simple conceptual framework to interpret the results in terms of expected differences between experimental arms and in terms of mechanisms.
Willingness to pay for a test likely requires the existence of three conditions: first, that the test provides new information; second, the perception that there are health and/or economic costs associated with continued use of arsenic-contaminated water; third, that in case of 'bad news' there will be mitigation strategies available (e.g., the possibility of switching to a nearby safer well). All these conditions were present in the empirical context of this study.
The first condition is satisfied by the large majority of respondents (76%) who did not know whether their well water was safe or unsafe to drink (Table 1). In addition, very few wells had visible signs of safety status such as paint, so even households with strong priors about the safety of their well water may have valued the possibility to demonstrate water quality to others by displaying test results. 14 That the tests could provide new information also required trust, as there is growing evidence that lack of trust in health-related information may hinder the adoption of behavior that could reduce health risks (Cohen et al. 2015, Bennett et al. 2017, Alsan and Wanamaker 2017, Martinez-Bravo and Stegmann 2019. Although the data do not include measures of trust, water tests were not a novelty because many wells had been tested in the past in Sonargaon. In addition, earlier work carried out in the neighboring Araihazar sub-district found switching rates from unsafe to safe wells after testing of between one-third and three-quarters (see Section 1 for references), consistent with a high degree of trust in the results.
The second condition (relevance of the information) is also satisfied because virtually all respondents knew in a general sense about the presence and health risks of arsenic in tube well water. Data on risk perceptions collected in neighboring Araihazar in 2008 indicated that a majority of respondents were aware not only of the serious nature of arsenic risk, but also that the risk becomes more severe with prolonged exposure, see Tarozzi (2016, Figure 4) for details.
Finally, the third condition (perceived availability of alternative sources of drinking water) was also likely satisfied given that well-sharing was already practiced in the area (see Section 3). To some extent households were therefore already aware of the possibility of using neighbours' well water for drinking. Data from neighboring Araihazar show that the BAMWSP testing campaign conducted about 10 years earlier led to a substantial degree of well sharing (Balasubramanya et al. 2014). On the one hand, the figures in Table 1 show that about three quarter of wells had been found to be unsafe by BAMWSP in the study area, and this may have reduced the perceived chance to have safe options nearby in case of bad news. On the other hand, the average household had more than 10 other wells within a short 50 m radius and about 30 within 100 m, and such dense network of wells likely increased the perceived likelihood of having safe wells close by.
Overall, differences in switching rates between any two experimental arms could have emerged either from different selection into purchase or from the way information was provided (in which case even identical buyers may have reacted differently). The decision to change source was still likely to depend primarily on any new information made available by the test results. Hence, regardless of the 14 In principle, such value needs not be positive, for instance if knowledge of a high-arsenic well is perceived as lowering land value, or signaling poor health among household members, or is more generally stigmatized. In Bihar, India, Barnwal et al. (2017) find that placards indicating unsafe arsenic levels were more likely to be removed by households than those indicating low levels of arsenic two years after installation, although such behavior may have also been justified by the desire not to be reminded constantly about the health risks. However, earlier research has shown that households very rarely refuse testing when this is offered for free, even when the results are posted on the wells (see for instance experimental arm, very little switching was expected to take place from untested wells (driven perhaps by 'free riders' who moved to wells nearby found to be safe), and even less from wells that were found to be safe. The prior was also that the difficulty in predicting arsenic contamination without a test would mean that the likelihood of having an unsafe well, even if conditional on purchase, would be uncorrelated with the mode of sale and thus similar between groups. Finally, conditional on finding out that one's water is unsafe, the expectation was that the soft commitment and the easier access to within-group information on safe alternatives (in group B), and the salience and visibility of the tags posted on wells (in group C), would lead a larger proportion of households to stop drinking from the tested well relative to the individual sales in group A. In contrast, priors about the switching rates in B relative to C were not as clear. However, the effectiveness of C at inducing switching from unsafe wells rested at least in part on households not removing the metal plates from the pump heads, thereby maintaining the ability of the plate to make safety status visible and to discourage drinking from unsafe wells.
Regardless of the sale method, switching from unsafe wells should be more likely when safer alternatives are available nearby. If households recognize that health risks increase with the arsenic concentration in the water, then switching from unsafe wells should increase with the arsenic level, a desirable pattern that has been observed elsewhere, see Madajewicz et al. (2007). The method of information delivery may also affect the choice of the new source conditional on switching. In particular, the labeling of wells in C could make safe wells easier to identify for the whole village relative to B, possibly leading to the choice of safer wells. Finally, it is possible that buyers that were not using their well for this purpose at the time of the sales started doing so if they learned that the water was safe.
Predictions for differences in demand between arms were not as sharp: while factors leading to higher willingness to react to information were expected to also lead to higher willingness to pay for it, key factors such as perceived health risks and availability of alternative sources were likely to become more salient after the realization of the test result. However, consistent with the conceptual framework described above, households that reported knowing the safety of their well water were expected to be less likely to purchase a test.

Results
This section first describes the estimated effect of the selling schemes on the demand for testing. Next, it describe the information on arsenic levels that was revealed by the tests, and finally it discusses to what extent such information changed household behavior in terms of choice of water source for cooking and drinking. In describing the results the focus is primarily on households who used the well as primary water source for cooking and drinking at baseline, given that for those who were not the baseline records do not indicate what the main source was.

Demand
Of the 11,410 households who used their own well for cooking and drinking at baseline and who were offered an arsenic test, 2,829 (25%) bought a test under one of the selling schemes. To estimate the average treatment effect of selling schemes B and C relative to A, the following equation is estimated using a linear probability model: where buy svh is equal to one if household h in village v and stratum s bought a test at baseline and zero otherwise, B v and C v are village-specific indicator variables for the respective treatments, X svh is a set of predetermined household and tube well characteristics, and svh is an error term. To account for the stratified design, regressions also include strata fixed effects (δ s ). Recall that treatment was stratified by the prevalence of unsafe wells based on BAMWSP data and by union. The coefficients of interest are β B and β C , which capture the causal impact on demand of selling schemes B and C, relative to A, respectively. All standard errors and statistical inference are robust to the presence of intra-village correlation of residuals. Figure 2 shows graphically the simple comparison of take up rates across arms without the inclusion of controls or strata fixed effects. A first clear result is that neither the group sales nor the addition of the metal placard made any appreciable difference for demand. A second finding is that demand was overall quite low, with about one quarter of households purchasing the test in each of the three experimental arms. As in many earlier studies looking at demand for health-related preventive products, even a small fee led to low demand, despite the potentially vital information provided by the tests. 15 Table 2 displays the corresponding regression results. Column 1 shows see that, as expected, the small differences in demand between arms A, B and C are not statistically significant. Column 2 shows that the results are quite robust to the inclusion of controls. Because missing values in one or more of the controls lead to the loss of about 20% of observations, in column 3 the model is estimated without controls but including only the observations with complete observations used in column 2. In this case,β B barely changes, consistent with the overall good balance between arms A and B suggested by the Imbens and Rubin approach. In contrast,β C doubles in magnitude from 3 to 6 percent (s.e. 0.034) and it becomes significant, although only at the 10% level. Recall that Arm C appears to be different mostly because, on average and relative to the other arms, (a) household heads had better education and (b) the fraction of wells whose safety was unknown at the time of the test sales was higher and the fraction believed to be safe was lower. In column 2 it can be seen that low schooling predicts lower demand, while the higher prevalence of wells of unknown safety is positively associated 15 The low demand also raises concerns on the sustainability of a test-for-fee selling scheme at this price. On an average work day, surveyors visited 25 well owners to offer As tests (with surveyor-specific averages ranging from 12 to 36 visits), but only sold six tests (with surveyor-specific averages ranging from 3.5 to 10.8 tests). Recall that the test price was chosen so that surveyors would earn a wage similar to that earned in the neighboring district of Araihazar for similar work by selling 15 tests a day. The low demand thus implies that the actual wage fell short of the expected one.
with it, conditional on other observed characteristics. Both these factors suggest that the omission of controls may have biased demand upwards in arm C, although the point estimates remain very close.
The coefficient estimates for the controls in column 2 cannot be interpreted causally, but it can be noted that beliefs about the safety status of the well strongly predict demand: well owners thinking that their well is safe had little to gain from buying a test, and indeed they were 12 percentage points less likely to purchase the test (p-value< 0.01). The belief that the water was unsafe also decreased the probability of purchase, although by less than half as much (β = −0.049, p-value< 0.01). This is overall consistent with the conceptual framework, where it was highlighted that a key factor for willingness to pay for the test is that its result will provide new information.

Test Results
Although the purchase rate was low, the intervention generated a large increase in the number of tested wells in Sonargaon. Before looking at the responses to the information made available by the tests, it is useful to first describe such information. The test results are summarized in Table 3, which also includes the detailed figures on switching behavior that will be described later, and so the statistics are calculated for the 10,412 households (91.3% of the total) that could be tracked in the endline survey.
Of these, 2,417 purchased tests during the intervention.
Overall, 19% (455/2417) of the tested wells which had been used for cooking and drinking at baseline had unsafe arsenic levels relative to the national standard of 50 ppb. 16 Recall that these results are conditional on demand so that the randomization across treatments does not guarantee similar distributions across arms, even in large samples. However, the distribution of arsenic was overall similar between arms A and B, the two largest arms. Arm C had more unsafe wells (27%, versus 19 and 16% in arms A and B, respectively), although the null of equality among these three arms cannot be rejected at standard levels (p-value= 0.32). Consistent with the existence of a degree of awareness about arsenic risk, at baseline group C was by far the one with the smallest fraction of respondents thinking that their well was safe, although the fraction believing that the well was unsafe was fairly similar between groups, see Table 1. The larger share of unsafe wells in arm C may thus have been the result of lack of balance at baseline arising by chance, possibly due to the small number of clusters (15) in this treatment arm.
Overall, at the time of the endline survey, of all the wells found to be unsafe, 30% had at least one well identified as safe within 25 m after the testing, 57% had at least one within 50 m, and 78% had at least one within 100 m. This confirms that, in principle, switching to a nearby safe well was a feasible 16 The prevalence of unsafe wells was much lower than the 40-90% observed at the time of the BAMWSP testing campaign, about 10 years earlier. This is consistent with a degree of learning over time about local arsenic risk and how to avoid it, in particular by digging deeper wells, more expensive but perhaps made more affordable by economic development. Indeed, the data indicate that a majority of wells were of recent construction and were deeper than the older ones. The data also show that the beliefs about the safety of their well water among the minority of respondents who reported to know it were strongly correlated with actual test results. Detailed results for these findings are available upon request from the authors. strategy to mitigate arsenic risk for the large majority of households. In addition, and consistent with the similarity across arms in the prevalence of purchases and unsafe results, the different testing strategies produced very similar frequencies of safe alternatives in the vicinity of high-arsenic wells.
At distances of 25, 50, and 100 meters, such frequencies ranged across arms from 27 to 33%, from 55 to 60% and from 73 to 85%, respectively, and the null of equality is never rejected at standard levels.
Note also that these figures may underestimate substantially the potential role of switching to reduce arsenic risk, given that they do not take into account the likely presence of safe wells that were not tested. Figure 3 shows the raw switching rates-and corresponding confidence intervals-observed in each experimental arm, without any control or strata fixed effects. Corresponding regression results are shown in Table 4. Consistent with the conceptual framework, little well switching was found among households who did not buy a test, and among those who found that the well they used for drinking was safe. The estimates in Table 3 show that barely anyone moved away from a well identified as safe (12/1,962), while less than 3% of untested wells (224/7,995) stopped being used for drinking. The two bottom bar charts in Figure 3 show that for these two groups the switching rates were similarly very small in all arms.

Responses to test results
The primary outcome of interest of this study was the response of users of unsafe wells, but because such responses are conditional on the choice to purchase a test and on the test result, the intent-totreat (ITT) estimates are presented first, to describe unconditional switching rates. In Figure 3.1 it can be seen that while standard individual sales (A) led 3.7% of households to switch water source, the fraction was 4.3% with group sales (16% higher) and 6.4% (73% higher) when metal plates were attached to the well spout in case of purchase. The regression results are displayed in columns 1-2 of Table 4, where the estimated models are as in equation (1). When controls are included (column 2), the difference in switching rates between B and A is 0.01 (95% C.I. [−0.012, 0.03]), while the difference between C and A is 0.03 (95% C.I. [0.003, 0.065]). Consistent with Figure 3.1 both estimates suggest that group sales and especially placards increased switching rates relative to individual sales, but the point estimates are small, and the null of equality is only rejected-at the 5% level-for arm C.
Overall, the unconditional switching rates were small in each arm, reducing the statistical power when making between-arm comparisons. This is in large part because about three quarter of households did not purchase a test, and a large majority of wells were found to be safe. Contrary to expectations, unsafe higher levels of arsenic do not predict more switching. Using 100 as the omitted category, dummies for the arsenic level being equal to 200, 300 or 500/1000 ppb are actually negative and in some case very large and statistically significant. This finding is consistent with most households gauging safety primarily in a binary way, an unfortunate possibility given that in reality arsenic health risk is to first order proportional to arsenic concentration. 17 Figure 4 shows indeed that overall switching rates were well approximated by a step function with a jump at 100 ppb.
Note also that very high arsenic levels were not rare in the sample: although less than 30% of tested wells were unsafe, more than half of these had arsenic levels above 100 ppb.
The estimates in column 6 also include as regressor a dummy for the (endogenous) presence of a safe well within 50 meters, where neighboring well is defined as safe when it was identified as such by the research team. Recall that, at baseline, very few wells could be identified as safe by visible signs such as placards or paint on the well spout. In this model some observations are lost due to implausible entries for the geo-location of some wells. Controls are also included for the total number of wells in a 50-meter radius, and the dummy for safe wells is interacted with the treatment indicators. Among owners of unsafe wells in arm A, as expected, having a safe alternative nearby increases switching. The coefficient is large (27 percentage points) and significant at the 1% level. In group B, this association is weaker, given that the interaction (= −0.19) is negative and its magnitude is about two-thirds of that observed in arm A, although it is estimated imprecisely and is thus not significant at standard levels. This is consistent with informal group agreements leading some households to share wells with other members, with less concern of geographical distance, something which may have happened if geographical proximity was a poor proxy for sorting into the same risk-sharing group. This remains, however, a conjecture, given that the data do now allow determining with certainty if the well being used at endline belonged to a group member. The interaction between distance to a safe well and the treatment C dummy is again negative (= −0.08) but smaller and not significant at standard levels.
Of the 217 'switchers', almost all (214) listed safety concerns as the primary reason for their decision. Although the arsenic level in the new source of drinking water for switchers cannot be determined, about one third of these (79/217) had switched to a different well which was itself perceived by the respondent as being unsafe, while 88 had switched to a well reported as being safe, and the remaining 50 households did not know the status of the well. In principle even a switch to an unsafe well, if the new well is safer, can reduce exposure to arsenic, but this finding suggests that in the study area a degree of arsenic exposure remained even among a sizeable fraction of households who reacted to the new information by switching to a different water source for drinking and cooking.

Mechanisms
These results confirm the conceptual framework, according to which group signing or metal placards would lead to more switching relative to privately provided information. This section provides evidence to support possible mechanisms behind the findings, although the arguments are tentative as it is not possible to separate conclusively the relative role of the increase in the information about alternatives versus the soft commitment (in arm B) or the added salience of the placards (in arm C). There are two key limitations. First, the data include respondents' beliefs about the well used by their household, and also include the arsenic levels of all tested wells, but the latter was not necessarily known to households.
As a consequence, one cannot measure how each intervention changed the whole information set for each household. Second, in arm B the data do not include group composition. It is thus possible to examine neither the nature of the specific groups, nor whether households whose water turned out to be unsafe were being allowed to drink water from wells belonging to other members of the same group.
Despite these limitations, the data suggest that the added salience provided by the placards in arm C played a role in explaining the higher switching rates. In principle, owners who did not want to be reminded of their well water being unsafe, or did not want the information to be known to others, could have removed the placards, although detaching the metal wire holding them to the pump head would have required some effort. However, this behavior was rare. At the time of the return visits, the vast majority of the 348 placards installed on the well spout at the time of the test were still in place, regardless of their color. Of the 95 red placards installed on unsafe wells, 90 were still visible, while no placard was visible in two wells and a 'black' placard (perhaps a data entry error) was found on the remaining three. Almost all blue and green placards remained similarly in place during the study period. This suggests that the testing campaign led to a persistent increase in the salience and visibility of information in villages included in arm C. This result stands in contrast with Barnwal et al. (2017), who found that placards indicating unsafe arsenic levels in Bihar, India, were significantly more likely to be removed by households, although such actions were observed two years after installation, a much longer time interval relative to the average of eight months in this study.
The data also indicate that the placards were more effective than result cards only (as in arm A) at reminding users of the arsenic status of their well water, again suggesting increased salience. At the time of the endline survey, almost 90% of buyers correctly reported whether the water was found to be unsafe, but while learning about unsafe water was similar in arms A and B, it appeared to be better in C, consistent with the role of placards as reminders. Among respondents whose well water was found to be unsafe, the fraction who correctly identified them as such was 83% in arm A (135/162), 88% in B (122/139), and a remarkable 98% in arm C (83/85).
There is also evidence that the placard allowed switchers to make better choices, while group sales, despite inducing more switching relative to individual sales, may have induced households to share wells within the group despite the existence of better options outside of the group. Of the 217 households who stopped drinking from their unsafe well, only 88 (41%) had switched to a well that they believed to be safe. Looking at this by arm, the fraction was 47, 27, and 53% in arms A, B and C, respectively. It is possible that switchers from unsafe wells in B started drinking water from wells safer than the original one (unfortunately it cannot be checked if this was the case), but that in C switchers were almost twice as likely as in B to change to a safe well suggests that the placards played a role in allowing better choices. This is also consistent with data on the distance to the new well. Recall that when a respondent reported a change in the main source of water for drinking since baseline, the surveyor would ask to be accompanied to the new source, whose GPS location would then be recorded.
Unfortunately such GPS records were clearly incorrect for about 40% of the 217 switchers (this was evident because the new source was located too far from the household residence, usually even outside the village borders), but when one looks at the 124 observations with likely correct records, while distances from the new source were almost identical in arms A and B (on average 68 and 80 meters, respectively, with the p-value of the difference = 0.518), the distance was substantially larger in C (190 meters, with the p-value of the difference with respect to A < 0.01). 18 In sum, the data suggest that placards (C) likely made households more aware of the risks associated to drinking from their own unsafe wells and allowed better choices, sometimes at the cost of longer distances traveled to fetch drinking water. In contrast, there is less that can be said about the mechanisms that made group sales (B) relatively successful, although the key factors delineated in the conceptual framework are consistent with the results.

Responses Among Non-Users
The results discussed so far are related to the large majority of households who used their well for cooking and drinking at baseline. Perhaps not surprisingly, demand was significantly lower among 'non-users' (12%, vs. 25% among users), and for these households there is no record of their main source of water for drinking. Switching behavior is thus harder to analyze, also because the sample is small. However, our data allow us to determine that many of these households reacted to 'good news' by switching to the well they were not initially using. In the sample, 139 non-users purchased a test, and of these 126 (91%) were re-interviewed at endline. Of these, exactly half (63) found out that their well was safe (As≤ 50), and all but 3 reported that they were using the well for drinking and cooking at the time of the endline. 19 In contrast, only nine of the 63 with unsafe wells reported that they were 18 This finding also suggests that the results are not driven by courtesy bias in reporting switching behavior. In principle, safety information provided publicly (as in arm C) or to a group (as in B) may have induced some respondents to over-report switching behavior if this was perceived as socially desirable. If the higher switching rates in B and C relative to A had been driven by courtesy bias, one would have expected to observe shorter distances in the two former 20 using the well. However, for them it is not known if the well they were using at baseline was found to have an arsenic level even higher than their own.

Cost Effectiveness
This section evaluates the cost-effectiveness of the different sale strategies. Because the RCT did not include an arm with free provision (unlike earlier studies that only included free provision, see for instance Madajewicz et al. 2007or Bennear et al. 2013, the merit of free provision as compared to our sales strategy is gauged by assuming a range of switching rates consistent with earlier studies. 20 In addition, recall that the change in arsenic contamination for 'switchers' cannot be estimated reliably, given that the data only include the arsenic level of the initial source. Consistent with figures from this study, the calculations assume that tests costs USD0.30 (or BDT24 using an exchange rate of 1USD/80BDT), and that personnel is paid BDT45 per test delivered, with an additional bonus of BDT12 for group sales. An amount of BDT80 per test is added in arm C, to account for the cost of the placards. Consistent with the experimental results, a take up rate of 25% is set in arms A, B and C, while consistent with results from earlier blanket testing a 100% testing rate is used when tests are offered for free. Again using estimates from the RCT, the calculations use a 30% switching rate among users of unsafe wells in arm A, while for arms B and C they use the estimates adjusted for controls and strata fixed effects in column 4 of Table 4. That is, switching rates are assumed to be 0.49 in arm B and 0.70 in arm C. In the case of free provision, switching rates are varied from 0.3 to 0.75, consistent with findings from earlier work that evaluated switching after free provision. In the study area, the fraction of unsafe wells varied in the 16-27% range, while the earlier BAMWSP figures in these same villages varied from 40 to 90%. To cover a wide range of possibilities calculations are shown using a fraction of unsafe wells that is either low (20%), medium (40%) or high (80%). The results are summarized in Table 5, assuming that a policy maker is deciding how best to allocate a total and fixed budget of USD10,000.
While free provision maximizes switching opportunities within a given locality, charging a fee allows for a larger coverage-at the cost of reducing uptake among those with low willingness or ability to pay.
Given the fixed budget, the total number of tests ranges from a maximum of 33,333 with individual sales, to a minimum of 7,692 with sales of tests supplied with a metal placard, so that the placards make arm C even more expensive (per test) than free provision without placards. Under the simplifying assumption that the probability of uncovering an unsafe well does not depend on the mode of supply, these figures imply that individual sales (A) would be the strategy that maximizes the number of unsafe wells uncovered, followed by group sales (B) and free provision, while sales with placard (C) would be the worst in this respect. However, the relative performance of the strategies changes once the different switching rates are taken into account. In particular, given the high switching rates observed in B and C, and the relatively low cost of group sales (B), it is group sales that maximize the number of unsafe wells that cease to be used for drinking. Individual sales (A) are second-best, followed by either free provision (under the high-switching scenario) or sales with placards (C), while free provision is the worst under the assumption of switching rates as low as those observed in arm A. Note that the relative performance of the different strategies does not depend on the prevalence of unsafe wells. In contrast, the average cost 'per switch' from an unsafe well decreases with the fraction of unsafe wells, as this leads to an increase in the number of households who may benefit from switching, while the fraction that does is not affected under our assumptions.
These estimates also show that even the strategies with the highest average cost per unsafe well averted are highly cost-effective. The cost ranges from USD1.15 (group sales with high prevalence of unsafe wells) to USD14.4 (free provision with low switching rates and low prevalence of unsafe wells).
However, Pitt et al. (2015) estimated a present discounted value of per-household gains from switching to safe water sources over twenty years ranging from USD1400 to USD1000 for discount rates of 3% to 8%. Such estimates only take into account income gains that result from avoiding productivity losses due to consumption of arsenic-contaminated water, while they ignore the additional utility gains from better health and reduced mortality. Argos et al. (2010) estimate substantial declines in all-cause mortality over a 10-year period associated with high arsenic content of drinking water, with hazard rates ranging from 1.09 to 1.68 relative to 'safe' wells with arsenic below 10 ppb. In addition, Keskin et al. (2017) find that testing campaigns also reduced mortality among young children because arsenic risk induced mothers to breastfeed longer. Such reductions in mortality would make testing even more cost-effective. 21 Note also that testing would remain cost-effective even if only a fraction of switchers actually moved to a safe source.
Last but not least, it must be highlighted that while group sales with cost-sharing appears to be the most cost-effective strategy for a given budget, it comes at a high cost in terms of equity. This is of course a by-product of low demand, which leads three quarter of households not to learn about the safety status of their well. For instance, even under the scenario of only 20% of unsafe wells, while 4,444 unsafe wells would be identified, and 2,178 of them would no longer be used for drinking, we also find that in the same communities where sales took place there would be an additional 13,333 unsafe wells that would not be tested. Equity concerns, if individuals sales are not allowed, may be particularly serious for households who are isolated, either geographically or socially.

Conclusions
Information on household-specific environmental health risks can be a relatively inexpensive policy tool to mitigate those risks. However, the design of information campaigns often has to contend with resistance to behavioral change even when the presence of such risks has been revealed to target households. This may be especially true in developing countries, where poverty, low literacy and other constraints may severely limit the effectiveness of such campaigns, especially if targeted information is only supplied for a fee. These considerations are salient in Bangladesh, a country where millions of people use water from shallow tube wells for cooking and drinking, and where a large fraction of such water is estimated to be contaminated by naturally occurring arsenic in concentrations high enough to have serious health consequences in case of long-term exposure. This is generating one of the most severe public health crises worldwide . Given that wells with unsafe water are often located at walking distance from safe wells, the provision of information on well-specific arsenic levels represents a potentially life-saving tool, allowing households to mitigate arsenic risk by simply changing their primary source of drinking water.
This article has described the results from a randomized field experiment to study the effect of different arsenic test selling schemes on test uptake and well switching. Despite the fees, a team of surveyors managed to test 2,800 of a total of about 11,400 wells. This allowed to uncover the presence of hundreds of wells with arsenic levels above the threshold adopted by the Government of Bangladesh. Overall, about half of the users of contaminated wells decided to switch to a different source of drinking water. Relatively subtle differences in the way information was sold and provided, while barely affecting demand, led to very substantial gaps in behavioral responses. Relative to simple, individual sales where test results were provided privately, both (i) group sales that leveraged informal local solidarity networks, and (ii) the addition of metal placards posted on the wells more than doubled switching rates among users of unsafe wells. These findings should be useful for the design of information campaigns that aim at providing measures of risk exposure that vary at the household level. In the context of this study, information was supplied only to households who chose to purchase a test, but similar considerations may also be relevant when information is provided for free, for instance through blanket testing campaigns such as the one conducted now more than 10 years ago by BAMWSP.
A number of caveats and limitations should be however emphasized. First, data limitations do not allow to conclusively disentangle the mechanisms underlying the results, although the observed patterns are consistent with a conceptual framework where the adoption of health-protecting behavior is increased by pre-commitment to share drinking water (despite the absence of enforcing mechanisms), by the ease of access to information on safe sources, and by 'reminders' on water safety provided by placards affixed to the tube well spouts.
Second, the data do not include objective measures of exposure to arsenic, and so (unlike some earlier studies) it is not possible to determine if the self-reported changes in the main source of drinking water were reflected in actual reductions in exposure. This also limits the ability to evaluate the costeffectiveness of the interventions.
Third, the trial did not include an arm where tests were provided free of charge, and so it is necessary to resort to a number of assumptions when making cost-effectiveness comparisons between sales and free provision. With this caveat, the results suggest that cost sharing under some scenarios could allow to achieve a significantly larger number of unsafe wells no longer being used for drinking for a given budget, but this would come at the expense of equity, as many wells would remain untested due to low demand. In addition, free, blanket testing campaigns may also increase switching rates among owners of unsafe wells by increasing the number of known safer alternatives available nearby.
Fourth, to the extent that the results can be extrapolated to the rest of the country, tests-for-fee campaigns can only provide a partial solution to the public health crisis due to arsenic in shallow aquifers. In the study area, about three quarters of wells remained untested. That the vast majority of well owners did not know the safety status of their well suggests that the share of unsafe wells among these untested wells was similar to that among tested ones and, therefore, in the 15-30% range.
These rates are much larger than the unconditional ITT estimates of the impact of the selling schemes on switching rates, which are in the 3-6 percentage point range. This large discrepancy suggest that, despite the many tests sold, switching rates achieved by the test sales program remained well below the likely fraction of unsafe wells.
Fifth, despite the likely selection into purchase of households more responsive to arsenic-related information, about half of users of unsafe wells were still using the same source at the time of the return visit. Further, among those who switched to a different source, many switched to a well that was either still unsafe (although possibly safer ) or with unknown contamination levels. That more guidance is needed to facilitate switching to safe water sources is also consistent with findings from the neighboring Araihazar sub-district, where Pfaff et al. (2017) document that following the BAMWSP blanket testing campaign, about 30% of households whose well water was found to be unsafe had switched to other wells identified as unsafe or of unknown status.
Sixth, this study is silent as to whether demand was limited by the strategy of approaching women (usually the most senior woman in the household) to offer the tests. Miller and Mobarak (2013) show that women in Bangladesh were less likely than men to purchase improved cookstoves to reduce indoor pollution, despite their stronger preference for the new technology justified by them bearing much of the health costs of traditional stoves. These factors may have mediated the differences in switching rates between arms. For instance, it is possible that the strength of women's bargaining power in the choice to buy the tests, or in the choice of water source for drinking, was affected by group dynamics (especially in arm B) or by the public nature of the test results (especially in arm C).
In sum, and until game-changers such as regulated piped water become widely available, much remains to be learned about the optimal design of campaigns for the provision of information on environmental health risks. The findings discussed in this article suggest that facilitating the spread of information on safe options, reminders, and mechanisms that leverage the presence of peer groups may represent promising ways to maximize the adoption of risk-avoiding behavior.

Figure 1: Metal Placards
Notes: The three pictures show examples of the stainless steel placards that were attached in case of test purchase to tube well spouts in arm C. The pictures show placards attached to tested wells that were found to be, from left to right, safe (blue, As ≤ 10 ppb), marginally safe (green, 10 < As ≤ 50 ppb), and unsafe (red, As > 50 ppb), respectively.   % Switching % at given arsenic level data. The figures show, for each experimental arm and for tested wells, the prevalence of each arsenic level as identified by the test (light grey bars, in ppb, or micrograms per litre) and the fraction of households who were no longer using the well at endline (dark grey bars). The field tests identified the arsenic level as a value in the set As ∈ {0, 10, 25, 50, 100, 200, 300, 500, 1000}. The values on the horizontal line are not drawn at scale. Results of As= 1000 were rare and hence 500 and 1000 were pooled together. Wells with arsenic below the thick vertical line were the safest, while those with arsenic above the second and thin vertical line were labeled unsafe. A household was described as having switched if, at the time of the endline survey, the respondent stated that the main source of water used for drinking and cooking was no longer the well used at baseline. Notes: Author's calculations from baseline data (January to June 2016). The unit of observation is the primary household attached to a specific well. The number of clusters (villages) in the five arms are 49 (arm A, n = 5, 550 wells), 48 (B, n = 5, 314) and 15 (C, n = 1, 739). Except for the first variable ("Drinks from well at baseline") all variables are summarized for household who used the specific well for cooking and drinking at baseline. Differences in the number of observations across these variables are explained by missing entries during the data collection. The p-values in column 5 are for tests of the null of equal means across treatment arms (robust to intra-village correlation). Asterisks denote test significance: *** p<0.01, ** p<0.05, * p<0.1. Table A.1 in the appendix shows the detailed results for the normalized differences described in the text.)  (1). Strata fixed effects include union fixed effects and a dummy = 1 in villages where the % of unsafe wells in the village (estimated by BAMWSP) was below the median. Standard errors are clustered at the village level. The smaller sample size in column 3 relative to columns 1-2 is due to missing values in one or more controls, while in column 4 controls are not included but only observations with complete observations are used. Significance: *** p<0.01, ** p<0.05, * p<0.1. Notes: Authors' calculations using information from a total of 10,412 wells that were used at baseline for drinking and cooking purposes. Excluded from the analysis are 768 wells used by households that could not be re-contacted at endline, and 406 wells with a duplicate ID at baseline which can thus not be matched to endline data on switching decisions. Notes: Authors' estimations from baseline and endline data. The Intent-to-Treat results in columns 1 and 2 show switching rates not conditional on purchase or test result, including all households who used the well at baseline and who could be matched between baseline and endline surveys. All regressions in columns 3-6 include only observations for which the well was used for cooking and drinking purposes at baseline, a test was purchased, and the test indicated unsafe levels of arsenic in the water (As > 50 ppb). In column 2 (relative to column 1), and in column 4 (relative to 3) the decrease in the number of observations is due to missing values in controls, and in column 6 some additional observations are lost because the GPS location was not recorded correctly. The model in column 5 is the same as in column 3 but uses only observations with complete data used in column 4. All regressions are estimated using a linear probability model where the dependent variables is a dummy equal to one if the well was no longer used for cooking and drinking at endline. Standard error are clustered at the village level. Asterisks denote statistical significance: *** p<0.01, ** p<0.05, * p<0.1. The estimate show the responses to different sale strategies, assuming a total budget of USD10,000. Each test is assumed to cost USD0.30 (BDT24), while testers are assumed to be paid BDT45 per test delivered, with an additional bonus of BDT12 for group sales. BDT80 per test are added in arm C, to account for the cost of the metal placards. The take up rate is assumed to be 25% in arms A, B and C, and 100% when tests are offered for free. Switching rates among users of unsafe wells are assumed to be 30%, 49% and 70% in arms A, B, and C, respectively. In the case of free provision, switching rates are varied from 0.3 to 0.75, consistent earlier studies in neighboring areas.  Notes: Author's calculations from baseline data (January to June 2016). The unit of observation is the primary household attached to a specific well. The number of clusters (villages) in the five arms are 49 (arm A, n = 5, 550 wells), 48 (B, n = 5, 314) and 15 (C, n = 1, 739). Except for the first variable ("Drinks from well at baseline") all variables are summarized for household who used the specific well for cooking and drinking at baseline. Differences in the number of observations across these variables are explained by missing entries during the data collection. The p-values in column 7 are for tests of the null of equal means across treatment arms (robust to intra-village correlation). Asterisks denote test significance: *** p<0.01, ** p<0.05, * p<0.1. The normalized differences in columns 8-10 are calculated as in equation (2), while the multivariate standardized differences in the last row are calculated as in equation (3), see Section 3.1 for details.