Policy Research Working Paper                                    11059




    Design of Partial Population Experiments
     with an Application to Spillovers in Tax Compliance

                               Guillermo Cruces
                               Dario Tortarolo
                             Gonzalo Vazquez-Bare




Development Economics                               A verified reproducibility package for this paper is
Development Research Group                          available at http://reproducibility.worldbank.org,

February 2025                                       click here for direct access.
Policy Research Working Paper 11059


  Abstract
 This paper develops a framework to analyze partial popula-                                effects, and optimal cluster assignment probabilities. All
 tion experiments, a generalization of the cluster experimental                            the results apply to cluster experiments, a particular case
 design where clusters are assigned to different treatment                                 of the framework. The paper sets up a potential outcomes
 intensities. The framework allows for heterogeneity in clus-                              framework to interpret the OLS estimands as causal effects.
 ter sizes and outcome distributions. The paper studies the                                It implements the methods in a large-scale experiment to
 large-sample behavior of OLS estimators and cluster-ro-                                   estimate the direct and spillover effects of a communication
 bust variance estimators and shows that (i) ignoring cluster                              campaign on property tax compliance. The analysis reveals
 heterogeneity may result in severely underpowered exper-                                  an increase in tax compliance among individuals directly
 iments and (ii) the cluster-robust variance estimator may                                 targeted with the mailing, as well as compliance spillovers
 be upward-biased when clusters are heterogeneous. The                                     on untreated individuals in clusters with a high proportion
 paper derives formulas for power, minimum detectable                                      of treated taxpayers.




 This paper is a product of the Development Research Group, Development Economic. It is part of a larger effort by the
 World Bank to provide open access to its research and make a contribution to development policy discussions around
 the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors
 may be contacted at dtortarolo@worldbank.org. A verified reproducibility package for this paper is available at http://
 reproducibility.worldbank.org, click here for direct access.




                                                                                         RESEA
                                                                                    CY
                                                                               LI                R
                                                                                                 CH
                                                                          PO




                                                                    TRANSPARENT
                                                                                ANALYSIS
                                                                                                     S
                                                                           W




                                                                                                 R




                                                                               R                     E
                                                                            O




                                                                                    KI           P
                                                                                         NG PA




         The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development
         issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the
         names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those
         of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and
         its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.


                                                       Produced by the Research Support Team
                 Design of Partial Population Experiments
               with an Application to Spillovers in Tax Compliance*


              Guillermo Cruces, U. of Nottingham & CONICET-CEDLAS-UNLP
                                    Dario Tortarolo, World Bank DECRG
                                Gonzalo Vazquez-Bare, UC Santa Barbara




           JEL CODES: C01, C93, H71 , H26 , H21 , O23.
           KEYWORDS: partial population experiments, spillovers, randomized controlled trials, cluster ex-
       periments, two-stage designs, property tax, tax compliance.




    * We thank Yuehao Bai, Youssef Benzarti, Augustin Bergeron, Javier Birchenall, Matias Cattaneo, Max Farrell, Kelsey

Jack, Heather Royer, Doug Steigerwald and Alisa Tazhitdinova for valuable discussions and suggestions, and seminar partici-
pants at the 2021 National Tax Association conference, IFS, CEDLAS-UNLP, and the 2022 Advances with Field Experiments
conference. We thank Julian Amendolaggine and Juan Luis Schiavoni for their invaluable support throughout the project. We
thank Bruno Cr´ epon and Roland Rathelot for their help in obtaining their data. The views expressed in this paper are entirely
those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Develop-
ment/World Bank and its afﬁliated organizations, or those of the Executive Directors of the World Bank or the governments
they represent. Corresponding author: Dario Tortarolo, E-mail: dtortarolo@worldbank.org. This project was reviewed and
approved in advance by the Institutional Review Board at the University of Nottingham. The design for this experiment was
preregistered in the AEA RCT Registry (RCT ID: AEARCTR-0006569). All remaining errors are our own.
1        Introduction

Randomized controlled trials (RCTs) are extensively used in economics. A large fraction of these experi-
ments are based on the assumption that the treatment assignment of one unit or subject does not inﬂuence
the outcomes of others. The assumption of no interference, however, may be violated in many settings.
In such cases, identifying and measuring spillovers between units is crucial for understanding the nature
and magnitude of interactions between subjects, as well as for accurately assessing the direct impact of
the treatment.
    While the early experimental literature considered the impact on untreated units in an ex-post manner
(e.g. Miguel and Kremer, 2004), ﬁeld experiments incorporating spillover effects into their design have
gained traction in applied research. In settings where units are grouped into independent clusters, such as
schools, villages, or ﬁrms, a common design is the partial population design. Partial population designs
are a generalization of the clustered design wherein clusters assigned to different treatment intensities
or saturations are compared to pure control clusters with no treated units (Mofﬁt, 2001; Duﬂo and Saez,
2003; Hudgens and Halloran, 2008; Hirano and Hahn, 2010; Baird et al., 2018). The variation in treatment
intensity allows researchers to disentangle the direct and indirect effects of a treatment. In this paper, we
provide a framework to analyze this type of experiment when clusters are heterogeneous.
    We consider two dimensions of cluster heterogeneity that have important practical implications: het-
erogeneity in cluster sizes and heterogeneity in outcome distributions across clusters (distributional het-
erogeneity)1 . When analyzing an experiment with heterogeneous clusters, correctly accounting for this
heterogeneity is crucial for several reasons. On the one hand, variance formulas have to be adjusted ac-
cordingly, and failing to do so may result in severely underpowered experiments. On the other hand,
cluster heterogeneity can affect the accuracy of the large sample normal approximation, and inference
based on this approximation can be misleading when clusters are very heterogeneous (Carter, Schnepel
and Steigerwald, 2017; Djogbenou, MacKinnon and Ørregaard Nielsen, 2019; Hansen and Lee, 2019;
Sasaki and Wang, 2022; Chiang, Sasaki and Wang, 2023).
    With these challenges in mind, our paper provides ﬁve contributions. First, in Theorem 1, we derive an
asymptotic distributional approximation for OLS regression estimators in a setting with between-cluster
heterogeneity. We consider a double-array asymptotic setting where cluster sizes are allowed, but not re-
quired, to grow with the sample size. We provide conditions under which OLS estimators are consistent
for cluster-size-weighted averages of within-cluster differences in means, and are asymptotically normal.
We also show that, in the presence of distributional heterogeneity, the usual cluster-robust variance estima-
tor is generally upward-biased, and hence inference based on this estimator is conservative (Proposition
1). While similar results have been obtained in design-based settings with non-random potential outcomes
(see e.g. Hudgens and Halloran, 2008; Basse and Feller, 2018; Abadie et al., 2022; Jiang, Imai and Malani,
1
    We note that our framework allows for general forms of between-cluster heterogeneity, but assumes that outcomes are iden-
    tically distributed within each cluster. The generalization of our results to the case where outcome distributions are heteroge-
    neous within a cluster is left for future research.


                                                                  2
2023), to our knowledge we are the ﬁrst to show this result in a superpopulation setting under distributional
heterogeneity.
    Our second contribution is to derive explicit, closed-form formulas to conduct power and minimum
detectable effect (MDE) calculations under the two aforementioned sources of cluster heterogeneity. We
then consider an intermediate setting where clusters differ in size but not in their outcome distributions,
which simpliﬁes power and minimum detectable effects calculations and can be applied more easily when
baseline outcome data is not available. We show how our formulas generalize those available in the
existing methodological literature on experimental design (Duﬂo, Glennerster and Kremer, 2007; Hirano
and Hahn, 2010; Baird et al., 2018) by allowing for multiple treatment intensities, cluster heterogeneity,
heteroskedasticity and general forms of intracluster correlation in outcomes and treatments.
    Our third contribution is to derive optimal assignment probabilities determining the proportion of clus-
ters to be assigned to each treatment saturation (Theorem 2). We provide a tractable, closed-form solution
to the optimal choice problem of minimizing a weighted average of estimators’ variances. We also discuss
how alternative optimality criteria may be used in combination with our variance formulas using numerical
methods.
    Our fourth contribution is to set up a potential outcomes framework with within-cluster spillovers,
heterogeneous treatment effects, and heterogeneous clusters. We use this framework to provide sufﬁcient
conditions for OLS estimands to recover causal direct and spillover effects.
    Fifth, based on our framework, we designed and conducted a large-scale ﬁeld experiment to estimate
direct and spillover effects of a randomized communication campaign on property tax compliance in Ar-
gentina. Our experiment sent personalized letters to randomly selected dwellings with reminders about
taxes due, information about the status of the account, due dates, past due debt, and payment methods.
While there is ample evidence on the effect of tax reminders on compliance and collection (Antinyan and
Asatryan, 2024), our goal was to ﬁnd evidence on relatively elusive spillover effects from information
campaigns on tax collection. We designed the experiment based on our methodological results to capture
spillover effects of our mailings on neighbors who live in the same street blocks of treated individuals but
who did not receive a letter. Our results reveal higher payment rates for treated individuals, but also for
their untreated neighbors in the same street block, compared to accounts in pure control blocks where no
one received the letter. Spillover effects are lower in magnitude but still substantial and precisely estimated
in high-saturation street blocks, especially when accounting for expected (pre-registered) heterogeneity in
past compliance: payment rates of untreated accounts in high saturation blocks with above median past
compliance increased by 2.6 percentage points, compared to direct effects of about 5.1 percentage points.


Comparison with current literature. Our paper contributes to a growing literature on experimental
design (Duﬂo, Glennerster and Kremer, 2007; Bruhn and McKenzie, 2009; Bugni, Canay and Shaikh,
2018, 2019; Bai, 2022) and in particular to the literature on design and analysis of experiments under


                                                      3
spillovers or interference (Hirano and Hahn, 2010; Athey, Eckles and Imbens, 2018; Baird et al., 2018;
Basse, Feller and Toulis, 2019; Jiang, Imai and Malani, 2023; Puelz et al., 2022; Viviano, 2024; Leung,
2022; Liu, 2023). More speciﬁcally, our results generalize those of Hirano and Hahn (2010), Hudgens
and Halloran (2008) and Baird et al. (2018) by allowing for cluster heterogeneity, heteroskedasticity,
general treatment assignment mechanisms and within-group correlation structures and alternative criteria
for optimal treatment assignment.
    In related work, Athey, Eckles and Imbens (2018), Basse, Feller and Toulis (2019) and Puelz et al.
(2022) derive randomization inference tests for a general class of null hypotheses under interference. A
closely related study is Jiang, Imai and Malani (2023), who analyze two-stage completely randomized
experiments and provide randomization-based variance estimators and sample size formulas. Our re-
sults complement this literature by considering different estimands, different assignment mechanisms and
by conducting super-population-based large-sample (instead of design-based) inference in a double ar-
ray asymptotic framework. Our approach allows us to determine the role of cluster heterogeneity in the
asymptotic behavior of the treatment effect estimators.
    Our paper is also related to the literature on inference in clustered experiments, which are a particular
case of partial population experiments with only two saturations and no within-cluster treatment variation.
Bugni et al. (2023) study inference in clustered experiments with non-ignorable cluster sizes and derive
variance estimators and valid inference procedures in a setup with random cluster sizes. We further discuss
the relationship between our results and that paper in Section 3.5.
     We also contribute to a large empirical literature on property taxes and a small but growing empirical
literature on spillover effects in tax compliance. On property taxes, recent contributions include Brock-
meyer et al. (2020) study of Mexico City, Bergeron, Tourek and Weigel (2024) and Weigel (2020) for
the Democratic Republic of Congo, and Krause (2020) for Haiti, among others. The latter two are ran-
domized controlled trials, and in both cases, the authors address the presence of spillovers, but in ex-post
analysis rather than in the experimental designs. The effect of social interactions in tax compliance inter-
ventions has remained a relatively elusive issue in the broader experimental compliance literature. Some
notable exceptions are Pomeranz (2015), who detects enforcement spillovers up the VAT chain in Chilean
ﬁrms, Drago, Mengel and Traxler (2020) who study enforcement spillovers of TV licensing inspections
on untreated households in Austria, and Boning et al. (2020) who analyze direct and network effects from
in-person visits by revenue ofﬁcers on visited and non-visited ﬁrms in the United States (see the review
in Pomeranz and Vila-Belda, 2019, for more studies covering spillover effects). In Argentina, a recent
study by Carrillo, Castro and Scartascini (2021) ﬁnds neighborhood spillover effects from a program that
randomly awarded 400 taxpayers with the repair of a sidewalk. Whereas these papers ﬁnd spillover effects
in tax compliance, their original experiments were not designed to capture these effects. We build on these
pioneering works with an intervention designed with the purpose of capturing spillovers.
   The paper is organized as follows. Section 2 illustrates the practical importance of cluster heterogeneity
when conducting power calculations. In Section 3, we set up our framework and derive the main results. In

                                                     4
Section 4, we implement our methods in a large-scale randomized communication campaign, we describe
the administrative data used in the analysis, the empirical strategy, and evidence of direct and spillover
effects. Section 5 provides some practical recommendations for designing and analyzing partial population
experiments. Section 6 concludes.



2    Why is Cluster Heterogeneity Important?

We consider a population where units are grouped into mutually exclusive and independent clusters. Com-
mon examples of this type of clustering are students in schools (Miguel and Kremer, 2004; Beuermann
et al., 2015), family members in households (Barrera-Osorio et al., 2011; Foos and de Rooij, 2017), job
seekers in local labor markets (Cr´epon et al., 2013), employees in ﬁrms or organizations (Duﬂo and Saez,
2003), or households in neighborhoods, villages or other geographic administrative units (Angelucci and
De Giorgi, 2009; Ichino and Sch¨  undeln, 2012; Haushofer and Shapiro, 2016; Gin´ e and Mansuri, 2018). In
our application, a local property tax reminder information campaign, the population of interest consists of
taxpayers in residential blocks. Within this population, we study an experimental design where treatment
assignments can vary both between and within clusters.
    Figure 1 shows the distribution of cluster sizes in six partial population experiments, including our
analysis sample and ﬁve published papers (Cr´    epon et al., 2013; Gin´  e and Mansuri, 2018; Haushofer and
Shapiro, 2016; Ichino and Sch¨   undeln, 2012; Imai, Jiang and Malani, 2021). The ﬁgure reveals substantial
variation in cluster sizes. When cluster sizes are heterogeneous, it is likely that the distribution of outcomes
will vary across clusters as well. For instance, one may expect the mean and the variance of the outcome to
be different in large clusters compared to small clusters. We refer to the variation in outcome distributions
across clusters as distributional heterogeneity.
                                                                                       ˆ, such as a difference
    Intuitively, with heterogeneous clusters, the variance of an estimator of interest β
in means between units in treated and untreated clusters (we deﬁne the estimators of interest precisely in
the next section), can be decomposed into four parts:

                               ˆ] ≈ variance under uncorrelated observations
                             V[β                                                                            (1)
                                   + clustering with equally-sized clusters                                 (2)
                                   + cluster size heterogeneity                                             (3)
                                   + cluster distributional heterogeneity                                   (4)

The ﬁrst term is the variance that would be obtained if observations were uncorrelated within clusters.
The second term is an adjustment factor that accounts for the within-cluster correlation, often known as
the “design effect” or the “Moulton factor” (after Moulton, 1986) that depends on the average cluster
size. The term in the third line represents the additional variation due to the heterogeneity in cluster sizes,


                                                       5
which intuitively accounts for the variance of cluster sizes (Moulton, 1986, also derives this adjustment
for a random effects model). Finally, the last component accounts for the between-cluster heterogeneity
in outcome distributions. While the need to account for within-cluster correlations (lines (1) and (2)) is
well-understood for designing and analyzing clustered experiments, the adjustment terms that account for
cluster heterogeneity are typically assumed away by the literature on experimental design (e.g. Bloom,
2005; Duﬂo, Glennerster and Kremer, 2007; Hirano and Hahn, 2010; Baird et al., 2018).
    To numerically illustrate the importance of appropriately accounting for cluster heterogeneity in this
design, we consider the simple setting of a cluster RCT (which is a particular case of a partial population
experiment) where “a few” clusters are “large”. Speciﬁcally, we consider a sample of 200 clusters, indexed
by g = 1, . . . , 200, each having size ng . The ﬁrst 10 clusters contain 100 units, ng = 100, and the
remaining 190 clusters contain 25 units each, ng = 25 (these values are chosen to match the median
values in the literature in Figure 1). We assume the treatment has no effect, and the outcome of unit
                                                                                              iid
i = 1, . . . , ng in cluster g is given by a random effects model: Yig = αg + νg + ωig , νg ∼ N (0, 1/2),
     iid
ωig ∼ N (0, 1/2) with νg independent of ωig and where αg is a (non-random) intercept with αg = 0
if ng = 25 and αg = 1 if ng = 100. This model implies that the average outcome is E[Yig ] = 1 in
large clusters and E[Yig ] = 0 in small clusters. In addition, V[Yig ] = 1 and the within-cluster correlation
between outcomes is cor(Yig , Yjg ) = 0.5.
    Figure 2 plots three power functions for the difference in means between treated and untreated clusters
that a researcher may consider when designing this experiment. The short-dashed curve represents the
power function that is obtained when ignoring both sources of heterogeneity, that is, considering only the
terms in lines (1) and (2) of the variance formula. Using this formula, the MDE at 80% power, given
this sample size, is 0.29 standard deviations. However, when accounting for the variation in cluster sizes,
the corresponding power function is represented by the long-dashed curve. According to this curve, the
power to detect an effect of 0.29 is not 80% but 69%, so the experiment is underpowered. Furthermore,
the true power function that accounts for both sources of heterogeneity (sizes and outcome distributions)
is represented by the solid curve. This curve shows that the true power to detect an effect of 0.29 in this
setting with heterogeneous clusters is 48%, signiﬁcantly below the desired power of 80%. This numerical
exercise shows how ignoring heterogeneity may result in severely underpowered experiments. We provide
further examples of the importance of accounting for heterogeneity in Section 4.



3     Analysis of Partial Population Experiments

3.1    Setup

We consider a sample of observations (units) that are divided into mutually independent clusters g =
1, . . . , G, where each cluster g contains ng observations i = 1, . . . , ng and the total sample size is n =


                                                      6
     G
       ng . We view cluster sizes as non-random (see Bugni et al., 2023; Sasaki and Wang, 2022, for
     g =1
an alternative sampling approach where cluster sizes are random). In a partial population experiment,
clusters are randomly divided into categories or saturations denoted by Tg ∈ {0, 1, 2, . . . , M }, where by
convention Tg = 0 denotes a pure control cluster (i.e. a cluster where no unit is treated). Let P[Tg = t] =
qt ∈ (0, 1) denote the probability that cluster g is assigned to saturation t. Within each cluster, a binary
treatment Dig is assigned to units with probability P[Dig = 1|Tg = t] where P[Dig = 0|Tg = 0] = 1.2
We let Dg = (D1g , D2g , . . . , Dng g ) be the vector of unit-level treatment assignments in cluster g , D =
(D1 , . . . , DG ) and T = (T1 , . . . , TG ) . Figure A.3 provides an example of a partial population design
with four saturations. Notice that both standard RCTs with independent observations and cluster RCTs
are particular cases of partial population experiments, as we further illustrate in Section 3.5.
    The observed outcome of interest for unit i in cluster g is denoted by Yig and we let Yg = (Y1g , . . . , Yng g )
be the vector of observed outcomes in cluster g . In partial population experiments, the estimands of in-
terest are typically comparisons of average outcomes between treated or untreated units in treated clusters
to pure control units, E[Yig |Dig = d, Tg = t] − E[Yig |Tg = 0], pooled across clusters. In the ﬁrst part
of the paper, we take these estimands as given since they are the most commonly analyzed estimands in
the empirical literature. In Section 3.6, we set up a potential outcomes framework to rigorously justify
the causal interpretation of these estimands. Let µg (d, t) = E[Yig |Dig = d, Tg = t] be the conditional
expectation of the outcome in cluster g given assignment (d, t). We consider the following sample means
estimators:
                                          G
                                               1
                                          g =1 (Tg = t)
                                                          ng
                                                                         1
                                                          i=1 Yig (Dig = d)                 g   1t    d¯d
                                                                                                  g Ng Yg
                            ˆ(d, t) =
                            µ                                                       =                                       (5)
                                                   1                 1                          g 1g Ng
                                            G               ng                                      t  d
                                            g =1 (Tg = t)   i=1 (Dig = d)


where 1t g = 1(Tg = t), Ng =
                             d
                                    i 1(Dig = d) and Yg =
                                                         ¯d        i Yig 1(Dig = d)/Ng , deﬁned whenever
                                                                                      d

  d
Ng   > 0. These estimators are commonly computed by running an OLS regression of the outcome on
a full set of indicators (1(Dig = d, Tg = t))(d,t) , without an intercept. Thus, in what follows, we refer
to these estimators as OLS estimators. Our parameter of interest is the vector of cluster-size-weighted
average of cluster-speciﬁc differences in means:

                                                         G
                                                             ng
                                          βn (d, t) =           (µg (d, t) − µg (0, 0)) .                                   (6)
                                                        g =1 n


We note that our framework can easily accommodate other parameters with different weighting schemes,
such as the simple average across clusters G
                                           g =1 (µg (d, t) − µg (0, 0)) /G.




2
    In practice, some desired saturations may not coincide with the observed proportion of treated units for some cluster sizes.
    For instance, if P[Dig = 1|Tg = t]=0.5 but ng is odd, the observed proportion of treated cannot be exactly 0.5. Appendix A.4
    proposes an assignment mechanism that ensures that the expected proportion of treated coincides with P[Dig = 1|Tg = t].


                                                                 7
3.2       Asymptotic Behavior of OLS Estimators

We now study the asymptotic distribution of the OLS estimators deﬁned in Equation (5) and functions
thereof. We consider a double-array asymptotic setting where the cluster sizes are allowed, but not re-
quired, to grow with the sample size. This type of approximation is more appropriate than the bounded
cluster size approach when groups can be large and heterogeneous in size, but we note that the settings
with bounded cluster sizes and/or equally-sized clusters are nested as particular cases of our analysis.3 We
consider the following sampling scheme.

Assumption 1 (Sampling)

    (i) (Yg , Dg , Tg )G
                       g =1 are mutually independent across g .


    (ii) For each g and for all i = 1, . . . , ng , E[Yig |Dig = d, Tg = t] = µg (d, t) for all (d, t) and for all
         such that E[|Yig | |Dig = d, Tg = t] < ∞.

(iii) For each g and for all i = 1, . . . , ng , P[Dig = d|Tg = t] = pg (d|t) and P[Dig = d, Djg = d |Tg =
      t] = pg (d, d |t) for all d, d and t.


Part (i) states that clusters are mutually independent, a standard assumption in the clustering literature.
Notice that we do not require clusters to be identically distributed, so outcome distributions can be het-
erogeneous across clusters. Part (ii) states that average conditional outcomes are the same for all units
in the same cluster. In what follows we deﬁne µg (d, t) = µg (d, t) for = 1 to reduce notation. Part
(iii) states that the unit-level treatment probabilities are the same within a cluster. Note that within-cluster
assignments may be correlated.
   Next, let D(i)g = (Djg )j =i denote the vector of treatments excluding unit i and D(ij )g = (Dkg )k=(i,j )
denote the vector of treatments excluding units i and j . We introduce the following restriction on the
conditional moments of the outcome.

Assumption 2 (Exchangeability) For all i, j and g ,

    (i) E[Yig |Dig = d, Tg = t, D(i)g ] = E[Yig |Dig = d, Tg = t]

    (ii) E[Yig Yjg |Dig = d, Djg = d , Tg = t, D(ij )g ] = E[Yig Yjg |Dig = d, Djg = d , Tg = t].


This assumption is a high-level condition stating that, conditional on own treatment assignment and the
cluster-level assignment Tg , the ﬁrst and second moments of Yig do not vary with the peers’ treatment
indicators.4 As shown in Theorem 1 below, these conditions guarantee that the OLS estimator is consistent
3
  The number of parameters remains ﬁxed in our setup. See Vazquez-Bare (2023) for an alternative approach in which the
  number of parameters is allowed to grow with the sample size.
4
  We refer to unit i’s “peers” as all the units other than i in the same cluster.


                                                          8
for a weighted average of cluster-speciﬁc conditional means and that the outcome variance only depends
on (d, t), the variation in treatment assignment that is controlled by the experimental design. Assumption
2 can be interpreted as a requirement that the assignment (Dig , Tg ) contains all the relevant variation in the
outcome moments, so that the spillovers model is “correctly speciﬁed”. To further justify this assumption,
in Section 3.6 we show that this condition is guaranteed when peers are assumed to be exchangeable,
so that potential outcomes only depend on the proportion of treated peers and not on their identities.
This exchangeability assumption is very common in the spillovers literature. This requirement may be
violated, for example, in networks where units have different degrees of network centrality, and thus both
the proportion of and the identities of the treated units matter.
       Finally, we restrict cluster heterogeneity in the following way.
Assumption 3 (Cluster heterogeneity and bounded moments)
                                                                                             2/r
     (i) For some 2 ≤ r < ∞, as n → ∞, maxg n2
                                             g /n → 0 and                           g   nr
                                                                                         g         /n ≤ C < ∞.
                                                               ˜ < ∞.
    (ii) For some > r, supi,g,d,t E[|Yig | |Dig = d, Tg = t] ≤ C

Condition (i) is taken from Hansen and Lee (2019). The ﬁrst part ensures that the largest cluster is small
relative to the total sample size, so no cluster dominates the sample. The second part of condition (i) is
a regularity condition that rules out unbounded r-th moments in the distribution of cluster sizes. As an
example, setting r = 4 restricts the fourth moment of the cluster size distribution, which rules out heavy
tails.5 Condition (ii) is a standard regularity condition that ensures that the -th conditional moment of the
outcome is bounded.
   In what follows, we use “→P ” to denote convergence in probability “plimn→∞ ” to denote probability
limits, “→D ” to denote convergence in distribution and · to denote the Euclidean norm. We deﬁne any
generic (2M + 1)-dimensional vector v as:

                      v = (v (d, t))(d,t) = (v (0, 0), v (0, 1), . . . , v (0, M ), v (1, 1), . . . , v (1, M ))

                                  ˆ n = (ˆ
Consider the vector of estimators µ      µ(d, t))(d,t) from (5) and deﬁne the vector:

                                                                              g   ng pg (d|t)µg (d, t)
                                µp     p
                                 n = (µn (d, t))(d,t) ,      µp
                                                              n (d, t) =                               .
                                                                                     g ng pg (d|t)


Deﬁne the (2M + 1) × (2M + 1) covariance matrix Ωn with elements:

                                      1          g 1g Ng Ng Cov Yg , Yg |Tg , Dg
                                              E 1t  t   d d           ¯d ¯d
            Ωn ((d, t), (d , t )) =
                                      n   g                          pn (d |t )
                                                             ¯n (d|t)¯
                                                       qt qt p
                                      1                                                              g Ng , 1g Ng )
                                              (µg (d, t) − µn (d, t))(µg (d , t ) − µn (d , t ))Cov(1t  d    t  d
                                  +
                                      n   g                                          pn (d |t )
                                                                             ¯n (d|t)¯
                                                                      qt q t p
5
    Notice that condition (i) holds automatically when group sizes are seen as ﬁxed or bounded in the asymptotic analysis.


                                                                  9
where p¯n (d|t) := g ng pg (d|t)/n. In what follows we use Ωn (d, t) to refer to the diagonal elements of
Ωn . We introduce the following technical conditions to guarantee invertibility of the covariance matrix
and to ensure the denominators of the estimators are bounded below.

Assumption 4 (Invertibility conditions)

 (i) The minimum eigenvalue of Ωn is bounded away from 0.

                                                         ¯n (d|t) :=
 (ii) For any (d, t) such that pg (d|t) > 0 for some g , p             g   ng pg (d|t)/n ≥ c > 0.


    The following theorem characterizes the asymptotic distribution and variance of the OLS estimators
in (5). Let I2M +1 is a (2M + 1)-dimensional identity matrix.
                                                         −1/2
                                                              √
                                      ˆ n − µp
Theorem 1 If Assumptions 1 to 4 hold, µ      n →P 0 and Ωn         ˆ n − µp
                                                                n (µ      n ) →D N (0, I2M +1 ).



All the proofs can be found in Appendix D. Because the estimator µ  ˆ can be obtained through a saturated
OLS regression including one regressor per distinct treatment assignment, Theorem 1 can be thought of
as generalizing the results in Hansen and Lee (2019) to a speciﬁc type of nonparametric regression where
coefﬁcients are heterogeneous across clusters.


3.3    Estimation and Inference for Differences in Means

Theorem 1 has two main implications. First, each µ   ˆ(d, t) estimates a weighted average of cluster-speciﬁc
means µg (d, t), where the weights depend on the cluster size ng and the within-cluster probability of
                                                                                  a
treatment pg (d|t). Second, the distribution of µ
                                                ˆ n can be approximated as µ ˆ n ∼ N (µp  n , Ωn /n) where the
variance matrix Ωn allows for heterogeneity in cluster sizes and outcomes distributions, heteroskedasticity,
different treatment assignment probabilities across clusters and intracluster correlation in both outcomes
and unit-level treatment assignments. This result can be applied to obtain an asymptotic distributional
approximation and variance formulas for functions of µ    ˆ n , such as subvectors, linear combinations (like
the pooled and slope effects proposed by Baird et al., 2018) or nonlinear functions thereof, applying the
delta method when needed.
    Theorem 1 implies that the difference-in-means estimators β ˆ(d, t) = µ ˆ(d, t) − µ
                                                                                      ˆ(0, 0) consistently
estimate:
                                         ng pg (d|t)µg (d, t)     ng µg (0, 0)
                          βnp
                              (d, t) = g                      − g
                                            g ng pg (d|t)            n
which is different from our parameter of interest (6) because treatment probabilities may differ across
                                                                                                     p
clusters. When the treatment probabilities are equal across clusters, pg (d|t) = p(d|t) for all g , βn (d, t) =
βn (d, t) so the parameter of interest can be consistently estimated by OLS. Thus, in settings with hetero-
geneous clusters, the experimenter may prefer designs in which the within-cluster treatment probabilities


                                                      10
do not vary across clusters with the same assignment Tg = t, or to reweight the estimators by the inverse
of pg (d|t).
   When conducting inference and hypothesis testing, the variance of the estimators of interest is com-
monly estimated using a cluster-robust variance estimator. In this setting, and ignoring ﬁnite-sample
degrees-of-freedom adjustments, the cluster-robust variance estimator of Ωn is:
                            Ç               å−1                                        Ç               å−1
                 ˆ cr = n
                 Ω                  1g 1g                         ˆ )(Yg − 1g µ
                                                      1g (Yg − 1g µ           ˆ ) 1g           1g 1g               (7)
                                g                 g                                        g


where 1g = (11g , . . . , 1ng g ) is an ng × (2M + 1) matrix and 1ig = (1(Dig = d, Tg = t))(d,t) is an
ng -dimensional column vector. Based on this matrix estimator, the cluster-robust variance estimator for
the difference in means β  ˆ(d, t) is V                          ˆ cr (0, 0) using that Ω
                                                   ˆ cr (d, t) + Ω
                                      ˆcr (d, t) = Ω                                    ˆ cr ((d, t), (d , t )) = 0 for
t = t . The following result shows that, in a setting with distributional heterogeneity, the cluster-robust
variance estimator for the difference in means can be conservative.

Proposition 1 Let Vn (d, t) = Ωn (d, t) + Ωn (0, 0) − 2Ωn ((d, t), (0, 0)) denote the true asymptotic variance
                                            Ä                     ä
   ˆ(d, t). Under Assumptions 1 to 4, plim V
of β                                          ˆcr (d, t)/Vn (d, t) ≥ 1.
                                              n→∞


    The reason why the cluster-robust variance estimator can be conservative is that the true asymptotic
variance can be approximated as:

                                        ng pg (d|t) 2                          pg (d, d|t)
                                                              ß                                     ™
                              1
                 Vn (d, t) ≈                        σ (d, t) 1 + ρg (d, d, t)              (ng − 1)
                             qt     g   np¯n (d|t)2 g                           pg (d|t)
                             1          ng 2
                           +               σ (0, 0) {1 + ρg (0, 0, 0)(ng − 1)}
                             q0     g    n g
                                        ng pg (d|t)                                pg (d, d|t)
                                                                             ß                         ™
                              1                                   p        2
                          +                         (µg (d, t) − µn (d, t)) 1 +                (ng − 1)            (8)
                              qt    g   np¯n (d|t)2                                  pg (d|t)
                            1          n2g
                          +                (µg (0, 0) − µp n (0, 0))
                                                                    2
                            q0      g n
                                                                                                    ò2
                                    n2
                                     g pg (d|t)
                                       ï
                                                                  p                        p
                          −                        (µg (d, t) − µn (d, t)) − (µg (0, 0) − µn (0, 0)) .
                                g   n p   ¯n (d|t)

The ﬁrst two lines in Equation (8) represent the average within-cluster variation in outcomes for the units
in treated and pure control clusters, respectively. The third and fourth lines represent the between-cluster
variation in average outcomes for treated and control clusters, respectively. Finally, the ﬁfth line can
be interpreted as a the between-cluster variance of the difference in means, weighted by the relative
probabilities of treatment in each cluster. Note that when pg (d|t) = p        ¯n (d|t), this last term becomes
      2                       2
  g ng (βg (d, t) − βn (d, t)) /n. The last three lines in this formula equal zero when outcome distributions
are homogeneous between clusters, as we discuss below.
    The last term in Equation (8) is not estimable because it depends on the within-cluster difference in

                                                             11
means between assignments, which is never observed. The cluster-robust variance estimator is, asymptot-
ically:
                                                                                                        ò2
                                         n2
                                          g pg (d|t)
                                            ï
          ˆcr (d, t) ≈ Vn (d, t) +
          V                                                          p                         p
                                                       (µg (d, t) − µn (d, t)) − (µg (0, 0) − µn (0, 0))      (9)
                                     g   n p  ¯n (d|t)

                                                   ˆcr (d, t) can be asymptotically upward-biased, and thus
which is Equation (8) without the last term. Thus, V
inference based on this variance estimator can be conservative. Similar results have also been obtained
in design-based causal inference settings with non-random potential outcomes, see for example Hudgens
and Halloran (2008); Basse and Feller (2018); Abadie et al. (2022) and Jiang, Imai and Malani (2023).
Proposition 1 shows that an analogous result holds in a superpopulation setting when clusters exhibit dis-
tributional heterogeneity. In particular, when outcome distributions are homogeneous across clusters, this
additional term disappears and inference based on the cluster-robust variance estimator is asymptotically
exact, as we discuss further in Section 3.5.


3.4   Power Calculations and Optimal Design
                                                           p
By Theorem 1, the power of a two-sided hypothesis test of βn (d, t) = 0, can be approximated by:
                                         Å√     p                ã    Å√ p                  ã
                  p                           nβn (d, t)                nβn (d, t)
               Γ(βn (d, t))   ≈1−Φ             √         + z1−α/2 + Φ    √         − z1−α/2                  (10)
                                                 V                         V

for some appropriately chosen asymptotic variance V , where z1−α/2 is the (1 − α/2)-quantile from the
standard normal distribution. To use the true variance formula, the researcher may replace V by Equation
(8). This variance depends on the within-cluster variances, intra-cluster correlation and the between-
cluster variation in outcomes, which can be imputed using baseline data, the cluster size distribution,
which is observable, and the cluster- and unit-level assignment probabilities which are chosen by the
researcher. One issue with this choice of variance formula is that, as shown in Proposition 1, the variance
estimator that is actually used when conducting inference may be upward biased, which may result in an
underpowered study. To avoid this issue, the researcher may instead conduct power calculations using the
variance formula in Equation (9).
    The number of saturations M and the within-cluster treatment probabilities pg (d|t) and pg (d, d|t) play
a crucial role in identiﬁcation, as they determine the type of comparisons that can be made between treated
and control units. The choice of these parameters can be guided by previous knowledge or assumptions on
how the conditional average outcome varies as a function of the treatment saturation. For instance, if this
function is assumed to be linear or close to linear, two saturations would be enough to identify the shape
of this function, whereas if the function can be approximated by a quadratic function, one would need
three saturations, and so on. In turn, the choice of within-cluster treatment probabilities pg (d|t) depends
on the slope of the conditional average outcome as a function of the treatment saturation. For instance,


                                                         12
with three saturations M = {0, 1, 2}, if average outcomes are expected to jump around some value p   ¯ but
                                                                            ¯ and pg (1|2) > p
                                    ¯, the researcher can choose pg (1|1) < p
to be relatively ﬂat below or above p                                                        ¯ to increase
the chance of detecting these changes. Without knowledge of how this function is expected to change, the
researcher may spread these probabilities approximately uniformly, choosing some “low”, “intermediate”
and “high” treatment probabilities. While we do not provide formal guidance on choosing M and the
within-cluster treatment probabilities, our results in Theorem 1 and power function (10) can be used to
compare the power and MDEs of competing designs.
     We now propose a method to optimally choose the cluster-level assignment probabilities {qt }M       t=0 .
                                                                                     M
Given M and the within-group treatment probabilities, optimally choosing {qt }t=0 requires deﬁning an
optimality criterion that determines how the variances of all the estimators of interest are aggregated. The
literature on optimal design of experiments has proposed several criteria (see e.g. Silvey, 1980; Melas,
2006; Berger and Wong, 2009). We consider A-optimality, which minimizes the trace of the variance-
covariance matrix of the difference in means estimators (β   ˆ(d, t))(d,t>0) (or equivalently, the average of
the asymptotic variances).6 The justiﬁcation of this criterion is that the trace of the variance-covariance
matrix can be seen as a measure of the size of the conﬁdence ellipsoid (i.e. the multidimensional conﬁ-
dence interval) for the vector of parameters of interest. One advantage of A-optimality is its tractability,
as the optimal choice has a simple closed-form solution in this setting. In the theorem below, we consider
a generalized version of A-optimality that allows the researcher to assign different weights to different
variances.

Theorem 2 Let ω = (ωdt )(d,t>0) be a known vector of weights with ωdt ≥ 0, ω1t + ω0t > 0,                               t>0 (ω0t   +
ω1t ) = 1. Consider the optimal design problem:

                                                             M     ¶                               ©
                                              min                                          ˆ(1, t)]
                                                                          ˆ(0, t)] + ω1t V[β
                                                                    ω0t V[β
                                        q0   ,q ,...,q
                                               1         M
                                                             t=1


with qt > 0, M   t=0 qt = 1 using the variance formula in Equation (8) or (9). The optimal assignment
probabilities are given by:
                                               √
                        ∗                          B0                       ∗                Bt (ω )
                       q0 (ω )   =√                                    ,   qt (ω ) = √           √         ,   t > 0,
                                      B0 +                   Bt (ω )                  B0 +         Bt (ω )
                                               t>0                                           t>0


where
                                     2
                     B0 =        ng σg (0, 0) {1 + ρg (0, 0, 0)(ng − 1)} + ng (µg (0, 0) − µn (0, 0))2
                             g




6
    Notice that this criterion is different from the one in Baird et al. (2018), who minimize the average standard error. We propose
    this alternative method as it is in line with the theoretical literature on optimal design, while also allowing for a simple,
    closed-form solution to the optimal design problem.


                                                                            13
and for t > 0,

                                   ng pg (1|t) 2                        pg (1, 1|t)
                                              ï          ß                                   ™
                  Bt (ω ) = ω1t                 σg (1, t) 1 + ρg (1, t)             (ng − 1)
                                g p ¯n (1|t)2                            pg (1|t)
                                                               pg (1, 1|t)
                                                           ß                          ™ò
                                                         2
                                + (µg (1, t) − µn (1, t)) 1 +               (ng − 1)
                                                                 pg (1|t)
                                   ng pg (0|t) 2                        pg (0, 0|t)
                                              ï          ß                                   ™
                          +ω0t                  σg (0, t) 1 + ρg (0, t)             (ng − 1)
                                g p ¯n (0|t)2                            pg (0|t)
                                                               pg (0, 0|t)
                                                           ß                          ™ò
                                                         2
                                + (µg (0, t) − µn (0, t)) 1 +               (ng − 1) .
                                                                 pg (0|t)

Theorem 2 provides the formula for the optimal cluster assignment probabilities that minimize a weighted
average of estimators variances. By choosing the vector (ωdt )(d,t>0) , the researcher can assign lower (or
zero) weights to some parameters that are not of interest, and larger weights to parameters that are deemed
more important. For instance, to focus on comparisons between untreated units in treated clusters and pure
controls, the researcher can set ω1t = 0 for all t.
    While A-optimality has the advantage of a simple closed form solution, there are other optimality
criteria that may be desirable in different settings. Optimization problems based on these alternative
criteria do not have closed form solutions in general, but can be solved numerically using our variance
formulas. See Silvey (1980), Melas (2006) and Berger and Wong (2009) for further details and discussions.
    It should be noted that researchers may often need to incorporate different sets of constraints (such
as logistical, budgetary, political or administrative constraints) when choosing assignment probabilities.
These restrictions can be incorporated when choosing qt , either directly into the optimization problem in
Theorem 2 or on a case-speciﬁc basis. For example, in the experiment we describe in the next section, the
total number of treated units was set by the government agency. We set up a system of equations incor-
porating this restriction to control the variance of the smallest treatment cells (i.e. the noisiest estimators).
See Section 4.3 for details.
    Finally, we note that our optimality criterion does not incorporate baseline covariates. A strand of
the literature on experiments has considered alternative designs that include observed covariates in the
treatment assignment mechanism as a way to improve balance and increase precision. A common design
in these settings is the matched pairs design (Imai, King and Nall, 2009; Bai, 2022; Liu, 2023), where
each cluster is paired with another one with similar covariates and then treatment is randomized within
each pair. Our results provide a different and complementary approach that does not require covariates,
and instead allows the researcher to assign different weights to the different variances of the estimators of
interest.




                                                       14
3.5    Power Calculations under Distributional Homogeneity

The formulas for Ωn from Theorem 1 and Equations (8) and (9) can be difﬁcult to implement when the
researcher does not have access to baseline outcome data. We now introduce an additional assumption that
simpliﬁes the variance formulas and makes them easier to implement in the absence of this information.
Speciﬁcally, the following assumption rules out between-cluster heterogeneity in conditional outcome
moments.

Assumption 5 (Between-Cluster Moment Homogeneity) E[Yig |Dig = d, Tg = t] = µ (d, t) and E[Yig Yjg |Dig =
                       ˜ (d, d , t) for all g , (d, d , t) and for any for which the moments exist.
d, Djg = d , Tg = t] = c


As before, we write µ1 (d, t) = µ(d, t) to reduce notation. Under this additional assumption we obtain the
following result.

Corollary 1 Suppose Assumptions 1 to 5 hold. Then µp
                                                   n (d, t) = µ(d, t) for all (d, t), Theorem 1 holds and
the variance Ωn takes the following form:
                                                    ®                                         ´
                                      nσ 2 (d, t)                   g ng (ng − 1)pg (d, d|t)
                    Ωn (d, t) =                       1 + ρ(d, t)                               , t > 0,
                                 qt g ng pg (d|t)                          g ng pg (d|t)
                                            ®              Ç      2
                                                                        å´
                                 σ 2 (0, 0)                   g ng
                    Ωn (0, 0) =                1 + ρ(0, 0)          −1        ,
                                     q0                       n
                                                                 ng (ng − 1)pg (0, 1|t)
           Ωn ((0, t), (1, t)) = nσ (0, t)σ (1, t)ρ(0, 1, t) g                              , t > 0,
                                                             g ng pg (0|t)    g ng pg (1|t)

          Ωn ((d, t), (d , t )) = 0,   t=t

and where σ 2 (d, t) = V[Yig |Dig = d, Tg = t], ρ(d, t) = cor(Yig , Yig |Dig = d, Djg = d, Tg = t),
pg (d, d |t) = P[Dig = d, Djg = 1|Tg = t], and ρ(0, 1, t) = Cov(Yig , Yig |Dig = 0, Djg = 1, Tg = t). In
                Ä                     ä
addition, plim V  ˆcr (d, t)/Vn (d, t) = 1.
          n→∞


Corollary 1 has three main implications. First, under between-cluster homogeneity, the difference-in-
means estimators are consistent for the population differences in means β (d, t) = µ(d, t)−µ(0, 0). Second,
it shows that under this additional assumption, the cluster-robust variance estimator is consistent and thus
inference based on this estimator is asymptotically exact. Third, it provides a simpliﬁed variance formula
that allows for heterogeneity in cluster sizes and within-cluster probabilities, conditional heteroskedasticity
and intracluster correlation in outcomes and treatments, but does not depend on cluster-speciﬁc average
outcomes like the variance formula in Theorem 1. This simpliﬁed formula can be readily used to conduct




                                                       15
power and MDE calculations for the parameters of interest. Speciﬁcally,
                                                              ®                                             ´
                                                 2
                              ˆ(d, t)] ≈       σ   ( d, t)                    g n g (n g −  1) p g (d, d|t)
                           V[ β                                 1 + ρ(d, t)
                                         qt g ng pg (d|t)                             g ng pg (d|t)
                                                      ®              Ç      2
                                                                                   å´                                        (11)
                                         σ 2 (0, 0)                     g ng
                                       +                 1 + ρ(0, 0)          −1
                                            nq0                         n

which only depends on the variance and conditional intracluster correlation in outcomes (as in any standard
power calculation), the assignment probabilities, which are chosen by the experimenter, and the sample
distribution of cluster sizes, which is observable. This variance can be fed into the power formula (10) to
calculate power and MDEs. We discuss practical implementation issues in more detail in Sections 4 and
5.
     As a word of caution, we note that, just like ignoring cluster size heterogeneity, incorrectly imposing
Assumption 5 when conducting power calculations can result in variances and MDEs that are too small
because they ignore between-cluster variability in outcomes. While this assumption may be strong in some
settings, most of the formulas for experimental design available in the literature rely on it. To illustrate
this point, the following examples show how our general formulas simplify to the ones proposed in the
literature under further assumptions.

Example 1 (Standard RCT with a binary treatment) Suppose that each cluster has one unit (ng = 1),
and there are two saturations so that each (single-unit) cluster is assigned to treatment or control with
probability q and 1 − q respectively. In this case, qt = q , q0 = 1 − q , g ng pg (1|1) = n and under
Assumptions 1 to 5, V[β   ˆ(1, 1)] ≈ σ 2 (1, 1)/nq + σ 2 (0, 0)/n(1 − q ). In addition, under the homoskedas-
                                                    ˆ(1, 1)] ≈ σ 2 /nq (1 − q ) which is Equation (6) in Duﬂo,
ticity assumption σ 2 (1, 1) = σ 2 (0, 0) = σ 2 , V[β
Glennerster and Kremer (2007).

Example 2 (Cluster RCT) Suppose that clusters are assigned to two saturations Tg ∈ {0, 1} and that all
units within the same cluster receive the same treatment. In this case, pg (1|1) = pg (1, 1|1) = 1 and under
Assumptions 1 to 4,

                                                      ¯g0 |Tg = 0]
                                       ¯g1 |Tg = 1] V[Y                                                         ã2
                                 ®                                                                                           ´
                            n2       V[Y                                      µg (1) − µp       µg (0) − µp
                                                                            Å
       ˆ(1, 1)] ≈            g                                                          n (1)             n (0)
     V[β                                           +               + q 0 q1                   +
                       g    n2              q1             q0                        q1                q0

where µg (1) = µg (1, 1), µg (0) = µg (0, 0) and similarly for the remaining terms. This formula is analo-
gous to the one derived by Bugni et al. (2023) for what they call the size-weighted cluster-level average
treatment effect, up to a term in their formula that accounts for the stratiﬁcation procedure.7 Furthermore,
if Assumption 5 holds,
                                       ®                 Ç         å´              ®             Ç                    å´
                      2
          ˆ(1, 1)] ≈ σ (1, 1)                                gn2
                                                               g        σ 2 (0, 0)                              gn2
                                                                                                                  g
        V[β                                1 + ρ(1, 1)           −1   +              1 + ρ(0, 0)                    −1   .
                        nq                                   n          n(1 − q )                               n
7
    See also Liu (2023) for an analysis of stratiﬁcation in two-stage experiments.


                                                                   16
Finally, suppose clusters are equally-sized, ng = n     ¯ , and assume a random effects structure so that
σ (1, 1) = σ (0, 0) = σ + τ and ρ(1, 1) = ρ(0, 0) = τ 2 /(σ 2 + τ 2 ). In this case, V[β
 2              2             2    2                                                           ˆ(1, 1)] ≈
 nτ 2 + σ 2 )/[q (1 − q )Gn
(¯                        ¯ ] which is Equation (9) in Duﬂo, Glennerster and Kremer (2007).

Example 3 (Homoskedastic case with two treatment saturations) Suppose there are only two satura-
tions, so that M ∈ {0, 1}, as in Duﬂo and Saez (2003). Let q = P[Tg = 1] and p = P[Dig = 1|Tg = 1].
Assume that σ 2 (d, t) = 1 and ρ(d, t) = 0 for all (d, t). In this case, for assignment (d, t) = (0, 1), under
Assumptions 1 to 5, V[β ˆ(0, 1)] ≈ (1 − pq )/[(1 − p)q (1 − q )] which corresponds to the variance formula
in Hirano and Hahn (2010).

Example 4 (Random effects structure with equally-sized clusters) Suppose that all clusters are equally
sized, ng = n¯ for all g , and consider a random effects covariance structure so that σ 2 (d, t) = σ 2 + τ 2 ,
ρ(d, t) = τ 2 for all (d, t). In addition, suppose that the within-cluster assignment given Tg = t sets
a ﬁxed number of treated units n   ¯ pt in each cluster, which implies that P[Dig = 1, Djg = 1|Tg =
         npt − 1)/(¯
t] = pt (¯           n − 1). In this case, for assignment (1, t), under Assumptions 1 to 5, V[β   ˆ(1, t)] ≈
            ¶ Ä           ä           Ä         ä©
             ¯ ρ q1t + q10 + (1 − ρ) pt1qt + q10 /n which corresponds to Equation (3) in Baird et al.
(σ 2 + τ 2 ) n
(2018).


3.6     A Potential Outcomes Framework

In this section we introduce a potential outcomes framework to study the causal interpretation of the OLS
estimands discussed in the previous sections. Let Yig (d, dg , t) denote unit i’s (random) potential outcomes
where d denotes own treatment, dg ∈ {0, 1}ng −1 is a vector denoting unit i’s peers’ treatments and t
denotes the cluster-level assignment. To be able to compare outcomes across clusters, our ﬁrst assumption
is an exclusion restriction stating that the cluster level assignment t does not directly affect potential
outcomes.

Assumption 6 (Exclusion restriction) Yig (d, dg , t) = Yig (d, dg ) for all (d, dg , t).

While this assumption is required to identify treatment effects using variation across clusters, to our
knowledge we are the ﬁrst to make it explicit. This potential outcome structure allows for within-cluster
spillovers, an assumption often known as stratiﬁed interference (Hudgens and Halloran, 2008). Speciﬁ-
                                                                                                                         ˜g )
cally, Yig (1, dg ) − Yig (0, dg ) is the direct effect of the treatment on unit i in cluster g , Yig (0, dg ) − Yig (0, d
is the spillover effect on an untreated unit and Yig (1, dg ) − Yig (1, d    ˜ g ) is the spillover effect on a treated
unit. The observed outcome of interest for unit i in cluster g is denoted by Yig = d,dg Yig (d, dg )1(Dg =
(d, dg )).
   Next, we assume that the vector of treatment assignments (Dg , Tg ) is independent of the vector of
potential outcomes, which is guaranteed by random assignment of the treatment.

                                                  ⊥ (Dg , Tg ).
Assumption 7 (Independence) (Yig (d, dg ))(d,dg ) ⊥

                                                             17
    Finally, we assume that peers are exchangeable. Under this assumption, potential outcomes can depend
ﬂexibly on the proportion of treated peers, as long as they do not depend on the peers’ identities. This
assumption reduces the dimensionality of potential outcomes and is ubiquitous when analyzing spillovers
(see Vazquez-Bare, 2023, and references therein for further discussion). In what follows let 1g be an
(ng − 1)-dimensional column vector of ones.

Assumption 8 (Exchangeability) For all dg , Yig (d, dg ) = Yig (d, πg ) where πg = 1g dg /(ng − 1) is the
proportion of unit i’s treated peers.


   The following result links moments of observed outcomes to average potential outcomes.

Proposition 2 Under Assumptions 6 to 8, letting Sig =              j =i   Djg ,

                                           ng −1    ï   Å                 ãò
                                                                  sg
             E[Yig |Dig = d, Tg = t] =             E Yig d,                    P[Sig = sg |Dig = d, Tg = t]
                                           sg =0                ng − 1

for any such that the expectations are well deﬁned.


    Proposition 2 implies that the conditional mean E[Yig |Dig = d, Tg = t] in cluster g equals an average
of average potential outcomes over the proportions of treated peers that are consistent with the assignment
mechanism. In particular, if the treatment assignment mechanism exactly determines the proportion of
treated units, so that P[Sig = sg |Dig = d, Tg = t] = 1 for some sg , each observed conditional mean
point-identiﬁes the average potential outcome.
                        1      n g
Theorem 3 Let Ng          = i=1    Dig be the total number of treated units in cluster g and deﬁne yg =
(Yig (d, dg ))i,d,dg . Suppose that:

 (i) Assumptions 6 to 8 hold.

 (ii) (yg , Dg , Tg )G                                                                                   ˜l
                     g =1 are mutually independent across g ; for each g and for all i, E[Yig (d, π )] = µ g (d, π )
      for all (d, π ) and for all such that E[Yig (d, π )] < ∞; Assumption 1(iii) holds.

                                                                            ˜ < ∞.
(iii) Assumption 3(i) holds and for some > r, maxi,g,d,π E[|Yig (d, π )|] ≤ C
       1                                           1
(iv) Ng   is nonrandom conditional on Tg , with P[Ng = ng pg (1|t)|Tg = t] = 1 and pg (d|t) = p(d|t) for
     all g .

Then, Theorem 1 holds and

                                                                            ng p(1|t) − d
                                                                    ï Å                   ã             ò
                            ng                                  ng
         βn (d, t) :=          (µg (d, t) − µg (0, 0)) =           E Yig d,                 − Yig (0, 0) .
                        g   n                              g    n              ng − 1



                                                           18
Theorem 3 provides conditions on the potential outcomes and experimental design to guarantee that The-
orem 1 holds. By Proposition 2 and Condition (iii) of Theorem 3, moments of observed outcomes for
(d, t) can be replaced by moments of potential outcomes for (d, (ng p(1|t) − d)/(ng − 1)) so the formulas
in Theorem 1 can be readily applied, as long as the variance matrix is invertible. In addition, Theorem 3
ensures that differences in means have a causal interpretation: each βn (d, t) equals a cluster-size-weighted
average of average differences in potential outcomes. By Theorem 1, these parameters can be consistently
estimated by OLS.
    When cluster sizes vary, average potential outcomes may vary across clusters, even within the set of
clusters with the same assignment Tg = t and when the observed proportion of treated units is ﬁxed. To
see this, consider the following example. Suppose there are two cluster sizes, ng = 16 and ng = 20,
and consider clusters with pg (1|t) = 0.5 so that half the units are assigned to treatment. In clusters with
ng = 16, the total number of treated units will be 8 and thus the proportion of treated peers for each
unit is 8/15 ≈ 0.535 for untreated units and 7/16 = 0.438 for treated units. On the other hand, in
clusters with ng = 20 there will be 10 treated units and thus the proportion of treated peers is 10/19 ≈
0.526 for untreated units and 9/19 ≈ 0.474 for treated units. Hence, an untreated unit in a cluster with
treatment intensity pg (1|t) = 0.5 will have a proportion of 0.533 treated peers if the cluster size is 16,
and a proportion of 0.526 treated peers if the cluster size is 20, so the proportions are slightly different
even though the treatment assignment is the same. As a result, to be able to use the simpliﬁed formula
in Corollary 1, the outcome homogeneity assumption needs to be strengthened to ensure that average
potential outcomes are invariant to small perturbations in the proportion of treated units, as shown below.

Theorem 4 Suppose that the conditions for Theorem 3 hold, and that:

                      ˜ (d, π ) and E[Yig (d, π )Yjg (d , π )] = c
 (i) E[Yig (d, π )] = µ                                          ˜ (d, d , π, π ) for all g and (d, d , π, π ).

                                                                                                        ˜ (d, πg (d|t)) − µ
 (ii) For each (d, d , t) there exists a π (d|t) such that for πg (d|t) = (ng pg (1|t)−d)/(ng −1), maxg µ                 ˜(
      0 and
                                           ˜ (d, d , π (d|t), π (d |t)) = 0.
          ˜ (d, d , πg (d|t), πg (d |t)) − c
     maxg c

Then Corollary 1 holds and β (d, t) := µ(d, t) − µ(0, 0) = E [Yig (d, π (d|t)) − Yig (0, 0)].

Condition (i) above states that, for a given (d, π ), potential outcome moments do not vary across clusters,
whereas condition (ii) formalizes the requirement that potential outcome moments are invariant to pertur-
bations in the proportion of treated peers generated by the variation in cluster sizes, that is, the function
is locally ﬂat. Intuitively, in the example from the previous paragraph, this condition implies for instance
that E[Yig (0, 0.533)] = E[Yig (0, 0.526)]. While this second condition may be unlikely to hold exactly in
practice, it can be a reasonable approximation when πg (d|t) shows little variation across g for each (d, t)
(which happens for example when clusters are not very small) and/or the function µ     ˜ (d, π ) is relatively
ﬂat around relevant values of π . Under these conditions, Corollary 1 can be applied to estimate direct and
spillover effects by OLS and to conduct power calculations.

                                                         19
4     Estimating Spillovers in Tax Compliance

4.1    Background

There is a large literature on nudges and tax compliance (Antinyan and Asatryan, 2024), but there is
relatively scant evidence on the social interaction effects behind these interventions. We designed and
implemented an intervention based on the framework presented in the previous sections to illustrate its
potential to capture social interaction effects in tax compliance.
    The intervention took place in a large municipality of Argentina where dwellings are billed and re-
quired to pay a municipal property tax on a monthly basis (the Tasa por Servicios Generales). The treat-
ment consisted of a one-page personalized letter with information on the current billing period, past due
debt, and how to pay online or in person.8
     The randomized treatment assignment was conducted in two stages—ﬁrst at the street block level
(clusters), and then at the taxpayer account/dwelling level (units). In the ﬁrst stage, we randomly divided
blocks into four categories with different intensity of treatment, as depicted in Figures A.2 and A.3: (1)
pure control blocks where no accounts were treated, (2) blocks with 20% of the accounts treated, (3) blocks
with 50% of the accounts treated, and (4) blocks with 80% of the accounts treated. These different treat-
ment intensities were designed to assess whether spillovers depend on the saturation of our information
campaign at the block level (namely, low, medium, and high saturation levels).9 In the second stage, we
randomly selected accounts within the latter three groups of blocks according to their treatment saturation
to receive the letter. The experiment was run on residential dwellings present in the municipality in 2019.
The timeline of the intervention is displayed in Figure A.4. The letters were delivered between September
28th and October 7th, 2020, corresponding to payments due on October 9th, 2020, as well as past due debt
(if any).


4.2    Administrative Data

We use a combination of administrative databases provided by the revenue agency of the municipality
where the experiment took place. The main database is constructed from the monthly bills issued to
account holders between January 2018 and December 2020. The unit of observation is an account (cuenta),
which coincides with a dwelling unit. The data contain the following billing details and demographic
characteristics of the account holder (titular): account number (unique ID), address, block number, name
of locality (neighborhood), year and month of the bill (12 bills per year), monthly fee (in pesos), paid fee

8
  Figure A.1 in the appendix provides an anonymized example of the intervention letter. Our simple design emphasized action-
  relevant information, in accordance with De Neve et al. (2021) who show that simpliﬁed tax letters are an effective way to
  increase tax compliance.
9
  The choice p1 = 20%, p2 = 50% and p3 = 80% attempts to balance parsimony with the ﬂexibility to detect nonlinearities in
  direct and spillover effects without having to estimate too many parameters. See Section 3.4 for further discussion.


                                                            20
(amount in pesos), due date, date of payment, days overdue, means of payment (cash or electronic), type
of account (residential, retail store, factory), gender of the account holder, age of the account holder, linear
front meters of the lot/property, assessed value of the property.
     The municipality authorities required us to target blocks with eight to ﬁfty accounts, neither very
sparse nor very dense, which was the target for their mailing campaign. Figure 1 shows the distribution
of accounts per block. Table A2 shows some descriptive statistics for the year 2019. Our sample size
consists of 68,808 accounts distributed in 3,982 blocks. The frequency of payments is highly polarized.
About 45 percent of the accounts paid the twelve 2019 monthly bills, and about 35 percent did not pay any
bill at all.10 We call these two core groups always payers and never payers, respectively. The proportion of
always payers is relatively low (45 percent) and, therefore, leaves room for potential behavioral responses
from non-compliant and partially-compliant neighbors, and this was compounded by the context of the
pandemic, during which lockdown measures reduced payments even from highly compliant individuals.
    For the randomization, power calculations, and simulations, we use baseline data from the year 2019.
We rely on three different pre-treatment outcomes: (i) an indicator equal to 1 if the account paid the twelve
monthly bills of 2019, (ii) an indicator equal to 1 if the account paid at least one bill in 2019, and (iii) an
indicator equal to 1 if the account paid six bills or more in 2019.


4.3        Experimental Design and MDEs

Following the notation in Section 3, the block-level treatment indicator is denoted by Tg ∈ {0, 1, 2, 3}
with distribution P[Tg = t] = qt for t = 0, 1, 2, 3 where Tg = 0 indicates the pure control blocks, Tg = 1
indicates the blocks with 20% treated, Tg = 2 indicates blocks with 50% treated, and Tg = 3 indicates
blocks with 80% treated. The account-level treatment indicator is Dig ∈ {0, 1}.
     We use an independent within-cluster treatment assignment and constant within-cluster treatment prob-
abilities. In the absence of data from a pilot experiment, we assume equal moments across assignments
  2            2
σg  (d, t) = σg  (0, 0), µg (d, t) − µn (d, t) = µg (0, 0) − µn (0, 0) and ρg (d, t) = ρg (0, 0) for all g, d, t. We
further assume that the intracluster correlation is constant across clusters. We then impute all these mag-
nitudes based on our baseline data. The parameters of interest are the difference in means between treated
or untreated units in each treated group and the pure control units, βn (d, t) for d = 0, 1 and t = 1, 2, 3.
    The municipality authorities requested that the total number of letters sent be set to L = 25, 061. To
incorporate this constraint into the choice of the saturation probabilities qt , we set up a system of equations
as follows. The expected number of treated units is n1 = n(0.2q1 + 0.5q2 + 0.8q3 ). Since the assignments
Tg = 1 and Tg = 3 can be seen as symmetric, we set q1 = q3 . Finally, we add an equation that ensures
that the variance of the effect at 50% saturation is equal to the variance for the “small” cells (treated units
in 20% clusters and untreated units in 80% clusters), so that V[β   ˆ(d, 2)] = V[β  ˆ(0, 3)] = V[β ˆ(1, 1)]. This

10
     For the full distribution, see Figure B.15.


                                                        21
gives a third equation of the form q2 = Rq3 where R is a constant obtained from our variance formulas.
Our system of equations consists of four equations: (i) L = n(0.2q1 + 0.5q2 + 0.8q3 ), (ii) q1 = q3 , (iii)
q2 = Rq3 and (iv). We use the results in Theorem 1 to approximate the variances and calculate the ratio
R.
    For our MDE calculations, and to illustrate our methods, we consider three scenarios: one with “sub-
stantial” heterogeneity (scenario 1), one with “moderate” heterogeneity (scenario 2), and one with “lim-
ited” heterogeneity (scenario 3). The ﬁrst one uses our raw data set that contains all clusters with eight
households or more. This raw data contains one cluster with a very large number of units. This is the
scenario where cluster heterogeneity is most substantial. The second scenario considers an intermediate
case where we drop all clusters with more than 500 units. This second data set still exhibits substantial
heterogeneity but eliminates one extreme outlier. Finally, the third scenario considers clusters of size be-
tween eight and 50, which is the sample we use in our experiment. While scenarios 1 and 2 are not used in
our empirical analysis, we use them to illustrate how ignoring cluster heterogeneity can result in severely
underpowered experiments.
    These three scenarios are described in Table 1. In scenario 1, the average cluster size is around 21
households, but the data set contains one very large outlier with 2,754 units. This large cluster makes the
standard deviation of cluster sizes very large, indeed larger than the average size, so our theoretical results
indicate that the adjustment for heterogeneity will make a substantial difference when calculating power
and MDEs. In scenario 2, we remove this outlier from the data and this reduces the standard deviation of
cluster size, slightly below but very close to the average cluster size. Finally, in scenario 3, while cluster
sizes are still heterogeneous and range from eight to 50, the standard deviation is about half the average
size. Thus, we may expect the power and MDE adjustment for cluster heterogeneity to be less sizeable
in this case. In each scenario, we consider the results obtained with our general formula from Theorem
1 (“Het”), the formulas that rule out between-cluster moment heterogeneity from Corollary 1 (“Homog”)
and the formulas that assume homogeneous, equally sized clusters (“Equal”). We emphasize that this last
case imposes an incorrect assumption in all three scenarios, as cluster sizes are not homogeneous in our
sample.
     Table 1 shows the cluster assignment probabilities and MDEs for the binary outcomes of interest. We
refer to the corresponding MDEs for the parameters βn (0, 1), βn (0, 2) and βn (0, 3) as M DE1 , M DE2 and
M DE3 , respectively (the MDEs for βn (1, t) are symmetric and therefore not reported). Our calculations
reveal that in scenario 1, the MDEs vary dramatically and range from 0.02 to almost 0.15 depending on
the assumptions one is willing to make about cluster heterogeneity. In scenario 2, the difference in MDEs
is less pronounced but still substantial: the MDEs under full heterogeneity are about twice as large as the
case with equally-sized and homogeneous clusters. Reassuringly, in scenario 3, the one that we use in
our experiment, the MDEs are much more robust to the different assumptions about cluster heterogeneity,
although ignoring heterogeneity may still result in MDEs that are about 30% smaller than the ones that
account for it.


                                                      22
    For comparison, we repeat these MDE calculations using the optimal cluster assignment probabilities
from Theorem 2 to assess the extent to which the constraint in the number of treated units affects the power
of our experiment. The results are shown in the bottom panels of Table 1. These calculations reveal that
our results are very robust: using the optimal cluster assignment probabilities instead of the constrained
probabilities would give different proportions of clusters in each saturation, but similar MDEs. The ﬁnal
sample sizes for our experiment are shown in Table A1. We assign 1,102 cluster to pure control, 1,100
clusters to the 20% and 80% saturation and 680 clusters to the 50% saturation. Note that the 20% and
80% saturations are “oversampled” relative to the 50% saturation because they contain small cells (20%
treated and 20% untreated, respectively).


4.4     Empirical Results

4.4.1    Direct and Spillover Effects on the Treated Tax Bill

We begin the empirical analysis by estimating direct and spillover effects on payments of the October
2020 property tax bill. The due date was October 9th, and the letters were delivered between September
28th and October 7th. We show graphical evidence of the causal effect of the intervention in Figure 3 and
summarize the point estimates in Table 2.11 Figure 3 presents the coefﬁcients and 95% conﬁdence intervals
from a saturated regression that estimates the day by day difference in payment rates between treated and
untreated units relative to accounts in pure control blocks.12 The ﬁgure focuses on high-saturation blocks
with 80% treated units. The complete analysis with the three saturation groups is presented in Figure B.8
and Table 2.
    The top left panel of Figure 3 reveals a clear positive effect of the intervention on tax compliance of
treated accounts. The payment rate of treated units started to diverge from the pure control group as soon
as the intervention began. This treatment effect reached a magnitude of about 4.5 percentage points exactly
by the due date of the current billing period, and it stayed relatively constant thereafter.13 The top right
panel of Figure 3 shows a clear spillover effect of the intervention on untreated accounts in high-saturation
blocks. It is smaller in size than the main direct effect but still substantial. The payment rates increase by
about 1.1 percentage points, and the effect is statistically signiﬁcant in the early days of the intervention,
losing signiﬁcance from the due date onward for the full sample.
    Table 2 presents the direct and spillover effect coefﬁcients for the three saturation groups. Panels A,
B, and C display the effects in blocks where 80%, 50%, and 20% of accounts were treated, respectively.

11
   Appendix Section A.3 conﬁrms that our groups are balanced and comparable. Appendix Section B.2 further shows that our
   tax communication campaign also increased the subscriptions to receive an electronic bill by e-mail and induced taxpayers to
   pay past-due tax bills.
12
   The top panels of Figures B.6 and B.7 display the corresponding payment rates in levels, i.e., the cumulative share of individ-
   uals paying the October 2020 bill over time for treated, untreated, and pure control units.
13
   Appendix Section B.3 shows that pure control blocks are not (indirectly) affected by adjacent treated blocks and thus provide
   a valid counterfactual for our analysis.


                                                               23
The omitted category comprises pure control blocks with untreated accounts only. Columns (1) and (4)
show the coefﬁcients and block-clustered standard errors for October 2020 bill payments on two different
dates: October 3 (early payments) and October 31 (includes overdue payments). As a benchmark for our
treatment effects, the last row reports the average payment rate in pure control blocks at each of these dates
(i.e., the constant of each regression).
    The results in Table 2 indicate that in the early stage of the intervention high-saturation blocks with
80% treated accounts present statistically signiﬁcant direct and spillover effects of about 1 percentage
point. This is relatively large, considering that only 5.2% of neighbors in pure control blocks had paid their
October 2020 bill by this date. Naturally, as time passes and more individuals pay their bills (reaching
34.4% in pure control blocks by the end of the month), small effects become harder to detect. Thus, while
the spillover effect on untreated units remains unchanged in size, it statistical signiﬁcance diminishes over
time. Conversely, the direct effect on treated units rises to 4.5 percentage points, representing 13.2% of
the payment rate in pure control blocks.14


4.4.2    Heterogeneous Effects

The results from the full experimental sample presented in Section 4.4.1 revealed modest spillover effects
in the high saturation group, primarily in the early days of the intervention. However, as outlined in
our pre-analysis plan, treatment effects are likely to vary along a fundamental dimension, namely pre-
treatment tax compliance behavior. In this section, we study heterogeneous effects along this dimension
by dividing the sample into blocks that exhibited average past compliance (i.e., payments) in 2019 above
and below the block median.15
    Columns (2)-(3) and (5)-(6) of Table 2 break down the results from columns (1) and (4), respectively,
into blocks with below and above median 2019 (i.e., pre-intervention) compliance. The direct effects at
the end of the month are generally larger but not substantially different: for blocks with 80%, 50%, and
20% saturation, direct effects are about 5.1, 5.7, and 4.4 percentage points for street blocks above the
median average compliance in 2019, compared to about 4.1, 4.8 and 5.4 for those below the median. The
differences are relatively small for early payments.
   The division of the sample into these two groups reveals a much starker contrast for spillover effects.
As in the main analysis in column (1) of Table 2, there is a spillover effect in early payments for the
80% saturation group but only for blocks above median compliance in 2019. The middle and bottom
panels of Figure 3 make this pattern all the more apparent. This effect is relatively large: 1.58 percentage
points, larger in fact than the direct effect of 1.06. The end-of-month spillover effect is much larger: 2.56
percentage points, about half of the direct effect in the high-saturation group (5.09 percentage points).

14
   Figure B.9 and Table A4 validate our analysis by showing experimental balance (i.e., no effects) on the pre-intervention bill
   of September 2020.
15
   See Appendix Section B.4 for more details and the intuition behind this exercise.


                                                              24
5    Recommendations for Practice

In this section, we outline the steps for designing partial population experiments and refer to an example
of spillovers on student test scores of an education intervention, the distribution of One Laptop per Child
(as in Beuermann et al. 2015) to illustrate these steps.
Step 1. First, the researcher needs to select the number of saturations M (i.e., categories with different
intensity of treatment) and the within-cluster treatment probabilities {pg (d|t)}(g,d,t) (i.e., the proportion of
within-clusters treated units), as discussed in Section 3.4. The choice of M could be guided by previous
knowledge or by assumptions on how conditional average outcomes vary as a function of the treatment
saturation. Consider, for example, the case of One Laptop per Child (OLPC)-type experiment in which
each cluster is a school. Assuming spillovers are linear as a function of saturations, a pure control group
of schools with untreated pupils, and two groups of schools with different degrees of intensity of treatment
or saturation (low and high) would sufﬁce. To test for non-linear spillovers as a function of saturation, the
researcher should specify at least three saturation levels (low, medium, and high). Our framework does
not provide speciﬁc guidance on these choices, but highlights the trade-off between the level of detail in
which this function can be traced and the availability of units in each treatment assignment - i.e., there
might not be enough schools or laptops to distribute to test many different saturation levels. Our power
formulas quantify these trade-offs in terms of the statistical power for different designs.
Step 2. Use baseline data to assess the degree of cluster size heterogeneity. In the OLPC case, cluster
size consists of each school’s enrollment which may vary substantially, especially across districts or geo-
graphical areas (for instance, if there are large urban and smaller rural schools in the population). In some
cases, the researcher may consider excluding clear outliers to satisfy the “no cluster too large” require-
ment, Assumption 3(i). In the OLPC context, it may be necessary to exclude one particularly large school
from the experiment. It should be kept in mind that excluding outliers generally changes the population
of interest and thus affects the external validity of the estimates. In addition to accounting for variation in
cluster sizes, the researcher may need to account for distributional heterogeneity, i.e., variation in outcome
distributions across clusters (see Theorem 1). The variation in outcome distributions across clusters may
be assessed using baseline outcome data and possibly some distributional assumptions on outcomes.
Step 3. Select the variance formula for the power and MDE calculations. This step may be based on the
general formula in Equation (9) or on the simpliﬁed formula in Equation (11) under distributional homo-
geneity. These formulas may be further simpliﬁed under additional assumptions on the data-generating
process or the experimental design. For instance, one may assume equal within-cluster probabilities across
g . Another possible simplifying assumption is homoskedasticity, σ 2 (d, t) = σ 2 and ρ(d, t) = ρ for all
(d, t), which means, for example, that the variance and intra-school correlation in student test scores are
the same across treatment assignments. This assumption may be reasonable when the effects of the treat-
ment (e.g., the effects of OLPC on test scores) are approximately constant across units.
Step 4. Choose the cluster-level assignment probabilities {qt }M
                                                               t=0 —that is, what proportion of clusters


                                                       25
to assign to each of the saturation levels deﬁned in point 1 above. These probabilities can be chosen
using Theorem 2 when the goal is to minimize a weighted average of estimators variances, incorporating
ad-hoc constraints as in Section 4.3, or based on another optimization criteria (and possibly numerical
methods) as discussed in Section 3.4. One common ad-hoc constraint is having a ﬁxed number of treated
units. In this case, researchers may rely on a system of equations as in Section 4.3. For example, in the
OLPC RCT example, the government may have mandated the distribution of exactly 10,000 laptops for
the experiment. In that case, if we have two saturation groups of 25% and 75% treated pupils within a
school (i.e., clusters), and the saturation probabilities are q1 and q2 , respectively, researchers should use an
equation that represents the treatment units as 10, 000 = n(0.25q1 + 0.75q2 ), where n is the total number
of pupils. Another condition may, for example, equalize the variance of the estimators in the “small”
                                                                                   î        ó      î        ó
cells (i.e., treated units in 25% and untreated units in 75% class), that is: V β    ˆ(1, 1) = V β   ˆ(0, 2) . See
Section 4.3.
Step 5. Use the power formula in Equation (10) together with the variance formula chosen in step 3 and
the cluster probabilities in step 4 to calculate power and/or MDEs.
    It should be noted that our framework encompasses several other common settings that are particular
cases of partial population experiments. For example, our formulas can be used for designing clustered
RCTs, such as an intervention where all students in treated schools receive an OLPC laptop. Finally, we
note that our results allow for general between-cluster heterogeneity but assume that outcome distributions
are homogeneous within each cluster. We leave the generalization of our framework to within-cluster
heterogeneity for future work.



6    Conclusion

We provide a general framework to analyze and design partial population experiments with heterogeneous
clusters. We derive an asymptotic approximation and variance formulas for general clustered experimental
designs, allowing for multiple treatment intensities, general forms of intracluster correlation, and two
sources of cluster heterogeneity: heterogeneity in cluster sizes and distributional heterogeneity. We then
apply our results to analyze inference and to conduct power and MDE calculations in partial population
experiments, and derive formulas for optimal group-level assignment probabilities. Our formulas are easy
to adapt to other experimental designs.
    We estimate total and neighborhood spillover effects of a randomized communication campaign on
property tax compliance in a large municipality of Argentina where neighbors must pay a monthly bill
on their real estate. We estimate direct effects on monthly payments and analyze whether the campaign
creates spillover effects on neighbors who live nearby within a treated block but who do not receive a letter.
We ﬁnd evidence of direct and spillover effects on property tax payment rates. Our results reveal higher
payment rates of treated and untreated accounts relative to neighbors in pure control blocks where nobody


                                                       26
received the communication letter. We ﬁnd that spillover effects are stronger in blocks that exhibited a
higher degree of tax compliance in the pre-treatment period. This application showcases the usefulness of
our methodological framework for designing partial population experiments.



References
Abadie, Alberto, Susan Athey, Guido W Imbens, and Jeffrey M Wooldridge. 2022. “When Should
 You Adjust Standard Errors for Clustering?” Quarterly Journal of Economics, 138(1): 1–35.

Angelucci, Manuela, and Giacomo De Giorgi. 2009. “Indirect Effects of an Aid Program: How Do Cash
 Transfers Affect Ineligibles’ Consumption?” American Economic Review, 99(1): 486–508.

Antinyan, Armenak, and Zareh Asatryan. 2024. “Nudging for Tax Compliance: a Meta-Analysis*.”
 The Economic Journal, ueae088.

Athey, Susan, Dean Eckles, and Guido W. Imbens. 2018. “Exact P-values for Network Interference.”
  Journal of the American Statistical Association, 113(521): 230–240.

Baird, Sarah, Aislinn Bohren, Craig McIntosh, and Berk Ozler.¨      2018. “Optimal Design of Experi-
  ments in the Presence of Interference.” The Review of Economics and Statistics, 100(5): 844–860.

Bai, Yuehao. 2022. “Optimality of Matched-Pair Designs in Randomized Controlled Trials.” American
  Economic Review, 112(12): 3911–3940.

Barrera-Osorio, Felipe, Marianne Bertrand, Leigh L. Linden, and Francisco Perez-Calle. 2011. “Im-
  proving the Design of Conditional Transfer Programs: Evidence from a Randomized Education Exper-
  iment in Colombia.” American Economic Journal: Applied Economics, 3(2): 167–195.

Basse, Guillaume, and Avi Feller. 2018. “Analyzing two-stage experiments in the presence of interfer-
  ence.” Journal of the American Statistical Association, 113(521): 41–55.

Basse, G W, A Feller, and P Toulis. 2019. “Randomization tests of causal effects under interference.”
  Biometrika, 106(2): 487–494.

Berger, Martijn P.F., and Weng-Kee Wong. 2009. An Introduction to Optimal Designs for Social and
  Biomedical Research. Wiley.

Bergeron, Augustin, Gabriel Tourek, and Jonathan Weigel. 2024. “The State Capacity Ceiling on Tax
  Rates: Evidence from Randomized Tax Abatements in the DRC.” NBER working paper.

Beuermann, Diether W., Julian Cristia, Santiago Cueto, Ofer Malamud, and Yyannu Cruz-Aguayo.
  2015. “One Laptop per Child at Home: Short-Term Impacts from a Randomized Experiment in Peru.”
  American Economic Journal: Applied Economics, 7(2): 53–80.

                                                   27
Bloom, Howard S. 2005. “Randomizing Groups to Evaluate Place-Based Programs.” Learning More from
  Social Experiments: Evolving Analytic Approaches, 115–172. Russell Sage Foundation.

Boning, William C., John Guyton, Ronald Hodge, and Joel Slemrod. 2020. “Heard it through the
  grapevine: The direct and network effects of a tax enforcement ﬁeld experiment on ﬁrms.” Journal of
  Public Economics, 190(C).

Brockmeyer, A, A Estefan, K Ramirez Arras, and J.C. Suarez Serrato. 2020. “Taxing Property in
  Developing Countries: Theory and Evidence from Mexico.” IFS Working Paper.

Bruhn, Miriam, and David McKenzie. 2009. “In Pursuit of Balance: Randomization in Practice in
  Development Field Experiments.” American Economic Journal: Applied Economics, 1(4): 200–232.

Bugni, Federico A., Ivan A. Canay, and Azeem M. Shaikh. 2018. “Inference under Covariate-Adaptive
  Randomization.” Journal of the American Statistical Association, 113(524): 1784–1796.

Bugni, Federico A, Ivan A. Canay, and Azeem M. Shaikh. 2019. “Inference under Covariate-Adaptive
  Randomization with Multiple Treatments.” Quantitative Economics, 10(4): 1747–1785.

Bugni, Federico A., Ivan A. Canay, Azeem M. Shaikh, and Max Tabord-Meehan. 2023. “Inference
  for Cluster Randomized Experiments with Non-ignorable Cluster Sizes.” arXiv:2204.08356.

Carrillo, Paul E., Edgar Castro, and Carlos Scartascini. 2021. “Public good provision and property tax
  compliance: Evidence from a natural experiment.” Journal of Public Economics, 198: 104422.

Carter, Andrew V., Kevin T. Schnepel, and Douglas G. Steigerwald. 2017. “Asymptotic Behavior of a
  t-Test Robust to Cluster Heterogeneity.” Review of Economics and Statistics, 99(4): 698–709.

Chiang, Harold, Yuya Sasaki, and Yulong Wang. 2023. “On the Inconsistency of Cluster-Robust Infer-
 ence and How Subsampling Can Fix It.” arXiv:2308.10138.

  epon, Bruno, Esther Duﬂo, Marc Gurgand, Roland Rathelot, and Philippe Zamora. 2013. “Do La-
Cr´
  bor Market Policies have Displacement Effects? Evidence from a Clustered Randomized Experiment.”
  Quarterly Journal of Economics, 128(2): 531–580.

De Neve, Jan-Emmanuel, Cl´    ement Imbert, Johannes Spinnewijn, Teodora Tsankova, and Maarten
  Luts. 2021. “How to Improve Tax Compliance? Evidence from Population-Wide Experiments in Bel-
  gium.” Journal of Political Economy, 129(5): 1425–1463.

Djogbenou, Antoine A., James G. MacKinnon, and Morten Ørregaard Nielsen. 2019. “Asymptotic
  theory and wild bootstrap inference with clustered errors.” Journal of Econometrics, 212(2): 393–412.

Drago, Francesco, Friederike Mengel, and Christian Traxler. 2020. “Compliance Behavior in Net-
  works: Evidence from a Field Experiment.” American Economic Journal: Applied Economics,
  12(2): 96–133.

                                                  28
Duﬂo, Esther, and Emmanuel Saez. 2003. “The Role of Information and Social Interactions in Retire-
 ment Plan Decisions: Evidence from a Randomized Experiment.” Quarterly Journal of Economics,
 118(3): 815–842.

Duﬂo, Esther, Rachel Glennerster, and Michael Kremer. 2007. “Using Randomization in Development
 Economics Research: A Toolkit.” In Handbook of Development Economics. Vol. 4 of Handbook of
 Development Economics, , ed. T. Paul Schultz and John A. Strauss, 3895–3962. Elsevier.

Foos, Florian, and Eline A. de Rooij. 2017. “All in the Family: Partisan Disagreement and Electoral
  Mobilization in Intimate Networks—A Spillover Experiment.” American Journal of Political Science,
  61(2): 289–304.

Gin´e, Xavier, and Ghazala Mansuri. 2018. “Together We Will: Experimental Evidence on Female Vot-
  ing Behavior in Pakistan.” American Economic Journal: Applied Economics, 10(1): 207–235.

Hansen, Bruce E., and Seojeong Lee. 2019. “Asymptotic theory for clustered samples.” Journal of
 Econometrics, 210(2): 268–290.

Haushofer, Johannes, and Jeremy Shapiro. 2016. “The Short-term Impact of Unconditional Cash Trans-
 fers to the Poor: Experimental Evidence from Kenya.” Quarterly Journal of Economics, 131(4): 1973–
 2042.

Hirano, Keisuke, and Jinyong Hahn. 2010. “Design of Randomized Experiments to Measure Social
  Interaction Effects.” Economics Letters, 106(1): 51–53.

Hudgens, Michael G., and M. Elizabeth Halloran. 2008. “Toward Causal Inference with Interference.”
 Journal of the American Statistical Association, 103(482): 832–842.

                                     ¨
Ichino, Nahomi, and Matthias Schundeln.    2012. “Deterring or Displacing Electoral Irregularities?
  Spillover Effects of Observers in a Randomized Field Experiment in Ghana.” Journal of Politics,
  74(1): 292–307.

Imai, Kosuke, Gary King, and Clayton Nall. 2009. “The Essential Role of Pair Matching in Cluster-
  Randomized Experiments, with Application to the Mexican Universal Health Insurance Evaluation.”
  Statistical science, 24(1): 29–53.

Imai, Kosuke, Zhichao Jiang, and Anup Malani. 2021. “Causal Inference With Interference and Non-
  compliance in Two-Stage Randomized Experiments.” Journal of the American Statistical Association,
  116(534): 632–644.

Jiang, Zhichao, Kosuke Imai, and Anup Malani. 2023. “Statistical Inference and Power Analysis for
  Direct and Spillover Effects in Two-Stage Randomized Experiments.” Biometrics, 79(3): 2370–2381.



                                                29
Krause, Benjamin. 2020. “Balancing Purse and Peace: Tax Collection, Public Goods and Protests.”
 Mimeo.

Leung, Michael P. 2022. “Rate-optimal cluster-randomized designs for spatial interference.” The Annals
  of Statistics, 50(5): 3064 – 3087.

Liu, Jizhou. 2023. “Inference for Two-stage Experiments under Covariate-Adaptive Randomization.”
  arXiv:2301.09016.

Melas, Viatcheslav B. 2006. Functional Approach to Optimal Experimental Design. Springer New York.

Miguel, Edward, and Michael Kremer. 2004. “Worms: Identifying Impacts on Education and Health in
 the Presence of Treatment Externalities.” Econometrica, 72(1): 159–217.

Mofﬁt, Robert. 2001. “Policy Interventions, Low-level Equilibria and Social Interactions.” In Social Dy-
 namics. , ed. Stephen N. Durlauf and Peyton Young, 45–82. MIT Press.

Moulton, Brent R. 1986. “Random group effects and the precision of regression estimates.” Journal of
 Econometrics, 32(3): 385–397.

Pomeranz, Dina. 2015. “No Taxation without Information : Deterrence and Self-Enforcement in the
  Value Added Tax.” American Economic Review, 105(8): 2539–2569.

Pomeranz, Dina, and Jos´  e Vila-Belda. 2019. “Taking State-Capacity Research to the Field: Insights
  from Collaborations with Tax Authorities.” Annual Review of Economics, 11(1): 755–781.

Puelz, David, Guillaume Basse, Avi Feller, and Panos Toulis. 2022. “A graph-theoretic approach to ran-
  domization tests of causal effects under general interference.” Journal of the Royal Statistical Society:
  Series B, 84(1): 174–204.

Sasaki, Yuya, and Yulong Wang. 2022. “Non-Robustness of the Cluster-Robust Inference: with a Pro-
  posal of a New Robust Method.” arXiv:2210.16991.

Silvey, Samuel D. 1980. Optimal Design: An Introduction to the Theory for Parameter Estimation.
   Springer Netherlands.

Vazquez-Bare, Gonzalo. 2023. “Identiﬁcation and Estimation of Spillover Effects in Randomized Exper-
  iments.” Journal of Econometrics, 237(1): 105237.

Viviano, Davide. 2024. “Policy design in experiments with unknown interference.” working paper.

Weigel, Jonathan L. 2020. “The Participation Dividend of Taxation: How Citizens in Congo Engage
 More with the State When it Tries to Tax Them.” Quarterly Journal of Economics, 135(4): 1849–1903.




                                                    30
Figures and Tables

                                       Figure 1: Distribution of cluster sizes in six partial population experiments

             300
                                                                                                                 50



                                                                                                                 40

             200
 Frequency




                                                                                                     Frequency
                                                                                                                 30



                                                                                                                 20
             100


                                                                                                                 10



                     0                                                                                               0
                             10         20                    30                40        50                                 0             100                                 200             300
                                                  Cluster (block) size                                                                           Cluster (local agency) size


(a) This paper; n = 68, 808; G = 3, 982; n/G =                                                            epon et al. (2013); n = 21, 431; G = 235;
                                                                                                    (b) Cr´
17.3; sd(ng ) = 8.25.                                                                               n/G = 91.2; sd(ng ) = 42.2.
             30                                                                                                  6




             20                                                                                                  4
 Frequency




                                                                                                     Frequency




             10                                                                                                  2




                 0                                                                                               0
                         0         20                    40                     60        80                             0            20                 40                          60        80
                                                Cluster (village) size                                                                       Cluster (neighborhood) size


(c) Haushofer and Shapiro (2016); n = 2, 880; G =                                                          e and Mansuri (2018); n = 2, 637; G = 67;
                                                                                                    (d) Gin´
123; n/G = 23.4; sd(ng ) = 14.8.                                                                    n/G = 39.4; sd(ng ) = 16.7.

             4
                                                                                                                 80




             3                                                                                                   60
 Frequency




                                                                                                     Frequency




             2                                                                                                   40




             1                                                                                                   20




             0                                                                                                       0
                     0            10                   20                  30        40                                          0   20             40                   60               80   100
                                             Cluster (constituency) size                                                                            Cluster (village) size


                  undeln (2012); n = 868; G = 39;
(e) Ichino and Sch¨                                                                                 (f) Imai, Jiang and Malani (2021); n = 11, 089;
n/G = 22.3; sd(ng ) = 9.6.                                                                          G = 437; n/G = 25.4; sd(ng ) = 16.7.
Notes: This ﬁgure shows the distribution of cluster sizes in six partial population experiments; n denotes the total sample
size; G denotes the number of clusters; n/G denotes the average cluster size; sd(ng ) denotes the standard deviation of cluster
sizes. Average values across studies are n = 7, 781; G = 180; n/G = 40.4; sd(ng ) = 20. Median values across studies are
n = 2, 880; G = 123; n/G = 25.6; sd(ng ) = 16.7. The data source for Cr´          epon et al. (2013) is: DARES (2010) “Enquˆete
aupr`                         ` la prestation d’insertion jeunes diplˆ
                   ´ ligibles a
     es des jeunes e                                                   es,” Progedo-Adisp. doi:10.13144/lil-1596.
                                                                     om´




                                                                                               31
                                 Figure 2: Power functions - numerical illustration




                     1.0
                                                           power = 0.80
                     0.8



                                                           power = 0.69
                     0.6
             Power




                                                           power = 0.48
                     0.4
                     0.2




                                                                                          MDE = 0.29
                     0.0




                                 −0.4           −0.2            0.0             0.2            0.4

                                                       Difference in means

Notes: This ﬁgure illustrates how ignoring heterogeneity can result in severely underpowered experiments. We consider the
simple setting of a cluster RCT with a few “large” clusters and variation in the distribution of outcomes across clusters. We
assume 200 clusters, with 10 clusters containing 100 units each and the remaining 190 clusters containing 25 units each. The
ﬁgure plots three power functions corresponding to different variance formulas: the short-dashed curve depicts the power
function for the variance formula that accounts for clustering assuming equally-sized clusters. The long-dashed curve depicts
the power function using a variance formula that accounts for variation in cluster sizes. The solid curve depicts the power
function using a variance formula that accounts for heterogeneity in both cluster sizes and outcome distributions. Given this
sample size, the MDE at 80% power, ignoring cluster heterogeneity, is 0.29. Accounting for cluster size heterogeneity decreases
the power to detect an effect of 0.29 from 80% to 69%. Accounting for both sources of heterogeneity decreases the power further
to 48%.




                                                             32
                                  Figure 3: Direct and spillover effects on property tax payments in high-saturation blocks


                                                    Direct Effects                                                                              Spillover Effects
                                               Treated vs. Pure Control                                                                    Untreated vs. Pure Control

                            7   Intervention                                                                                       7   Intervention
                                   begins       Due date of                                                                               begins      Due date of
                                                 Oct'20 bill                                                                       6                   Oct'20 bill
                            6
                                                                                                                                   5
  Treatment effect (p.p.)




                                                                                                         Treatment effect (p.p.)
                            5
                                                                                                                                   4
                            4
                                                                                                                                   3
                            3                                                                                                      2
                            2                                                                                                      1

                            1                                                                                                      0

                            0
                     25sep2020           02oct2020      09oct2020    16oct2020   23oct2020   30oct2020                      25sep2020           02oct2020     09oct2020    16oct2020   23oct2020   30oct2020
                                                              Calendar date                                                                                         Calendar date


                                                                                 Above Median Compliance

                                                                                                                                   7   Intervention
                            7   Intervention                                                                                              begins      Due date of
                                   begins       Due date of                                                                                            Oct'20 bill
                                                 Oct'20 bill                                                                       6
                            6
  Treatment effect (p.p.)




                                                                                                         Treatment effect (p.p.)



                                                                                                                                   5
                            5
                                                                                                                                   4
                            4
                                                                                                                                   3
                            3
                                                                                                                                   2
                            2
                                                                                                                                   1
                            1
                                                                                                                                   0
                            0
                     25sep2020           02oct2020      09oct2020    16oct2020   23oct2020   30oct2020                      25sep2020           02oct2020     09oct2020    16oct2020   23oct2020   30oct2020
                                                              Calendar date                                                                                         Calendar date



                                                                                 Below Median Compliance

                            7   Intervention                                                                                       7   Intervention
                                   begins       Due date of                                                                               begins      Due date of
                                                 Oct'20 bill                                                                       6                   Oct'20 bill
                            6
                                                                                                                                   5
  Treatment effect (p.p.)




                                                                                                         Treatment effect (p.p.)




                            5                                                                                                      4
                                                                                                                                   3
                            4
                                                                                                                                   2
                            3                                                                                                      1
                                                                                                                                   0
                            2

                            1

                            0
                     25sep2020           02oct2020      09oct2020    16oct2020   23oct2020   30oct2020                      25sep2020           02oct2020     09oct2020    16oct2020   23oct2020   30oct2020
                                                              Calendar date                                                                                         Calendar date



Notes: These ﬁgures show the coefﬁcients and 95% conﬁdence intervals from a saturated regression that computes, at each
calendar day, the payment rate difference between treated and untreated groups relative to the pure control group (i.e., blocks
where no accounts were treated). We focus the attention on blocks where 80% of the units were treated. The three left ﬁgures
exhibit the direct effects on treated accounts, and the three right ﬁgures present the spillover effects on untreated accounts. The
top panel includes all the observations in high-saturation blocks, the middle panel focuses on blocks with baseline compliance
above the median, and the bottom panel focuses on blocks with baseline compliance below the median. We deﬁne compliance
as the share of bills paid by block in 2019. The median compliance is 0.56 (see Figure B.15). Standard errors are clustered
by block. The ﬁrst vertical bar shows the due date for the September 2020 bill. This corresponds to a bill issued and due for
payment before our intervention began, thus serving as a placebo. The second vertical bar indicates the start of the intervention.
The letters were delivered between September 28th and October 7th.
                                                                                                    33
                  Table 1: MDEs with constrained and optimal choice of cluster probabilities

                                 Scenario 1                         Scenario 2                         Scenario 3
                             Het Homog Equal                    Het Homog Equal                    Het Homog Equal
 Restricted qt
  q0                        0.408      0.408      0.408        0.388      0.388      0.388        0.272      0.272      0.272
  q1                        0.199      0.209      0.230        0.219      0.231      0.239        0.273      0.283      0.286
  q2                        0.194      0.173      0.131        0.173      0.149      0.135        0.182      0.162      0.157
  q3                        0.199      0.209      0.230        0.219      0.231      0.239        0.273      0.283      0.286
 MDEs
  M DE1                     0.145      0.047      0.022        0.042      0.027      0.022        0.033      0.025      0.023
  M DE2                     0.146      0.051      0.029        0.047      0.033      0.029        0.039      0.032      0.030
  M DE3                     0.146      0.051      0.030        0.047      0.034      0.030        0.040      0.033      0.032
 Optimal qt
  q0                        0.364      0.352      0.314        0.348      0.328      0.313        0.332      0.315      0.309
  q1                        0.212      0.219      0.238        0.221      0.231      0.238        0.229      0.237      0.240
  q2                        0.211      0.211      0.210        0.210      0.210      0.210        0.210      0.210      0.210
  q3                        0.212      0.219      0.238        0.221      0.231      0.238        0.229      0.237      0.240
 MDEs
  M DE1                     0.145      0.047      0.023        0.043      0.028      0.023        0.033      0.026      0.024
  M DE2                     0.145      0.049      0.025        0.045      0.030      0.025        0.036      0.028      0.027
  M DE3                     0.146      0.051      0.030        0.047      0.034      0.030        0.040      0.033      0.032
 Scenario Statistics
  n                                   84,175                             81,961                             68,808
  G                                   4,139                              4,138                               3,982
  min ng                                8                                  8                                    8
  max ng                              2,754                               376                                  50
  mean(ng )                            20.5                               19.8                               17.3
  sd(ng )                              45.9                               17.5                                8.3

Notes: This table shows the cluster assignment probabilities and MDEs for the binary outcomes of interest. The parameters
of interest are the difference in means between untreated units in each treated group and the pure control units, βn (0, t), with
t = 1, 2, 3 indicating the groups with 20%, 50%, and 80% treated units, respectively. We refer to the corresponding MDEs for
the parameters βn (0, 1), βn (0, 2) and βn (0, 3) as M DE1 , M DE2 and M DE3 , respectively. Scenario 1 exhibits “substantial”
heterogeneity, scenario 2 has “moderate” heterogeneity, and scenario 3 presents “limited” heterogeneity. n denotes the sample
size; G denotes the number of clusters; min ng and max ng show the smallest and largest cluster; mean(ng ) is the average
cluster size; sd(ng ) is the standard deviation of cluster sizes. In each scenario, we consider the results obtained with our
general formula from Theorem 1 (“Het”), the formulas that rule out between-cluster moment heterogeneity from Corollary 1
(“Homog”) and the formulas that assume homogeneous, equally-sized clusters (“Equal”). Panels 1 and 2 show the constrained
and optimal cluster assignment probabilities and their corresponding MDEs. Panels 3 and 4 show the optimal cluster assignment
probabilities and their corresponding MDEs.




                                                              34
                         Table 2: Direct and spillover effects on property tax payments


                                               Early Payments                          On-time & Late Payments
                                                     Below          Above                          Below          Above
                                        All         Median          Median           All          Median          Median
                                        (1)            (2)            (3)             (4)            (5)            (6)

 A. Blocks with 80% treated
  Treated                            0.96***         0.86**         1.06**         4.55***        4.12***        5.09***
                                      (0.28)         (0.34)          (0.42)         (0.74)         (0.79)          (0.81)
  Untreated                           1.10**          0.55          1.58**           0.79           -1.25         2.56**
                                      (0.43)         (0.50)          (0.67)         (1.01)         (1.16)          (1.27)
 B. Blocks with 50% treated
  Treated                            1.07***         1.24**           1.02         4.87***        4.81***        5.67***
                                      (0.41)         (0.50)          (0.62)         (0.93)         (1.07)          (1.08)
  Untreated                           -0.02           0.10           -0.03          -0.10           1.34           -0.76
                                      (0.34)         (0.43)          (0.50)         (0.91)         (1.00)          (1.14)
 C. Blocks with 20% treated
  Treated                             0.69*           0.85*           0.52         4.97***        5.41***        4.40***
                                      (0.42)         (0.52)          (0.63)         (0.99)         (1.21)          (1.27)
  Untreated                            0.11          0.68**          -0.42          -0.18           0.61           -1.09
                                      (0.26)         (0.33)          (0.38)         (0.72)         (0.77)          (0.82)

 Payment rate of pure control          5.15           3.63            6.49          34.37          23.53           43.91
 Observations                         68,806         32,361         36,445         68,806          32,361         36,445
 Number of clusters (blocks)          3,981           2,013          1,968          3,981          2,013           1,968

Notes: This table shows the results from saturated OLS regressions. The dependent variable in columns (1)-(3) is an indicator
for paying the October 2020 bill by October 3rd (early payments); while in columns (4)-(6) we use an indicator for paying the
October 2020 bill by October 31st (includes early, on time, and overdue payments). Columns (2) and (3) break the main result
from column (1) into blocks below and above median compliance in 2019. We deﬁne compliance as the share of bills paid by
block in 2019 with median value of 0.56 (see Figure B.15). Each column corresponds to a separate regression. The omitted
category corresponds to blocks where no accounts were treated (pure control). Panel A shows the results for blocks where 80%
were treated, panel B for blocks with 50% treated, and panel C for blocks with 20% treated. The letters were delivered between
September 28th and October 7th. The due date for the October 2020 bill was October 9th. The row Payment rate of pure control
displays the constant of each regression, corresponding to the average payment rate in blocks with no treated units. Standard
errors clustered by blocks are reported in parentheses. * p<0.10, ** p<0.05, *** p<0.01




                                                              35
           Supplementary Materials for:
     “Design of Partial Population Experiments
with an Application to Spillovers in Tax Compliance”




                        A-1
A      Further Details on Experimental Design

A.1      Additional Material

                                    Figure A.1: Example of the intervention letter




                                                                                                                     ID: XXXXX




                                                  CAP. MADARIAGA N°                        LOCALIDAD: 11 de Septiembre
                                         1657

                                                XXXXXX/7




                                                             XXXXX/7

                                                Cuota 10 vencimiento 10 de octubre 2020:                   347,29
                                                Deuda año en curso*:                                      1.702,58
                                                Deuda años anteriores*:                                    289,54
                                                * Al 15/09/2020




Notes: This ﬁgure shows an anonymized example of the letters sent during the intervention between September 28th and
October 7th, 2020. The headline reads: “Your municipal taxes are now available on the electronic bill.” The information below
the headline contains the name of the account holder, the address, and the account number. The main text of the letter reads:
“We would like to tell you that now in Tres de Febrero your municipal General Service Fee (TSG) bill is 100% digital. In other
words, paper is no longer used. You can access it and pay for it from your cell phone or computer. In this way, we take care
of each other by reducing circulation and we also take care of the environment. It is a difﬁcult situation and we appreciate the
effort you are making to keep up with your taxes, because that translates directly into constructions and services that do not
stop in your neighborhood. We inform you of the status of your account and show you how easy it is:” The table below this
text shows the account number, the amount due in the October 2020 billing period, the amount of past due debt from previous
months of 2020, and the amount of past due date from earlier years. The large box below the table explains: (1) how to sign up
for electronic billing, and (2) how to pay the bill and the different means of payment (online or in person). Finally, below the
box, the text reads: “For questions, contact us at reclamos.mistasas@tresdefebrero.gov.ar. If this letter arrived by mistake at
your address, inform us in that same email. Many thanks!”




                                                                             A-2
                       Figure A.2: Map of the municipality with the experimental design




Notes: This ﬁgure shows a map of the municipality where the 2-level randomized communication campaign took place. We
highlight the group-level assignment of blocks (cuadras) with different colors: pure control blocks with 0% treated (light
green), blocks with 20% treated accounts (green), blocks with 50% treated (blue), and blocks with 80% treated (dark blue). We
use gray for blocks that were not part of the experiment (e.g., industrial or commercial blocks).




                                                            A-3
                                        Figure A.3: A Partial Population Design

                                                       Sample of clusters

                                               q0        q1                     q2       q3


                            Pure control         20% treated              50% treated             80% treated


                                                0.8           0.2         0.5          0.5       0.2         0.8

                                   C             C            T           C             T         C          T
Notes: In a partial population design, clusters are ﬁrst randomly assigned to different treatment intensities or saturations. Within
each cluster, units are randomly assigned to treatment with a probability equal to their cluster saturation. The ﬁgure above shows
an example of a partial population design with four saturations, including pure control clusters with no treated units.




                        Figure A.4: Timeline of the randomized communication campaign

                                 25,000 letters delivered


                  September 28                               October 7                         October 9           Timeline
                    First day                                 Last day                        October 2020           2020
                  of campaign                               of campaign                        bill is due



A.2      Sample Sizes and Descriptive Statistics

                                                     Table A1: Sample sizes

                                                            Blocks         Control Obs           Treated Obs
                           Tg   =0     Pure control         1, 102                   19, 103                 0
                           Tg   =1     20% treated          1, 100                   15, 060            3, 853
                           Tg   =2     50% treated             680                    5, 905            5, 897
                           Tg   =3     80% treated          1, 100                    3, 677           15, 311
                           Total                            3, 982                   43, 745           25, 061


Notes: This table shows the ﬁnal sample sizes used in our experiment. We limit the analysis to clusters of size ranging between
8 and 50 property tax accounts per street-block.




                                                                    A-4
                              Table A2: Descriptive statistics in 2019 (baseline year)


                                           Blocks              Obs              Mean                SD               ICC

 Paid the twelve bills in 2019              3, 981           68, 808            0.449             0.497             0.062
 Paid at least one bill in 2019             3, 981           68, 808            0.650             0.477             0.071
 Paid six bills or more in 2019             3, 981           68, 808            0.572             0.495             0.073

Notes: This table shows descriptive statistics about the frequency of payments in 2019. This is the baseline year we used for the
randomization, power calculations, and simulations. The data set is restricted to blocks with size between 8 and 50 accounts.
Figure 1 shows the distribution of accounts per block. Our sample size consists of 68,808 accounts distributed in 3,982 blocks.
The frequency of payments is very polarized. About 45 percent of the accounts paid the twelve bills and about 35 percent
did not pay any bill. We call these two core groups always payers and never payers, respectively. The perfect compliance
rate of 45 percent is presumably low and, therefore, leaves room for potential behavioral responses from non-compliant and
partially-compliant neighbors.




                                                              A-5
Figure A.5: Distribution of payment date for treated, untreated, and pure control (October 2020 billing
period)
                                                           (a) Treated vs. Pure Control

                                    .15
                                                       Due date of
                                                                                                         Treated
                                                       October bill                                      Pure Control




                                     .1
                         Fraction




                                    .05




                                     0
                                          30sep2020   14oct2020   28oct2020   11nov2020 25nov2020 09dec2020 23dec2020



                                                         (b) Untreated vs. Pure Control

                                    .15
                                                       Due date of
                                                                                                         Untreated
                                                       October bill                                      Pure Control




                                     .1
                         Fraction




                                    .05




                                     0
                                          30sep2020   14oct2020   28oct2020   11nov2020 25nov2020 09dec2020 23dec2020


Notes: These ﬁgures show the fraction of individuals paying the October 2020 bill before and after the due date (October 9th,
2020). Panel (a) shows the distribution of payments for treated units (in blue) relative to pure control units (in red). We pool
together treated units from Tg = 1, 2, 3. Panel (b) shows the distribution of payments for untreated units (in blue) relative to
pure control units (in red). We pool together untreated units from Tg = 1, 2, 3. The area of each histogram integrates to one. A
larger bar on a particular date means that the payment frequency of the corresponding group is higher than the other group.




                                                                              A-6
A.3    Balance Checks

We ran balance test checks to verify the comparability of the treated, untreated, and pure control groups
in terms of demographic and account-related characteristics in 2019. We jointly estimate the parameters
of interest through the following saturated OLS regression:

                                  3                                3
                     Xig = α +         θt 1(Tg = t)(1 − Dig ) +         τt 1(Tg = t)Dig + εig             (12)
                                 t=1                              t=1


where Xig is one of the account holder or dwelling characteristics contained in our baseline data. We allow
εig to be correlated within blocks and use a cluster-robust variance estimator. In this regression, θt captures
the average difference of Xig of untreated units in groups with Tg = t relative to the pure control group,
and τt captures the average difference of Xig of treated units in groups with Tg = t relative to the pure
control group. The results are reported in Table A3 and reassuringly conﬁrm that our groups are highly
balanced. The null effect on timely payments (i.e., excluding past-due payments) of the September 2020
bill—the bill prior to our intervention— sheds further light on the balance between groups (see Figure
B.9).




                                                      A-7
                                     Table A3: Balance test saturated regressions


                          Property       Front       House        Tenant       Tenant         Bill       N Bills       Digital
                           Value        Metres        type         Male         Age         amount      paid 2019     payment
                             (1)          (2)          (3)          (4)          (5)          (6)           (7)          (8)

 A. Blocks with 80% treated:
  Treated                   0.01        −8.27        −0.00        −0.00        −0.14         2.81         0.05         −0.00
                           (0.02)       (17.77)      (0.00)       (0.01)       (0.40)       (7.81)       (0.09)         (0.01)
  Untreated                 0.00        −1.76         0.00          0.00       −0.53         6.27        −0.06         −0.00
                           (0.02)       (20.70)      (0.01)       (0.01)       (0.53)       (12.95)      (0.12)         (0.01)
 B. Blocks with 50% treated:
  Treated                   0.01         12.65       −0.00        −0.00        −0.47         1.16         0.03           0.00
                           (0.02)       (20.38)      (0.01)       (0.01)       (0.50)       (9.21)       (0.11)         (0.01)
  Untreated                 0.01         25.30       −0.00        −0.00        −0.42         1.88         0.02           0.01
                           (0.02)       (20.66)      (0.01)       (0.01)       (0.48)       (9.66)       (0.11)         (0.01)
 C. Blocks with 20% treated:
  Treated                   0.02        32.57*       −0.01          0.01        0.10         5.94         0.07         −0.01
                           (0.02)       (16.79)      (0.01)       (0.01)       (0.54)       (9.55)       (0.12)         (0.01)
  Untreated                 0.02         19.14       −0.01        −0.01         0.12         1.32         0.00           0.00
                           (0.02)       (14.05)      (0.00)       (0.01)       (0.40)       (7.77)       (0.09)         (0.01)

 Mean Pure Control         13.64        841.50        0.91          0.62        19.15       368.66        6.71           0.35
 Observations              64,932       68,808       68,808       46,419       52,714       68,808       68,808        38,112
 Number of clusters        3,979         3,981        3,981        3,973        3,976        3,981        3,981         3,968

 Notes: This table shows balance test regressions to formally test for differences in observable characteristics between the
treatment and control groups. Each column corresponds to a separate regression (equation (12) in the text). The dependent
variables in each column are: (1) the log of assessed property value; (2) the front metres of the property; (3) an indicator for
the property being a house versus a house with a store; (4) whether the tenant is male; (5) a proxy for the tenant’s age (ﬁrst two
digits of the ID); (6) the amount paid in the bill corresponding to December 2019 (including zeroes); (7) the number of bills
paid in 2019 (the maximum is 12); (8) for those who paid, whether they did so digitally. The row Mean Pure Control displays
the constant of each regression, corresponding to the average of the dependent variable for accounts in blocks with no treated
units (Tg = 0). Missing/non-missing indicators for the dependent variables with missing observations (columns 1, 4, 5 and
8) are also balanced between groups (results not reported). Standard errors clustered by blocks are reported in parentheses. *
p<0.10, ** p<0.05, *** p<0.01




A.4      Further Details on Within-Group Assignment Mechanisms

Fixed Margins. The within-group treatment is often assigned by choosing a ﬁxed (i.e. nonrandom)
number of treated units within each group. Given Tg = t, suppose the researcher wants to assign a


                                                              A-8
proportion pt of, or a total of ng pt , units to treatment. Assigning exactly ng pt units to treatment is not
possible when ng pt is not an integer. We propose the following procedure to deal with this issue. Deﬁne
an independent binary random variable ξg and let the number of treated units in cluster g be:

                                       1
                                      Ng = ng pt + ξg 1(ng pt ∈
                                                              / N).

so that ξg plays the role of an adjusting factor that randomly rounds the number of treated up or down.
Given Tg = t, set the probability that ξg = 1 to:
                                                
                                                0                  if ng pt ∈ N
                           Pg [ξg = 1|Tg = t] =
                                                n p − n p          if ng pt ∈
                                                                             / N.
                                                   g t  g t


This implies that, given Tg = t, the expected number of treated units in group g is ng pt and that Pg [Dig =
1|Tg = t] = pt . Then, given Tg = t, the expected number of treated units in group g is ng pt and that
Pg [Dig = 1|Tg = t] = pt . More precisely,

                            1
                         E[Ng |Tg = t] = ng pt + E[ξg |Tg = t]1(ng pt ∈
                                                                      / N)
                                        = ng pt + (ng pt − ng pt )1(ng pt ∈
                                                                          / N)
                                        = ng pt

using that ng pt = ng pt when ng pt ∈ N. It follows that:
                                  ñ
                                  1
                                          ô
                                 Ng
                               E    Tg = t = P[Dig = 1|Tg = t] = pt
                                 ng

                                                                                    0         1
which doesn’t vary across groups conditional on Tg = t. On the other hand, deﬁning Ng = ng − Ng , we
have that:                    ñ 0         ô
                               Ng
                            E      Tg = t = P[Dig = 0|Tg = t] = 1 − pt .
                               ng
Next, for this assignment mechanism,
                                                       ñ
                                                       1
                                                             Ç    1
                                                                          å         ô
                                                      Ng         Ng −1
                     P[Dig = 1, Djg   = 1|Tg = t] = E                         Tg = t
                                                      ng         ng − 1
                                                        1 2               1
                                                    E[(Ng ) |Tg = t] − E[Ng |Tg = t]
                                                  =
                                                              ng (ng − 1)




                                                    A-9
where

            ) |Tg = t] = E[( ng pt + ξg 1(ng pt ∈
          1 2
      E[(Ng                                     / N))2 |Tg = t]
                         = n2g pt 1(ng pt ∈ N)
                                2


                                                                                      Pg [ξg = 0|Tg = t] 1(ng pt ∈
                           Ä                                                      2
                                                                                                        ä
                         + ( ng pt + 1)2 Pg [ξg = 1|Tg = t] + ng pt                                              / N)
                         = n2g pt 1(ng pt ∈ N)
                                2


                                                                                  (1 − ng pt − ng pt ) 1(ng pt ∈
                           Ä                                                  2
                                                                                                      ä
                         + ( ng pt + 1)2 (ng pt − ng pt ) + ng pt                                              / N).

Similarly,
                                                                      0 2               0
                                                                  E[(Ng ) |Tg = t] − E[Ng |Tg = t]
                        P[Dig = 0, Djg = 0|Tg = t] =
                                                                            ng (ng − 1)

where

                    0 2                     1 2
                E[(Ng ) |Tg = t] = E[(ng − Ng ) |Tg = t] = n2       1 2              2
                                                            g + E[(Ng ) |Tg = t] − 2ng pt


Notice that even if P[Dig = d|Tg = t] does not change across g , the joint probabilities do. Nevertheless,
these terms can be calculated for any sample using the chosen probabilities pt and the cluster sizes {ng }G
                                                                                                          g =1 .



Bernoulli Trials. Alternatively, the within-cluster treatment may be assigned to each unit independently
as a “coin ﬂip” with probability pt . Under this mechanism, independence between treatment indicators
implies that:

                                          P[Dig = 1|Tg = t] = P[Dig = 1|Tg = t] = pt
                             P[Dig = d, Djg = d|Tg = t] = P[Dig = d|Tg = t]2 .

which do not vary over g . It follows that:
                                                                                             Ç          å
                   g   ng (ng − 1)P[Dig = d, Djg = d|Tg = t]               1−d                   g n2
                                                                                                    g
                                                             = pd
                                                                t (1 − pt )                           −1
                               g ng P [ Dig = d |Tg = t]                                          n

Then the variances are approximated by:
                                 ®                        Ç         å´            ®         Ç                  å´
                      2
          ˆ0t ] ≈ σ (0t)                                      gn2
                                                                g        σ 2 (00)                        gn2
                                                                                                           g
       V[ β                          1 + ρ0t (1 − pt )            −1   +            1 + ρ00                  −1
                 nqt (1 − pt )                                n            nq0                           n

and                              ®                Ç         å´            ®         Ç                      å´
                        2
               ˆ1t ] ≈ σ (1t)                         gn2
                                                        g        σ 2 (00)                            gn2
                                                                                                       g
             V[β                     1 + ρ1t pt           −1   +            1 + ρ00                      −1   .
                       nqt pt                         n            nq0                               n



                                                              B-10
B     Additional Empirical Results

B.1   Further Details and Figures for Main Results




                                      B-11
                                                  Figure B.6: Payment rates: Treated groups vs Pure control blocks
                                                                             (a) Payment rates in levels

                                                                                Due date of
                                          40       Intervention                October 2020                    Treated groups
                                                      begins                   billing period
            Cumulative payment rate (%)




                                          30                                                                        Pure Control



                                          20



                                          10



                                              0
                                           25sep2020             02oct2020      09oct2020        16oct2020          23oct2020      30oct2020
                                                                                       Calendar date

                                                                    (b) Difference relative to pure control group


                                                  Intervention
                                                                               Due date of                              T50
                                          5          begins
                                                                              October 2020
                                                                              billing period                            T80
                                                                                                                        T20
                                          4
            Treatment effect (p.p.)




                                          3


                                          2


                                          1


                                          0
                                          25sep2020          02oct2020         09oct2020        16oct2020       23oct2020          30oct2020
                                                                                      Calendar date
Notes: These ﬁgures show the effect of the intervention on payments of the October 2020 bill for treated groups. Panel (a)
shows the cumulative share of individuals paying the October 2020 bill over time. The brown dashed line shows the payment
rate for pure control units. The blue dashed line corresponds to treated units in group Tg = 1 (blocks with 20% treated). The
black dashed line corresponds to treated units in group Tg = 2 (blocks with 50% treated). The red solid line corresponds to
treated units in group Tg = 3 (blocks with 80% treated). Panel (b) shows, for each calendar date, the difference between each
treated group and the pure control group (treatment effect coefﬁcients). The letters were delivered between September 28th
and October 7th. The ﬁrst vertical bar denotes the start of the intervention. The due date was October 9th and is indicated with
another vertical bar.




                                                                                        B-12
                                                  Figure B.7: Payment rates: Untreated groups vs Pure control blocks
                                                                             (a) Payment rates in levels

                                                                                Due date of
                                                   Intervention
                                          40          begins
                                                                               October 2020
                                                                               billing period
                                                                                                              Untreated groups
            Cumulative payment rate (%)




                                          30                                                                        Pure Control


                                          20



                                          10



                                              0
                                           25sep2020             02oct2020      09oct2020        16oct2020          23oct2020      30oct2020
                                                                                       Calendar date

                                                                    (b) Difference relative to pure control group

                                                                               Due date of
                                                  Intervention
                                          5          begins
                                                                              October 2020
                                                                              billing period
                                                                                                                      Treated
                                                                                                                      (pooled)
                                          4
            Treatment effect (p.p.)




                                          3


                                          2


                                          1                                                                  T80

                                                                                                             T50
                                          0
                                                                                                             T20
                                          25sep2020          02oct2020         09oct2020        16oct2020          23oct2020       30oct2020
                                                                                      Calendar date
Notes: These ﬁgures show the effect of the intervention on payments of the October 2020 bill for untreated groups. Panel (a)
shows the cumulative share of individuals paying the October 2020 bill over time. The brown dashed line shows the payment
rate for pure control units. The blue dashed line corresponds to untreated units in group Tg = 1 (blocks with 20% treated). The
black dashed line corresponds to untreated units in group Tg = 2 (blocks with 50% treated). The red solid line corresponds
to untreated units in group Tg = 3 (blocks with 80% treated). Panel (b) shows, for each calendar date, the difference between
each untreated group and the pure control group (treatment effect coefﬁcients). For comparison, the gray solid line shows the
treatment effects for treated units (pooled from Tg = 1, 2, 3). The letters were delivered between September 28th and October
7th. The ﬁrst vertical bar denotes the start of the intervention. The due date was October 9th and is indicated with another
vertical bar.




                                                                                        B-13
                                  Figure B.8: Direct effects on treated accounts and spillover effects on untreated accounts


                                                    Direct Effects                                                                              Spillover Effects
                                               Treated vs. pure controls                                                                   Untreated vs. pure controls

                            7   Intervention                                                                                       7   Intervention
                                   begins       Due date of                                                                               begins      Due date of
                                                 Oct'20 bill                                                                       6                   Oct'20 bill
                            6
                                                                                                                                   5
  Treatment effect (p.p.)




                                                                                                         Treatment effect (p.p.)
                            5
                                                                                                                                   4
                            4
                                                                                                                                   3
                            3                                                                                                      2
                            2                                                                                                      1

                            1                                                                                                      0

                            0
                     25sep2020           02oct2020      09oct2020    16oct2020   23oct2020   30oct2020                      25sep2020           02oct2020     09oct2020    16oct2020   23oct2020   30oct2020
                                                              Calendar date                                                                                         Calendar date


                                                   Blocks with 50% treated                                                                               Blocks with 50% treated
                            7   Intervention                                                                                       7   Intervention
                                   begins       Due date of                                                                               begins      Due date of
                                                 Oct'20 bill                                                                       6                   Oct'20 bill
                            6
                                                                                                                                   5
  Treatment effect (p.p.)




                                                                                                         Treatment effect (p.p.)

                            5
                                                                                                                                   4
                            4                                                                                                      3
                            3                                                                                                      2

                            2                                                                                                      1
                                                                                                                                   0
                            1

                            0
                     25sep2020           02oct2020      09oct2020    16oct2020   23oct2020   30oct2020                      25sep2020           02oct2020     09oct2020    16oct2020   23oct2020   30oct2020
                                                              Calendar date                                                                                         Calendar date


                                                   Blocks with 20% treated                                                                               Blocks with 20% treated
                            7   Intervention                                                                                       7   Intervention
                                   begins       Due date of                                                                               begins      Due date of
                                                 Oct'20 bill                                                                       6                   Oct'20 bill
                            6
                                                                                                                                   5
  Treatment effect (p.p.)




                                                                                                         Treatment effect (p.p.)




                            5
                                                                                                                                   4
                            4
                                                                                                                                   3
                            3
                                                                                                                                   2
                            2                                                                                                      1
                            1                                                                                                      0

                            0

                     25sep2020           02oct2020      09oct2020    16oct2020   23oct2020   30oct2020                      25sep2020           02oct2020     09oct2020    16oct2020   23oct2020   30oct2020
                                                              Calendar date                                                                                         Calendar date


Notes: These ﬁgures show the coefﬁcients and 95% conﬁdence intervals from a saturated regression that computes, at each
calendar day, the payment rate difference between each treated and untreated group relative to the pure control group (i.e.,
blocks where no accounts were treated). The top panel shows the effect on treated (left) and untreated (right) units in blocks
with 80% treated (Tg = 3). The middle panel shows the effect on treated (left) and untreated (right) units in blocks with 50%
treated (Tg = 2). The bottom panel shows the effect on treated (left) and untreated (right) units in blocks with 20% treated
(Tg = 3). These point estimates coincide with those reported in panel (b) of Figures B.6 and B.7. Standard errors are clustered
by block. The ﬁrst vertical bar denotes the start of the intervention. The due date for the October 2020 bill was October 9th and
is indicated with another vertical bar. The letters were delivered between September 28th and October 7th.




                                                                                                  B-14
                                 Figure B.9: Placebo. Direct and spillover effects for the pre-intervention Sep’20 bill


                                                Treated groups                                                                   Untreated groups
                                               vs. pure controls                                                                 vs. pure controls
                                            Blocks with 80% treated                                                                    Blocks with 80% treated
                                                                               Intervention                                                                               Intervention
                            7      Due date of                                    begins                                7     Due date of                                    begins
                                  Sept 2020 bill                                                                        6    Sept 2020 bill
                            6
                            5                                                                                           5
  Treatment effect (p.p.)




                                                                                              Treatment effect (p.p.)
                                                                                                                        4
                            4
                                                                                                                        3
                            3
                                                                                                                        2
                            2
                                                                                                                        1
                            1                                                                                           0
                            0



                     01sep2020        08sep2020     15sep2020      22sep2020     29sep2020                       01sep2020       08sep2020     15sep2020      22sep2020     29sep2020
                                                   Calendar date                                                                              Calendar date


                                            Blocks with 50% treated                                                                    Blocks with 50% treated
                                                                               Intervention                                                                               Intervention
                            7                                                     begins                                7     Due date of                                    begins
                                   Due date of
                                  Sept 2020 bill                                                                        6    Sept 2020 bill
                            6
                                                                                                                        5
  Treatment effect (p.p.)




                                                                                              Treatment effect (p.p.)

                            5

                            4                                                                                           4

                            3                                                                                           3

                            2                                                                                           2
                                                                                                                        1
                            1
                                                                                                                        0
                            0



                     01sep2020        08sep2020     15sep2020      22sep2020     29sep2020                       01sep2020       08sep2020     15sep2020      22sep2020     29sep2020
                                                   Calendar date                                                                              Calendar date


                                            Blocks with 20% treated                                                                    Blocks with 20% treated
                                                                               Intervention                                                                               Intervention
                            7      Due date of                                    begins                                7     Due date of                                    begins
                                  Sept 2020 bill                                                                        6    Sept 2020 bill
                            6
                                                                                                                        5
  Treatment effect (p.p.)




                                                                                              Treatment effect (p.p.)




                            5

                            4                                                                                           4

                            3                                                                                           3

                            2                                                                                           2

                            1                                                                                           1

                            0                                                                                           0



                     01sep2020        08sep2020     15sep2020      22sep2020     29sep2020                       01sep2020       08sep2020     15sep2020      22sep2020     29sep2020
                                                   Calendar date                                                                              Calendar date


Notes: These ﬁgures show the coefﬁcients and 95% conﬁdence intervals from a saturated regression that computes, at each
calendar day, the payment rate difference between each treated and untreated group relative to the pure control group (i.e.,
blocks where no accounts were treated). The top panel shows the effect on treated (left) and untreated (right) units in blocks
with 80% treated (Tg = 3). The middle panel shows the effect on treated (left) and untreated (right) units in blocks with 50%
treated (Tg = 2). The bottom panel shows the effect on treated (left) and untreated (right) units in blocks with 20% treated
(Tg = 3). Standard errors are clustered by block. The ﬁrst vertical bar shows the due date for the September 2020 bill. This
corresponds to a bill issued and due for payment before our intervention began, thus serving as a placebo. The second vertical
bar indicates the start of the intervention. The letters were delivered between September 28th and October 7th.




                                                                                       B-15
        Table A4: Placebo. Direct and spillover effects for the pre-intervention September 2020 bill


 Dependent variable:                                          Placebo bill (September 2020)
 Pr(pay the bill)                              All                      Below Median                  Above Median
                                               (1)                            (2)                            (3)

 A. Blocks with 80% treated
  Treated                                      0.12                          0.10                           0.28
                                              (0.69)                        (0.73)                         (0.81)
  Untreated                                   -0.30                          -1.55                          0.78
                                              (0.95)                        (1.09)                         (1.24)
 B. Blocks with 50% treated
  Treated                                      0.76                          1.54                           0.69
                                              (0.88)                        (0.99)                         (1.12)
  Untreated                                    0.26                          0.81                           0.36
                                              (0.88)                        (0.94)                         (1.15)
 C. Blocks with 20% treated
  Treated                                      0.85                          1.32                           0.27
                                              (0.93)                        (1.11)                         (1.24)
  Untreated                                    0.07                          0.27                          -0.32
                                              (0.68)                        (0.72)                         (0.80)

 Payment rate of pure control                 29.70                          20.05                         38.19
 Observations                                68,806                         32,361                        36,445
 Number of clusters (blocks)                  3,981                          2,013                         1,968

Notes: Notes: This table shows the results from saturated OLS regressions using as dependent variable an indicator for paying
the September 2020 bill by September 15th (pre-intervention). Each column corresponds to a separate regression. The omitted
category corresponds to blocks where no accounts were treated (pure control). Panel A shows the results for blocks where
80% were treated, panel B for blocks with 50% treated, and panel C for blocks with 20% treated. Columns (2) and (3) split
the sample from column (1) into blocks below and above median compliance in 2019, respectively. We deﬁne compliance as
the share of bills paid by block in 2019 with median value of 0.56 (see Figure B.15). The estimates reported in column (1)
correspond exactly to the numbers shown in Figure (B.9). The row Payment rate of pure control displays the constant of each
regression, corresponding to the average payment rate in blocks with no treated units. Standard errors clustered by blocks are
reported in parentheses. * p<0.10, ** p<0.05, *** p<0.01




                                                            B-16
B.2        Other Margins

Subscriptions to electronic billing. We ﬁnd evidence that our tax communication campaign also in-
creases the subscriptions to receive an electronic bill by e-mail.1 These effects are greater in high-
saturation blocks, albeit small in absolute value. Appendix Section B.2.1 presents graphical evidence
of direct and spillover effects (Figure B.10), which are then summarized in Table A5, although spillover
effects in this outcome are much more tenuous.
Backward and forward payments. We also ﬁnd that the effects of our letters are not solely concentrated
on the October 2020 billing period (the bill targeted by our intervention). Section B.2.2 presents graphical
evidence that the letters also increased the payment rates in subsequent billing periods. Perhaps more
strikingly, we also show that some neighbors made backward payments to cancel past-due debt from
previous billing periods. This is especially prominent after April 2020 when the COVID-19 lockdown
measures were established in Argentina (See Figure B.11).


B.2.1       Effects on Subscriptions to Electronic Billing

The communication campaign also included information about how to sign up for electronic billing, a
system introduced in June 2020. We brieﬂy analyze the effect of our mailing on subscription to this
service.
   We rely on a database that contains the individuals who signed up for the electronic billing option.
This database goes through December 2020 and contains the account number, date of subscription, and
email address. This source is linked with the main data through the unique account identiﬁer.
     We analyze the intervention’s effect on subscriptions to electronic billing and present convincing
graphical evidence that the tax communication campaign increased subscriptions to receive an electronic
bill by e-mail. These effects are greater in high-saturation blocks, albeit small in absolute value.
    The results are summarized in Figure B.10, which follows a similar structure as Figure B.8 but for
e-bill subscriptions. We run dynamic difference-in-differences comparing subscription rates between each
treated and each untreated group relative to pure control blocks, day by day (ﬁxing September 27, 2020,
as the baseline date).
    Four important points are worth highlighting: (1) trends are generally parallel, as we estimate no
signiﬁcant differences between the treatment and control groups prior to the intervention; (2) the difference
in subscription rates between treated accounts and pure control blocks experiences a noticeable break at
the time we started sending letters, which is reassuring and implies that the effects we estimate are indeed
caused by our experiment; (3) direct effects are greater in high-saturation blocks with 50% and 80% treated
units relative to low-saturation blocks where only 20% received the letter. As happened with payment

1
    Note that nudging individuals to sign up for e-billing was an explicit content of the letter (see Figure A.1).


                                                                 B-17
rates, this could be interpreted as a spillover effect, whereby the intervention creates interference between
treated units strengthening the effect of the letter; and (4) although less clear than the left-hand-side panels
for treated units, the right-hand-side panels of Figure B.10 also suggest the presence of spillover effects in
subscriptions to e-billing for untreated accounts in high-saturation blocks. As was the case with payment
rates, these effects are harder to detect. They are precisely estimated but only signiﬁcant at the 5% level at
the beginning of the intervention.
    Lastly, Table A5 summarizes the corresponding diff-in-diffs estimates reported in Figures B.10, with
the same structure as Table 2.2 To benchmark our estimates, in the last row we report the share of e-bill
subscribers in pure control blocks on September 27 (our baseline date). For treated accounts, the table
shows an immediate effect in the three saturation groups that increases over time. This effect is higher
in blocks with 80% treated units, consistent with interference that strengthens the effect. In such blocks,
the total effect reaches 0.86 percentage points by the end of October. Although this represents about 20%
of the baseline 4.25% share of e-bill subscribers, we ﬁnd it striking that so few individuals switched to
the digital bill. In the case of untreated accounts, spillover effects on subscription rates are smaller and,
therefore, much harder to detect than in the analysis of payment rates. The clearest effect arises in blocks
with 50% treated accounts with a spillover effect of 0.25 percentage points, signiﬁcant at the 10% level.
The somewhat absence of spillovers in this case can be explained by the fact that the outcome of analysis
(subscription rate) has very low take up, making it harder for interference between neighbors to emerge.
    In sum, we ﬁnd that our tax communication campaign also generates direct effects and spillover effects
among neighbors in subscriptions to electronic billing. These effects are greater in high-saturation blocks,
albeit small in absolute value.




2
    Column (1) validates the experiment by showing a placebo saturated regression that compares subscription rates between each
    group and the pure control group on September 17, before the intervention began. None of the coefﬁcients are statistically
    signiﬁcant or large in magnitude.


                                                              B-18
Figure B.10: Direct effects on treated accounts and spillover effects on untreated accounts (subscriptions
to e-billing). Difference in differences


                                                           Treated groups                                                                         Untreated groups
                                                          vs. pure controls                                                                       vs. pure controls

                                                        Blocks with 80% treated                                                                            Blocks with 80% treated
                                                                                                                                1          Intervention
                             1          Intervention                                                                                          begins            Due date of
                                           begins            Due date of                                                                                         Oct'20 bill
                                                              Oct'20 bill
  Treatment effect (p.p.)




                                                                                                     Treatment effect (p.p.)
                                                                                                                               .5
                            .5




                             0                                                                                                  0




                            -.5                                                                                                -.5
                            16sep2020                  30sep2020             14oct2020   28oct2020                             16sep2020                  30sep2020             14oct2020   28oct2020
                                                                   Calendar date                                                                                      Calendar date


                                                        Blocks with 50% treated                                                                            Blocks with 50% treated
                                                                                                                                1          Intervention
                                                                                                                                              begins            Due date of
                             1          Intervention                                                                                                             Oct'20 bill
                                                             Due date of
                                           begins             Oct'20 bill
  Treatment effect (p.p.)




                                                                                                     Treatment effect (p.p.)




                                                                                                                               .5
                            .5



                                                                                                                                0
                             0



                            -.5                                                                                                -.5
                            16sep2020                  30sep2020             14oct2020   28oct2020                             16sep2020                  30sep2020             14oct2020   28oct2020
                                                                   Calendar date                                                                                      Calendar date


                                                        Blocks with 20% treated                                                                            Blocks with 20% treated
                             1          Intervention                                                                            1          Intervention
                                           begins            Due date of                                                                      begins            Due date of
                                                              Oct'20 bill                                                                                        Oct'20 bill
  Treatment effect (p.p.)




                                                                                                     Treatment effect (p.p.)




                            .5                                                                                                 .5




                             0                                                                                                  0




                            -.5                                                                                                -.5
                            16sep2020                  30sep2020             14oct2020   28oct2020                             16sep2020                  30sep2020             14oct2020   28oct2020
                                                                   Calendar date                                                                                      Calendar date


Notes: These ﬁgures show the coefﬁcients and 95% conﬁdence intervals from dynamic difference-in-differences regressions
where the outcome of interest is a dummy equal to one if the account is subscribed to an electronic bill. All the coefﬁcients
are estimated with respect to September 27th, 2020 (baseline date) and relative to the pure control group (i.e., blocks where no
accounts were treated). The top panel shows the effect on treated (left) and untreated (right) units in blocks with 80% treated
(Tg = 3). The middle panel shows the effect on treated (left) and untreated (right) units in blocks with 50% treated (Tg = 2).
The bottom panel shows the effect on treated (left) and untreated (right) units in blocks with 20% treated (Tg = 1). Standard
errors are clustered by block. The ﬁrst vertical bar denotes the start of the intervention. The due date for the October 2020 bill
was October 9th and is indicated with another vertical bar. The letters were delivered between September 28th and October 7th.




                                                                                                 B-19
                   Table A5: Total effects and spillover effects for subscriptions to e-billing


 Dependent variable:                             Placebo:                                  Intervention:
 Pr(subscribe to e-bill)                         By Sep 20                       Early                      By Oct 31
                                                     (1)                          (2)                           (3)

 A. Blocks with 80% treated
  Treated                                          -0.02                       0.31***                       0.86***
                                                   (0.04)                        (0.06)                       (0.12)
  Untreated                                         0.04                          0.11                         0.08
                                                   (0.03)                        (0.08)                       (0.15)
 B. Blocks with 50% treated
  Treated                                           0.03                        0.18**                       0.81***
                                                   (0.03)                        (0.08)                       (0.18)
  Untreated                                        -0.07                          0.10                         0.25*
                                                   (0.05)                        (0.06)                       (0.13)
 C. Blocks with 20% treated
  Treated                                          -0.04                         0.15*                       0.57***
                                                   (0.05)                        (0.08)                       (0.19)
  Untreated                                        -0.01                          0.05                         -0.09
                                                   (0.03)                        (0.04)                       (0.09)

 Mean of Pure Control at baseline                   4.25                          4.25                         4.25
 Observations                                     137,612                      137,612                       137,612
 Number of clusters (blocks)                       3,981                         3,981                         3,981

 Notes: This table shows the results from a saturated dynamic difference-in-differences regression where the dependent variable
is an indicator for subscribing to electronic billing. The regression computes the outcome difference between each of the treated
and untreated groups relative to the pure control group for each calendar date relative to September 27th, 2020 (baseline date).
The estimates correspond exactly to the numbers shown in Figure (B.10). Column (1) shows the results for e-bill subscriptions
made before the letters were delivered (placebo); Column (2) shows the results for early subscriptions right after the letters
started to be delivered (by October 3); Column (3) shows the results for subscriptions made up to the end of October 2020. The
letters were delivered between September 28 and October 7. The due date for the October 2020 bill was October 9th. The row
Mean of Pure Control displays the constant of the regression, corresponding to the average susbcription rate for units in blocks
with no treated units on September 27, 2020. Standard errors clustered by blocks are reported in parentheses. * p<0.10, **
p<0.05, *** p<0.01




                                                             B-20
B.2.2       Timing of Payments and Due Bills

For completeness, we analyze the effects of the intervention on backward and forward payments corre-
sponding to billing periods before and after month 10, the month of our intervention. These results are
summarized in Figure B.11.
    Intuitively, neighbors can pay their property tax bill at any time before or after the due date, and
hence, payments from previous billing periods can also be affected by our intervention.3 To illustrate
this, the left panels of Figure B.11 only consider timely payments, deﬁned as bills paid before the 27th
of the corresponding month. We set any payment made after the 27th as unpaid in our data. Hence, pre-
intervention bills mechanically exclude any past-due payment triggered by our intervention. In contrast,
the right panels of Figure B.11 consider timely as well as past-due payments made until December 2020
and, thus, capture backward payments triggered by our intervention (e.g., individuals that decide to pay
the October 2020 bill as well as previous unpaid bills after receiving the letter).
    The top ﬁgures show payment rates in levels for treated units (black line) and pure control units (gray
line), for 24 consecutive monthly bills between January 2019 and December 2020. Treated units are
pooled from groups Tg = 1, 2, 3. The bottom ﬁgures report total treatment effects—i.e., the difference
between treated and pure control units—and 95% conﬁdence intervals for the 24 billing periods. The ﬁrst
vertical bar denotes the start of the COVID-19 pandemic in Argentina, and the second vertical bar ﬂags
the October’20 bill targeted by our intervention.
    Four important points are worth noting: (1) Overall, payment rate levels are low. The top left panel
shows that about 48% of households pay their bill before the 27th of each month. This share is relatively
constant until March 2020, when the COVID-19 pandemic hit Argentina and payment rates decreased
sharply to 23%; (2) a similar pattern emerges when we consider timely and past-due payments. The reason
why levels are higher and decrease over time is that as time goes by, it is more likely that individuals cancel
unpaid bills; (3) placebo direct effects (red line), based on payment rates constructed with timely payments
only, are precisely estimated and not different from zero for the 21 pre-intervention bills. For the October
2020 bill, however, timely payments are 4.4 p.p. higher in treated units relative to control blocks. This is
reassuring and implies that our sample is balanced and that the effects we estimate are indeed caused by
our experiment; and (4) when we account for past-due payments, the blue line shows that our intervention
nudged some individuals to catch up with unpaid bills. The difference in payment rates between treated
and pure control accounts experiences a noticeable increase in the pandemic billing periods from April
2020 onward. Although the October bill, when the intervention took place, presents the highest effect (4.2
p.p.), the letters also had some residual positive effects in November and December.




3
    The treatment letter included past due balances and could therefore induce neighbors to make backward payments to cancel
    debt.


                                                            B-21
                                        Figure B.11: Direct effects on pre- and post-intervention bills


                               Timely payments                                                                           Timely and past-due
                                     only                                                                                    payments

                                                                          (a) Payment rates in levels

   % paid the bill                                                                                % paid the bill
   60                                                          COVID-19            Intervention                                                               COVID-19            Intervention
                                                             billing periods            bill      60                                                        billing periods            bill
   55                                                                                             55

   50                                                                                             50

   45                                                                                             45
                     Treated (pooled)                                                                               Treated (pooled)
   40                                                                                             40
                     Pure Control                                                                                   Pure Control
   35                                                                                             35

   30                                                                                             30

   25                                                                                             25

   20                                                                                             20
        2019m1   2019m4   2019m7   2019m10   2020m1        2020m4   2020m7     2020m10   2021m1        2019m1   2019m4    2019m7   2019m10   2020m1       2020m4   2020m7     2020m10   2021m1
                                          Billing Period                                                                                 Billing Period



                                                           (b) Difference relative to pure control group

   Treatment                                                                                      Treatment
   effect (p.p.)                                                                                  effect (p.p.)
    6                                                          COVID-19                            6                                                          COVID-19
                                                             billing periods                                                                                billing periods
    5                                                                                              5
                                                                                   Oct 2020:                                                                                      Oct 2020:
                                                                                    4.4 p.p.                                                                                       4.2 p.p.
    4                                                                                              4

    3                                                                                              3

    2                                                                                              2
    1                                                                                              1
    0
                                                                                                   0
   -1
                                                                                                  -1
     2019m1      2019m4   2019m7   2019m10   2020m1    2020m4       2020m7     2020m10   2021m1     2019m1      2019m4    2019m7   2019m10   2020m1   2020m4       2020m7     2020m10   2021m1
                                          Billing Period                                                                                 Billing Period




Notes: These ﬁgures show the effect of the communication campaign on payment rates of pre- and post-intervention bills. The
left panels only consider timely payments, deﬁned as bills paid before the 27th of the corresponding month (i.e., any payment
made after the 27th is considered unpaid). Hence, pre-intervention bills mechanically exclude any past-due payment triggered
by our intervention. The right panels consider timely as well as past-due payments made until December 2020 and, thus, capture
backward payments triggered by our intervention (e.g., individuals who, after receiving the letter, pay the October 2020 bill as
well as previous unpaid bills). The top ﬁgures show payment rates in levels for treated units (black line) and pure control units
(gray line), for 24 consecutive monthly bills between January 2019 and December 2020. Treated units are pooled from groups
Tg = 1, 2, 3. The bottom ﬁgures report total treatment effects—i.e., the difference between treated and pure control units—and
95% conﬁdence intervals for the 24 billing periods. The letters were delivered between September 28th and October 7th. The
vertical bar denotes the start of the COVID-19 pandemic in Argentina. Each coefﬁcient is estimated in separate regressions.
Standard errors are clustered at the block level. The red line shows no difference on timely payments for pre-intervention bills.
In contrast, when we account for past-due payments, the blue line shows that our intervention nudged some individuals to catch
up with unpaid bills from April 2020 onwards.




                                                                                           B-22
B.3        Are Untreated Blocks Affected by the Intervention?

A crucial aspect of partial population experiments is the unit within which the experimenter will test
the presence of spillovers. In some settings, these are relatively straightforward to establish: electoral
precincts for political outcomes, towns for regional policies, and schools or school districts for educational
interventions. In our application, we aim to measure information spillovers among taxpayers. Discussions
with municipal tax authorities and with taxpayers, as well as the context of our intervention, led us to
select city street blocks as the relevant clusters for potential information spillovers about tax reminders
and deadlines and their effects on tax compliance. Speciﬁcally, the campaign was motivated by the sharp
drop in compliance in April 2020 induced by the severe lockdown imposed in the Greater Buenos Aires
area in Argentina during the COVID pandemic in a context where most payments were made in person
(see Figure B.11). The lockdown was strongly enforced, and as a result, citizens’ mobility was severely
limited, which justiﬁes the choice of the city street block—a relatively small cluster—as the relevant unit
for information spillover since it reﬂects the limited physical interactions generated by the lockdown. A
further justiﬁcation is the city’s street layout, which consists mainly of relatively homogeneous straight
streets with orthogonal intersections in square/rectangular city blocks (see Figure A.2).
    A potential concern with this setup is that the city street block may not be the relevant unit to capture
information spillovers. The random assignment process and the city’s physical layout imply that taxpayers
in pure control street blocks (i.e., blocks where no one received a tax reminder) were still adjacent and/or
surrounded by blocks with treated taxpayers, as shown by inspection of the map in Figure A.2. Inter-
ference between adjacent blocks is possible, and this would induce a downward bias in our results, since
individuals in pure control (untreated) blocks would be affected by the information campaign via spillovers
from adjacent (treated) blocks. Our empirical setup allows for an auxiliary test to rule out this concern and
establish that units in pure control blocks indeed provide a valid counterfactual in our analysis.4
    To test the robustness of untreated blocks as pure controls, we leverage our experimental assignment
process, which implies that the “intensity” of treatment in the surrounding blocks is random by deﬁnition.
Pure control units are by chance surrounded by blocks with varying degrees of treatment intensity (0%,
20%, 50%, or 80%), and thus by a random number of treated taxpayers. If there is interference between
treated and untreated blocks, we should observe that pure control payment rates increase with the exposure
of untreated blocks to treated blocks.
    We construct our measure of the potential exposure of a street-block to the intervention in two steps.
First, we use GIS software to calculate a buffer of 100, 200, and 300 meters around the centroid of each
street-block (see the three ﬁgures in the top panel of Figure B.12), given the typical street block length of
100 meters. Second, for each street-block and radius, we calculate the share of properties receiving a letter
(treated) relative to all the properties in the buffer zone. The three ﬁgures in the middle panel of Figure

4
    This is an auxiliary analysis in the sense that while it exploits features of our experimental design, it does not correspond to
    our pre-registered analysis and only represents an ex-post robustness check.


                                                                B-23
B.12 display the distribution of the share of treated units around pure control blocks.
    With this exposure measure at hand, we test whether payment rates of the October 2020 bill in pure
control blocks increase with the exposure to the proportion of treated units in surrounding blocks. The
ﬁgures in the bottom panel of Figure B.12 present parametric and non-parametric evidence of this rela-
tionship. Each panel shows a binned scatterplot of payment rates of the October 2020 bill (y-axis) by
equally-sized bins of exposure to treated units within the buffer zone (x-axis). Reassuringly, the rela-
tionship is ﬂat, and it is robust to increasing the size of the buffer zone to 200 and 300 meters. This is
conﬁrmed by the small linear regression coefﬁcients and large p-values reported in these ﬁgures.
    Our main results indicated that we only found spillover effects in our main research design for high
saturation blocks with high previous compliance, as illustrated by the results in Figure 3 and Table 2.
We conduct a similar analysis with the exposure measure for the 100-meter buffer in Figure B.13. The
parametric and non-parametric results presented there conﬁrm a ﬂat gradient for untreated blocks with
both high and low compliance in 2019, further conﬁrming that untreated blocks were not affected by the
intervention even when considering this heterogeneity.
    Finally, for completeness, we also study the relationship between payment rates and exposure to ad-
jacent treated blocks in blocks where 80% of the units were treated, again for the 100-meter buffer. The
results of this exercise are reported in Figure B.14. The left panel corresponds to the October 2020 bill
affected by our intervention, whereas the middle and right panels correspond to pre-intervention bills of
July and August 2020. In all these cases, the relationship between exposure and payment rates is ﬂat and
statistically not signiﬁcant for both the pure control blocks (with blue dots and blue linear ﬁt) and the 80%
saturation blocks (with red triangles and a red linear ﬁt). Interestingly, the vertical distance between the
red and blue linear ﬁt in the left panel captures the treatment effect of our experiment, which is clearly
uniform in the exposure measure.
    Taken together, the results from the exercise in this section indicate that pure control blocks were
not affected by adjacent treated blocks, and thus provide a valid counterfactual for the analysis. In more
general terms, information spillovers do not seem to have happened at a higher degree of aggregation than
the city street block. When combined with the presence of information spillovers documented in the main
body of the paper, the city street block seems to have been the relevant level of information dissemination
for this campaign.




                                                   B-24
                                                 Figure B.12: Robustness of untreated blocks as pure controls


                                100m buffer                                                            200m buffer                                                              300m buffer

                                                                                                      (a) Illustration




                          (b) Exposure distribution (share of treated accounts around pure treatment blocks)
    Share of                         100m buffer                                 Share of                       200m buffer                                 Share of                       300mts buffer
    accounts                                                                     accounts                                                                   accounts
     .08                                                                          .08                                                                        .08




     .06                                                                          .06                                                                        .06




     .04                                                                          .04                                                                        .04




     .02                                                                          .02                                                                        .02




          0                                                                            0                                                                          0
               0     .1         .2     .3   .4      .5      .6      .7     .8              0    .1         .2     .3   .4      .5      .6      .7     .8              0    .1         .2     .3   .4      .5      .6      .7     .8
                    Share of treated accounts around pure control blocks                       Share of treated accounts around pure control blocks                       Share of treated accounts around pure control blocks


                                                  (c) Payment rates for pure controls as a function of exposure
    Payment                                                                      Payment                                                                    Payment
      rate                           100m buffer                                   rate                         200m buffer                                   rate                         300m buffer
    .5




                                                                                 .5




                                                                                                                                                            .5




                                                          b=.000                                                                     b=.049                                                                     b=.008
                                                         (p=.979)                                                                   (p=.271)                                                                   (p=.881)
    .45




                                                                                 .45




                                                                                                                                                            .45
    .4




                                                                                 .4




                                                                                                                                                            .4
    .35




                                                                                 .35




                                                                                                                                                            .35
    .3




                                                                                 .3




                                                                                                                                                            .3
    .25




                                                                                 .25




                                                                                                                                                            .25
    .2




                                                                                 .2




                                                                                                                                                            .2




              .05         .15          .25       .35          .45          .55         .05           .15          .25       .35          .45          .55         .05           .15          .25       .35          .45          .55
                                     Exposure: 100mts                                                           Exposure: 200mts                                                            Exposure: 300m



Notes: The top three panels illustrate the way we compute buffer zones around the centroid of each street-block using GIS
tools in our data. We consider radiuses of 100 meters (left panel), 200 meters (middle panel), and 300 meters (right panel).
The middle panel three ﬁgures show the distribution of accounts in pure control street-blocks according to their exposure to
treated accounts. The bottom three panels show binned scatterplots of payment rates of the October 2020 bill (y-axis) in pure
control blocks and their exposure to treated units within the buffer zone (x-axis). The x-axis is grouped into equally-sized
bins. The coefﬁcient and p-value of each regression are also reported in each panel. The regressions ﬂexibly control for a
cubic polynomial of the number of properties in the buffer zone. This variable is highly correlated with payment rates, and its
omission leads to omitted variable bias.

                                                                                                                  B-25
Figure B.13: Payment rates and exposure of untreated blocks above and below median 2019 compliance,
100 meters buffer

    Payment                 All                           Payment            Above Median                         Payment           Below Median
      rate                                                  rate              Compliance                            rate             Compliance
    .55




                                                          .55




                                                                                                                  .55
                                                                                                 b=-0.018
                                                                                                (p=0.687)
    .5




                                                          .5




                                                                                                                  .5
                                          b=.000
                                         (p=.979)
    .45




                                                          .45




                                                                                                                  .45
    .4




                                                          .4




                                                                                                                  .4
    .35




                                                          .35




                                                                                                                  .35
                                                                                                                                                         b=-0.013
                                                                                                                                                        (p=0.745)
    .3




                                                          .3




                                                                                                                  .3
    .25




                                                          .25




                                                                                                                  .25
    .2




                                                          .2




                                                                                                                  .2
    .15




                                                          .15




                                                                                                                  .15
          .05   .15     .25       .35      .45      .55         .05   .15      .25       .35       .45      .55         .05   .15      .25       .35       .45      .55
                      Exposure: 100mts                                       Exposure: 100mts                                        Exposure: 100mts



Notes: This ﬁgure shows binned scatterplots of payment rates (y-axis) in pure control blocks by equally-sized bins of exposure
to treated units within a buffer zone of 100 meters (x-axis). The left panel replicates the bottom left panel of Figure B.12.
The middle and right panels split pure control blocks into blocks with above- and below-median compliance deﬁned in 2019,
respectively. The regressions ﬂexibly control for a cubic polynomial of the number of properties in the buffer zone. This
variable is highly correlated with payment rates, and its omission leads to omitted variable bias.




Figure B.14: Payment rates and exposure of untreated blocks and blocks with 80% treated units, 100
meters buffer

    Payment                                               Payment             July 2020                           Payment            August 2020
      rate            October 2020                          rate                                                    rate
                                                                            Pre intervention                                        Pre intervention
    .5




                                                                                                                  .5
                                                          .5
    .45




                                                                                                                  .45
                                                          .45




                                    Blocks with
                                    80% treated
    .4




                                                                                                                  .4
                                                          .4
    .35




                                                                                                                  .35
                                                          .35




                                     Untreated
                                      blocks
    .3




                                                                                                                  .3
                                                          .3
    .25




                                                                                                                  .25
                                                          .25




          .05   .15     .25       .35      .45      .55         .05   .15      .25       .35       .45      .55         .05   .15      .25       .35       .45      .55
                      Exposure: 100mts                                       Exposure: 100mts                                        Exposure: 100mts



Notes: This ﬁgure shows binned scatterplots of payment rates (y-axis) by equally-sized bins of exposure to treated units within
a buffer zone of 100 meters (x-axis). The left panel shows the gradient in both untreated blocks (blue dots) and blocks with
80% treated units (red triangles) for the October 2020 bill (the one affected by the intervention). The middle and right panels
correspond to the pre-intervention bills of July and August 2020, respectively. The regressions ﬂexibly control for a cubic
polynomial of the number of properties in the buffer zone. This variable is highly correlated with payment rates and its
omission leads to omitted variable bias.




                                                                              B-26
B.4    Heterogeneity and Pre-treatment Tax Compliance

Section 4.4.2 analyzes heterogeneous effects based on pre-treatment tax compliance behavior in the year
2019. The distribution of the 68,806 accounts by the number of bills paid in 2019 is bi-modal, with a core
group of neighbors not paying any bills (35%) and another group paying all of them (45%). Panel (a) of
Figure B.15 shows the individual-level distribution. We deﬁne past compliance by computing the average
number of payments of the twelve monthly bills for 2019 in each block. Panel (b) of Figure B.15 shows
the block-level distribution. We use this measure to divide our sample into two groups – those above and
those below the median block average payment rate.
    The logic of this exercise goes as follows. A large fraction of neighbors who typically paid their bills
stopped doing so during the pandemic in the ﬁrst few months of 2020. This decrease in compliance was
stronger in blocks that had higher compliance in 2019. Hence, we argue that such a core group of “good
compliers” is more likely to be nudged to pay by our intervention, and where spillover effects are more
likely to manifest.
   Figure B.16 suggests that 2018 and 2019 are comparable in terms of compliance, but compliance
decreased substantially in 2020 because of the pandemic—the sharp fall corresponds to the lockdown
measures put in place. Figure B.17 shows that payment rates in 2020 decreased more in blocks with higher
compliance in 2019. In contrast, 2018 and 2019 show similar levels of compliance. This set of ﬁgures
thus helps us rationalize the reasoning behind the heterogeneity analysis presented in Section 4.4.2.
    Table 2 conﬁrms that spillover effects are driven by blocks with baseline compliance above the median
in high saturation blocks (80% treated). Spillover effects are more muted and insigniﬁcant in medium
(50% treated) and low (20% treated) saturation blocks, however. Reassuringly, the ﬁrst two columns also
show no effects for the pre-intervention bill of September 2020, either above or below the median.




                                                   B-27
                 Figure B.15: Distribution of bill payments in 2019 for individuals and blocks
                               (a) Number of monthly bills paid in 2019 (by individuals)

  Share of
  accounts
   .5



   .4                                                                                                                45%


   .3           35%


   .2



   .1



    0
                 0         1     2        3         4        5          6        7      8        9    10       11     12
                                                        Number of bills paid in 2019


                                       (b) Share of bills paid in 2019 (by blocks)

  N blocks
   300

                                                                                                 Mean = 0.59
                                                                                                 p50 = 0.56
                                                                                                  N = 3,981
   200




   100




        0
            0         .1         .2           .3         .4           .5          .6        .7       .8         .9         1
                                                   Share of bills paid in 2019 - by block

Notes: Panel (a) shows the distribution of the 68,806 accounts by the number of bills paid in 2019. The distribution is bi-modal
with a core group of neighbors not paying any bills (35%) and another group paying all of them (45%). Panel (b) uses the
information from panel (a) to compute the share of total bills paid in 2019 for each block. We use this measure of block-level
compliance for the heterogeneity analysis, to split our sample into blocks below and above the median of 0.56 (see Table 2).
These two ﬁgures and values look very similar for the year 2018.




                                                                B-28
                  Figure B.16: Compliance in the ﬁrst nine months of 2018, 2019, and 2020
                                                    (a) 2018 vs 2019

  N blocks

   300                                                                                                             2018
                                                                                                                   2019




   200




   100




      0
          0          .1        .2         .3        .4           .5          .6      .7         .8        .9         1
                                                   Share of bills paid - by block


                                                    (b) 2019 vs 2020

  N blocks
                                                                                                                   2020
   300
                                                                                                                   2019




   200




   100




      0
          0          .1        .2         .3        .4           .5          .6      .7         .8        .9         1
                                                   Share of bills paid - by block

Notes: These ﬁgures show compliance in the ﬁrst 9 billing periods of the year. For each block, we compute the share of total
bills paid out of 9. Panel (a) compares 2018 and 2019, and panel (b) compares 2019 and 2020. We restrict the analysis to the
ﬁrst 9 bills because our intervention takes place in October. To make it comparable, the numerator excludes overdue payments
(i.e., payments made after the due date of each month). The ﬁgure suggests that 2018 and 2019 are comparable in terms of
compliance and that compliance decreased substantially in 2020 because of the pandemic.




                                                           B-29
      Figure B.17: Payment rates in 2020 decreased more in blocks with higher compliance in 2019


                                                     1                                                         45° line

               Share of bills paid in 2018 or 2020   .9

                                                     .8
                                                                                                                  2018
                                                     .7

                                                     .6                                                           2020
                                                     .5

                                                     .4

                                                     .3

                                                     .2

                                                     .1

                                                     0
                                                          0   .1   .2   .3    .4       .5     .6     .7   .8      .9      1
                                                                          Share of bills paid in 2019

Notes: This ﬁgure compares compliance in 2018 or 2020 (vertical axis) relative to 2019 (horizontal axis) at the block level. To
that end, we split the sample of blocks into ten evenly-spaced groups using the share of payments in 2019 (horizontal axis). For
each bin, we then compute the average share of payments in 2018, 2019, and 2020. The red triangles compare 2018 to 2019,
and the blue circles compare 2020 to 2019. The 45◦ line corresponds to the situation where compliance remains unchanged over
time. The ﬁgure suggests that the drop in compliance in 2020 highlighted in Figure B.16 is more prominent for higher levels of
baseline compliance. That is, blocks that had high compliance in 2019 are those where the payment rate decreased the most in
the ﬁrst nine months of 2020. In contrast, 2018 and 2019 display similar levels of compliance. This stylized fact suggests that
blocks with high compliance in 2019 (and low compliance in 2020) are more likely to be nudged by our intervention and, thus,
where spillovers are more likely to manifest.




                                                                                 B-30
C      Additional Numerical Illustration
Figure 1 summarizes the distribution of cluster sizes in ﬁve published studies employing partial population designs:
  epon et al. (2013), Gin´
Cr´                                                                                        undeln (2012) and
                         e and Mansuri (2018), Haushofer and Shapiro (2016), Ichino and Sch¨
Imai, Jiang and Malani (2021).

    For this numerical illustration, we calculate the estimators standard errors and minimum detectable effects based
on our formulas from Section 3 using the cluster size distribution of these four studies. We refer to these magnitudes
as “adjusted” standard errors and MDEs, since they are adjusted for cluster size variation. For comparison, we also
calculate the “unadjusted” standard errors and MDEs using average cluster size and assuming that the variance of
group size is equal to zero, that is, ignoring cluster size heterogeneity. To make the results comparable, we use as a
benchmark the design in our application to tax compliance, which has four saturations: p0 = 0, p1 = 0.2, p2 = 0.5,
p3 = 0.8. We compute the optimal probabilities {q0 , q1 , q2 , q3 } using Theorem 2. We assume for simplicity that
outcomes are homoskedastic with σ 2 (dt, dt) = 1 for all d, t so that effects are measured in standard deviations, and
consider four values for the intraclass correlation, ρ ∈ {0.1, 0.2, 0.5, 0.8}. The parameter of interest is the spillover
effect on untreated units in groups with 80% treated.

                                            Table A6: Numerical results

                                     Standard error                            MDE
                                   Adj. Unadj. Ratio                  Adj.     Unadj.      Ratio
                      ρ = 0.1
                       GM         0.1262 0.1181         1.0687       0.3536    0.3308     1.0689
                       HS         0.1053 0.0932         1.1307       0.2951    0.2610     1.1307
                       IS         0.1768 0.1667         1.0608       0.4954    0.4670     1.0608
                       IJM        0.0569 0.0497         1.1453       0.1595    0.1393     1.1450
                      ρ = 0.5
                       GM         0.2593 0.2393         1.0835       0.7265    0.6705     1.0835
                       HS         0.2098 0.1783         1.1761       0.5877    0.4997     1.1761
                       IS         0.3437 0.3171         1.0840       0.9630    0.8884     1.0840
                       IJM        0.1136 0.0950         1.1961       0.3183    0.2661     1.1962
                      ρ = 0.8
                       GM         0.3252 0.2997         1.0851       0.9112    0.8397     1.0851
                       HS         0.2622 0.2218         1.1818       0.7345    0.6215     1.1818
                       IS         0.4284 0.3941         1.0869       1.2002    1.1042     1.0869
                       IJM        0.1420 0.1181         1.2024       0.3979    0.3309     1.2025


    The numerical results are shown in Table A6. When the intraclass correlation is low (ρ = 0.1), accounting
for cluster size heterogeneity increases standard errors and MDEs between 6.8% and 14.5%. The problem worsens
for larger intraclass correlations. When ρ = 0.5, adjusted standard errors and MDEs are between 8.3% and 19.6%
larger, and between 8.5% and 20.2% larger when ρ = 0.8.




                                                         D-31
D       Proofs

D.1     Setup and Deﬁnitions

Following the notation in the paper, consider clusters g = 1, . . . , G with cluster size ng , units i = 1, . . . , ng and total
sample size n =       g   ng . The cluster-level treatment assignment is Tg ∈ {0, . . . , M } with P[Tg = t] = qt , and the
individual-level treatment indicator Dig with P[Dig = d|Tg = t] = pg (d|t). Within each cluster, the total number of
                                           i 1(Dig = d), and conditional on Ng > 0, the within-cluster average
                                      d =
units receiving treatment Dig = d is Ng                                      d

outcome under Dig = d is Y ¯ d = ng Yig 1(Dig = d)/N d .
                                   g          i=1                              g

    Letting 1dt
             ig = 1(Dig = d, Tg = t),                       ig )(d,t) and 1g = (1ig , . . . , 1ng g ) , the vector of OLS estimators
                                                    1ig = (1dt
for the sample means is:
                                                                 −1
                                         µ
                                         ˆn =            1g 1g                 1g Yg = (N)−1               1g Yg
                                                    g                     g                        g

where N = diag (N (d, t))(d,t) is a diagonal matrix with entries N (d, t) =                            g    g Ng and where 1g = 1(Tg = t).
                                                                                                           1t  d            t

                                                                               2 (d, t), Cov(Y , Y |D
    Also deﬁne E[Yig |Dig = d, Tg = t] = µg (d, t), V[Yig |Dig = d, Tg = t] = σg              ig  jg  ig =
d, Djg = d , Tg = t) = cg (d, d , t) with Cov(Yig , Yjg |Dig = d, Djg = d, Tg = t) = cg (d, t) and similarly
ρg (d, d , t) = cg (d, d , t)/(σg (d, t)σg (d , t)) and ρg (d, t) = ρg (d, d, t). Finally, let pg (d, d |t) = P[Dig = d, Djg =
d |Tg = t].



D.2     Auxiliary Results

Lemma 1 (Convergence of Sample Sizes) Under Assumptions 1 and 3,
                           ï ò−1                                      ïò
                      N     N                                        N
                        ×E       →P I2M +1 ,                       E     = diag qt                 ng pg (d|t)/n             .
                      n     n                                        n                         g                     (d,t)



Proof. For any (d, t),

    1                     1                         1
            1t                     V[1t                        V 1t                     1
                                                              ¶ î               ó     î                 ó©
                d                        d                               d               t       d
V            g Ng =                   g Ng ] =                    g E[ Ng  |Tg ]  + E    g V [ Ng  |Tg ]
    n   g                 n2   g                    n2    g
                     1
                   = 2                 n2         2
                                        g pg (d|t) qt (1 − qt ) + qt ng pg (d|t)(1 − pg (d|t)) + qt ng (ng − 1) pg (d, d|t) − pg (d|t)
                                                                                                                                      2
                    n          g
                                              n2
                                               g                              ng
                   = qt (1 − qt )                pg (d, t)2 + qt                 pg (d|t)(1 − pg (d|t))
                                          g   n2                    g         n2
                          ng (ng − 1)
                   + qt          2
                                       pg (d, d|t) − pg (d|t)2
                        g      n
                              2
                        Ç       å
                           g ng
                   =O              = o(1).
                          n2




                                                                        D-32
since       g   n2   2
                 g /n ≤ maxg ng /n → 0. Therefore, by Markov’s inequality,

        ñ        ï ò−1             ô                                     Å                     ã2
            N     N                                                            N (d, t)/n
   P          ×E       − I2M +1 > ε = P                                                     − 1 > ε2
            n     n                                                d,t
                                                                              E[N (d, t)/n]

                                                                          1                               1                                    ε    E[N (d, t)]
                                                        ≤          P                 1t  d
                                                                                      g Ng − E                          1t  d
                                                                                                                         g Ng      >√
                                                             d,t
                                                                          n     g                         n    g                             2M + 1     n
                                                                                          Ç                              å2
                                                          (2M + 1)2                                  n                             1
                                                        ≤                                                                      V              1t  d
                                                                                                                                               g Ng
                                                             ε2                     d,t
                                                                                              qt   g g pg (d|t)
                                                                                                    n                              n     g

                                                             (2M + 1)3 1        1
                                                        ≤         2
                                                                      · · max V                                     1t  d
                                                                                                                     g Ng → 0
                                                                ε      c d,t    n                              g


using that        g   ng pg (d|t)/n is bounded below.

                    ¯ d ) Under Assumptions 1 and 2,
Lemma 2 (Moments of Yg


                                     g E[Yg |Tg = t, Dg ] = 1g µg (d, t)
                                    1t     ¯d                   t

                                                               1t                                                              1d
                                                                                                                                ig 1jg
                                                                                                                                    d
                                    1t
                                     g V [ ¯
                                           Yg
                                             d
                                               |Tg = t, Dg ] =     σ (d, t) + 2cg (d, t)1t
                                                                 g 2
                                                                  d g                    g                                       d )2
                                                                   Ng                                               i    j>i
                                                                                                                               (Ng

                           1t
                       î                                     ó
                                d 2   ¯d                  2
                   E        g (Ng ) V[Yg |Tg = t, Dg ] = σg (d, t)qt ng pg (d|t) + cg (d, t)ng (ng − 1)qt pg (d, d|t).



                                       ig = 1(Dig = d),
Proof. By direct calculation, letting 1d

                                                                                                                1t
                                                        ñ                                            ô
                                                             1
                  1t   ¯d
                   g E[Yg |Tg       = t, Dg ] = 1  t
                                                   gE         d
                                                                         Yig d1 Tg = t, Dg =
                                                                             ig                                 d
                                                                                                                 g
                                                                                                                               E[Yig |Tg = t, Dg ]1d
                                                                                                                                                   ig
                                                            Ng     i
                                                                                                               Ng         i

                                             =1    t
                                                   g µg (d, t)


where the last equality follows from Assumption 2. Similarly,

                                        1t
1t   ¯d
 g V[Yg |Tg = t, Dg ] =
                                         g
                                      d )2
                                                   V[Yig |Tg = t, Dig = d]1d
                                                                           ig + 2                                    ig 1jg Cov(Yig , Yjg |Dig = d, Djg = d)
                                                                                                                    1d   d
                                    (Ng        i                                                      i       j>i

                                    1t                                                1 1
                                                                                     d d
                                                   + 21t
                                     g 2                                             ig jg
                                =      σ (d, t)
                                      d g              g cg (d, t)                     d 2
                                    Ng                                    i    j>i
                                                                                   ( N g)


and the third expression follows from taking expectation.

Lemma 3 (Convergence of squared sums) Given a vector of random variables Xg = (X1g , . . . , Xng g ) and (Xg )G
                                                                                                              g =1 ,
                  ng                               1
let Xg =          i=1 Xig                 X 2 . Suppose that: (i) (Xg )G
                              and deﬁne Tn =                           g =1 are independent across g ; (ii) Assumption
                                        îg g ó     n
3(i) holds; (iii) For some > r, supi,g E |Xig | < ∞. Then |Tn /E[Tn ] − 1| →P 0.




                                                                               D-33
Proof. This proof follows those of Theorems 2 and 3 in Hansen and Lee (2019). Write
                                                    2                                                                 ng
                    Tn      1                      Xg                    1         2              Xig
                          =                    1                       =          Zg ,   Zig :=           ,   Zg :=            Zig .
                   E[Tn ]   n          g   E   n       g
                                                            2
                                                           Xg            n    g                 E[Tn ]1/2             i=1


Fix ε > 0. We show that for n large enough, E [|Tn /E[Tn ] − 1|] < ε and the result follows by Markov’s inequality.
Set δ = ε2 /4. Then, using that:

                              1                            1                                    1
                      E                 2
                                       Zg =1=                             2
                                                                       E[Zg 1(Zg
                                                                               2
                                                                                 > nδ )] +                 2
                                                                                                        E[Zg 1(Zg
                                                                                                                2
                                                                                                                  ≤ nδ )]
                              n    g                       n       g                            n   g


we have:
                              ï                ò
                                   Tn                           1
                      E                  −1        ≤E                         2
                                                                             Zg 1(Zg
                                                                                   2             2
                                                                                     > nδ ) − E[Zg 1(Zg
                                                                                                      2
                                                                                                        > nδ )]
                                  E[Tn ]                        n       g

                                                                1
                                                   +E                         2
                                                                             Zg 1(Zg
                                                                                   2             2
                                                                                     ≤ nδ ) − E[Zg 1(Zg
                                                                                                      2
                                                                                                        ≤ nδ )]            .
                                                                n       g


    and by the triangle inequality,
                              ï                ò
                                   Tn                      2
                      E                  −1        ≤                      2
                                                                       E[Zg 1(Zg
                                                                               2
                                                                                 > nδ )]                                               (13)
                                  E[Tn ]                   n   g

                                                        1
                                                   +      E                   2
                                                                             Zg 1(Zg
                                                                                   2             2
                                                                                     ≤ nδ ) − E[Zg 1(Zg
                                                                                                      2
                                                                                                        ≤ nδ )]            .           (14)
                                                        n              g


Consider the term (13). For r ≥ 2,

                                                               |Zg |r Ä
                                                                   ñ                                         ô
           1                                 1
                          1                                             1 |Zg | > (nδ )
                                                                                                           ä
                      2     2                                                      r −2            r/2−1
                   E[Zg   (Zg      > nδ )] =               E
           n   g                             n          g     |Zg |r−2
                                                          1
                                                                                   1
                                                                         î            Ä                      äó
                                                                                 r                       1/2
                                               ≤                      E   | Z g |      | Z g | >  ( nδ )
                                                   n(nδ )r/2−1 g
                                                                            ñ        r Ç                          åô
                                                          1                     Zg             Zg        (nδ )1/2
                                               ≤                       r
                                                                      n E              1             >
                                                   nr/2 δ r/2−1 g g             ng             ng           ng
                                                                                     r                  Ç             å1/2
                                                          1                     Zg             Zg               n
                                               ≤                       r
                                                                      n E              1             >                     δ 1/2
                                                   nr/2 δ r/2−1 g g             ng             ng          maxg n2  g
                                                                    r                    r                 Ç             å1/2
                                                       1        g ng               Zg            Zg               n
                                               ≤            ·         sup E                1           >                      δ 1/2
                                                   δ r/2−1 nr/2 g                  ng            ng            maxg n2 g
                                                                            r                  Ç               å1/2
                                                    C r/2             Zg             Zg                n
                                               ≤            sup E             1            >                        δ 1/2
                                                   δ r/2−1 g          ng             ng           maxg n2    g




                                                                               D-34
where the last equality follows from Assumption 3(i). Now, by Condition (iii), for > r,
                                                                                      î       ó
                                                         î            ó       supi,g E |Xig |
                                        sup E |Zig |                      =                                   < ∞.
                                            i,g                                    E[Tn ]l/2

Thus by Lemma 1 in Hansen and Lee (2019), there is a B large enough such that:
                                                        ñ         r       Ç                  åô
                                                             Zg               Zg                           εδ r/2−1
                                        sup E                         1          >B                  ≤
                                            g                ng               ng                            2C r/2

and by Assumption 3(i) there is an n large enough such that:
                                                                      Ç              å1/2
                                                                            n
                                                            B≤                                   δ 1/2 ,
                                                                          maxg n2
                                                                                g

from which:
                                                r                         Ç                  å1/2
                                    Zg                       Zg                 n                                       εδ r/2−1
                          sup E                     1           >                                        δ 1/2      ≤            .
                           g        ng                       ng               maxg n2
                                                                                    g                                    2C r/2

Therefore,

                                                        1                            ε
                                                                     2
                                                                  E[Zg 1(Zg
                                                                          2
                                                                            > nδ )] ≤ .
                                                        n     g                      2

Next, consider the term (14). We have that:

                                                                                                                                                            2 1/2
                                                                                                                                                            
 1                                                                             1 
   E          2
             Zg 1(Zg
                   2             2
                     ≤ nδ ) − E[Zg 1(Zg
                                      2
                                        ≤ nδ )]                            ≤     E                    2
                                                                                                     Zg 1(Zg
                                                                                                           2             2
                                                                                                             ≤ nδ ) − E[Zg 1(Zg
                                                                                                                              2
                                                                                                                                ≤ nδ )]                         
 n      g                                                                      n             g
                                                                                                                          1/2
                                                                            1
                                                                           = V                 2
                                                                                              Zg     1     2
                                                                                                         (Zg     ≤ nδ )
                                                                            n            g
                                                                                                                                                                1/2
                                                                             1
                                                                                                           1                            1               2
                                                                                                 î                                                          ó
                                                                                                      2         2                   2     2
                                                                           =                 E       Zg       (Zg   ≤ nδ ) −     E[Zg   (Zg   ≤ nδ )]
                                                                             n       g
                                                                                                                                1/2
                                                                             1
                                                                           ≤                 E     4
                                                                                                  Zg      1   2
                                                                                                            (Zg   ≤ nδ )
                                                                             n       g


where the ﬁrst line uses Jensen’s inequality, the second line uses the deﬁnition of variance, the third line uses the
fact that clusters are independent and the fourth line uses that for any random variable A, V[W ] ≤ E[W 2 ]. Next,
          4 1(Z 2 ≤ nδ ) = Z 2 1(Z 2 ≤ nδ )
use that Zg                                                    2 1(Z 2 ≤ nδ ) ≤ nδZ 2 and thus
                                                              Zg
               g            g     g                                 g              g

                                                                      1/2                                           1/2
                     1                                                                       1                                          ε
                               E    4
                                   Zg   1 2
                                        (Zg             ≤ nδ )                ≤δ   1/2                       2
                                                                                                          E[Zg ]          ≤ δ 1/2 =
                     n     g                                                                 n       g                                  2




                                                                              D-35
since          2 ]/n = 1. Collecting these results,
            E[Zg
        g

                                                                ï               ò
                                                                     Tn
                                                            E              −1        ≤ε
                                                                    E[Tn ]

as required.



D.3     Proof of Theorem 1

For any (d, t),

                                          g   1t  d ¯d     p
                                               g Ng (Yg − µn (d, t))
         ˆ(d, t) − µp
         µ          n (d, t) =
                                                   N (d, t)
                                          g   1       ¯d
                                               t N d (Y
                                               g g     g− µg (d, t))                1
                                                                                    t d
                                                                                g ( g Ng   − qt ng pg (d|t))(µg (d, t) − µp
                                                                                                                          n (d, t))
                                  =                                  +
                                                   N (d, t)                                          N (d, t)

where the second equality uses that:


                qt ng pg (d|t)(µg (d, t) − µp
                                            n (d, t)) = qt                ng pg (d|t)µg (d, t) − µp
                                                                                                  n (d, t)           ng pg (d|t)   = 0.
            g                                                         g                                      g


Next,

                                  g   1t  d ¯d
                                       g Ng (Yg − µg (d, t))                1
                                                                      − qt ng pg (d|t))(µg (d, t) − µp
                                                                              t d
                                                                          g ( g Ng                   n (d, t))
   ˆ(d, t) −
   µ            µp
                 n (d, t)   =                           +
                                        N (d, t)                                N (d, t)
                              E[N (d, t)] 1       g Ng (Yg − µg (d, t)) + (1g Ng − qt ng pg (d|t))(µg (d, t) − µn (d, t))
                                                 1t  d ¯d                   t d                                 p
                            =             ·
                               N (d, t)     n g                           qt g ng pg (d|t)/n
                                E[N (d, t)] 1
                            =              ·            ψg (d, t)
                                 N (d, t)    n      g


where

                       1t
                        g Ng (Yg − µg (d, t)) + (1g Ng − qt ng pg (d|t))(µg (d, t) − µn (d, t))
                           d ¯d                   t d                                 p
        ψg (d, t) =                                                                                              ,     E[ψg (d, t)] = 0
                                                                   ¯n (d|t)
                                                                qt p




                                                                      D-36
     ¯n (d|t) =
with p                g   ng pg (d|t)/n and

                                     1
                                                   g (Ng ) V[Yg |Tg = t, Dg ] + (µg (d, t) − µn (d, t)) V[1g Ng ]
                                                E 1t
                                               ¶ î                           ó                                   ©
          V[ψg (d, t)] =                               d 2   ¯d                               p        2   t d
                                 2p
                                qt ¯n (d|t)2
                                   2
                                                            n (d, t))Cov(1g Ng (Yg − µg (d, t)), 1g Ng )
                                                                             t d ¯d
                            +  2p           (µg (d, t) − µp                                           t d
                              qt ¯n (d|t)2
                                    1
                            = 2               σ 2 (d, t)qt ng pg (d|t) + cg (d, t)ng (ng − 1)qt pg (d, d|t)
                                 ¯n (d|t)2 g
                              qt p
                              (µg (d, t) − µp n (d, t))
                                                        2
                            +         2p                    qt (1 − qt )n2         2
                                                                         g pg (d|t) + qt ng pg (d|t)(1 − pg (d|t))
                                     qt ¯n (d|t)2
                              (µg (d, t) − µp n (d, t))
                                                        2
                            +         2p                    qt ng (ng − 1) pg (d, d|t) − pg (d|t)2 .
                                     qt ¯n (d|t)2

From this,

              1                          1
          V               ψg (d, t) =               V[ψg (d, t)]
              n   g                      n2    g
                                                1               ng 2
                                     =                            σ (d, t)qt pg (d|t)
                                          2p
                                         qt ¯n (d|t)2      g    n2 g
                                           1                    ng (ng − 1)
                                     + 2                                    cg (d, t)qt pg (d, d|t)
                                         ¯n (d|t)2
                                      qt p                 g         n2
                                       qt (1 − qt )             n2
                                                                 g
                                     +  2p                         pg (d|t)2 (µg (d, t) − µp
                                                                                           n (d, t))
                                                                                                     2
                                      qt  ¯n (d|t)2        g    n2
                                            qt                  ng
                                     + 2                           pg (d|t)(1 − pg (d|t))(µg (d, t) − µp
                                                                                                       n (d, t))
                                                                                                                2
                                      qt p¯n (d|t)2        g    n2
                                              qt                ng (ng − 1)
                                     +    2                                 pg (d, d|t) − pg (d|t)2 (µg (d, t) − µp
                                                                                                                  n (d, t))
                                                                                                                           2
                                            ¯n (d|t)2
                                         qt p              g         n2
                                                       2
                                           Ç               å
                                                    g ng
                                     =O                        = o(1)
                                                    n2

       2 (d, t) and |µ (d, t) − µ (d, t)| are bounded by Assumption 3, p
since σg                                                               ¯n (d|t) is bounded from below and maxg ng /n →
                      g          n
0. This implies that:
                                                            ˆ(d, t) − µp
                                                           |µ          n (d, t)| →P 0

for all (d, t), which gives the consistency result. Next, stack the elements ψg (d, t) in a vector ψg and note that

                                                        1                        1
                                                Ωn = V √                  ψg =           E[ψg ψg ]
                                                         n            g          n   g




                                                                          D-37
where

                                                                                                                                       pg (d, d|t)
                                                                                           ß                    Å                                  ã
               1                    2                         n                                 2
                       E[ψg (d, t) ] =                                        2             ng σg (d, t)pg (d|t) 1 + ρg (d, t)(ng − 1)
               n   g                          qt         g   ng pg (d|t)              g                                                 pg (d|t)

                                        + ng (µg (d, t) − µp         2                   2
                                                           n (d, t)) ng (1 − qt )pg (d|t) + pg (d|t)(1 − pg (d|t)))

                                        + (ng − 1)Cov(1d   ig , 1jg |Tg = t)
                                                                            ä©
                                                                 d
                                                                               ,
    1                                              n                 , t)ng pg (d, d |t)
                                                             g cg (d, d
            E[ψg (d, t)ψg (d , t)] =
    n   g                                     qt         g ng pg (d|t)      g ng pg (d |t)

                                                                      − µpn (d, t))(µg (d , t) − µn (d , t))Cov(1g Ng , 1g Ng )
                                              n                                                    p             t d t d
                                                       g (µg (d, t)
                                        +                                                                                       ,
                                                                       qt     g ng pg (d|t)      g ng pg (d |t)
                                                             n2                                 p                        p
   1                                               n     g    g pg (d|t)pg (d |t )(µg (d, t) − µn (d, t))(µg (d , t ) − µn (d , t ))
            E[ψg (d, t)ψg (d , t )] = −                                                                                              .
   n    g                                                                     g ng pg (d|t)     g ng pg (d |t )


and this variance matrix is invertible because its minimum eigenvalue is bounded below by assumption. Finally,
write

                                                         1                                 1
                                                                    ψg (d, t) =                           ψig (d, t)
                                                         n     g                           n    g    i

where

                                                                                                 g 1ig
                                                                                                1t  d
                                          ®                                                 Ç                            å                      ´
                              1
            ψig (d, t) =                      11  t d
                                                  g ig (Yig        − µg (d, t)) +                         − qt pg (d|t) (µg (d, t) −   µp
                                                                                                                                        n (d, t))   .
                            ¯n (d|t)
                         qt p                                                                       ng

Then we have that for > r ≥ 2,
                                                             Å î                                                   ã
                       î            ó1/          1                                ó1/
                                                                                                      p         1/
                   E |ψig (d, t)|             ≤               E |Yig − µg (d, t)|     + |µg (d, t) − µn (d, t)|      <∞
                                                qt c

uniformly over i, g, d, t since as moments are uniformly bounded and using Minkowski’s inequality and Assumption
3. Thus,
                                                                                           /2
                                        sup E[ ψig ] ≤ (2M + 1)                                 sup E[|ψig (d, t)| ] < ∞
                                        i,g                                                     i,g,d,t

and by Theorem 2 in Hansen and Lee (2019),

                                                          1/2 1
                                                       Ω−
                                                        n    √                    ψg →D N (0, I(2M +1) ).
                                                               n          g


To complete the proof, notice that by Lemma 1 and the Slutsky theorem:

         −1/2 √                −1    −1/2 1                                                     −1/2 1
        Ωn      n(µ    ˆp
                  ˆn − µn ) = N E[N]Ωn   √                                                ψg = Ωn   √                  ψg + oP (1) →D N (0, I(2M +1) )
                                           n                                      g                   n           g


as required.




                                                                                      D-38
D.4     Proof of Proposition 1

By Equation (7),

                             ˆ cr (d, t) = n          g   1t   d 2 ¯d    ˆ(d, t))2
                                                           g (Ng ) (Yg − µ
                             Ω
                                                                       N (d, t)2
                                                           g 1g
                                                          1t              i 1ig (Yig − µ                                          1d
                                                                       Ä                       äÄ                                                     ä
                                                              t               d
                                                      g                                ˆ(d, t))                               i    ig (Yig − µ
                                                                                                                                             ˆ(d , t ))
                ˆ cr ((d, t), (d , t )) = n
                Ω
                                                                                                N (d, t)N (d , t )

                ˆ cr (d, t, d t ) = 0 for t = t . For the main diagonal terms, recall that:
and notice that Ω

                               1
                                                     g (Ng ) V[Yg |Tg , Dg ] + (µg (d, t) − µn (d, t)) V[1g Ng ] .
                                                  E 1t
                                                 ¶ î                        ó                                   ©
           Ωn (d, t) =                                   d 2   ¯d                                     2   t d
                           2 ¯n (d|t)2
                         nqt p               g


Adding and subtracting µp
                        n (d, t) and expanding the square, the variance estimator is:

                                             Å                 ã2
                                                    n                       1
                             ˆ cr (d, t) =
                             Ω                                                       1t   d 2 ¯d     p
                                                                                      g (Ng ) (Yg − µn (d, t))
                                                                                                              2
                                                                                                                                                          (15)
                                                 N (d, t)                   n    g
                                                                                     1
                                       + (µp          ˆ(d, t))2
                                           n (d, t) − µ                                              1t   d 2
                                                                                                      g (Ng )                                             (16)
                                                                                     n          g

                                                                                        1
                                       + 2(µp
                                            n (d, t) − µ
                                                       ˆ(d, t))                                      1t   d 2 ¯d     p
                                                                                                      g (Ng ) (Yg − µn (d, t))                            (17)
                                                                                        n       g


where                                                                                                   
                                                                                                                   g   n2
                                                                                                                        g
                                             |µ
                                              ˆ(d, t) −         µp
                                                                 n (d, t)|           = OP                                     .
                                                                                                                   n2

as shown in the proof of Theorem 1. By Lemma 1 and the continuous mapping theorem,
                                                 Å              ã2
                                                        n                               1
                                                                        =        2p          (1            + oP (1)).
                                                     N (d, t)                   qt ¯n (d|t)2
On the other hand, for term (16),

                         1                                 1                                           2
                                  E[1t   d 2
                                     g (Ng ) ] =                               g 1ig ] +
                                                                            E[1t  d
                                                                                                                                     g 1ig 1jg ]
                                                                                                                                  E[1t  d d
                         n    g                            n   g        i
                                                                                                       n       g        i   j>i
                                                           1                                          1
                                                      =                ng qt pg (d|t) +                            ng (ng − 1)qt pg (d, d|t)
                                                           n   g                                      n    g
                                                                                   qt
                                                           ¯n (d|t) +
                                                      = qt p                                        ng (ng − 1)pg (d, d|t)
                                                                                   n        g
                                                                                2
                                                               Ç                  å
                                                                        gng
                                                      =O
                                                                        n

                   g 1ig , by Lemma 3,
and letting Xig = 1t  d


                                                           1
                                                           n      1t    d 2
                                                                   g (Ng )
                                                                   g
                                                                              = 1 + oP (1)
                                                               g E[1g (Ng ) ]
                                                       1             t   d 2
                                                       n


                                                                                 D-39
so

                                       1                                                                   1
          (µp          ˆ(d, t))2
            n (d, t) − µ                        1t   d 2     p
                                                                        ˆ(d, t))2
                                                 g (Ng ) = (µn (d, t) − µ                                               E[1t   d 2
                                                                                                                           g (Ng ) ](1 + oP (1))
                                       n   g                                                               n     g
                                                                                          2                       2
                                                                                                                                          maxg n2
                                                                                Ç             å       Ç             å                 Ç                å
                                                                                       g ng                   g ng                              g
                                                                       = OP                       O                       ≤ OP                             = oP (1)
                                                                                       n2                     n                             n

under Assumption 3. For term (17),

      1                                                  1                                                          1
              1t   d 2 ¯d     p
               g (Ng ) (Yg − µn (d, t)) =                             1t   d 2 ¯d
                                                                       g (Ng ) (Yg − µg (d, t)) +                             1t   d 2               p
                                                                                                                               g (Ng ) (µg (d, t) − µn (d, t)).
      n   g                                              n       g                                                  n    g


Now,
                                                                  1t
                                                                 î                                        ó
                                                                       d 2 ¯d
                                                             E     g (Ng ) (Yg − µg (d, t)) = 0

and

                                                                                                                                                      n2
                                                                                                                                               Ç               å
                     1                                                            1
                             1                                                                                                                       g g
                                                                                                      î                               ó
                                  t   d 2 ¯d                                                              ¯d
                 E                g (Ng ) (Yg           − µg (d, t))            ≤             n2
                                                                                               gE         Y g   − µg (d, t)               =O
                     n   g                                                        n      g                                                           n

so by Markov’s inequality,
                                                                                                                      n2
                                                                                                              Ç               å
                                           1
                                                        1t   d 2 ¯d
                                                         g (Ng ) (Yg            − µg (d, t)) = OP
                                                                                                                    g  g
                                                                                                                                  .
                                           n    g                                                                    n

On the other hand,

                         1                                                                                                                      n2
                                   1t   d 2               p
                                    g (Ng ) (µg (d, t) − µn (d, t)) ≤ max |µg (d, t) − µp
                                                                                        n (d, t)|
                                                                                                                                                 g
                         n    g
                                                                       g                                                                       n

which implies
                                                                                                                         n2
                                                                                                                Ç             å
                                           1
                                                         1t   d 2 ¯d     p
                                                          g (Ng ) (Yg − µn (d, t)) = OP
                                                                                                                          g
                                           n        g                                                                   n

and therefore
                                                                                               
                                                                                                                                n2
                                                                                                                         Ç                å
                       1                                                                                  n2
 (µp
   n (d, t) − µ
              ˆ(d, t))             1t   d 2 ¯d
                                    g (Ng ) (Yg               −      µp
                                                                      n (d, t))   = OP
                                                                                                      g    g
                                                                                                                    OP
                                                                                                                                 g
                       n      g                                                                       n2                       n
                                                                                                                                                    
                                                                                                      g   n6
                                                                                                           g                          n2
                                                                                                                                       g                   g   n2
                                                                                                                                                                g
                                                                                  = OP                              ≤ OP          max    ·                          = oP (1).
                                                                                                      n4                           g  n                  n2

Thus,
                                                Å                      ã2
                                                           n                1
                             ˆ cr (d, t) =
                             Ω                                                        1t   d 2 ¯d     p        2
                                                                                       g (Ng ) (Yg − µn (d, t)) + oP (1).
                                                        N (d, t)            n     g


Next, under Assumption 3 and by Lemma 3

                                               1
                                               n         g    1t    d 2 ¯d    p
                                                               g (Ng ) (Yg − µn (d, t))
                                                                                       2
                                                                                           = 1 + oP (1)
                                                             g 1g (Ng ) (Yg − µn (d, t))
                                                 1               t   d 2 ¯d     p        2
                                           E    n


                                                                                      D-40
and therefore:
                                   Å              ã2
                                          n                    1
                   ˆ cr (d, t) =
                   Ω                                   E                 1t   d 2 ¯d     p
                                                                          g (Ng ) (Yg − µn (d, t))
                                                                                                  2
                                                                                                    (1 + oP (1)) + oP (1)
                                       N (d, t)                n    g

                                   1         E    1           ¯ d − µp (d, t))2
                                                   t (N d )2 (Y
                                                   g   g       g       n
                             =                             2p                                (1 + oP (1)) + oP (1).
                                   n    g                qt  ¯n (d|t)2

But

  1       E    1t   d 2 ¯d     p
                g (Ng ) (Yg − µn (d, t))
                                        2
                                                           1         E    1t   d 2    ¯d    p        2
                                                                           g (Ng ) E (Yg − µn (d, t)) |Tg , Dg
                      2p                              =                                       2p
  n   g              qt ¯n (d|t)2                          n    g                            qt ¯n (d|t)2
                                                                             1t
                                                                         î               Ä                                        äó
                                                                                        ¯gd |Tg , Dg + (µg (d, t) − µp
                                                                                  d 2 V Y                                     2
                                                        1            E        g (Ng )                                n (d, t))
                                                      =                                               2p
                                                        n       g                                    qt ¯n (d|t)2
                                                           1         E    1t   d 2   ¯d
                                                                           g (Ng ) V Yg |Tg , Dg + V 1t  d               p
                                                                                                      g Ng (µg (d, t) − µn (d, t))
                                                                                                                                  2
                                                      =                                                      2p
                                                           n    g                                           qt ¯n (d|t)2

                                                        1            E    1t  d 2 (µ (d, t) − µp (d, t))2
                                                                           g Ng     g          n
                                                      +                               2p
                                                        n       g                    qt ¯n (d|t)2
                                                                                   n2    pg (d|t) 2
                                                                                      Å           ã
                                                                                    g                                      2
                                                      = Ωn (d, t) +                                 (µg (d, t) − µp
                                                                                                                  n (d, t))
                                                                               g   n p   ¯n (d|t)

which implies that:
                                                                                   ã2                           2
                           ˆ cr (d, t)                         n2       pg (d|t)        (µg (d, t) − µp
                                                                    Å
                           Ω                                    g                                      n (d, t))
                                       =1+                                                                        + oP (1).
                            Ω(d, t)                    g       n        ¯n (d|t)
                                                                        p                       Ωn (d, t)

                                                                                                 ˆ cr (0, 0). The true
                                                                                   ˆ cr (d, t) + Ω
                                                                      ˆcr (d, t) = Ω
Finally, consider the variance estimator for the difference in means, V
variance is:

          Vn (d, t) = Ωn (d, t) + Ωn (0, 0) − 2Ω(d, t, 0, 0)
                                                  n2      pg (d|t)
                                                       Å           ã
                                                     g
                    = Ωn (d, t) + Ωn (0, 0) + 2                      (µg (d, t) − µn (d, t))(µg (0, 0) − µn (0, 0)).
                                                g n       ¯n (d|t)
                                                          p

Therefore,
                                                                                                      ã2
          ˆcr (d, t)               n2       pg (d|t)
                                        Å
          V                         g                              p                         p               1
                     =1+                             (µg (d, t) − µn (d, t)) − (µg (0, 0) − µn (0, 0))             + oP (1).
          Vn (d, t)          g     n        ¯n (d|t)
                                            p                                                            Vn (d, t)

so that
                                                                        ˆcr (d, t)
                                                                        V
                                                                    plim           ≥ 1.
                                                                    n→∞ Vn (d, t)

as required.




                                                                               D-41
D.5     Proof of Theorem 2

Based on Theorem 1, the variance for the difference in means can be approximated as:

                                       ng pg (d|t)                              pg (d, d|t)
                                                   ï          ß                                     ™
                 ˆ(d, t)] ≈ 1
              V[ β                                     2
                                                     σg (d, t) 1 + ρg (d, d, t)             (ng − 1)
                            qt          2 ¯ (d|t)2                               pg (d|t)
                                   g n p   n

                                                              pg (d, d|t)
                                                        ß                          ™ò
                                                      2
                            + (µg (d, t) − µn (d, t)) 1 +                 (ng − 1)
                                                               pg (d|t)
                               1      ng 2
                            +              σ (0, 0) {1 + ρg (0, 0, 0)(ng − 1)} + ng (µg (0, 0) − µn (0, 0))2
                              q0 g n 2 g
                                                                                                   ò2
                                  n2g pg (d|t)
                                      ï
                            −       2 p
                                                 (µg (d, t) − µn (d, t)) − (µg (0, 0) − µn (0, 0))
                               g n      ¯n (d|t)

where the last term does not depend on {qt }t so after dropping this term and rescaling by n2 , the minimization
problem is equivalent to:
                                                  M
                                                        Bt (ω ) B0
                                        min                    +    = f (q0 , q1 , . . . , qM )
                                     q0 ,q1 ,...,qM
                                                    t=1
                                                          qt     q0

subject to qt > 0,      t qt   = 1 where B0 and Bt (ω ) are deﬁned in the statement of the theorem. The ﬁrst-order
conditions for each qt , t > 0 are given by:
                                                                                               
                                  ∂f    Bt (ω ) B0                                      ∗         Bt (ω ) ∗
                                      =− 2 + 2 =0                          ⇐⇒          qt =              q
                                  ∂qt    qt     q0                                                 B0 0

Since   t>0 qt   = 1 − q0 ,                                                     
                                                        ∗            ∗             Bt (ω )
                                                  1−   q0       =   q0
                                                                         t>0
                                                                                    B0

and thus:
                                         √                                               √
                        ∗                    B0                      ∗                       Bt
                       q0   =√                              ,       qt   =√                                 ,   t > 0.
                                 B0 +         t>0 Bt (ω )                      B0 +           t>0 Bt (ω )

On the other hand, the second-order conditions for t > 0 are given by:

                                        ∂2f   2Bt (ω ) 2B0                          ∂2f     2B0
                                          2 =     3   + 3 ,                                = 3
                                        ∂qt     qt      q0                         ∂qt ∂ql   q0

and therefore the Hessian matrix H can be written as:
                                      Å                         ã Å     ã
                                        2B1 (ω )       2BM (ω )     2B0
                             H = diag       3    ,...,    3      +    3   1M 1M
                                           q1            qM          q0

where 1M is an M × 1 vector of ones. Thus, for any non-zero M × 1 vector v,

                        M                                                      M                                  M          2
                                2
                       2Bt (ω )vt
                                    Å
                                      2B0
                                          ã                                                 2
                                                                                   2Bt (ω )vt
                                                                                                Å
                                                                                                  2B0
                                                                                                      ã
            v Hv =          3     +     3   v 1M 1M v =                                 3     +     3                   vt       >0
                   t=1
                           q1          q0                                      t=1
                                                                                       q1          q0             t=1




                                                                    D-42
using that Bt (ω ) > 0 for all t so the Hessian is positive deﬁnite as required.



D.6     Proof of Corollary 1

This proof follows from Theorem 1 and Proposition 1 setting µg (d, t) = µp
                                                                         n (d, t) = µ(d, t) throughout.




D.7     Proof of Proposition 2

Let Dg (d, t) denote the set of possible values for D(i)g given Dig = d and Tg = t. Then,

  E[Yig |Dig = d, Tg = t] =                   E[Yig |Dig = d, D(i)g = dg , Tg = t]P[D(i)g = dg |Dig = d, Tg = t]
                               dg ∈Dg (d,t)

                           =                  E[Yig (d, dg )|Dig = d, D(i)g = dg , Tg = t]P[D(i)g = dg |Dig = d, Tg = t]
                               dg ∈Dg (d,t)

                           =                  E[Yig (d, dg )]P[D(i)g = dg |Dig = d, Tg = t]
                               dg ∈Dg (d,t)
                               n g −1    ï Å                 ãò
                                                      sg
                           =            E Yig d,                  P[Sig = sg |Dig = d, Tg = t]
                               sg =0
                                                    ng − 1

where the ﬁrst equality follows by the law of iterated expectations, the second equality plugs in the potential out-
comes under the exclusion restriction (Assumption 6), the third equality uses independence (Assumption 7), and the
fourth equality uses exchangeability (Assumption 8).



D.8     Proof of Theorem 3

We verify the conditions for Theorem 1. First, condition (i) implies that Proposition 2 holds. Second, condition (ii)
and Proposition 2 imply Assumption 1. Next, condition (iii) implies that:

                                                                       1 − D = s , D = d|T = t]
                                                                   P[ Ng
                                    P[Sig = sg , Dig = d|Tg = t]            ig    g    ig       g
 P[Sig = sg |Dig = d, Tg = t] =                                  =
                                              pg (d|t)                         pg (d|t)
                                    P[Ng1 = s + d, D = d|T = t]
                                                                                              P[Dig = d|Tg = t]
                                                                    = 1(sg + d = ng pg (1|t))
                                             g         ig     g
                                  =
                                                 pg (d|t)                                          pg (d|t)
                                  = 1(sg = ng pg (1|t) − d)




                                                              D-43
                                                      î Ä n p (1|t)−d äó
                                                                g
and thus by Proposition 2, E[Yig |Dig = d, Tg = t] = E Yig d, g n g −1
                                                                         . This fact also implies that

               E[Yig |Dig = d, Tg = t, D(i)g ] =            E[Yig |Dig = d, Tg = t, D(i)g = dg ]1(D(i)g = dg )
                                                       dg

                                                   =        E[Yig (d, (dg 1g − d)/(ng − 1))]1(D(i)g = dg )
                                                       dg

                                                   =        E[Yig (d, (sg /(ng − 1))]1(Sig = sg )
                                                       sg

                                                   = E[Yig (d, (ng pg (1|t) − d)/(ng − 1))]

and an analogous argument gives the result for the joint moments, so Assumption 2 holds. Next, condition (iii)
implies that Assumption 3 holds, so all the requirements for Theorem 1 are satisﬁed. Finally, by Proposition 2 and
condition (iv),

                                                                                ng p(1|t) − d
                                                                        ï Å                   ãò
                          ng                   ng                   ng                                   ng
      βn (d, t) : =          µg (d, t) −          µg (0, 0) =          E Yig d,                  −          E[Yig (0, 0)]
                      g   n                g   n                g   n              ng − 1            g   n

which completes the proof.



D.9      Proof of Theorem 4

This result follows from the fact that conditions (i) and (ii) imply Assumption 5 and thus under the conditions for
Theorem 3, Corollary 1 holds.




                                                                D-44