WPS6824
Policy Research Working Paper 6824
Designing Experiments to Measure Spillover
Effects
Sarah Baird
Aislinn Bohren
Craig McIntosh
Berk Özler
The World Bank
Development Research Group
Poverty and Inequality Team
March 2014
Policy Research Working Paper 6824
Abstract
This paper formalizes the design of experiments intended framework for consistent estimation of these effects,
specifically to study spillover effects. By first randomizing provides explicit expressions for power calculations, and
the intensity of treatment within clusters and then shows that the power to detect average treatment effects
randomly assigning individual treatment conditional declines precisely with the quantity that identifies the
on this cluster-level intensity, a novel set of treatment novel treatment effects. A demonstration of the technique
effects can be identified. The paper develops a formal is provided using a cash transfer program in Malawi.
This paper is a product of the Poverty and Inequality Team, Development Research Group. It is part of a larger effort by
the World Bank to provide open access to its research and make a contribution to development policy discussions around
the world. Policy Research Working Papers are also posted on the Web at http://econ.worldbank.org. The authors may be
contacted at bozler@worldbank.org.
The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development
issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the
names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those
of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and
its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.
Produced by the Research Support Team
Designing Experiments to Measure Spillover Eﬀects*
, J. Aislinn Bohren
Sarah Baird , Craig McIntosh§ ¨
, Berk Ozler¶
February 7, 2014
KEYWORDS: Experimental Design, Networks, Cash Transfers
JEL: C93, O22, I25
* We are grateful for the useful comments received from seminar participants at Caltech, Monash, Namur,
Paris School of Economics, Stanford, University of British Columbia, UC Berkeley, University of Colorado,
University of Melbourne, and Yale. We thank the Global Development Network, Bill Melinda Gates Foun-
dation, National Bureau of Economic Research Africa Project, World Bank’s Research Support Budget, and
several World Bank trust funds (Gender Action Plan, Knowledge for Change Program, and Spanish Impact
Evaluation fund) for funding.
George Washington University and University of Otago, sarah.baird@otago.ac.nz
University of Pennsylvania, abohren@sas.upenn.edu
§ University of California, San Diego, ctmcintosh@ucsd.edu
¶ World Bank and University of Otago, berk.ozler@otago.ac.nz
Designing Experiments to Measure Spillover Eﬀects
Multiple economic disciplines have begun to explore the empirical issues raised by spillover
eﬀects from one individual to another. What Charles Manski (1993) refers to as endogenous
eﬀects are explored in diﬀerent ways – by empirical studies permitting general equilibrium
eﬀects, the analysis of medical treatments that provide herd immunity, or by studies of
network eﬀects. An increasingly useful lens on this problem is experimental policy trials
that explicitly consider interference between individuals. Once we permit interference in this
context, the impact of a program only on its beneﬁciaries becomes an unsatisfying answer
to the real policy impact. Thus, it becomes more important to understand spillovers and
the overall eﬀect on the entire population. What if a program creates beneﬁts to some only
by diverting them from others? How do individuals respond to the intensity of treatment
within a population? Does the study even have an unpolluted counterfactual?
The possibility of interference between individuals has traditionally been seen as the
Achilles heel of randomized experiments; standard experimental designs are unable to iden-
tify and measure spillovers.1 Given these concerns, a new wave of empirical work has emerged
in the past decade trying to relax the strong assumption of no interference, or that individ-
uals are not aﬀected by the treatment status of others. This literature includes studies
that uncover network eﬀects using experimental variation across treatment groups (Matteo
Bobba and Jeremie Gignoux 2013; Edward Miguel and Michael Kremer 2004), leave some
members of a group untreated (Manuela Angelucci and Giacomo De Giorgi 2009; Felipe
Barrera-Osorio, Marianne Bertrand, Leigh Linden and Francisco Perez-Calle 2011; Gus-
tavo J. Bobonis and Frederico Finan 2009; Esther Duﬂo and Emmanuel Saez 2003; Rafael
Lalive and M. A. Cattaneo 2009), exploit plausibly exogenous variation in within-network
treatments (Philip S. Babcock and John L. Hartman 2010; Lori A. Beaman 2012; Tim-
othy G. Conley and Christopher R. Udry 2010; Esther Duﬂo and Emmanuel Saez 2002;
Kaivan Munshi 2003), or intersect an experiment with pre-existing networks (Abhijit Baner-
jee, Arun G. Chandrasekhar, Esther Duﬂo and Matthew O. Jackson 2013; Jiehua Chen,
Macartan Humphries and Vijay Modi 2010; Karen Macours and Renos Vakis 2008; Emily
Oster and Rebecca Thornton 2012).
A partial population experiment (Robert A. Moﬃtt 2001), in which some clusters are
assigned to control and a subset of individuals are oﬀered treatment within clusters assigned
to treatment, partially overcomes this challenge and yields valid estimates of treatment
and spillover eﬀects. But such experiments provide no exogenous variation in treatment
1
In the presence of spillovers, the blocked design produces biased estimates. The clustered design is not
biased, but provides no information to estimate the extent of spillovers.
1
saturation to estimate the extent to which program eﬀects are driven by the percentage of
individuals treated in treatment clusters.2 Consequently, the most recent empirical approach
has been to conduct a two-level randomization in which the share of individuals assigned to
treatment within treated clusters is directly varied.3
In this paper, we provide a formal presentation of these randomized saturation (RS)
designs. We deﬁne the relevant set of treatment eﬀects a researcher should consider in the
presence of spillovers, and present a clear set of assumptions under which the RS design
can consistently measure these eﬀects. In a RS design, each cluster is randomly assigned a
treatment saturation, and each individual within the cluster is randomly assigned a treatment
status, given the assigned cluster saturation. This design allows for the consistent estimation
of a rich set of treatment and spillover eﬀects across the distribution of treatment saturations,
which include the intention to treat eﬀect (ITT), spillovers on the non-treated (SNT), and
the total causal eﬀect (TCE) as well as novel estimands such as the treatment eﬀect on the
uniquely treated (TUT) and the spillover eﬀect on the treated (ST). In the process, a RS
design also allows the researcher to discover the extent to which observed correlations in
outcomes within clusters were caused by endogenous eﬀects, thereby oﬀering a solution to
the reﬂection problem, albeit after the fact, and informing the design of future studies.
Next, we develop a technical framework to guide researchers through the various choices
in RS designs. All (non-trivial) RS designs yield consistent estimates of treatment and
spillover eﬀects, but the power of each design varies with the saturation proﬁle and the share
of clusters assigned to each saturation. We impose a random eﬀects variance structure and
derive explicit expressions for the minimum detectable treatment and slope eﬀects, which
determine the statistical power of diﬀerent RS designs. Power is a function of some standard
quantities, such as the eﬀect size and the intra-cluster correlation (ICC) of outcomes, as well
as of some unique features of the RS design such as the share of individuals assigned to each
treatment saturation and the variance of saturations. Power for the average ITT and SNT
is decreasing in precisely the quantity that identiﬁes the novel eﬀects, namely, variation in
2
Most extant partial population experiments feature cluster-level saturations that are either endoge-
nous (Oportunidades) or ﬁxed (Duﬂo and Saez (2003), where they are typically set at 50%). PRO-
GRESA/Oportunidades (Mexico) is perhaps the most-studied example of a partial population experiment.
This program features a treatment decision at the cluster (village) level and an objective poverty eligibil-
ity threshold at the household level, so both eligible and ineligible individuals in treatment villages can be
compared to their counterparts in the pure control group. PROGRESA has been used to examine spillover
eﬀects in several contexts (Jennifer Alix-Garcia, Craig McIntosh, Katharine R. E. Sims and Jarrod R. Welch
2013; Angelucci and De Giorgi 2009; Bobonis and Finan 2009). Other partial population experiments include
Duﬂo and Saez (2003) and Peter Kuhn, Peter Kooreman, Adriaan Soetevent and Arie Kapteyn (2011).
3
Abhijit Banerjee, Raghabendra Chattopadhyay, Esther Duﬂo, Daniel Keniston and Nina Singh (2012);
Bruno Crepon, Esther Duﬂo, Marc Gurgand, Roland Rathelot and Philippe Zamora (2013); Xavier Gine
and Ghazala Mansuri (2012); Betsy Sinclair, Margaret McConnell and Donald P. Green (2012).
2
the intensity of treatment across clusters. RS designs therefore generate a tradeoﬀ compared
with the standard blocked, clustered, and partial-population designs: while they allow the
researcher to identify novel eﬀects on both the treated and the non-treated, this comes at the
cost of reduced power to detect average eﬀects. With input on the magnitude of ICCs and
the relative importance of the various estimands, we use the power calculations to provide
insight on design choices such as the optimal degree of variation in saturations and the size
of the pure control. 4
We conclude the theoretical presentation with three additional uses of the RS design.
First, we show that one can recover an estimate of the treatment on the compliers eﬀect
(TOC) by assuming that the observed spillover eﬀects on those not oﬀered treatment are a
reasonable proxy for the spillovers experienced by non-compliers. This technique is critical
because interference between units within clusters violates the exclusion restriction in the
standard technique of instrumenting for treatment with randomized assignment to identify
the TOC. Second, we consider experiments that use within-cluster controls to form the
counterfactual. Imposing a functional form assumption on the saturation distribution allows
the researcher to project the desired counterfactual outcome: untreated clusters with a
saturation of zero. This value can be used to correct the naive estimate of the ITT, even
in studies without a pure control. Finally, we show that an RS design implemented on a
non-overlapping network also produces exogenous variation in the treatment saturation of
overlapping networks (for example, social groups), variation that is generally superior to
what would be obtained from blocked or clustered designs.
We close with an empirical application of these techniques using a cash transfer exper-
iment in Malawi, wherein the fraction of eligible school-aged girls oﬀered treatment was
randomized across clusters. The study seeks to understand whether cash transfers could
help adolescent girls improve schooling outcomes as well as delay marriage and pregnancy.
In previous work, we have shown that, compared with a pure control group, conditional cash
transfers (CCTs) signiﬁcantly improved schooling outcomes while unconditional cash trans-
fers (UCTs) caused substantial reductions in marriage and fertility rates among program
beneﬁciaries (Sarah Baird, Craig McIntosh and Berk Ozler ¨ 2011). In this paper, we exploit
the sample of within-cluster controls and the RS design to investigate spillover eﬀects on
both program beneﬁciaries and eligible non-beneﬁciaries. Spillovers are a central concern for
two distinct reasons. First, a large literature indicates that schooling cash transfer programs
can alter the welfare of non-beneﬁciaries due to congestion eﬀects in the classroom (Jere R
Behrman, Piyali Sengupta and Petra Todd 2005), shifts in local norms around education
4
In the Supplemental Appendix, we provide a Matlab program that allows a researcher to calculate the
power of diﬀerent potential RS designs.
3
(George A. Akerlof and Rachel E. Kranton 2002), income spillovers (Manuela Angelucci, Gi-
acomo De Giorgi, Marcos A. Rangel and Imran Rasul 2010), or general equilibrium changes to
prices (Jesse M. Cunha, Giacomo De Giorgi and Seema Jayachandran 2011) and production
(Alix-Garcia et al. 2013). Second, and more speciﬁc to the Malawian context, cash transfers
can decrease young women’s dependence on men for ﬁnancial assistance (Winford Masanjala
2007) and/or the need for ‘transactional sex’ (Michelle J. Poulin 2007; Ann Swindler and
Susan Watkins 2007), thereby reducing the incidence of teen pregnancies and early marriages
among program beneﬁciaries, but with ambiguous spillovers to non-beneﬁciaries in the same
communities.
We ﬁnd that while average spillover eﬀects are muted for all outcomes, they generally
intensify with treatment saturation: positive treatment eﬀects on beneﬁciaries are accom-
panied by positive spillovers on non-beneﬁciaries, which increase with treatment intensity.
On the other hand, treatment eﬀects among beneﬁciaries themselves decline with treatment
saturation. More importantly, we ﬁnd no evidence for higher rates of marriage or preg-
nancy among within-cluster controls, suggesting that diversionary eﬀects do not counter the
documented beneﬁcial eﬀects of UCTs on these outcomes. Finally, taking advantage of ex-
ogenous variation generated by the RS design in the number of treated friends of individuals,
we conﬁrm that spillover eﬀects are similarly muted in social networks.
The remainder of the paper is structured as follows. Section 1 formally models a random-
ized saturation design, outlines the assumptions required to use this design, deﬁnes novel
estimands related to spillovers, presents closed-form expressions for the power of these es-
timands, and discusses the critical design tradeoﬀs. Sections 2.1 and 2.2 discuss the use
of randomized saturation designs in the absence of a pure control group, while Section 2.3
demonstrates the use of randomized saturation designs in a broader class of networks. Sec-
tion 3 presents an application of the technique and Section 4 concludes. All proofs are in
the Appendix.
1 A Randomized Saturation Design
One of the most basic design choices in any multi-level experiment is the question of allocating
treatment to N individuals distributed across C clusters. The conventional wisdom focuses
on the ‘design eﬀect’, whereby a positive correlation between the outcomes of individuals
in the same cluster, i.e. intra-cluster correlation (ICC), causes a power loss if treatment is
assigned at the cluster level. It would be easy to conclude that a blocked design, in which half
of individuals in each cluster is treated and the other half is used as the counterfactual, is
preferable. Critically, however, individuals in the same cluster may behave similarly because
4
they are inﬂuenced by the behavior of others in the group (endogenous eﬀects), their behavior
reﬂects the exogenous characteristics of the group (contextual eﬀects), or because they share
similar characteristics or face similar institutional environments (correlated eﬀects) (Manski
1993). The entire thrust of the ‘reﬂection problem’ introduced by Manski (1993) is the
impossibility of separating these eﬀects using observational data that is typically available
to the researcher at baseline. If only contextual or correlated eﬀects are responsible for the
observed ICC, indeed the blocked design proves optimal. However, if endogenous eﬀects are
present, then a blocked design is the wrong choice because the counterfactual is contaminated
by interference from treated individuals. A clustered design, in which some clusters are
assigned to treatment while others to control, would produce unbiased treatment eﬀects if
there is no interference across clusters, but with the loss of statistical power arising from
cross-cluster identiﬁcation. Thus this most basic of design choices ends up on the horns of
the reﬂection problem: because neither the blocked nor the clustered design actually reveals
the extent of interference, researchers learn little from a given study as to the optimal design
of subsequent studies. The RS design provides a solution to this conundrum.
A randomized saturation (RS) design is an experiment with two stages of randomization.
Take as given a set of N individuals divided into C non-overlapping groups, or clusters.5
The ﬁrst stage randomizes the treatment saturation of each cluster, and the second stage
randomizes the treatment status of each individual in the cluster, according to the realized
saturation of the cluster. Formally, in the ﬁrst stage, each cluster c = 1, ..., C is assigned a
treatment saturation πc ∈ Π ⊂ [0, 1] according to the distribution F , with mean µ = E [π ]
and variance η 2 = V ar(π ). In the second stage, each individual i = 1, ..., n in cluster c is
assigned a treatment status Tic ∈ {0, 1}, where Tic = 1 represents a treated individual.6 The
realized treatment saturation of stage 1 speciﬁes the distribution of the treatment status in
stage 2 for each cluster, P (Tic = 1|πc = π ) = π . Let f be the probability mass function for
distribution F .7 A RS design ω is completely characterized by the pair {Π, f }.
The saturation πc = 0 represents a cluster with no treatment individuals, or a pure
control cluster. A within-cluster control is deﬁned as an untreated individual in a cluster
with treated individuals: Sic = 1{Tic = 0, πc > 0}. This results in the following distribution
5
The RS design and the studies discussed here use a simple, spatially deﬁned deﬁnition of ‘cluster’ that is
mutually exclusive and exhaustive. This is distinct from the issue of randomizing saturations with overlapping
social networks (Peter Aronow 2012), which typically require a more complex sequential randomization
routine (Panos Toulis and Edward Kao 2013). However, an additional advantage of this design is that it will
also create exogenous variation in the saturation of any network that is correlated with given cluster, even
if this other network is overlapping. This is discussed in more depth in Section 2.3.
6
This notation implicitly assumes each cluster is of equal size. This is for notational convenience; the
results easily extend to unequally sized clusters.
7
For expositional simplicity, we present the theoretical results in a discrete saturation support framework,
although the analysis easily generalizes to continuous or mixed distributions.
5
over the three possible treatment statuses:
Treatment Individual: P (Tic = 1) = µ
Pure Control: P (Sic = 0, Tic = 0) = ψ
Within-cluster Control: P (Sic = 1) = 1 − µ − ψ := µS
where ψ := f (0). We say a randomized saturation design has a pure control if ψ > 0.
A RS design introduces correlation between the treatment status of two individuals in
the same cluster. This correlation is proportional to the variance of the cluster level treat-
ment saturations, ρT = η 2 /(µ(1 − µ)), where η 2 can be split into the variance in treatment
2
saturation across treated clusters, ηT = V ar(π |π > 0), and the variance from pure control
clusters:
ψ
2
η 2 = (1 − ψ ) ηT + µ2
1−ψ
Section 1.3.1 shows that in the presence of intra-cluster correlation (ICC), η 2 aﬀects the
power of the design.
The RS design nests several common experimental designs, including the clustered,
blocked and partial population designs.8 The blocked design is biased in the presence of
spillovers, and it is not possible to measure spillovers with either design. Therefore, we must
put some restrictions on the RS design in order to be able to identify treatment and spillover
eﬀects. We say a RS design is non-trivial if it has at least two saturations, at least one of
which is strictly interior.
Deﬁnition 1. A randomized saturation design is non-trivial if |Π| ≥ 2 and ∃π ∈ Π such
that π ∈ (0, 1).
Multiple saturations guarantee a comparison group to determine whether eﬀects vary with
treatment saturation, and an interior saturation guarantees the existence of within-cluster
controls to identify spillovers on the untreated (µS > 0). Note that the blocked and clustered
designs are trivial, while the partial population design is non-trivial.
Remark 1. Before turning to our formal framework, it is important to clarify the popula-
tion in which the researcher is measuring spillovers. The RS design deﬁnes the treatment
saturation of a cluster as the share of the study sample that is oﬀered treatment. If spillovers
8
Fixing the probability of treatment at P , the clustered design corresponds to Π = {0, 1} and f (1) = P ,
the blocked design corresponds to Π = {P } and f (P ) = 1 and the partial population design corresponds
to Π = {0, π } and f (π ) = P/π . In the clustered design, there is perfect correlation between the treatment
2
status of two individuals in the same cluster and in the blocked design, there is no correlation. Note ηT =0
for all three.
6
occur within the study sample, then this is the appropriate saturation measure.9 Alterna-
tively, if there is a ‘gateway to treatment’ and not all eligible individuals are sampled into
the study, or spillovers occur on a larger population within the cluster, then it is necessary
to distinguish between the true treatment saturation (the share of treated individuals in the
spillover network) and the assigned treatment saturation (the share of treated individuals in
the study population).10 If sampling rates or the share of the spillover population eligible
for treatment are constant across clusters, the true saturation is the sampling rate times the
assigned saturation. If the sampling rates are driven by cluster characteristics, then the true
saturation is endogenous. In this case, the researcher can instrument for the true saturation
with the assigned saturation. To streamline the remainder of the theoretical analysis, we
assume that the assigned and true saturations coincide.
1.1 Deﬁning Treatment and Spillover Eﬀects
Let Yic represent the outcome for individual i in cluster c. In a general framework, outcomes
can depend in an arbitrary way on an individual’s own treatment status, as well as the
treatment status of all other individuals in the study:
Yic = g (Tic , Ric , {Tjd , Rjd }jd=ic ; Xic , εic )
where Ric ∈ {0, 1} indicates whether an individual complies with treatment, Xic is a vector
of covariates and εic is an error term.11
To use the RS design for causal inference requires an assumption on how the treatment
status of others impacts Yic . We relax the stable unit treatment value assumption (SUTVA)
within clusters, but maintain it across clusters: spillovers may ﬂow within a cluster, but do
not ﬂow between clusters. This ensures that pure control clusters provide a valid counter-
factual for treated clusters and that cross cluster comparisons can identify how spillovers
depend on the intensity of treatment saturation.
9
For example, Banerjee et al. (2012) study interventions to improve performance among constables in
Rajasthan police stations. Sinclair, McConnell and Green (2012) study sending social-pressure mailings to
registered voters in a congressional district.
10
For example, Gine and Mansuri (2012) sample every fourth household in a neighborhood, and randomly
oﬀer treatment to 80 percent of these households. This causes the true treatment saturation to be 20 percent
rather than the assigned 80 percent. Other examples include unemployed individuals on oﬃcial unemploy-
ment registries form a small portion all unemployed individuals in an administrative region (Crepon et al.
2013); neighborhoods eligible for infrastructure investments comprise only 3 percent of all neighborhoods
(Craig McIntosh, Tito Alegria, Gerardo Ordonez and Rene Zenteno 2013); and malaria prevention eﬀorts
target vulnerable individuals, who account for a small share of total cluster population (GF Killeen, TA
Smith, HM Ferguson, H Mshinda, S Abdulla et al. 2007).
11
As is standard, Ric is only observed for individuals with Tic = 1.
7
Assumption 1. There is no cross-cluster interference in outcomes: Yic is independent of
{Tjd , Rjd } for all d = c.
Assumption 1 simpliﬁes the framework so that outcomes only depend on the treatment of
other individuals in the same cluster,
Yic = g (Tic , Ric , {Tjc , Rjc }j =i ; Xic , εic ).
Given Assumption 1, we can formally deﬁne several treatment and spillover eﬀect mea-
sures, both at speciﬁc saturations and pooled across multiple saturations. The Intention
to Treat (ITT) eﬀect is the diﬀerence between the expected outcome for individuals of-
fered treatment in a cluster with saturation π and the expected outcome for pure control
individuals,
IT T (π ) := E (Yic | Tic = 1, πc = π ) − E (Yic | Tic = 0, πc = 0).
The corresponding term for the Spillover on the Non-Treated (SNT) eﬀect is the diﬀer-
ence between the expected outcome for individuals not oﬀered treatment in a cluster with
saturation π and the expected outcome for pure control individuals,
SN T (π ) := E (Yic | Tic = 0, πc = π ) − E (Yic | Tic = 0, πc = 0).
The Total Causal Eﬀect (TCE) measures the overall cluster-level diﬀerence between
treated and pure control clusters,
T CE (π ) := E (Yic | πc = π ) − E (Yic | πc = 0) = π ∗ IT T (π ) + (1 − π ) ∗ SN T (π ).
Individuals oﬀered treatment will experience two types of treatment eﬀects, a direct
treatment eﬀect from the program as well as a spillover eﬀect that arises from the treatment
of other individuals in their cluster. A natural way to formalize these two eﬀects is to
decompose the ITT into two components: the Treatment on the Uniquely Treated
(TUT) measures the ITT on a sole individual oﬀered treatment within a cluster. 12
T U T := E (Yic | Tic = 1, πc = 0) − E (Yic | Tic = 0, πc = 0) = IT T (0),
The Spillover on the Treated (ST) measures the saturation-dependent spillover eﬀect on
12
The saturation of a cluster includes all treated individuals in the cluster. When the size of a cluster is
ﬁnite, it is impossible to simultaneously have a treatment individual and a saturation of zero - technically,
IT T (1/n) captures the isolated impact of treatment. We use T U T = IT T (0) for notational simplicity.
8
individuals oﬀered treatment,
ST (π ) := E (Yic | Tic = 1, πc = π ) − E (Yic | Tic = 1, πc = 0).
The ITT is the sum of these two components, IT T (π ) = T U T + ST (π ).
It is also possible to pool across saturations and estimate an average eﬀect for the entire
experiment. Given a RS design ω , deﬁne IT T ω as the diﬀerence between the expected
outcome for individuals oﬀered treatment in each saturation π , weighted by the share of
treated clusters with saturation π , and the expected outcome for pure control individuals,
f (π )
IT T ω := E (Yic | Tic = 1, πc = π ) − E (Yic | Tic = 0, πc = 0)
1−ψ
Π\0
f (π )
= IT T (π ) .
1−ψ
Π\0
with analogous deﬁnitions for SN T ω , T CE ω and ST ω . This measure depends on the distri-
bution and support of saturations, and will vary across RS designs.13
We can now formalize what we refer to as spillover eﬀects. There are spillover eﬀects on
the untreated (treated) if there exists a π such that SN T (π ) = 0 (ST (π ) = 0). A suﬃcient
condition to test for the presence of spillovers is SN T = 0 or ST = 0.
1.2 Consistent Estimates of Treatment and Spillover Eﬀects
Next, we establish that a RS design yields consistent estimates of treatment and spillover
eﬀects, both at individual saturations and pooled across multiple saturations. Suﬃcient
conditions for consistency are a design with a pure control and an interior saturation, and
no interference between clusters.
Result 1. Assume Assumption 1 and let ω be a non-trivial randomized saturation design
with a pure control. Then ω generates unbiased, consistent estimators for IT T (π ), SN T (π )
and T CE (π ) at each π ∈ Π.
In order to estimate the pooled eﬀects described in Section 1.1, we must introduce weights.
When data are pooled, this unintentionally places a disproportionate weight on treated
individuals in high saturation clusters and untreated individuals in low saturation clusters.
Saturation weights correct for this distortion.14
13
We make this dependence explicit by indexing the pooled measure with ω ; this index is suppressed at
times for expositional simplicity.
14
One could deﬁne many diﬀerent pooled eﬀects, including the pooled eﬀect that results from using
9
Deﬁnition 2. Saturation weights apply weight sT π = 1/π to treated individuals and weight
sU
π = 1/(1 − π ) to untreated individuals in treated clusters.
For example, a cluster with π = 2/3 has twice as many treated individuals as a cluster
with π = 1/3. Weighting the treated individuals by sT T
2/3 = 3/2 and s1/3 = 3 allows one to
calculate a pooled estimate that places equal weight on both clusters, rather than twice as
much weight the π = 2/3 clusters.
Result 2. Assume Assumption 1 and let ω be a non-trivial randomized saturation design with
a pure control. Then using saturation weights, ω generates unbiased, consistent estimators
for IT T ω and SN T ω , and without saturation weights, T CE ω .
We need an additional condition on the RS design to obtain a consistent estimate of the
TUT and ST. It is possible to estimate the TUT by either including clusters with very low
saturations, or imposing a functional form on IT T (π ) and deriving T U T = IT T (0) from
estimates at other saturations.
Result 3. Assume Assumption 1 and let ω be a non-trivial randomized saturation design
ˆ (π ) = IT
ˆ T is unbiased and consistent, then ST
with a pure control. If T U ˆ T and
ˆT (π ) − T U
ˆ = IT
ST ˆT − T U ˆ T are unbiased, consistent estimators.
ω ω
1.3 Calculating Variances: Stratiﬁed Interference and Random
Eﬀects
Estimating the variance of treatment and spillover eﬀects requires an assumption on the
nature of interference between units and the variance of the data generating process. Within
a cluster, we observe a single realization of the many potential conﬁgurations of individual
treatment assignment at a given saturation.15 We follow Eric J. Tchetgen and Tyler Vander-
Weele (2010) in using the ‘Stratiﬁed Interference’ assumption proposed by Michael Hudgens
and Elizabeth Halloran (2008). This assumption says that the outcome of an individual is
independent of the identity of the other individuals assigned to treatment.
Assumption 2. Fixing {πc , Tic , Ric , Xic , εic }, Yic = y for any permutation of the treatment
status of individuals j = i.
unweighted data. The deﬁnition we propose has two advantages: (i) it is comparable across treatment
and within-cluster controls, in that the pooled ITT and SNT give the same weight f (π )/(1 − ψ ) to each
saturation-speciﬁc eﬀect IT T (π ) or SN T (π ), and (ii) it facilitates an easy test for the shape of the eﬀect
(linearity, convexity, etc.) by comparing the pooled ITT to the ITT at the expected saturation.
15
This is not an issue with non-interference, as each unit has only two potential outcomes.
10
This assumption signiﬁcantly simpliﬁes the analysis and allows inference without possessing
information about the underlying network structure within a cluster.16
Second, we parameterize the nature of interference within clusters with a random eﬀects
error structure.
Assumption 3. The data generating process has a random eﬀects error structure, with
εic = vc + wic , common cluster component vc ∼ (0, τ 2 ), individual component wic ∼ (0, σ 2 )
and (vc , wic ) orthogonal to (πc , Tic , Ric , Xic ).
A random eﬀects framework combined with a RS design decomposes the clustering of out-
comes into two components: (i) the extent to which outcomes are endogenously driven by
treatment of others in the same cluster, and (ii) the statistical random eﬀect in outcomes,
which reduces the power of the clustered estimates but does not imply interference between
units.
Remark 2. This approach mirrors regression techniques typically used to analyze economic
and medical experiments, and enables a direct comparison of the power of RS designs to
the power of the canonical blocked and clustered designs, making explicit the impact that
randomizing saturations has on power. It diﬀers from the approach taken by the recent
statistics literature (Hudgens and Halloran 2008), as well as in the paper most similar to ours
(Sinclair, McConnell and Green 2012), both of which use randomization inference techniques
(Ronald A. Fisher 1935).
Given Assumptions 1, 2 and 3, we can express Yic as:
Yic = g (Tic , Ric , πc , ; Xic ) + vc + wic .
The random eﬀects assumption provides the additional structure needed to characterize the
relationship between the RS design, the data generating process and the Minimum Detectable
Eﬀect (MDE), the smallest treatment or spillover eﬀect that it is possible to distinguish from
zero (Howard S. Bloom 1995). Suppose that the true eﬀect is nonzero for some treatment
or spillover eﬀect β . Given statistical signiﬁcance level α, the null hypothesis that β = 0 is
rejected with probability γ (the power) for values of β that exceed:
M DE = [t1−γ + tα ] ∗ SE β .
16
In the absence of this assumption, a researcher would need to observe the complete network structure
in each cluster, understand the heterogeneity in networks across clusters, and use a model of network-driven
spillovers to simulate the variance in outcomes that could be generated by these networks.
11
In the next two subsections, we characterize the MDE for the treatment and spillover eﬀect
measures deﬁned in Section 1.1, show how the MDE depends on the structure of the RS
design, establish properties of the optimal RS design to measure each eﬀect (the design that
yields the smallest MDE), and illustrate the trade-oﬀ between measuring pooled and slope
eﬀects.
1.3.1 The Minimum Detectable Pooled Eﬀect
A simple regression-based estimator of the pooled eﬀects is:
Yic = β0 + β1 Tic + β2 Sic + φXic + εic (1)
For any non-trivial RS design with a pure control, this model identiﬁes the pooled treatment
eﬀect and the pooled spillover eﬀect on untreated individuals, but not the pooled spillover
eﬀect on treated individuals. The coeﬃcients depend on the empirical distribution of satura-
ˆT = β
tions; given design ω , equation 1 with saturation weights returns IT ˆ1 and SNˆT = β ˆ2
ω ω
ˆ
and equation 1 without saturation weights returns T CE ω = (µ/(1 − ψ ))β ˆ1 +((1 − µ − ψ )/(1 −
ˆ2 .
ψ ))β
The following theorem characterizes the MDE of the pooled ITT and SNT.
Theorem 1. Assume Assumptions 1, 2 and 3 and let ω be a non-trivial randomized satura-
tion design with a pure control. Then, given statistical signiﬁcance level α and power γ , the
MDE of IT T ω is:
T 1 1 1−ψ 2 ψ+µ
M DEω = (t1−γ + tα ) (n − 1) τ 2 + ηT + (τ 2 + σ 2 )
nC (1 − ψ ) ψ µ2 µψ
S
The MDE of SN T ω (M DEω ) is similar, substituting µS for µ.
The MDE depends on the size of the treatment and control group, and the within-cluster
2
variation in treatment status, ηT . This expression illustrates the relationship between the
random eﬀects structure and the RS design. The ﬁrst term in the brackets captures the
variation in β due to the common cluster component of the error term, and the second
term captures the variation in β due to individual variation. Introducing randomization into
the treatment saturation of clusters results in a power loss when there is a common cluster
component to the error. Otherwise, if τ 2 = 0, the standard error only depends on the size of
the treatment and control groups, but is independent of how treatment is distributed across
clusters.
12
Figure 1. Partial Population Design
Suﬃcient tests for the presence of treatment eﬀects and spillover eﬀects on the untreated
are IT T ω = 0 and SN T ω = 0. The following set of Corollaries derive the optimal RS
design to test for these eﬀects. Consider the partial population design in which a cluster is
treated with probability 1 − ψ , and treated clusters all have the same treatment saturation
P . This design minimizes the variation in treatment saturation, and therefore, the MDE for
treatment and spillover eﬀects.
Corollary 1. Let Ω be the set of non-trivial RS designs with a pure control and suppose
τ 2 > 0. Then, ﬁxing µ and ψ , the design with Π = {0, P = µ/(1 − ψ )} and f = {ψ, 1 − ψ }
T S
(a partial population design) jointly minimizes M DEω and M DEω .
The optimality of a partial population design stems from a positive ICC.
Choosing the optimal treatment saturation P involves a trade-oﬀ. The power of the
pooled ITT increases with P , while the power of the pooled SNT decreases with P . The
relative importance of detecting these two eﬀects, as well as their expected magnitudes, will
determine the optimal P .
Corollary 2. Let Ω be the set of non-trivial RS designs with a pure control and suppose τ 2 >
T S
0. Then, ﬁxing ψ , a partial population experiment with P = 1/2 minimizes M DEω + M DEω .
T S
In this design, M DEP P = M DEP P .
The optimal size of the control group depends on the relative magnitude of the common
cluster component of error to the individual component of error.
Corollary 3. Let Ω be the set of non-trivial RS designs with a pure control. The size of the
T S
control group that minimizes M DEω + M DEω depends on τ 2 , σ 2 and n:
√
1. If τ 2 = 0, then ψ ∗ = 2 − 1 ≈ 0.41
13
2. If σ 2 = 0, then ψ ∗ = n(1 + n) − n which converges to 1/2.
√
3. If τ 2 > 0 and σ 2 > 0, then ψ ∗ ∈ 2 − 1, n(1 + n) − n
The optimal size of the control therefore lies in a relatively narrow range. Designating about
40% of individuals as pure controls yields the smallest sum of standard errors when there
is no common cluster component to the error, while designating close to 50% is preferable
when there is no individual component to error. It is always optimal to have the control be
more than a third because it serves as the counterfactual for both treatment and spillover
groups. As τ 2 increases, the optimal number of control clusters increases. This comparative
static arises because the variance in βˆ due to individual error is proportional to the total
number of individuals in each treatment group, while the variance in β ˆ due to correlated
error is proportional to the total number of clusters in each treatment group.
Moving away from the partial population design to a design with variation in the treat-
ment saturation leads to a power loss in the ability to measure pooled eﬀects. Corollary 4
characterizes the rate at which this power loss occurs.
2
Corollary 4. Fix µ and ψ . Then V ar(β ) increases linearly with respect to ηT .
Taken together, these corollaries provide important insights on experimental design. If
the researcher is only interested in detecting treatment eﬀects and spillover eﬀects on the
untreated, then a partial population experiment has the smallest MDE, and Corollary 3
speciﬁes the optimal control group size. However, partial population designs have the draw-
back that they only measure eﬀects at a single saturation. When researchers care about
the eﬀects at multiple saturations, they will need to introduce variation in the treatment
saturation. Corollary 4 establishes the rate at which the power of the pooled eﬀects declines
from this increase in treatment saturation variance.
1.3.2 The Minimum Detectable Slope Eﬀect
Now suppose that a researcher would like to determine how treatment and spillover eﬀects
vary with treatment intensity, or measuring spillover eﬀects on the treated. This section
presents two methods to estimate these measures: (1) a non-parametric model that estimates
an individual treatment and spillover eﬀect at each non-zero saturation; and (2) a linearized
model that estimates the ﬁrst order eﬀect that changing the treatment saturation has on
treatment and spillover eﬀects. Identiﬁcation of these models requires a RS design with
multiple interior treatment saturations and a pure control.
The Minimum Detectable Slope Eﬀect (MDSE) is the smallest rate of change δ in the
eﬀect, with respect to π , that it is possible to distinguish from zero. Suppose that the true
14
slope is nonzero. Given statistical signiﬁcance level α, the null hypothesis that the eﬀect is
constant, δ = 0, is rejected with probability γ for values of δ that exceed:
M DSE = [t1−γ + tα ] ∗ SE δ .
A Non-Parametric Model: A regression based estimator for the treatment and spillover
eﬀect at each saturation can be obtained through:
Yic = β0 + β1π Tic ∗ 1{πc = π } + β2π Sic ∗ 1{πc = π } + φ · Xic + εic , (2)
Π\{0} Π\{0}
which returns IT ˆT (π ) = β ˆ1π , SN ˆ2π and T CE
ˆ T (π ) = β ˆ (π ) = π β ˆ2π for each
ˆ1π + (1 − π )β
π ∈ Π \ {0}.17 The support of the RS design determines which saturation speciﬁc estimates
are identiﬁed, but unlike equation 1, the deﬁnition of the coeﬃcients is independent of the
empirical distribution of saturations f (π ). This model introduces the possibility to test for
the presence of spillover eﬀects on treated individuals. A hypothesis test of β1πj = β1πk
determines whether the ITT varies with the treatment saturation. By deﬁnition, β1πk −
β1πj = ST (πk ) − ST (πj ), so this hypothesis also tests for the presence of spillover eﬀects on
treated individuals. Similarly, β2πj = β2πk tests whether the SNT varies with the treatment
saturation.
We can also use equation 2 to estimate the change in spillover eﬀects between saturations.
Given saturations πj and πk , the rate of change of the spillover eﬀect on treated individuals is
T
δjk = β1πk − β1πj / (πk − πj ), with an analogous deﬁnition for the within-cluster controls.
If spillover eﬀects are aﬃne, then this is a measure of the slope of the spillover eﬀect,
dIT T (π )/dπ or dST (π )/dπ ; in the case of a non-linear spillover eﬀect, one can view δ1jk as
a ﬁrst order approximation of the slope.
Similar to Theorem 1, we can characterize the MDSE of the ITT and SNT between any
pair of saturations πj , πk ∈ Π, which is proportional to SE (δ ˆS ).18
ˆT ) or SE (δ
jk jk
Theorem 2. Assume Assumptions 1, 2 and 3 and let ω be a randomized saturation design
with κ ≥ 2 interior saturations. Then, given statistical signiﬁcance level α and power γ , the
17
No saturation weights are necessary to estimate individual saturation eﬀects.
18
Recall the MDSE of the ITT and ST are equivalent, by deﬁnition. It is also possible to calculate the
MDE of IT T (π ) and SN T (π ) for each saturation π ; this result is similar to the pooled MDE and is presented
in the Appendix.
15
MDSE between saturations πj and πk for the treated group is:
T (t1−γ + tα ) 1 1 1 1 1
M DSEω (πj , πk ) = ∗ (n − 1) τ 2 + + (τ 2 + σ 2 ) +
πk − π j nC f (πj ) f (πk ) µj µk
where µk := πk f (πk ). An similar expression characterizes the MDSE for the within-cluster
S
control group as M DSEω , substituting µS
k := (1 − πk ) f (πk ) for µk .
As the distance between two saturations increases, it is possible to detect smaller slope
eﬀects. At the same time, increasing the spread of saturations has a countervailing eﬀect by
making the number of treatment (within-cluster control) individuals very small at low (high)
saturations. The latter eﬀect dominates at saturations close to zero or one. When the cluster
component of error is large, the share of clusters assigned to each saturation, f (πj ), plays a
larger role in determining the MDSE - a more equal distribution leads to a smaller MDSE.
When the individual component of error is large, the share of treated and control individuals
assigned to each saturation, µj , is more important. Note that while a pure control is required
to identify treatment and spillover eﬀects at each saturation in equation 2, it is not required
to identify the slope eﬀects.
There are two steps to the design choice for the non-parametric model: selecting which
saturations to use (the support of Π), and deciding how to allocate individuals into each
saturation bin (the distribution f (π )). A researcher can either ﬁx a hypothesized slope size
and determine how far apart saturations must be to detect this slope, or ﬁx the distance
between two saturations and calculate the smallest detectable slope size. Although a partial
population design with a saturation of π = 1/2 is optimal for detecting pooled eﬀects, this
design does not identify slope eﬀects. Moving away from the partial population design to
a design with two interior saturations, Corollary 5 determines how we should assign the
saturations.
Corollary 5. Let Ω be the set of RS designs with at least two interior saturations. Then, ﬁx-
∗ ∗ T S
ing f (πj ) = f (πk ), the saturations (πj , πk ) that minimize M DSEω (πj , πk ) +M DSEω (πj , πk )
are symmetric about 1/2. The optimal distance ∆∗ = πk ∗ ∗
− πj depends on τ 2 , σ 2 and n:
√
1. If τ 2 = 0, then ∆∗ = 2/2 ≈ 0.71.
√
2. If τ 2 > 0, then ∆∗ ∈ 2/2, 1 and limn→∞ ∆∗ = 1.
3. ∆∗ is increasing in τ 2 and n, and decreasing in σ 2 .
∗
Therefore, πj ∗
= (1 − ∆∗ )/2 and πk = (1 + ∆∗ )/2.
16
Although Theorem 2 is generally too intractable to yield broader analytical insights about
optimal design questions, it is possible to numerically calculate the MDSE for designs with
more than two saturations. Given κ saturations, a researcher could use Theorem 2 to answer
questions like (i) ﬁxing equal sized bins f (π1 ) = ... = f (πκ ), what is the optimal spacing of
saturations; or (ii) ﬁxing equally spaced saturations π1 , ..., πκ , what share of clusters should
be assigned to each bin? This model also allows for hypothesis tests on the shape of the
IT T (π ) and SN T (π ). For example, a test of concavity requires three interior saturations.
It is possible to use the expression for the M DE (π ) to calculate the optimal control
group size numerically, given an estimate for τ 2 and σ 2 .19 Similar to the pooled model, the
optimal size of the control group will be smaller in the presence of only individual error
than in the presence of only cluster-level error, and will lie in between for intermediate
error distributions. The optimal control will be smaller than the size of any treatment
saturation ψ ∗ < f (π ), but will be larger than any treatment or within-cluster control group,
ψ ∗ > max{πf (π ), (1 − π )f (π )}.
An Aﬃne Model: It is also possible to measure slope eﬀects by imposing a functional
form on the shape of the IT T (π ) and SN T (π ). For example, we could use an aﬃne model
to estimate the ﬁrst order slope eﬀect:
Yic = δ0 + δ1 Tic + δ2 Sic + δ3 (Tic ∗ πc ) + δ4 (Sic ∗ πc ) + φ · Xic + εic (3)
This regression identiﬁes the TUT as the intercept of the treatment eﬀect, T U ˆT = δˆ1 . The
coeﬃcients δ3 and δ4 are slope terms estimating how eﬀects change with the saturation,
dSTˆ (π )/dπ = δ ˆ3 and dSN ˆ T (π )/dπ = δ ˆ4 . The intercept δ2 estimates spillover eﬀects at
saturation zero. There should be no spillover eﬀect on untreated individuals if the saturation
of treatment is zero (SN T (0) = 0 by deﬁnition), so δ2 = 0 serves as a hypothesis test for the
linearity of the spillover relationship. A test for dST /dπ = dSN T /dπ is given by an F-test
of the hypothesis that δ3 = δ4 .
Similar to Theorem 2, identiﬁcation of equation 3 requires a RS design with two interior
saturations and a pure control. We present an analogous result to Theorem 2 in the Ap-
pendix, which characterizes the analytical expression for the MDSE, proportional to SE (δ ˆ3)
ˆ
and SE (δ4 ).
It is also possible to test for linearity, or identify non-linear relationships with a similar
2
regression to equation 3. For example, including a squared term Tic ∗ πc would identify a
quadratic relationship. In simulations, the aﬃne M DSE is smaller than the non-parametric
M DSE for detecting these higher moments. Another advantage of the aﬃne model is that
19
The expression for M DE (π ) is in the proof of Theorem 2 in the Appendix.
17
Figure 2. Trade-oﬀ between Pooled MDE and MDSE
it can be estimated with data from a RS design in which saturations are assigned from a
continuum.20
The optimal RS design for a pooled analysis stands in sharp contrast to that for a slope
analysis, most obviously in the extent of variation in treatment saturation. A graphical
representation of the tradeoﬀ between detecting pooled and slope eﬀects is presented in
Figure 2. The optimal RS design to identify both slope and pooled eﬀects will depend on
the relative importance that the researcher places on each eﬀect, as well as the expected
size of each eﬀect. To facilitate actual implementation of an RS experiment, we created a
Matlab program to calculate the minimum detectable eﬀects in the pooled, non-parametric
and aﬃne models for diﬀerent designs. The researcher speciﬁes the relative importance of
measuring (i) pooled versus slope eﬀects and (ii) treatment versus spillover on the untreated
eﬀects. The program then calculates the optimal support of the RS design, Π, and the
optimal allocation of clusters to each saturation bin, f (π ).
1.4 Estimating the Treatment on the Compliers Eﬀect
This section returns to the general framework of Section 1.2, and derives the Treatment
on the Compliers (TOC) eﬀect in a model with spillovers.21 The TOC is the diﬀerence
between the expected outcome for individuals who comply with treatment and the expected
20
This design is necessary when using a Chow test to identify threshold eﬀects.
21
This is more commonly known as the Treatment on the Treated (TOT) eﬀect. Throughout this paper,
we use the term ‘treated’ to refer to the group oﬀered treatment; therefore, to avoid confusion, we refer to
the impact on those actually receiving treatment as the Treatment on the Compliers eﬀect.
18
outcome for pure control individuals who would have complied with treatment,
T OC (π ) = E (Yic | Tic = 1, Ric = 1, πc = π ) − E (Yic | Tic = 0, Ric = 1, πc = 0).
A similar expression deﬁnes the pooled eﬀect T OC .
In a model with spillovers, the non-compliers in a treatment cluster may be aﬀected
by the treatment of compliers, and don’t necessarily have the same expected outcome as
non-compliers in a control cluster. Deﬁne the Spillover on the Non-Compliers (SNC)
as:
SN C (π ) = E (Yic | Tic = 1, Ric = 0, πc = π ) − E (Yic | Tic = 0, Ric = 0, πc = 0),
a spillover term which is conceptually similar to the SNT. Combining these expressions, the
T OC (π ) can be expressed as the diﬀerence between the IT T (π ) and SN C (π ), weighted by
the compliance rate r(π ):22
IT T (π ) − (1 − r(π ))SN C (π )
T OC (π ) = .
r(π )
These expressions have no empirical counterpart because compliance in the control is
not observed, and interference between units invalidates the usual strategy of estimating
the TOC from the ITT. With no interference, SN C (π ) = 0, and the standard approach of
instrumenting for compliance with being oﬀered the treatment produces a valid estimate of
the T OT (π ). With interference, we need an estimate of SN C (π ) to estimate the T OT (π ).
An alternative way forward is to assume that spillovers on within-cluster non-compliers
are similar to spillovers on within-cluster controls, which are empirically identiﬁable.23
Assumption 4. SN C (π ) = SN T (π )
This assumption eﬀectively replaces the IV estimator’s assumption that SN C (π ) = 0 with
an estimate of the spillover eﬀect on untreated individuals, and allows us to recover an
estimate of the T OC (π ).24
22
If compliance varies across the saturation distribution, then changes in IT T (π ) will be driven by this as
well as changes in the underlying T OC (π ) and SN C (π ). Indeed, in some cases, such as adoption of a new
technology, the most important saturation-driven heterogeneity may come from variation in uptake across
the saturation distribution.
23
Unlike many extant partial population experiments in which the within-cluster controls are ineligible
for the treatment, in a RS design the within-cluster controls come from the same population as the treatment
sample, so this assumption may be more warranted.
24
Crepon et al. (2013) estimate the treatment on the treated eﬀect by assuming that the externality on
an untreated worker is independent of his treatment status, which is equivalent to Assumption 4.
19
Result 4. Assume Assumption 1 and 4. A non-trivial randomized saturation design with a
pure control yields a consistent estimate of the TOC at saturation π ,
ˆ ˆ T (π )
ˆ (π ) = IT T (π ) − (1 − r
T OC
ˆ(π ))SN
ˆ(π )
r
ˆ(π ) =
where r i,c 1{Tic = 1, Ric = 1, πc = π }/ i,c 1{Tic = 1, πc = π } is a consistent
estimate of the compliance rate at saturation π .
If the compliance rate is constant with respect to treatment saturation, then an analogous
expression exists for T OC as a function of IT T and SN T .25
Similar to the ITT, we can break the TOC into two eﬀects: a direct treatment eﬀect from
the program, the Treatment on the Unique Complier (TUC), and a spillover eﬀect, the
Spillover on the Compliers (SC). An analogous result to Result 3 identiﬁes these eﬀects.
Returning to the random eﬀects model and maintaining Assumption 4, we can back out
estimates of the TOC, SC and TUC. From equation 2, T OC ˆ (π ) = (β ˆ1π − (1 − r ˆ2π )/r
ˆ(π ))β ˆ(π ).
If we assume the compliance rate is constant with respect to π , equation 1 identiﬁes T OC =
((βˆ1 − (1 − r
ˆ)βˆ2 ))/r
ˆ. Equation 3 identiﬁes dSCˆ (π )/dπ = (δˆ3 − (1 − r)δ ˆ4 )/rˆ and T U ˆC =
ˆ1 /r
δ ˆ. Cross-equation hypothesis testing can be performed using either Seemingly Unrelated
Regression or GMM.
In conclusion, the RS framework provides an empirical resolution of why units within a
cluster behave similarly. A study that ﬁnds high ICCs but no spillover eﬀects can attribute
clustering to correlated or contextual eﬀects, while a study with the same ICCs but large
spillovers should attribute clustering to endogenous eﬀects. In this way the randomization of
saturations resolves the reﬂection problem (albeit after the fact), and informs optimal design
of subsequent experiments in similar contexts.
2 Extensions of the RS Design
2.1 Using Within-cluster Controls as Counterfactuals
Suppose there is no evidence of spillovers on untreated individuals – the estimate of SN T (π )
is a precise zero for all π . Then the within-cluster controls are not subject to interference
25
Estimating the pooled TOC is tricky if the compliance rate varies with treatment saturation:
1 1 − r (π ) f (π )
T OC = IT T (π ) + SN C (π ) .
r(π ) r (π ) 1−ψ
Π\{0}
It is not possible to express T OC as a function of IT T and SN T . We must either estimate IT T (π ) and
SN C (π ) for each π , or weight observations to take into account the varying compliance rate.
20
from the treatment and they can be used as counterfactuals.
Assumption 5. SN T (π ) = 0 for all π ∈ Π.
This assumption is testable using any RS design that identiﬁes a consistent estimate of the
SNˆ T (π ).
When Assumption 5 holds, the researcher can pool within-cluster and pure controls, and
estimate a simpler model to measure treatment eﬀects:
Yic = β0 + β1 Tic + φ · Xic + εic (4)
ˆT = β
Given RS design ω , this regression returns IT ˆ1 .26 Power is signiﬁcantly improved by
ω
the larger counterfactual, particularly when the ICC is high.
Theorem 3 characterizes the pooled MDE when the within-cluster controls are included
in the counterfactual.
Theorem 3. Assume Assumptions 1, 2, 3 and 5 and let ω be a randomized saturation design.
Then, given statistical signiﬁcance level α and power γ , the MDE of IT T ω is:
T 1 (1 + ρ(n − 1)) 1
M DEω = (t1−γ + tα ) τ2 + σ2
nC µ(1 − µ) µ(1 − µ)
where ρ = η 2 /µ(1 − µ) is the correlation in treatment status between two individuals in the
same cluster.
Theorem 3 nests the familiar expressions for the MDE of the blocked and clustered
designs, and provides context for two well-known results. Fixing the treatment probability
µ, the expression for the MDE is decreasing in the variance of the treatment saturation η 2 ,
and minimized when this variation is zero, which corresponds to the blocked design. Second,
ﬁxing η 2 , the MDE is minimized when µ(1 − µ) is maximized, which occurs at µ = 1/2.
Therefore, in the absence of spillovers, the optimal design is a blocked study with equal size
treatment and control groups.
An immediate result of Theorem 3 is that the power of the pooled treatment eﬀect in
any RS design lies between the power of the treatment eﬀect in the blocked and clustered
designs.
Corollary 6. Let ω be a randomized saturation design with treatment probability µ. Then
T T T
M DEB < M DEω < M DEC ,
26
Saturation weights are necessary if there are spillover eﬀects on treated individuals, ST (π ) = 0 for some
π ∈ Π.
21
T T
where M DEB is the MDE in a blocked design with saturation µ and M DEC is the MDE in
a clustered design with share of treatment clusters µ.
2.2 Using the RS Design to Estimate the Pure Control Outcome
If a study has no pure control group, the counterfactual is at the mercy of within-cluster
spillovers. In this context, the RS design has the distinct advantage of allowing a researcher
to test for the presence of spillover eﬀects and estimate the unperturbed counterfactual. If
the spillover eﬀect is continuous at zero, the researcher can use the variation in treatment sat-
uration to project what would happen to untreated individuals as the saturation approaches
zero.27 With this unperturbed counterfactual in hand, we can then estimate SN ˆ T , and use
this value to correct the estimate of the IT ˆT .
Assumption 6 provides a simple way to estimate the pure control by assuming that the
outcome variable is linear with respect to treatment saturation.
Assumption 6. E (Y |T, π ) is an aﬃne (linear) function of π .
While it is possible to use a more ﬂexible functional form and the speciﬁcation can be tested,
the linear case provides simple intuition for the technique.28
Given Assumption 6, it is natural to estimate:
Yic = δ0 + δ1 Tic + δ2 ∗ πc + δ3 (Tic ∗ πc ) + φ · Xic + εic (5)
Given RS design ω with no pure control, estimating equation 4 with saturation weights
and equation 5, the hypothesis test δ2 = 0 determines whether there is variation in the
control outcome across saturations. If spillovers are present on untreated individuals, then
the counterfactual needs to be corrected. The coeﬃcient δ ˆ0 is an estimate of the desired
ˆ0 is an estimate of the within-cluster
‘pure’ control outcome, E (Yic | Tic = 0, πc = 0), while β
control outcome actually used as the counterfactual, E (Yic | Tic = 0, πc > 0). The diﬀerence
between β ˆ0 is the SN
ˆ0 and δ ˆ T , which can be used to derive an unbiased estimate of the
ITˆT .
27
Although continuity is a reasonable assumption, it is not universally applicable. Consider signalling in
a ground-hog colony. Individuals are ‘treated’ by being alerted to the presence of a nearby predator, and the
possible individual-level outcomes are ‘aware’ and ‘not aware’. The animal immediately signals danger to
the rest of the colony, and control outcomes will be universally ‘aware’ for any positive treatment saturation,
but ‘unaware’ when the saturation is exactly zero.
28
In a panel diﬀerence in diﬀerence regression, the quantity giving the desired counterfactual would be
the un-interacted ‘post-treatment’ dummy. This is the change the control group would have experienced at
saturation zero
22
Figure 3. Treatment Saturation of Alternate Network
Result 5. Assume Assumption 1, 2, 3 and 6, and let ω be a randomized saturation design
with no pure control and κ ≥ 2 interior saturations. Then ω generates consistent estimators
ˆT = β
of IT ˆ0 − δ
ˆ1 + β ˆ0 and SNˆT = β ˆ0 , where β
ˆ0 , β
ˆ0 − δ ˆ0 are the estimates from
ˆ1 and δ
ω ω
equation 4 with saturation weights and equation 5.
Similar estimates for the IT T and SN T at a speciﬁc saturation are generated by estimating
equation 4 on a single saturation.
The RS design opens up unique empirical possibilities even when there is no pure control
group. This is particularly important for settings in which a pure control is not feasible due
to regulatory requirements or other exogenous restrictions.29
2.3 Spillover Eﬀects in Overlapping Networks
The RS design we present must be implemented in a non-overlapping network (such as vil-
lages or schools), but many networks of interest do not satisfy this strong requirement (such
as peer networks or extended families). However, an RS design implemented on a non-
overlapping network also produces exogenous variation in the treatment saturation of over-
lapping networks, variation that is always superior to what would be obtained from a blocked
design and generally superior to clustered designs. This variation depends on the structure of
both networks – it increases as the correlation between the two networks increases. As imple-
menting a RS design using non-overlapping clusters is much more straightforward than the
sequential randomization required to conduct a RS design in overlapping networks (Toulis
and Kao 2013), this provides an attractive way of generating random variation in treatment
saturation even when the true network of interest is overlapping.
Figure 3 illustrates the treatment saturation distributions in an overlapping network that
results from implementing either a blocked, clustered or RS design on a non-overlapping
network.30 Using an overlapping network with ﬁve links per individual, we plot the share of
individuals at each treatment saturation in the non-overlapping network, where the treatment
saturation captures the share of an individual’s links who receive treatment. We use the
29
For example, in McIntosh et al. (2013), a Mexican government rule required that each participating
cluster (municipality) be guaranteed at least one treated sub-unit (neighborhood).
30
We use a blocked design in which 50% of individuals in each cluster are treated, a clustered design
in which 50% of clusters are treated at either 100% or 0% saturation, and a RS design in which an equal
share of clusters are treated at saturations 0%, 33%, 67% or 100%. Each assignment rule results in the same
overall fraction (one half) of the sample being treated.
23
probability that a link in the overlapping network connects two individuals in the same cluster
(the unit of the non-overlapping network) to measure the correlation between networks.31
As can be seen in Figure 3, the blocked design produces little overall variation in treatment
saturations; the saturations are centered around 50%, independent of the correlation. The
clustered design suﬀers from the opposite problem: because treatment has taken place at
the cluster level, it is dominated by nodes that have either high or low treatment saturations
when there is correlation between networks. Finally, the RS design produces a more even
distribution of saturations when there is correlation between networks. In the limit, when
there is no correlation between networks, the three designs produce the same saturation
distributions (left panel of Figure 3).
3 Empirical Application
The Schooling, Income, and Health Risk (SIHR) is a randomized saturation study designed
to understand the role that Conditional and Unconditional Cash Transfers (CCTs and UCTs)
play in improving schooling outcomes and reducing early marriage and pregnancy among
unmarried, school-age females. We now present an analysis of all of the estimands developed
in this paper using the RS design to understand how these programs altered outcomes for
the within-cluster controls as well as for the treated. The study took place in the Zomba
district of Malawi. Before the start of the intervention, 176 EAs were selected from urban
(Zomba city, 29 EAs) and rural (147 EAs) strata for inclusion in the study. 32
In the 176 study EAs, each dwelling was visited to take a census of all never-married
females aged 13-22 years. Within this eligible population we deﬁned two cohorts: those
enrolled in school at baseline (baseline schoolgirls), and those not enrolled in school at
baseline (baseline dropouts). All baseline dropouts were selected for inclusion in the study
due to the small size of this cohort (approximately ﬁve per EA, accounting for about 15%
of the target population), while we sampled within the larger cohort of baseline schoolgirls.
The percentage of this cohort randomly selected for inclusion in the study was just above
60% and varied by geographical stratum and age group. 33 This sampling procedure yielded
3,796 individuals, who were enrolled in the study and completed baseline interviews at the
end of 2007. Of these study participants, 889 were baseline dropouts and 2,907 were the
baseline schoolgirls who we analyze here.
Out of the 176 EAs, 88 EAs were assigned to pure control and 88 to treatment. All
31
The speciﬁc structure of the network is irrelevant. Any network with the same number of links and
correlation measure will acheive the same saturation distribution.
32
Each EA contains an average of 250 households spanning several villages.
33
The sampling rate varied from 14% to 45% in urban EAs and 70% to 100% in rural ones.
24
Control Treatment Enumeration Areas
Enumeration (N=88)
Areas
0% Saturation 33% Saturation 66% Saturation 100% Saturation
(N=88)
(N=15) (N=24) (N=25) (N=24)
15 CCT 9 UCT
87 68
16 CCT 9 UCT
Baseline Within-village
Pure Control 143 87 15 CCT 9 UCT
Study Schoolgirls Control
Within-village
Strata: Control 276 128
Within-village
1,495 200 173 135 70 Control 44
Baseline
Pure Control CCT CCT CCT CCT
Dropouts
Figure 4. Research Design
Shaded cells indicate treatment and numbers give sample sizes at the individual level per cell. Household
transfer amounts randomized at the EA level, monthly values of $4, $6, $8, $10. Participant transfer amounts
randomized at the indvidual level, monthly values of $1, $2, $3, $4, $5.
baseline dropouts in treatment EAs were oﬀered CCTs. The randomized saturation experi-
ment as well as the UCT/CCT experiment was conducted only among baseline schoolgirls.
46 EAs had CCT saturations randomized, 27 EAs had the UCT saturations randomized,
and 15 EAs saw only baseline dropouts treated. 34 In EAs assigned to CCT, 15 are treated
at 33%, 16 are treated at 67%, and 15 are treated at 100%, while there were 9 UCT EAs
in each saturation bin. The 15 EAs in which only baseline dropouts are treated provide a
0% CCT saturation, measuring the spillover from CCT treatment of baseline dropouts on
baseline schoolgirls. Within each EA, we then selected the integer number of treatments
that made the EA-level sample saturation as close as possible to that assigned. Figure 4
presents a schematic of the randomized saturation study design.
In the CCT arm, households were oﬀered cash transfers of between $5 and $15 per month
if the study participant attended school at least 80% of the days her school was in session
during the past month. The UCT arm featured the same transfer system, but the cash
transfers were oﬀered unconditionally.35 The cash transfer program started in early 2008,
and continued for two years.
We performed the RS experiment only among baseline schoolgirls sampled into the study,
meaning that the inclusion rules and sampling rates form a ‘gateway to treatment’ for the
true saturation within both the eligible population and the overall population. This has
two distinct implications. First, while the conduit for the spillover eﬀects may be an inel-
igible group such as potential male partners, a gateway-to-treatment study can only hope
34
Due to funding constraints for the transfers, the study included a larger pure control group than would
have been ideal for power purposes alone.
35 ¨
See Baird, McIntosh and Ozler (2011) for more details on intervention design.
25
to capture spillover eﬀects that are both generated and experienced by eligibles. Second, a
sampling rate of less than 100% pushes down treatment saturations within the entire eligible
population relative to those assigned within the study sample. Because this study featured
a high overall sampling rate of 68%, the true saturations are only slightly lower than the as-
signed. The correlation coeﬃcient between the assigned and true saturations at the EA level
is 0.86, while the assigned saturations are completely orthogonal to the sampling weights
with a correlation coeﬃcient of 0.03. The actual saturation experiment is thus dampened
by one-third from the assigned experiment. To recover marginal eﬀects in the correct units
of the true saturation, we instrument for the true saturations with the assigned.
Our analysis utilizes data from three sources. First, the annual SIHR Household Survey
provides three rounds of data (baseline, 12-month follow-up, and 24-month follow-up). This
survey provides data on the core respondent’s marital status and fertility, as well as her
network of friends. Second, we visited the schools of all study participants, who reported
being enrolled in school during the 12-month follow-up interviews, and collected data on
their enrollment and attendance directly from their schools.36 Finally, to obtain an objec-
tive measure of learning, we administered independent tests for English, mathematics, and
cognitive skills to study participants in their homes at the 24-month follow-up. The tests
were developed by a team of experts at the Human Sciences Research Council according to
the Malawian curricula for these subjects for Standards 5-8 and Forms 1-2.37 The outcomes
used in the empirical analysis, then, are enrollment, average test scores, and self-reported
marriage and pregnancy.
Table 1 shows balance tests with the same speciﬁcations to be used in the analysis of
spillover eﬀects. All results are shown separately for CCTs and UCTs, providing cross-
sectional baseline comparisons at the individual level while clustering standard errors at the
EA level to account for the design eﬀect. The set of 10 variables for which we examine baseline
balance between various treatment groups is the same set reported in Baird, McIntosh and
¨
Ozler (2011). Panel A shows the simple balance tests; the spillover sample is generally similar
to the pure control at baseline. In Panel B we include linear these slope terms, meaning that
the top half of Panel B tests for the diﬀerence of the 0% saturation (observed in the CCT,
extrapolated in the UCT) from the pure control. This provides falsiﬁcation for the intercept
and slope terms to be used in the saturation analysis. Overall the experiment appears well
balanced.
36
While a school survey was also conducted at the 24-month follow-up, this was done only for a random
sub-sample of study participants due to budget constraints. Hence, the outcome variable of number of terms
enrolled goes from a minimum of zero to a maximum of three an indicator of school attendance during the
ﬁrst year of the program.
37
Primary school in Malawi is from Standard 1 to 8, while secondary school is from Form 1 to 4.
26
3.1 Analysis of Treatment and Spillover Eﬀects
We now present the treatment and spillover eﬀects that can be identiﬁed using the random-
ized saturation design. Table 2 estimates equations (1) and (3), with two modiﬁcations.
First, we allow the CCT and UCT arms to have separate treatment and spillover eﬀects.
Second, we instrument for the true saturation within the eligible population using the ran-
domly assigned saturation in the sample so as to provide marginal eﬀects in the units of the
true saturation. Our analysis includes all baseline schoolgirls, controlling for a basic set of
baseline covariates and clustering standard errors at the EA level. We present two sets of
results for each outcome, ﬁrst showing simple IT T and SN T eﬀects by estimating equation
(1) in the odd-numbered columns, and then proceeding to test for the presence of saturation
slope eﬀects using equation (3) in the even-numbered columns. The bottom two panels of
Table 2 explicitly calculate the treatment eﬀects that were developed in Section 1.38
The regression coeﬃcients on the treatment saturations give the linearized slope eﬀects
for each outcome. The pooled IT T is presented in columns 1 and 2 and the SN T in columns
3 and 4. The T U T is the intercept term, given by the ﬁrst two rows in the even-numbered
columns. We can divide the T U T by the respective compliance rates to calculate the T U C ,
the treatment eﬀect on the unique complier; and calculate the T oC , the pooled treatment
on the compliers eﬀect, using Assumption 4. These two estimands allow us to calculate
the pooled spillovers on the compliers: SC = T OC − T U C . Finally, we perform F-tests
on each of these estimands, which are linear combinations of regression coeﬃcients across
equations. Estimation conducted using Seemingly Unrelated Regressions with OLS models
or two-step GMM with IV models provide identical results for the signiﬁcance levels in the
bottom panels of Table 2.
The cluster-level pooled spillover on the non-treated eﬀects (SN T ) are given by the co-
eﬃcients on the within-cluster control indicators in the odd-numbered columns. Despite
the sizable pooled intention to treat eﬀects (IT T ), we ﬁnd no average spillover eﬀects on
the non-treated. Furthermore, for each statistically signiﬁcant treatment eﬀect the average
spillover eﬀect on within-cluster controls has the same sign, indicating no evidence of detri-
mental spillover eﬀects among untreated individuals in treated clusters. The total causal
eﬀect, which is a weighted average of the IT T and SN T , presented in the bottom two
panels conﬁrms this ﬁnding: the T CE closely tracks the IT T in statistical signiﬁcance and
typically appears close to the IT T multiplied times .65, the average treatment saturation
in clusters with any treatment. Saturation eﬀects presented in the even-numbered columns
suggest that these spillovers on the non-treated increase with treatment saturation, although
38
The compliance rate was 77.4% for the CCT arm and 99% for the UCT arm.
27
none of these slope estimates are signiﬁcant at the 10% level.
When we move to examining spillover eﬀects on the treated (ST ), we ﬁnd some evidence
that the beneﬁcial treatment eﬀects decline with treatment intensity. For example, the
treatment on the uniquely treated (T U T ) eﬀect on enrollment in the CCT group is 0.25
terms, while the pooled IT T estimate is 0.133. Eﬀects on test scores are evident in estimates
presented in columns 3 and 4. Similarly, for marriage and pregnancy, the T U T eﬀects in the
UCT arm are consistently higher (in absolute value) than the pooled IT T eﬀects, suggesting
that beneﬁcial intention to treat eﬀects wear oﬀ as more eligible individuals are treated
within the cluster. Intriguingly, this indicates that d(SN
dπ
T (π ))
and d(ST
dπ
(π ))
have opposite signs
for all four outcome variables, suggesting a welfare tradeoﬀ between treated and untreated
units that becomes more pronounced as the treatment saturation increases.
Underlying the estimation of the saturation slope terms in Table 2 are the discrete distri-
butions of the treatment saturations assigned in our experiment: 33%, 67%, 100%, and a 0%
CCT cell that estimates the spillover on schoolgirls from CCT baseline dropout treatment
alone. We calculate the non-parametric cell-speciﬁc IT T (π ) for each treatment saturation
and SN T (π ) for each saturation below 100%. Table 3 presents this fully granular analysis of
impact, showing coeﬃcient estimates for each combination of treatment arm and saturation
separately, using the non-parametric regression model in equation 2. In the ﬁrst column
we provide the average true saturation rate within the eligible population for each assigned
saturation bin. The impact estimates in columns 2-5 reinforce the ﬁndings from Table 2,
which used the aﬃne model to estimate saturation eﬀects: spillover eﬀects on the non-treated
are generally strongest (and have the same sign as the pooled intention to treat eﬀects) for
the cells with the highest treatment saturation (see, e.g. column 2 row 9 or column 4 row
11). Furthermore, again consistent with the earlier ﬁndings from Table 2, intention to treat
eﬀects are highest in the cells with low saturation, becoming insigniﬁcant for the highest
saturations (see, e.g. the IT T (π ) estimates for schooling outcomes in the CCT arm and
those for marriage and pregnancy in the UCT arm in the top panel of Table 3).
3.2 Analysis of Spillover Eﬀects in Friends Network
The ﬁnding of weak spillovers within relatively large spatial units could mask the presence
of stronger spillovers within social networks. Using data collected at baseline on the closest
friends of each study participant, we show that spillover eﬀects are equally muted within
this more intimate social network.
At baseline, we asked each study participant to list their ﬁve closest friends and to provide
some basic information about these friends. We matched the friends to our study sample
28
to determine their treatment status. Restricting our sample to the set of individuals who
(a) lived in study EAs, (b) were eligible for the treatment (i.e. never-married females aged
13-22), and (c) are either themselves in the study sample or were listed at baseline among the
ﬁve closest friends gave us a sample of 8,981 individuals in 176 EAs. As described in Section
II.C, because there is a positive correlation between the locations of the study participants
and their friends, the RS design generates exogenous variation in treatment intensity within
each individual’s social network.
In Table 4, we present program eﬀects as a function of the number of treated friends.
The covariates included in the analysis of social networks must reﬂect the fact that we failed
to link friends in a non-negligible number of cases and that we only observe treatment status
for friends linked to our sample. To account for this endogenous variation in our ability to
link friends in the study sample, we include fully ﬂexible controls for the distribution of the
number of matched friends. The ﬁndings mirror those presented in Table 2: for beneﬁciaries
and non-beneﬁciaries alike, none of the outcomes is responsive to the number of treated
friends. The similarity in the pattern of spillovers in the spatial and social networks provides
some support for stratiﬁed interference described in Assumption 2.
Returning to clusters deﬁned over spatial units, such as EAs, the purity of a cross-cluster
counterfactual will be compromised if the regional intensity of treatment has an eﬀect on
outcomes. To test for this, we conclude by following Miguel and Kremer (2004) and Bobba
and Gignoux (2013) in using GIS data on the locations of the EA centroids to count the
number of treatment and control EAs within distance bands of <3km and 3-6km from each
EA. Since the treated number of EAs is randomized conditional on the total number within
each band, we can use this variation to look for cross-cluster spillovers that would violate
Assumption 1. Table 5 demonstrates that this cash transfer experiment did not generate
strong cross-cluster eﬀects. Coeﬃcient estimates for the number of treated EAs within the
two distance bands are always small and statistically insigniﬁcant, implying that there are
no spillovers for enrollment, test scores, marriage, or fertility across clusters (columns 1, 3,
5, and 7).39 Exploiting incidental randomization across clusters we conﬁrm Assumption 1
and, as in the within-cluster analysis, ﬁnd little evidence of spillover eﬀects.
39
In contrast to Bobba and Gignoux (2013), who ﬁnd large spillover eﬀects of PROGRESA in Mexico but
only on treated individuals, we ﬁnd no consistent evidence that program beneﬁciaries experience spillovers
from adjacent clusters that are any diﬀerent from untreated individuals (columns 1, 3, 5, and 7). In other
words, the ST and the SN T measured cross-cluster are both zero.
29
4 Conclusion and Discussion
In recent years, empirical researchers have become increasingly concerned with the problem
of interference between subjects. Experiments designed to rigorously estimate spillovers
open up a fascinating set of research questions and provide policy-relevant information about
program design. Research designs and RCTs that fail to account for spillovers can be biased;
ﬁnding meaningful treatment eﬀects but failing to observe deleterious spillovers can lead to
misconstrued policy conclusions. This paper attempts to push the frontier of research designs
by formalizing the analysis of randomized saturation experiments.
The beneﬁt of randomizing treatment saturations is the ability to generate direct ex-
perimental evidence on the nature of spillover and threshold eﬀects. The cost of doing so
is statistical power. Having laid out the assumptions necessary to estimate both the mean
and variance of spillover eﬀects, we develop explicit, closed-form expressions for the power
of RS experiments. We ﬁrst provide a general expression for power when we seek to esti-
mate treatment and spillover eﬀects jointly. The power loss from randomizing saturations
is directly related to the variation in treatment saturation, and so is an inherent feature of
the design. Our explicit power calculation formulae provide concrete guidance for optimal
research design depending on whether the researcher is primarily interested in measuring
pooled treatment and spillover eﬀects or slope eﬀects (which necessitates more partially
treated clusters). When spillover eﬀects are found to be muted, this bolsters the credibility
of causal inference from clustered designs.
Our empirical application provides little evidence of spillover eﬀects within clusters, or
indeed across clusters. This suggests that the signiﬁcant decreases in marriage and fertility
amongst schoolgirls in the unconditional cash transfer treatment group (Baird, McIntosh and
¨
Ozler 2011) are causal in a larger sense, and are not arising because the treatment diverts
such behavior to others girls in the study. For marriage, and pregnancy, the coeﬃcient on
treatment saturations for the within-cluster controls is in fact negative, indicating a slight
protective eﬀect of the program on nearby individuals who do not receive the treatment.
The framework presented here serves as an important guide to policy questions. For
example, if a researcher is implementing a program with ﬁxed resources and can either treat
100% of ﬁve villages or 50% of ten villages, which treatment allocation will maximize the
total beneﬁt? In the Malawi cash transfer program, our results suggest that they would
have the same total eﬀect, and the T CE of the program is closely approximated by the IT T
times the average saturation rate, independent of how individuals are assigned to treatment.
Small policy trials conducted on a subset of the population can miss important scale or
congestion eﬀects that will accompany the full-scale implementation of a program. To the
30
extent that varying the cluster-level saturation leads to diﬀerential impacts on prices, norms,
and congestion eﬀects, the randomized saturation design provides an experimental framework
that can bolster both external and internal validity.
References
Akerlof, George A., and Rachel E. Kranton. 2002. “Identity and Schooling: Some
Lessons for the Economics of Education.” Journal of Economic Literature, 40(4): 1167–
1201.
Alix-Garcia, Jennifer, Craig McIntosh, Katharine R. E. Sims, and Jarrod R.
Welch. 2013. “The Ecological Footprint of Poverty Alleviation: Evidence from Mexico’s
Oportunidades Program.” The Review of Economics and Statistics, 95(2): 417–435.
Angelucci, Manuela, and Giacomo De Giorgi. 2009. “Indirect Eﬀects of an Aid Pro-
gram: How Do Cash Transfers Aﬀect Ineligibles’ Consumption?” American Economic
Review, 99(1): 486–508.
Angelucci, Manuela, Giacomo De Giorgi, Marcos A. Rangel, and Imran Ra-
sul. 2010. “Family networks and school enrolment: Evidence from a randomized social
experiment.” Journal of Public Economics, 94(3-4): 197–221.
Aronow, Peter. 2012. “A General Method for Detecting Interference in Randomized Ex-
periments.” Sociological Methods Research, 41(1): 3–16.
Babcock, Philip S., and John L. Hartman. 2010. “Networks and Workouts: Treatment
Size and Status Speciﬁc Peer Eﬀects in a Randomized Field Experiment.” National Bureau
of Economic Research, Inc NBER Working Papers 16581. NBER Working Papers.
¨
Baird, Sarah, Craig McIntosh, and Berk Ozler. 2011. “Cash or Condition? Evidence
from a Cash Transfer Experiment.” The Quarterly Journal of Economics, 126(4): 1709–
1753.
Banerjee, Abhijit, Arun G. Chandrasekhar, Esther Duﬂo, and Matthew O. Jack-
son. 2013. “The Diﬀusion of Microﬁnance.” Science, 341(6144).
Banerjee, Abhijit, Raghabendra Chattopadhyay, Esther Duﬂo, Daniel Keniston,
and Nina Singh. 2012. “Can Institutions be Reformed from Within? Evidence from
a Randomized Experiment with the Rajasthan Police.” National Bureau of Economic
Research, Inc NBER Working Papers 17912.
Barrera-Osorio, Felipe, Marianne Bertrand, Leigh Linden, and Francisco Perez-
Calle. 2011. “Improving the Design of Conditional Cash Transfer Programs: Evidence
from a Randomized Education Experiment in Colombia.” American Economic Journal:
Applied Economics, 3(2): 167–195.
31
Beaman, Lori A. 2012. “Social Networks and the Dynamics of Labour Market Out-
comes: Evidence from Refugees Resettled in the U.S.” The Review of Economic Studies,
79(1): 128–161.
Behrman, Jere R, Piyali Sengupta, and Petra Todd. 2005. “Progressing through
PROGRESA: An Impact Assessment of a School Subsidy Experiment in Rural Mexico.”
Economic Development and Cultural Change, 54(1): 237–75.
Bloom, Howard S. 1995. “Minimum Detectable Eﬀects: A Simple Way to Report the
Statistical Power of Experimental Designs.” Evaluation Review, 19(5): 547–556.
Bobba, Matteo, and Jeremie Gignoux. 2013. “Policy Evaluation in the Presence of
Spatial Externalities: Reassesing the Progresa Program.” Working Paper.
Bobonis, Gustavo J., and Frederico Finan. 2009. “Neighborhood Peer Eﬀects in Sec-
ondary School Enrollment Decisions.” The Review of Economics and Statistics, 91(4): 695–
716.
Chen, Jiehua, Macartan Humphries, and Vijay Modi. 2010. “Technology Diﬀusion
and Social Networks: Evidence from a Field Experiment in Uganda.” Working Paper.
Conley, Timothy G., and Christopher R. Udry. 2010. “Learning about a New Tech-
nology: Pineapple in Ghana.” American Economic Review, 100(1): 35–69.
Crepon, Bruno, Esther Duﬂo, Marc Gurgand, Roland Rathelot, and Philippe
Zamora. 2013. “Do Labor Market Policies have Displacement Eﬀects? Evidence from a
Clustered Randomized Experiment.” The Quarterly Journal of Economics, 128(2): 531–
580.
Cunha, Jesse M., Giacomo De Giorgi, and Seema Jayachandran. 2011. “The Price
Eﬀects of Cash Versus In-Kind Transfers.” National Bureau of Economic Research, Inc
NBER Working Papers 17456. NBER Working Papers.
Duﬂo, Esther, and Emmanuel Saez. 2002. “Participation and investment decisions in
a retirement plan: the inﬂuence of colleagues’ choices.” Journal of Public Economics,
85(1): 121–148.
Duﬂo, Esther, and Emmanuel Saez. 2003. “The Role Of Information And Social Inter-
actions In Retirement Plan Decisions: Evidence From A Randomized Experiment.” The
Quarterly Journal of Economics, 118(3): 815–842.
Fisher, Ronald A. 1935. The Design of Experiments. Oxford, England:Oliver & Boyd.
Gine, Xavier, and Ghazala Mansuri. 2012. “Together we will : experimental evidence
on female voting behavior in Pakistan.” Working Paper.
Hudgens, Michael, and Elizabeth Halloran. 2008. “Towards Causal Inference with
Interference.” Journal of the American Statistical Association, 103(482): 832–842.
32
Killeen, GF, TA Smith, HM Ferguson, H Mshinda, S Abdulla, et al. 2007. “Pre-
venting childhood malaria in Africa by protecting adults from mosquitoes with insecticide-
treated nets.” PLoS Med, 4(7): e229.
Kuhn, Peter, Peter Kooreman, Adriaan Soetevent, and Arie Kapteyn. 2011. “The
Eﬀects of Lottery Prizes on Winners and Their Neighbors: Evidence from the Dutch
Postcode Lottery.” American Economic Review, 101(5): 2226–2247.
Lalive, Rafael, and M. A. Cattaneo. 2009. “Social Interactions and Schooling Decisions.”
The Review of Economics and Statistics, 91(3): 457–477.
Macours, Karen, and Renos Vakis. 2008. “Changing Households’ Investments and As-
pirations through Social Interactions: Evidence from a Randomized Transfer Program in
a Low-Income Country.” World Bank Working Paper 5137.
Manski, Charles. 1993. “Identiﬁcation of Endogenous Social Eﬀects: The Reﬂection Prob-
lem.” Review of Economic Studies, 60(3): 531–542.
Masanjala, Winford. 2007. “The PovertyHIV/AIDS Nexus in Africa: A Livelihood Ap-
proach.” Social Science and Medicine, 64(5): 1032–1041.
McIntosh, Craig, Tito Alegria, Gerardo Ordonez, and Rene Zenteno. 2013. “In-
frastructure Impacts and Budgeting Spillovers: The Case of Mexico’s Habitat Program.”
Working Paper.
Miguel, Edward, and Michael Kremer. 2004. “Worms: Identifying Impacts on Educa-
tion and Health in the Presence of Treatment Externalities.” Econometrica, 72(1): 159–217.
Moﬃtt, Robert A. 2001. “Policy Interventions, Low-Level Equilibria And Social Interac-
tions.” 45–82. MIT Press.
Munshi, Kaivan. 2003. “Networks in the Modern Economy: Mexican Migrants in the U.S.
Labor Market.” Quarterly Journal of Economics, 118(2): 549–599.
Oster, Emily, and Rebecca Thornton. 2012. “Determinants of Technology Adoption:
Peer Eﬀects in Menstrual Cup Take-Up.” Journal of the European Economic Association,
10(6): 1263–1293.
Poulin, Michelle J. 2007. “Sex, money, and premarital relationships in southern Malawi.”
Social Science and Medicine, 65(11): 2383–2393.
Sinclair, Betsy, Margaret McConnell, and Donald P. Green. 2012. “Detecting
Spillover Eﬀects: Design and Analysis of Multilevel Experiments.” American Journal of
Political Science, 56(4): 1055–1069.
Swindler, Ann, and Susan Watkins. 2007. “Ties of Dependence: AIDS and Transac-
tional Sex in Rural Malawi.” Studies in Family Planning, 38(3): 147–163.
Tchetgen, Eric J., and Tyler VanderWeele. 2010. “On Causal Inference in the Presence
of Interference.” Statistical Methods in Medical Research, 21(1): 55–75.
33
Toulis, Panos, and Edward Kao. 2013. “Estimation of Causal Peer Inﬂuence Eﬀects.”
Journal of Machine Learning Research, 28. Proceedings of the 30th International Confer-
ence on Machine Learning Research.
34
A Mathematical Appendix
A.1 Proofs from Section 1.2, 1.4 and 2.2
Proof of Result 1: Let y ¯1,π and y¯0,π be the sample averages for treated and untreated
observations, respectively, in clusters with saturation π . Note for π > 0,
C n
1
y
¯1,π = Yic 1Tic =1,πc =π
nCπf (π ) c=1 i=1
with an analogous deﬁnition for y ¯0,π and y ¯0,0 . Then ITˆT (π ) = y
¯1,π − y¯0,0 converges to
E (Yic | Tic = 1, πc = π ) − E (Yic | Tic = 0, πc = 0) = IT T (π ) by the strong law of large
numbers. The results for SN ˆ T (π ) = y¯0,π − y ˆT (π ) + (1 − π )SN
ˆ (π ) = π IT
¯0,0 and T CE ˆ T (π )
are analogous. Q.E.D.
Proof of Result 2: Let y ¯0,π>0 be the sample averages for treated and within-
¯1 and y
cluster control observations across all saturations, respectively, weighted with saturation
weights. Note that
C n
1 1
¯1 =
y πc 1Tic =1,πc >0 =
Yic sT y1,π
f (π )¯
nC T
i,c sπc 1Tic =1,πc >0 c=1 i=1
1−ψ
Π\0
Therefore,
1 1 ˆT (π )
¯1 − y
y ¯0,0 = y1,π − y
f (π )(¯ ¯0,0 ) = f (π )IT
1−ψ 1−ψ
Π\0 Π\0
and from Result 1, IT ˆT =
ˆT (π ) is a consistent, unbiased estimate of IT T (π ). Therefore, IT
¯1 − y
y ¯0,0 converges to IT T . Similarly, SN ˆT = y¯0,π>0 − y
¯0,0 converges to SN T .
One must estimate T CE from a pooled estimate of the ITT and SNT without satura-
tion weights, because the shifting composition of the sample is integral to the deﬁnition of
the TCE.. Let y ¯π>0 be the sample average for pooled treated and within-cluster control
observations across all saturations.
C n
1 1
¯π>0 =
y Yic 1πc >0 = y1,π + (1 − π )f (π )¯
πf (π )¯ y0,π
nC i,c 1πc >0 c=1 i=1
1−ψ
Π\0
Therefore,
1 ˆT (π ) + (1 − π )f (π )SN
ˆ T (π ) = 1 ˆCE (π )
¯π>0 − y
y ¯0,0 = f (π )π IT f (π )T
1−ψ 1−ψ
Π\0 Π\0
ˆ (π ) is a consistent, unbiased estimate of T CE (π ). Therefore,
and from Result 1, T CE
ˆ = y
T CE ¯π>0 − y
¯0,0 converges to T CE . Note that while it was possible to directly esti-
35
ˆT (π ) and SN
ˆ (π ) from IT
mate T CE ˆ T (π ) in Result 1, it is not possible to directly estimate
ˆ from IT
T CE ˆT and SN ˆ T . Q.E.D.
T CE = (µ/(1 − ψ ))IT T unweighted + ((1 − µ − ψ )/(1 − ψ ))SN T unweighted
Proof of Result 3: Analogous to Result 1. Q.E.D.
Proof of Result 4: Given Assumption 4, there is a consistent estimate of the SNC. The
rest of the proof is analogous to Result 1. Q.E.D.
Proof of Result 5: Given Assumption 6, we can identify the slope of the ITT and SNT.
The rest of the proof is analogous to Result 1. Q.E.D.
A.2 Preliminary Calculations
This section provides background material used to derive the MDE and MDSE in Theorems
1, 2 and 3.
A.2.1 Form of the MDE
The MDE depends on the standard error of β :
M DE = [t1−γ + tα ] ∗ SE β
To compute the MDE, we need to determine SE β . This depends on the data generating
process and the randomization structure. Consider a model with a random eﬀects error
structure:
yic = xic β + vc + wic
for a vector of treatment status covariates xic , where vc is the common cluster component
of error and wic is the individual error. Let Xc Xc = n i=1 xic xic and uc = u1c ... unc ,
where uic = vc + wic . Then the standard error of β is:
1
SE β = ∗ A−1 BA−1
nC
where
C C
1 1
A := prob lim X c Xc and B := prob lim Xc uc uc Xc
nC c=1
nC c=1
1 C 1 C 1
Given that all clusters are identical ex-ante, N c=1 E [Xc Xc ] = C c=1 n E [Xc Xc ] =
1
n
E [Xc Xc ]. Also note that A and B are independent of whether one takes n → ∞ or
36
C → ∞. Therefore, using the formulas for matrices A and B yields:
1 1
A= E [Xc Xc ] and B= E [Xc uc uc Xc ]
n n
The matrix B can be decomposed into two matrices, B = (n − 1)τ 2 D + (τ 2 + σ 2 ) A, which
leads to the expression:
n−1 1
SE β = τ 2 A−1 DA−1 + (τ 2 + σ 2 ) A−1
nC nC
where A and D depend on the RS design. We will utilize this expression to calculate SE β
for diﬀerent eﬀects and RS designs.
A.2.2 Form of Matrices
u21c u1c u2c u1c unc
u1c u2c ... ...
uc uc =
...
...
u1c unc u2nc
τ 2 + σ2 τ2 τ2 σ2 0 0
τ2 τ 2 + σ2 ...
= τ2 + 0 σ2 ...
E [uc uc ] =
... ... ... ...
2 2 2 2
τ τ +σ 0 σ
Given xic = 1 Tic Sic ... , (a 1 × k vector), we can write:
x1c 1 T1c S1c ...
x2c 1 T2c S2c ...
Xc = an n × k matrix
... = ...
...
xnc 1 Tnc Snc ...
1 Tic Sic ...
n n 2
Tic Tic Tic Sic ...
Xc Xc = xic xic = 2
a k × k matrix
Sic Tic Sic Sic ...
i=1 i=1
... ...
A.2.3 Relevant Expectations
40
Distribution of treatment status:
E [Tic ] = P (Tic = 1) = µ
40 1 n
We implicitly assume that realized saturation is equal to assigned saturation, i.e. πc = n Σi=1 Tic , so
given assigned saturation, there is no variation in realized saturation.
37
E [Sic ] = 1 − µ − ψ = µS
E [Cic ] = ψ
E [Tic
x
] = E [Tic ] = µ
E [Sic
x
] = E [Sic ] = µS
E [Tic Sic ] = 0
Variance of treatment status:
V ar[Tic ] = E [Tic
2
] − E [Tic ]2 = µ(1 − µ)
V ar[Sic ] = (1 − µ − ψ ) (µ + ψ )
V ar[Cic ] = ψ (1 − ψ )
Within cluster treatment status:
E [Tic Tjc ] = P (Tic = 1, Tjc = 1) = Π P (Tic = 1, Tjc = 1, πc = π ) = Π π 2 f (π ) =
E [π 2 ]
where the second equality follows from the chain rule of probability and the third
equality follows from the fact that randomization at the individual level is independent
within a cluster i.e. Tic is independent of Tjc , conditional on πc .
E [Sic Sjc ] = 1 − 2µ + E [π 2 ] − ψ
E [Sic Sjc ] = P (Sic = 1, Sjc = 1)
= P (Sic = 1, Sjc = 1, Tc = π )
Π
= (1 − π )2 f (π )
Π\{0}
= E (1 − π )2 − ψ
= 1 − 2µ + E [π 2 ] − ψ
E [Cic Cjc ] = ψ
E [Tic Sjc ] = µ − E [π 2 ]
E [Tic Sjc ] = P (Tic = 1, Sjc = 1)
= P (Tic = 1, Sjc = 1, Tc = π )
Π
= π (1 − π ) f (π )
Π\{0}
= E [π (1 − π )] − 0 ∗ ψ
= µ − E [π 2 ]
38
Across cluster treatment status:
E [Tic Tjd ] = µ2
since E [Tic Tjd ] = E [Tic ] E [Tid ] by independence
E [Sic Sjd ] = (1 − µ − ψ )2
E [Cic Cjd ] = ψ 2
Correlation with saturation πc is
E [Tic πc
x
]= Π π x P (Tic = 1, πc = π ) = Π π x+1 f (π ) = E [π x+1 ]
E [Tic
x
πc ] = E [Tic πc ]
E [Tic Tjc πc
x
]= Π π x P (Tic = 1, Tjc = 1, πc = π ) = Π π x+2 f (π ) = E [π x+2 ]
E [Sic πc
x
]= Π π x P (Sic = 1, πc = π ) = Π\{0} π x (1 − π ) f (π ) = E [(1 − π )π x ]
E [Sic
x
πc ] = E [Sic πc ]
E [Sic Sjc πc
x
]= Π π x P (Sic = 1, Sjc = 1, πc = π ) = Π π x (1 − π )2 f (π ) = E [π x (1 − π )2 ]
E [Tic Sjc πc
x
]= Π π x P (Tic = 1, Sjc = 1, πc = π ) = Π π x π (1 − π ) f (π ) = E [π x+1 (1 − π )]
Correlation of treatment status between two girls in the same cluster:
E [Tic Tjc ] − E [Tic ] E [Tic ] η2
ρT = =
V ar[Tic ] µ(1 − µ)
E [Sic Sjc ] − E [Sic ] E [Sic ]
ρS =
V ar[Sic ]
(1 − 2µ + η 2 + µ2 − ψ ) − (1 − µ − ψ )2
=
(1 − µ − ψ ) (µ + ψ )
2
η + ψ (1 − 2µ − ψ )
=
(1 − µ − ψ ) (µ + ψ )
ρC = 1
Distribution of uic :
E [u2 2
ic ] = τ + σ
2
E [uic ujc ] = τ 2 if i = j which is Cov (uic ujc )
E [uic ujd ] = 0 if c = d
Tic or Sic is independent of uic ⇒ E [f (uic )g (Tic )] = E [f (uic )] ∗ E [g (Tic )]
39
Sum of the error terms and treatment status within each cluster:
n 2
1
E uic = E u2 2 2
ic + (n − 1)E [ujc uic ] = (nτ + σ )
n i=1
n n
1
E uic Tic uic = E u2 2 2
ic Tic + (n − 1)E [ujc uic Tic ] = (nτ + σ )µ
n i=1 i=1
n n
1
E uic Sic uic = nτ 2 + σ 2 (1 − µ − ψ )
n i=1 i=1
since Tic and Sic are independent of uic .
n 2
1
E Tic uic = E u2 2
ic Tic + (n − 1)E [ujc uic Tic Tjc ]
n i=1
= τ 2 + σ 2 µ + (n − 1)τ 2 E [π 2 ]
n 2
1
E Sic uic = τ 2 + σ 2 (1 − µ − ψ ) + (n − 1)τ 2 1 − 2µ + E [π 2 ] − ψ
n i=1
n n
1
E Tic uic Sic uic = (n − 1)τ 2 µ − E π 2
n i=1 i=1
since Tic Sic = 0 for all i, c.
A.2.4 Variance of Treatament Saturation
The marginal distribution of saturations across treatment clusters (removing control clusters)
is:
f (π )
g (π ) =
1−ψ
with support Π\{0}.
1
Eg [π 2 ] = π 2 f (π )
1−ψ
Π\{0}
1
Eg [π ] = πf (π )
1−ψ
Π\{0}
40
The component of total variation in cluster saturation due to variation in the saturation of
treated clusters is:
2
ηT : = V ar(π |π > 0) = Eg [π 2 ] − Eg [π ]2
2
1 1
= π 2 f (π ) − πf (π )
1−ψ 1−ψ
Π\{0} Π\{0}
1 ψ
= η2 − µ2
1−ψ (1 − ψ )2
Then the total variation can be expressed as the sum of the variation in the saturation of
treated clusters and the variation between treated and control clusters, weighted by the size
of the control group:
ψ
2
η 2 = (1 − ψ ) ηT + µ2
1−ψ
A.3 Proof of Theorem 1
We want to compute matrices A and B for the model with xic = 1 Tic Sic . Using the
calculations from Section A.2, we can calculate:
n n Tic Sic 1 µ µS
1 2
A= E Tic Tic Tic Sic = µ µ 0
n i=1 2
Sic Tic Sic Sic µS 0 µS
2
( n ( n uic ) ( n ( n uic ) ( n
i =1 uic ) i =1 i =1 Tic uic ) i =1 i =1 Sic uic )
1 2
B = E ( n i=1 uic ) (
n
i=1 Tic uic ) ( n i=1 Tic uic ) ( n
i=1 Tic uic ) (
n
i=1 Sic uic )
n n n n n n 2
( i=1 uic ) ( i=1 Sic uic ) ( i=1 Tic uic ) ( i=1 Sic uic ) ( i=1 Sic uic )
1 µ µS
2 2 2 + τ 2 + σ2 A
= (n − 1)τ µ η +µ µ − µ2 − η 2
2 2 2 2
µS µ − µ − η µS − µ + η + µ
1
Using mathematica to compute SE β = nC
∗ A−1 BA−1 , taking the diagonal entries and
2
plugging in the expression relating η 2 and ηT yields the result. Q.E.D.
Proof of Corollary 1: Fixing µ and ψ , SE β1 and SE β2 are both minimized at
2
ηT = 0. This corresponds to a partial population experiment with a control group of size ψ
and a treatment saturation of P = µ/(1 − ψ ). Q.E.D.
Proof of Corollary 2: Fixing ψ , a partial population design has the smallest sum of
standard errors, for any treatment size µ. Therefore, we can restrict attention to the set of
partial population designs, and the expression for V ar β simpliﬁes to:
41
T 1 1 ψ+µ
M DEω = [t1−γ + tα ] ∗ (n − 1) τ 2 + (τ 2 + σ 2 )
nC (1 − ψ ) ψ µψ
S 1 1 1−µ
M DEω = [t1−γ + tα ] ∗ (n − 1) τ 2 + (τ 2 + σ 2 )
nC (1 − ψ ) ψ (1 − µ − ψ )ψ
The sum of these expressions is minimized at µ = µS = (1 − ψ )/2, which corresponds to a
partial population experiment with P = 1/2. Q.E.D.
Proof of Corollary 3: In a partial population design with µ = (1 − ψ )/2,
T S 1 1 ψ+1
M DEω = M DEω = [t1−γ + tα ] ∗ (n − 1) τ 2 + (τ 2 + σ 2 )
nC (1 − ψ ) ψ (1 − ψ )ψ
When there is no inter-cluster correlation, τ 2 = 0, the expression simpliﬁes to:
1 ψ+1
[t1−γ + tα ] ∗ σ2
nC (1 − ψ )ψ
√
which is minimized at ψ ∗ = 2 − 1. When there is no individual error, σ 2 = 0, the expression
simpliﬁes to:
1 n+ψ
[t1−γ + tα ] ∗ τ2
nC (1 − ψ ) ψ
which is minimized at ψ ∗ = n(1 + n) − n. Note limn→∞ n(1 + n) − n = 1/2. Given that
(ψ + 1)/((1 − ψ )ψ ) and (ψ + n)/((1 − ψ )ψ ) are both convex with unique minimums, any
weighted sum of these functions is minimized at a value ψ ∗ √
that lies between the minimum
2 2 ∗
of each function. Therefore, when τ > 0 and σ > 0, ψ ∈ ( 2 − 1, n(1 + n) − n).Q.E.D.
Proof of Corollary 4: Follows directly from Theorem 1. Q.E.D.
A.4 Proof of Theorem 2
We want to compute matrices A and B for the model with xic = 1 T1ic S1ic T2ic S2ic
where T1ic = 1 (Tic = 1, πc = π1 ), S1ic = 1 (Tic = 0, πc = π1 ), T2ic = 1 (Tic = 1, πc = π2ic )
and so forth. Using the calculations from Section A.2 and deﬁning µk := πk f (πk ), pk :=
42
2
(1 − πk ) f (πk ), ηk := πk f (πk ) and ρk := (1 − πk )2 f (πk ) = pk − µk + ηk , we can calculate:
n T1ic S1ic T2ic S2ic 1 µ1 p1 µ2 p2
2
n
T1ic T1ic 0 0 0 µ1 µ1 0 0 0
1 2
A= E S 1ic 0 S 1ic 0 0 = p1 0 p1 0
0
n i=1 T2ic 0 2
0 T2ic 0 µ2 0 0 µ2 0
2
S2ic 0 0 0 S2ic p2 0 0 0 p2
n
( n
( uic )
i=1 i=1 uic )
( n T1ic uic ) ( n T1ic uic )
1 i =1 i =1
E ( n ∗ ( n S1ic uic )
B = i=1 S 1 ic uic ) i=1
n
( n ( n T2ic uic )
i=1 T2ic uic ) i=1
( ni=1 S2ic uic ) ( n i=1 S2ic uic )
1 µ1 p1 µ2 p2
µ1
η 1 µ 1 − η 1 0 0
2 + τ 2 + σ2 A
= (n − 1)τ p1 µ1 − η1 ρ1 0 0
µ2 0 0 η2 µ2 − η 2
p2 0 0 µ2 − η2 ρ2
1
Using mathematica to compute SE β = nC
∗ A−1 BA−1 and taking the diagonal entries
yields the M DE T for each saturation πj :
T 1 ηj 1 ψ + µj
M DEω (πj ) = (t1−γ + tα ) ∗ (n − 1) τ 2 2
+ + (τ 2 + σ 2 )
nC µj ψ ψµj
1 1 1 1 1
= (t1−γ + tα ) ∗ (n − 1) τ 2 + + (τ 2 + σ 2 ) +
nC f (πj ) ψ µj ψ
T
Next, we can compute M DSEω (πj , πk ) from
T
SE (δjk ) = SE [ β1πk − β1πj / (πk − πj )] = SE β1πk − β1πj / (πk − πj )
as
1 1
Cov (β1πk , β1πj ) = (nτ 2 + σ 2 )
nC ψ
V ar(β1πk − β1πj ) = V ar(β1πj ) + V ar(β1πk ) − 2Cov (β1πk , β1πj )
1 1 1 1 1
= ∗ (n − 1) τ 2 + + τ 2 + σ2 +
nC f (πj ) f (πk ) µj µk
Plugging this into the expression for M DSE T yields the result. Calculating the M DE S and
M DSE S is analogous. Q.E.D.
43
Proof of Corollary 5: Fixing the size of each saturation bin f (πj ) = fj and f (πk ) =
T
fk and the distance between two saturations πk − πj = ∆, minimizing M DSEω (πj , πk )
S
+M DSEω (πj , πk ) is equivalent to solving:
1 1 1 1
min + + +
πj fj πj fk (πj + ∆) fj (1 − πj ) fk (1 − ∆ − πj )
∗ ∗ ∗ ∗ ∗
The minimum occurs at the πj that solves πj (1 − πj )fj = (πj + ∆)(1 − ∆ − πj )fk . When
∗ ∗ ∗
fj = fk , πj = (1 − ∆)/2 and πk = πj + ∆ = (1 + ∆)/2, which is symmetric about 1/2.
T S
Fixing fj = fk , the ∆ that minimizes M DSEω (πj , πk ) + M DSEω (πj , πk ) is equivalent
to solving:
1 (n − 1) 2 (τ 2 + σ 2 ) 2
min 2 τ +
∆ ∆ n n (1 − ∆)(1 + ∆)
The optimal ∆∗ solves:
(n − 1)τ 2 2(∆∗ )2 − 1
=
2(τ 2 + σ 2 ) (1 − (∆∗ )2 )2
√
If τ 2 = 0, then 2(∆∗ )2 − 1 = 0, yielding ∆∗ = 2/2. Note that (2∆2√ − 1)/((1 − ∆2 )2 ) is
monotonically increasing for ∆ ∈ [0, 1), and strictly positive for ∆ > 2/2. When τ > 0,
((n − 1)τ 2 )(2(
√ τ 2 + σ 2 )) is also strictly positive, increasing in τ 2 and decreasing in σ 2 . There-
fore, ∆∗ ∈ 2/2, 1 for τ 2 > 0 and ﬁnite n, ∆∗ is increasing in τ 2 and n, and decreasing in
σ . If τ > 0, then the left hand side converges to ∞ as n → ∞, which requires ∆∗ → 1.
2 2
Q.E.D.
A.5 Proof of Theorem 3
We want to compute matrices A and B for the model with xic = 1 Tic . Using the
calculations from Section A.2, we can calculate:
n
1 n i=1 Tic 1 µ
A= E n n 2 =
n i=1 Tic i=1 Tic µ µ
and
2
1 ( n i=1 uic ) ( n
uic ) ( n
i=1 i=1 Tic uic )
B = E n n n 2
n ( i=1 uic ) ( i=1 Tic uic ) ( i=1 Tic uic )
1 µ
= τ 2 (n − 1) + τ 2 + σ2 A
2
µ η + µ2
This can be used to compute
1 1 (n − 1)η 2 1
SE β1 = ∗ + 2 τ2 + σ2
nC µ(1 − µ) µ (1 − µ)2 µ(1 − µ)
44
Using η 2 = ρµ(1 − µ), we can express SE β1 in terms of µ and ρ.
1 (1 + ρ (n − 1)) 1
SE β1 = ∗ τ2 + σ2
nC µ (1 − µ) µ(1 − µ)
Fixing µ, this expression is minimized at η 2 = 0 or ρ = 0. Q.E.D.
Proof of Corollary 6: Follows directly from Theorem 3, noting that the blocked design
corresponds to ρ = 0 and the clustered design corresponds to ρ = 1. Q.E.D.
A.6 Aﬃne Model
Theorem 4. Assume Assumptions 1, 2 and 3 and let ω be a randomized saturation design
with κ ≥ 2 interior saturations and a pure control. Then, given statistical signiﬁcance level
α and power γ , the MDSE for the treated group is:
T 1
M DSEω = (t1−γ + tα ) ∗ {(n − 1) τ 2 h1 + (τ 2 + σ 2 ) h2 }
nC
where
(η 2 + µ2 )2 − 2µ(η 2 + µ2 )E [π 3 ] + µ2 E [π 4 ] η 2 + µ2
h1 = and h2 =
((η 2 + µ2 )2 − µE [π 3 ])2 (η 2 + µ2 )2 − µE [π 3 ]
S
A similar expression characterizes the MDSE for the within-cluster control group as M DSEω .
Proof of Theorem 4: We want to compute matrices A and B for the model with
xic = 1 Tic Tic πc Sic Sic πc
Using the calculations from Section A.2, we can calculate:
1 Tic Tic πc Sic Sic πc
2 2
n Tic Tic Tic πc Tic Sic Tic Sic πc
1 2 2 2 2
A = E Tic πc Tic πc Tic πc Tic Sic πc Tic Sic πc
n 2 2
i=1 Sic Tic Sic Tic Sic πc Sic Sic πc
2 2 2 2
Sic πc Tic Sic πc Tic Sic πc Sic πc Sic πc
η 2 + µ2 1 − µ − ψ µ − η 2 + µ2
1 µ
µ µ η 2 + µ2 0 0
2 2 2 2 3
=
η +µ η +µ E [π ] 0 0
1−µ−ψ 0 0 1−µ−ψ µ − η 2 + µ2
µ − η 2 + µ2 0 0 µ − η 2 + µ2 η 2 + µ2 − E [π 3 ]
45
n n
uic )
( i=1 ( uic )
i=1
( n Tic uic ) ( n Tic uic )
1 i=1 i=1
n ∗ ( n Tic πc uic )
B = E ( i =1 Tic πc uic ) i =1
n
( n Sic uic ) ( n Sic uic )
i=1 i=1
( n
i=1 Sic πc uic ) ( n
i=1 Sic πc uic )
= (n − 1)τ 2 D + τ 2 + σ 2 A
where
E [π 2 ] µ − E [π 2 ]
1 µ 1−µ−ψ
2 3 2
µ E [π ] E [π ] µ − E [π ] E [π 2 ] − E [π 3 ]
D=
E [π 2 ] E [π 3 ] E [π 4 ] E [π 2 ] − E [π 3 ] E [π 3 ] − E [π 4 ]
1−µ−ψ µ − E [π 2 ] E [π 2 ] − E [π 3 ] 1 − 2µ + E [π 2 ] − ψ µ − 2E [π 2 ] + E [π 3 ]
2 2 3 3 4 2 3
µ − E [π ] E [π ] − E [π ] E [π ] − E [π ] µ − 2E [π ] + E [π ] E [π 2 ] − 2E [π 3 ] + E [π 4 ]
1
Using mathematica to compute SE δ = nC
∗ A−1 BA−1 and taking the diagonal entries
yields the result. The M DSE T is a function of SE (δ3 ), while the M DSE S is a function of
SE (δ4 ).
46
Dependent&Variable:&&
PANEL&A:&&Pooled&Tests.
Female) Mobile) Highest)
Household) Mother) Father) Never)had) Ever)
Asset)Index Headed) Phone) Age) Grade)at)
Size) Alive Alive Sex Pregnant
Household Ownership Baseline
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
CCT P0.027 0.330 P0.0889** P0.034 P0.251** P0.218 P0.0422* P0.002 P0.005 0.004
(0.195) (0.298) (0.040) (0.051) (0.113) (0.169) (0.025) (0.039) (0.027) (0.009)
UCT 0.219 0.464** P0.0773* P0.032 0.138 0.355** P0.001 0.051 P0.017 0.006
(0.162) (0.221) (0.042) (0.064) (0.136) (0.154) (0.026) (0.038) (0.038) (0.009)
population in the study Eas.
Within)CCT)EA)Control P0.155 0.078 P0.0923** 0.014 0.115 P0.058 0.011 0.056 0.011 0.007
(0.152) (0.326) (0.036) (0.048) (0.130) (0.192) (0.027) (0.034) (0.030) (0.011)
Within)UCT)EA)Control P0.150 0.392* P0.074 0.055 0.012 0.136 0.015 0.003 0.019 0.005
(0.236) (0.234) (0.049) (0.059) (0.170) (0.171) (0.036) (0.057) (0.025) (0.021)
Mean)in)Pure)Control: 6.432 0.581 0.343 0.616 15.252 7.479 0.842 0.705 0.797 0.023
Observations 2,651 2,651 2,651 2,651 2,653 2,652 2,653 2,648 2,653 2,652
PANEL&B:&&Linear&Slope&Term.
Female) Mobile) Highest)
Household) Mother) Father) Never)had) Ever)
Asset)Index Headed) Phone) Age) Grade)at)
47
Size) Alive Alive Sex Pregnant
Household Ownership Baseline
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
CCT P0.691 P1.242** P0.027 P0.273*** P0.176 P0.408 P0.023 0.105 P0.057 0.032
(0.489) (0.498) (0.087) (0.091) (0.277) (0.357) (0.072) (0.083) (0.054) (0.029)
UCT P0.025 0.374 P0.164* P0.021 0.377 0.613* P0.031 0.058 P0.063 0.019
Table 1. Balance Tests
(0.377) (0.506) (0.087) (0.201) (0.372) (0.326) (0.068) (0.097) (0.097) (0.025)
Within)CCT)EA)Control P0.176 0.002 P0.104*** P0.006 0.065 0.108 P0.037 0.038 0.001 0.007
(0.195) (0.523) (0.035) (0.060) (0.175) (0.243) (0.032) (0.036) (0.044) (0.018)
Within)UCT)EA)Control 0.333 P0.566 P0.054 P0.107 0.329 0.659* 0.082 P0.110 0.0965* P0.062
(0.631) (0.482) (0.127) (0.152) (0.472) (0.385) (0.103) (0.157) (0.050) (0.054)
EA)saturation,) 0.873 2.068*** P0.082 0.314*** P0.098 0.248 P0.025 P0.141 0.068 P0.037
)))))CCT)Treatment (0.556) (0.623) (0.117) (0.113) (0.306) (0.385) (0.074) (0.109) (0.068) (0.032)
EA)saturation,) 0.348 0.123 0.124 P0.016 P0.340 P0.370 0.044 P0.010 0.066 P0.018
)))))UCT)treatment (0.425) (0.524) (0.121) (0.219) (0.437) (0.451) (0.081) (0.119) (0.126) (0.029)
EA)saturation,) 0.074 0.263 0.039 0.068 0.169 P0.571 0.167** 0.065 0.033 P0.002
)))))WithinPCCT)Control (0.404) (1.086) (0.089) (0.146) (0.361) (0.566) (0.071) (0.090) (0.101) (0.041)
EA)saturation,) P1.061 2.099** P0.045 0.354 P0.696 P1.148 P0.147 0.249 P0.170 0.147
)))))WithinPUCT)Control (1.130) (0.809) (0.325) (0.311) (0.979) (0.781) (0.181) (0.250) (0.132) (0.155)
Mean)in)Pure)Control: 6.432 0.581 0.343 0.616 15.252 7.479 0.842 0.705 0.797 0.023
Observations 2,651 2,651 2,651 2,651 2,653 2,652 2,653 2,648 2,653 2,652
are weighted with both sampling and saturation weights to make the results representative of the target
Regressions are OLS using Round 1 data with robust standard errors clustered at the EA level. All regressions
Dependent(Variable:((
Terms=Enrolled= Average=Test=Score Ever=Married Ever=Pregnant
OLS IV OLS IV OLS IV OLS IV
(1) (2) (3) (4) (5) (6) (7) (8)
CCT 0.133 0.250 0.026 0.054 40.004 0.001 0.034 0.088
(0.0454)*** (0.113)** (0.00819)***(0.0174)*** (0.021) (0.074) (0.028) (0.071)
UCT 0.071 0.267 0.008 0.009 40.069 40.144 40.056 40.125
(0.049) (0.0933)*** (0.014) (0.043) (0.0247)*** (0.0482)*** (0.0229)** (0.0563)**
Within=CCT=EA=Control 0.018 40.003 0.022 40.008 0.009 0.007 0.008 40.006
(0.048) (0.082) (0.014) (0.015) (0.022) (0.034) (0.025) (0.032)
Within=UCT=EA=Control 40.088 0.011 40.015 40.068 40.004 0.036 40.018 0.078
(0.077) (0.169) (0.024) (0.049) (0.031) (0.070) (0.028) (0.072)
True=CCT=treatment=saturation,= 40.238 40.061 40.011 40.113
====instrumented=w/=assigned (0.227) (0.0352)* (0.128) (0.141)
True=UCT=treatment=saturation,= 40.386 40.002 0.154 0.144
====instrumented=w/=assigned (0.167)** (0.076) (0.115) (0.110)
True=CCT=control=saturation, 0.138 0.176 0.009 0.087
====instrumented=w/=assigned (0.324) (0.108) (0.124) (0.166)
True=UCT=control=saturation, 40.294 0.162 40.123 40.288
====instrumented=w/=assigned (0.623) (0.129) (0.180) (0.191)
48
Mean=in=Pure=Control: 2.639 0.456 0.176 0.247
Observations 2,579 2,579 2,612 2,612 2,649 2,649 2,650 2,650
Estimates(for(CCT:( (average=R3=compliance=rate=of=77.4%,=compliance=defined=as=attending=school=regularly)
Intention=to=Treat=(ITT) 0.133 *** 0.026 *** 40.004 0.034 =
Treatment=on=Uniquely=Treated=(TUT) 0.250 ** 0.054 *** 0.001 0.088
Spillovers=on=the=Treated=(ST) 40.117 = 40.028 * 40.005 = 40.054 =
Treatment=on=Compliers=(ToC) 0.167 *** 0.027 ** 40.008 0.042
Treatment=on=Unique=Complier=(TUC) 0.323 ** 0.070 *** 0.001 0.113 =
Table 2. Linear Spillover Analysis
Spillover=on=Compliers=(SC) 40.156 40.043 ** 40.009 40.072 =
Spillover=on=Non4Treated=(SNT) 0.018 0.022 = 0.009 0.008
Total=Causal=Effect=(TCE) 0.082 ** 0.021 *** 40.002 0.016
Estimates(for(UCT:=(average=R3=compliance=rate=of=99%,=compliance=defined=as=receiving=transfer)
Intention=to=Treat=(ITT) 0.071 0.008 40.069 *** 40.056 **
Treatment=on=Uniquely=Treated=(TUT) 0.267 *** 0.009 40.144 *** 40.125 **
Spillovers=on=the=Treated=(ST) 40.196 ** 40.001 0.075 = 0.069 =
Treatment=on=Compliers=(ToC) 0.072 0.008 40.069 *** 40.056 **
Treatment=on=Unique=Complier=(TUC) 0.270 *** 0.009 40.145 *** 40.126 **
Spillover=on=Compliers=(SC) 40.197 ** 40.001 0.076 0.070
Spillover=on=Non4Treated=(SNT) 40.088 40.015 = 40.004 40.018 =
Total=Causal=Effect=(TCE) 40.002 40.003 = 40.039 ** 40.036 *
estimates statistically diﬀerent than zero at 99 percent (***), 95 percent (**), and 90 percent (*) conﬁdence.
levels for cross-equation F-tests calculated using multiple equation two-step GMM estimation. Parameter
dummies, household asset index, highest grade attended, and an indicator for ever had sex. Signiﬁcance
at the EA level. Baseline values of the following variables are included as controls: age dummies, strata
sampling weights only. TCE estimated through separate regression of outcomes on a dummy for treatment
weights to make the results representative of the target population in the study EAs; TCE regressions use
clustered at the EA level. All regressions except for the TCE are weighted with both sampling and saturation
columns are OLS regressions, even-numbered columns are IV, using Round 3 data with robust standard errors
Test scores are standardized to mean zero and standard deviation one in the control group. Odd-numbered
Dependent'Variable:''
True%Saturation%%%%%%%%%%% Average%Test%
Terms%Enrolled% Ever%Married Ever%Pregnant
in%Cell Score
(1) (2) (3) (4) (5)
CCT%33% 0.244 0.179 0.050 0.021 0.086
(0.0167)*** (0.0846)** (0.0119)*** (0.056) (0.0493)*
CCT%66% 0.372 0.185 0.019 G0.041 0.007
(0.0487)*** (0.0548)*** (0.00881)** (0.0244)* (0.043)
CCT%100% 0.657 0.084 0.019 0.004 0.025
(0.101)*** (0.063) (0.012) (0.021) (0.040)
UCT%33% 0.246 0.195 0.007 G0.105 G0.082
percent (**), and 90 percent (*) conﬁdence.
(0.0208)*** (0.0585)*** (0.029) (0.0305)*** (0.0380)**
UCT%66% 0.507 G0.003 0.012 G0.076 G0.089
(0.0452)*** (0.085) (0.017) (0.0348)** (0.0386)**
an indicator for ever had sex. Parameter estimates 49
UCT%100% 0.691 0.022 0.007 G0.038 G0.022
(0.0674)*** (0.070) (0.019) (0.035) (0.026)
Spillover%CCT%0% G0.011 0.000 0.001 0.011 0.010
(0.030) (0.085) (0.014) (0.035) (0.032)
Spillover%CCT%33% 0.247 0.027 0.010 G0.004 G0.038
(0.0163)*** (0.056) (0.008) (0.038) (0.033)
Spillover%CCT%66% 0.381 0.036 0.066 0.017 0.045
Table 3. Granular Spillover Analysis
(0.0490)*** (0.073) (0.0376)* (0.029) (0.053)
Spillover%UCT%33% 0.244 G0.058 G0.032 0.009 0.013
% (0.0207)*** (0.063) (0.029) (0.038) (0.035)
Spillover%UCT%66% 0.510 G0.138 0.015 G0.027 G0.071
% (0.0447)*** (0.163) (0.039) (0.046) (0.0389)*
Observations 2,650 2,579 2,612 2,649 2,650
RGSquared 0.820 0.099 0.418 0.144 0.200
statistically diﬀerent than zero at 99 percent (***), 95
the regression analyses: age dummies, strata dummies, household asset index, highest grade attended, and
the target population in the study EAs. Baseline values of the following variables are included as controls in
All regressions are weighted with both sampling and saturation weights to make the results representative of
group. Regressions are OLS models using Round 3 data with robust standard errors clustered at the EA level.
Test scores have been standardized to have a mean of zero and a standard deviation of one in the control
Table 4. Spillover Analysis in Social Networks
Average&Test&
Terms&Enrolled& Ever&Married Ever&Pregnant
Scores
& (1) (2) (4) (5)
CCT 0.164 0.034 A0.017 0.012
(0.060)*** (0.024) (0.030) (0.037)
UCT 0.122 0.036 A0.075 A0.065
(0.071)* (0.025) (0.024)*** (0.026)**
WithinAVillage&Control&CCT 0.01 0.016 0.015 A0.003
(0.063) (0.024) (0.030) (0.028)
WithinAVillage&Control&UCT A0.12 A0.003 0.005 A0.019
(0.114) (0.019) (0.039) (0.048)
Number&of&Treated&Friends& A0.011 A0.021 0.006 0.015
&&&&&&&&&&&for&CCT&Treatment&Girls (0.055) (0.016) (0.023) (0.028)
Number&of&Treated&Friends A0.031 A0.029 A0.006 A0.01
&&&&&&&&&&&for&UCT&Treatment&Girls (0.092) (0.019) (0.025) (0.024)
Number&of&Treated&Friends& 0.109 0.008 A0.025 0.044
&&&&&&&&&&&for&CCT&Untreated&Girls (0.106) (0.017) (0.047) (0.058)
Number&of&Treated&Friends 0.196 A0.002 A0.065 A0.052
&&&&&&&&&&&for&UCT&Untreated&Girls (0.099)* (0.028) (0.042) (0.058)
Number&of&Treated&Friends 0.021 A0.032 0.036 0.084
&&&&&&&&&&&for&Pure&Control&Girls (0.145) (0.023) (0.052) (0.076)
Number&of&friends&who&are&dropouts A0.15 A0.029 0.085 0.125
(0.047)*** (0.008)*** (0.022)*** (0.022)***
Number&of&friends&in&same&cluster 0.004 A0.02 0.004 A0.006
(0.015) (0.004)*** (0.006) (0.006)
1&Matched&Friend A0.008 A0.025 0.037 0.027
(0.046) (0.009)*** (0.019)** (0.022)
2&Matched&Friends 0.001 0.003 0.055 0.049
(0.073) (0.014) (0.032)* (0.032)
3&Matched&Friends A0.015 A0.003 0.087 0.065
(0.098) (0.017) (0.038)** (0.041)
4&Matched&Friends A0.123 0.011 0.08 0.08
(0.196) (0.030) (0.074) (0.089)
5&Matched&Friends 0.227 A0.016 A0.148 A0.253
(0.133)* (0.042) (0.057)** (0.066)***
Constant 2.639 0.502 0.133 0.219
(0.049)*** (0.012)*** (0.019)*** (0.019)***
Observations 2660 2620 2652 2653
RAsquared 0.013 0.062 0.023 0.023
Analysis performed within social networks, as deﬁned by the ’ﬁve closest friends’ listed by core respondents at
baseline. Regressions are OLS models with robust standard errors clustered at the EA level. All regressions
are weighted with both sampling and saturation weights to make the results representative of the target
population in the study EAs. Parameter estimates statistically diﬀerent than zero at 99 percent (***), 95
percent (**), and 90 percent (*) conﬁdence.
50
Dependent'Variable:''
Terms8Enrolled8 Average8Test8Score Ever8Married Ever8Pregnant
(1) (2) (3) (4) (7) (8) (9) (10)
CCT 0.119 0.126 0.022 0.008 0.000 /0.023 0.040 /0.011
(0.0431)*** (0.085) (0.00896)** (0.015) (0.023) (0.044) (0.026) (0.041)
UCT 0.059 0.052 0.005 /0.019 /0.064 /0.090 /0.057 /0.114
(0.050) (0.111) (0.013) (0.018) (0.0269)** (0.0484)* (0.0240)** (0.0508)**
Within8CCT8EA8Control 0.013 0.016 0.021 0.023 0.010 0.011 0.008 0.008
(0.047) (0.047) (0.014) (0.0134)* (0.023) (0.023) (0.026) (0.025)
Within8UCT8EA8Control /0.100 /0.095 /0.020 /0.015 0.000 0.002 /0.021 /0.020
percent (**), and 90 percent (*) conﬁdence.
(0.074) (0.077) (0.023) (0.023) (0.034) (0.036) (0.029) (0.031)
#8of8treated8EAs8within838km /0.021 /0.020 /0.005 /0.002 0.005 0.005 0.004 0.003
8 (0.018) (0.020) (0.005) (0.006) (0.009) (0.012) (0.009) (0.011)
#8of8treated8EAs8between838&868km 0.010 0.019 0.001 0.006 /0.004 /0.002 /0.005 /0.003
51
8 (0.013) (0.016) (0.003) (0.004) (0.006) (0.007) (0.006) (0.008)
#8of8total8EAs8within838km 0.012 0.011 0.006 0.004 /0.003 /0.004 0.001 0.002
8 (0.012) (0.013) (0.00281)** (0.004) (0.006) (0.007) (0.006) (0.007)
#8of8total8EAs8between838&868km /0.004 /0.008 /0.002 /0.004 0.000 /0.001 0.004 0.002
8 (0.007) (0.008) (0.002) (0.00216)* (0.003) (0.004) (0.003) (0.004)
Treated8individual8*8#8of8treated8EAs 0.003 0.005 /0.004 /0.007
888888within838kilometers (0.021) (0.004) (0.011) (0.012)
Treated8individual8*8#8of8treated8EAs 0.001 /0.006 0.013 0.009
888888between838and868kilometers (0.040) (0.008) (0.022) (0.023)
Treated8individual8*8#8of8total8EAs8 /0.029 /0.014 /0.007 /0.004
88888within838kilometers (0.026) (0.00467)*** (0.015) (0.015)
Treated8individual8*8#8of8total8EAs 0.012 0.007 0.004 0.007
888888between38and868kilometers (0.014) (0.00276)** (0.008) (0.007)
Observations 2,579 2,579 2,612 2,612 2,649 2,649 2,650 2,650
R/squared 0.098 0.098 0.418 0.42 0.144 0.144 0.199 0.2
Table 5. Robustness check using cross-EA variation in treatment intensity
an indicator for ever had sex. Parameter estimates statistically diﬀerent than zero at 99 percent (***), 95
the regression analyses: age dummies, strata dummies, household asset index, highest grade attended, and
the target population in the study EAs. Baseline values of the following variables are included as controls in
regressions are weighted with both sampling and saturation weights to make the results representative of
Regressions are OLS models using Round 3 data with robust standard errors clustered at the EA level. All
%PROGRAM TO ESTIMATE POWER OF A RANDOMIZED SATURATION STUDY
%AUTHOR: Aislinn Bohren
%SUPPLEMENTAL MATERIAL TO "DESIGNING EXPERIMENTS TO MEASURE SPILLOVER
%EFFECTS" By S. Baird, A. Bohren, C. McIntosh, B. Ozler
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%USER INPUT
clear;clc;
n=20; %cluster size
C=5; %number of clusters
tau=0.5; %variance of cluster error
sigma=2; %variance of individual error
alpha=0.05 ; %significance
gamma=.9; %power
pi=[0,1/3,2/3]; %saturation bins
f=[1/3,1/3,1/3]; %distribution over bins
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%CALCULATIONS
varN=tau+sigma;
varCo=(n-1)*tau;
varC=n*tau+sigma;
%Note: eta=E[pi^2] here; so eta in paper is eta-mu^2
x=1;
powerT=zeros(1,length(x));
powerS=zeros(1,length(x));
powerTonly=zeros(1,length(x));
powerSonly=zeros(1,length(x));
powerT1=zeros(1,length(x));
powerT2=zeros(1,length(x));
powerS1=zeros(1,length(x));
powerS2=zeros(1,length(x));
MDSE_T=zeros(1,length(x));
MDSE_S=zeros(1,length(x));
MDSE_TAffine=zeros(1,length(x));
MDSE_SAffine=zeros(1,length(x));
MU=zeros(1,length(x));
ETA=zeros(1,length(x));
etaT=zeros(1,length(x));
for j=1:length(x);
%Calculate distribution statistics
mu_ind=zeros(1,length(pi));
p=zeros(1,length(pi));
eta_ind=zeros(1,length(pi));
rho=zeros(1,length(pi));
c_ind=zeros(1,length(pi));
d_ind=zeros(1,length(pi));
for i=1:length(pi);
mu_ind(i)=pi(i) *f(i);
eta_ind(i)=pi(i)^2 *f(i);
c_ind(i)=pi(i)^3 *f(i);
d_ind(i)=pi(i)^4 *f(i);
p(i)=(1-pi(i))*f(i);
rho(i)=(1-pi(i))^2 * f(i);
end;
mu=sum(mu_ind);
eta=sum(eta_ind);
c=sum(c_ind);
d=sum(d_ind);
psi=0;if pi(1)==0; psi=f(1);end;
MU(j)=mu;
ETA(j)=eta;
etaT(j)=(eta-mu^2)/(1-psi)-(psi/(1-psi)^2)*mu^2;
%Pooled S&T
A=[1,mu,1-mu-psi;mu,mu,0;1-mu-psi,0,1-mu-psi];
D=[1,mu,1-mu-psi;mu,eta,mu-eta;1-mu-psi,mu-eta,1-2*mu+eta-psi];
power=((1/(n*C)).*(varCo*A^(-1)*D*A^(-1)+varN*A^(-1))).^0.5;
disp('The pooled MDE_T is:')
powerT(j)=power(2,2)
disp('The pooled MDE_S is:')
powerS(j)=power(3,3)
%end;
%Pooled T only: need to correct n for proper comparison
A=[1,mu;mu,mu];
D=[1,mu;mu,eta];
power=((1/(n*C)).*(varCo*A^(-1)*D*A^(-1)+varN*A^(-1))).^0.5;
disp('The pooled MDE_T, including within-cluster controls in the counterfactual, is:')
powerTonly(j)=power(2,2)
%Pooled S only: need to correct n for proper comparison
A=[1,1-mu-psi;1-mu-psi,1-mu-psi];
D=[1,1-mu-psi;1-mu-psi,1-2*mu+eta-psi];
power=((1/(n*C)).*(varCo*A^(-1)*D*A^(-1)+varN*A^(-1))).^0.5;
powerSonly(j)=power(2,2);
if length(pi)>2;
%Non-parametric, 2 saturations
A=[1,mu_ind(2),p(2),mu_ind(3),p(3);mu_ind(2),mu_ind(2),0,0,0;p(2),0,p(2),0,0;mu_ind(3),
0,0,mu_ind(3),0;p(3),0,0,0,p(3)];
D=[1,mu_ind(2),p(2),mu_ind(3),p(3);mu_ind(2),eta_ind(2),mu_ind(2)-eta_ind(2),
0,0;p(2),mu_ind(2)-eta_ind(2),rho(2),0,0;
mu_ind(3),0,0,eta_ind(3),mu_ind(3)-eta_ind(3);p(3),0,0,mu_ind(3)-eta_ind(3),rho(3)];
power2=((1/(n*C)).*(varCo*A^(-1)*D*A^(-1)+varN*A^(-1))).^0.5;
powerT1(j)=power2(2,2);
powerS1(j)=power2(3,3);
powerT2(j)=power2(4,4);
powerS2(j)=power2(5,5);
disp('The non-parametric MDSE_T is:')
MDSE_T(j)=(power2(2,2)+power2(4,4)-2*power2(4,2))/(pi(2)-pi(3))^2
disp('The non-parametric MDSE_S is:');
MDSE_S(j)=(power2(3,3)+power2(5,5)-2*power2(3,5))/(pi(3)-pi(2))^2
%Affine
A=[1,mu,eta,1-mu-psi,mu-eta;
mu,mu,eta,0,0;
eta,eta,c,0,0;
1-mu-psi,0,0,1-mu-psi,mu-eta;
mu-eta,0,0,mu-eta,eta-c];
D=[1,mu,eta,1-mu-psi,mu-eta;
mu,eta,c,mu-eta,eta-c;
eta,c,d,eta-c,c-d;
1-mu-psi,(mu-eta),(eta-c),1-2*mu+eta-psi,mu-2*eta+c;
mu-eta,(eta-c),(c-d),mu-2*eta+c,eta-2*c+d];
power3=((1/(n*C)).*(varCo*A^(-1)*D*A^(-1)+varN*A^(-1))).^0.5;
disp('The affine MDSE_T is:')
MDSE_TAffine(j)=power3(3,3)
disp('The affine MDSE_S is:')
MDSE_SAffine(j)=power3(5,5)
end;
end;