77356

               the world bank economic review, vol. 15, no. 2 229–272


   What have we learned from a decade of empirical research on growth?

                      Growth Empirics and Reality
                        William A. Brock and Steven N. Durlauf

   This article questions current empirical practice in the study of growth. It argues that
   much of the modern empirical growth literature is based on assumptions about regres-
   sors, residuals, and parameters that are implausible from the perspective of both eco-
   nomic theory and the historical experiences of the countries under study. Many of these
   problems, it argues, are forms of violations of an exchangeability assumption that
   implicitly underlies standard growth exercises. The article shows that these implausible
   assumptions can be relaxed by allowing for uncertainty in model specification. Model
   uncertainty consists of two types: theory uncertainty, which relates to which growth
   determinants should be included in a model; and heterogeneity uncertainty, which re-
   lates to which observations in a data set constitute draw from the same statistical model.
   The article proposes ways to account for both theory and heterogeneity uncertainty.
   Finally, using an explicit decision-theoretic framework, the authors describe how one
   can engage in policy-relevant empirical analysis.



   There are more things in heaven and earth, Horatio,
   Than are dreamt of in your philosophy.          —William Shakespeare
                                                    Hamlet, act 1, scene 5
The objective of this article is ambitious—to outline a perspective on empirical
growth research that will both address some of the major criticisms to which
this research has been subjected and facilitate policy-relevant empirics. It is no
exaggeration to say that the endogenous growth models pioneered in Romer
(1986, 1990) and Lucas (1988) have produced a sea change in the sorts of ques-
tions around which macroeconomic research is focused. In empirical macroeco-
nomics, efforts to explain cross-country differences in growth behavior since
World War II have become a predominant area of research. The implications of
this work for policymakers are immense. For example, strong links exist between
national growth performance and international poverty and inequality. Differ-
    William A. Brock and Steven N. Durlauf are with the Department of Economics, University of
Wisconsin. Their e-mail addresses are wbrock@ssc.wisc.edu and sdurlauf@ssc.wisc.edu. The authors
thank the National Science Foundation, John D. and Catherine T. MacArthur Foundation, Vilas Trust,
and Romnes Trust for financial support. They thank François Bourguignon both for initiating this re-
search and for helpful suggestions, as well as Gernot Doppelhofer, Paul Evans, Cullen Goenner, Andros
Kourtellos, Artur Minkin, Eldar Nigmatullin, Xavier Sala-i-Martin, Robert Solow, seminar participants
at Carnegie-Mellon and Pittsburgh, and three anonymous referees for helpful comments. Chih Ming
Tan has provided superb research assistance. Special thanks go to William Easterly and Ross Levine
for sharing data and helping with replication of their results. An earlier version of this article was pre-
sented at the World Bank conference “What Have We Learned from a Decade of Empirical Research
on Growth?�? held on 26 February 2001.
   © 2001 The International Bank for Reconstruction and Development /         THE WORLD BANK




                                                   229
230    the world bank economic review, vol. 15, no. 2


ences in per capita income across countries are substantially larger than those
within countries; Schultz (1998) concludes that two-thirds of (conventionally
measured) inequality across individuals internationally is due to intercountry
differences, so that efforts to reduce international inequality naturally focus on
cross-country growth differences. In turn, the academic community has used this
new empirical work as the basis for strong policy recommendations. A good
example is Barro (1996). Based on a linear cross-country growth regression of
the type so standard in this literature, Barro (1996, p. 24) concludes that
  The analysis has implications for the desirability of exporting democratic
  institutions from the advanced western economies to developing nations.
  The first lesson is that more democracy is not the key to economic growth.
  . . . The more general conclusion is that advanced western countries would
  contribute more to the welfare of poor nations by exporting their eco-
  nomic systems, notably property rights and free markets, rather than their
  political systems.
   Yet there is widespread dissatisfaction with conventional empirical methods
of growth analysis. Many critiques of growth econometrics have appeared in
recent years. Typical examples include Pack (1994, pp. 68–69) who described a
litany of problems with cross-country growth regressions:

  Once both random shocks and macroeconomic policy variables are recog-
  nized as important, it is no longer clear how to interpret many of the expla-
  nations of cross-country growth. . . . Many of the right hand side variables
  are endogenous. . . . The production function interpretation is further
  muddled by the assumption that all countries are on the same international
  production frontier . . . regression equations that attempt to sort out the
  sources of growth also generally ignore interaction effects. . . . The recent spate
  of cross-country growth regressions also obscures some of the lessons that
  have been learned from the analysis of policy in individual countries.

Another is Schultz (1999, p. 71): “Macroeconomic studies of growth often seek
to explain differences in economic growth rates across countries in terms of lev-
els and changes in education and health human capital, among other variables.
However, these estimates are plagued by measurement error and specification
problems.�? In fact, it seems no exaggeration to say that the growth literature in
economics is notable for the large gaps that persist between theory and empirics.
A recent (and critical) survey of the empirical literature, Durlauf and Quah (1999,
p. 295), concludes that
  the new empirical growth literature remains in its infancy. While the litera-
  ture has shown that the Solow model has substantial statistical power in
  explaining cross-country growth variation, sufficiently many problems exist
  with this work that the causal significance of the model is not clear. Fur-
                                                           Brock and Durlauf   231


  ther, the new stylized facts of growth, as embodied in nonlinearities and
  distributional dynamics have yet to be integrated into full structural econo-
  metric analysis.

    Our purposes in this article are threefold. First, we attempt to identify some
general methodological problems that we believe explain the widespread mis-
trust of growth regressions. Although the factors we identify are not exhaustive,
they do represent many of the most serious criticisms of conventional growth
econometrics of which we are aware. These problems are important enough to
at best seriously qualify and at worst invalidate many of the standard claims made
in the new growth literature concerning the identification of economic structure.
In particular, we argue that causal inferences as conventionally drawn in the
empirical growth literature require certain statistical assumptions that may eas-
ily be argued to be implausible. This assertion holds from the perspective of both
economic theory and the historical experiences of the countries under study. We
further argue that a major source of skepticism about the empirical growth lit-
erature, and one that incorporates many of the usual criticisms, is the failure of
certain statistical conditions representing forms of a property known as exchange-
ability to hold in conventional empirical growth exercises.
    Second, we argue that the exchangeability failures underlying many criticisms
of growth models may be constructively dealt with through explicit attention to
model uncertainty in the formulation of growth regressions; see Temple (2000)
for a complementary analysis. What we mean is the following: In estimating a
particular regression, the inferences are made conditional both on the data and
on the specification of the regression. The exchangeability objection to a regres-
sion amounts to questioning whether the specification of the regression is cor-
rect. The assumption that a particular specification is correct can be relaxed by
treating model specification as an additional unknown feature of the data, that
is, by explicitly incorporating model uncertainty in the statistical analysis. In
taking this approach, we follow some important recent developments in the
empirical growth literature—Fernandez, Ley, and Steel (1999) and Doppelhofer,
Miller, and Sala-i-Martin (2000)—in endorsing the use of Bayesian methods to
address explicitly the model uncertainty that we believe underlies the mistrust
of conventional growth regressions. This analysis does not address all the criti-
cisms we describe in the first part of the article. In particular, we argue that
questions of causality versus correlation, which are of first-order importance in
interpreting growth regressions, may only be addressed using substantive infor-
mation that originates outside the models under analysis. Nevertheless, account-
ing for exchangeability can strengthen the confidence that may be attached to
causal interpretations of regression exercises.
    Third, we argue that the appropriate use of empirical growth analyses for
policy analysis requires an explicit decision-theoretic formulation. Current em-
pirical practice in growth is therefore not “policy-relevant�? in the sense that the
232     the world bank economic review, vol. 15, no. 2


policy inferences of a given data analysis are decoupled from the analysis itself.
For example, one often sees a statistically insignificant coefficient used as evi-
dence that some policy is not important for growth or, conversely, the assertion
that statistical significance establishes the importance of some policy. We argue
that these types of claims are not appropriate. Ideally, empirical growth exer-
cises should employ cross-country growth data to compute predictive distribu-
tions for the consequences of policy outcomes, distributions that can then be
combined with a policymaker’s welfare function to assess alternative policy sce-
narios. A decision-theoretic approach to evaluating growth regressions can pro-
vide a better measure of the level of the evidence inherent in the available data,
especially for the construction of policy-relevant predictive structures through
empirical growth analyses.
   The title of this article intentionally echoes the classic Sims (1980) critique of
macroeconometric models. The growth literature does not suffer from the exact
type of “incredible�? assumptions (Sims 1980) that were required to identify eco-
nomic structure through 1960s-style simultaneous equation models and whose
interpretation Sims was attacking. Yet this literature does rely on assumptions
that may be argued to be equally dubious and whose implausibility renders the
inferences typically claimed by empirical workers to be equally suspect.1 As will
be clear from our discussion, this article only begins to scratch the surface of a
policy-relevant growth econometrics. Our hope is that the ideas herein will fa-
cilitate new directions in growth research.
   At the same time, our purpose is not to argue that statistical analyses of cross-
country growth data are incapable of providing insights. Regression and other
forms of statistical analysis have several critical roles in the study of growth. One
role is the identification of interesting data patterns, patterns that can both stimu-
late economic theory and suggest directions along which to engage in country-
specific studies. Quah’s work (1996a, 1996b, 1997) is exemplary in this regard.
However, we focus explicitly on the role of empirical work in formulating policy
recommendations. In particular, a second goal of this article is to explore how
one can, by casting empirical analysis in an explicitly decision-theoretic frame-
work, develop firmer insights into the growth process.
   Throughout we will take an eclectic stance on how one should go about data
analysis. Many of our ideas are derived from the Bayesian statistics literature.
Yet the basic arguments we make are relevant to frequentist analyses. Our view
of data analysis is essentially pragmatic. Data analyses of the sort that are con-
ventional in economics should be thought of as evidence-gathering exercises
aimed at facilitating the evaluation of hypotheses and the development of policy-
relevant predictions for future trajectories of variables of interest. For example,
one starts with a proposition such as “the level of democracy in a country caus-
ally influences the level of economic growth.�? Once this statement is mathemati-

    1. A number of the issues we raise echo, at least in spirit, Freedman (1991, 1997), who has made
serious criticisms of the use of regressions to uncover causal structure in the social sciences.
                                                                          Brock and Durlauf        233


cally instantiated (which means that ceteris paribus conditions are formalized, a
more or less convincing theoretical model or set of models of causal influence is
formulated in a form suitable for econometric implementation, etc.), the pur-
pose of an empirical exercise is to see whether the statement is more or less plau-
sible once the analysis has been conducted. The success or failure of an empiri-
cal exercise rests on whether one’s prior views of the proposition have been altered
by the analysis and on whether the level of uncertainty around a conclusion is
low enough for the conclusion to be of policy relevance. Our position is that
one should evaluate statistical procedures on the basis of whether they success-
fully answer the questions for which they are employed; we are unconcerned, at
least in this article, with abstract issues that distinguish frequentist and Bayesian
approaches, for example.
   Many of our criticisms of the empirical growth literature apply in principle to
other empirical contexts. They take on particular force in the growth context
because of the complexity of the objects under study, the poor data available for
empirical growth work, and the qualitative nature of the theories that drive the
new growth literature.

                                 I. A Baseline Regression

The bulk of modern empirical work on growth has focused on cross-country
growth regressions of the type pioneered by Barro (1991) and Mankiw, Romer,
and Weil (1992). Although recent work has extended growth analysis to con-
sider panels (Evans 1998; Islam 1995; Lee, Pesaran, and Smith 1997), the argu-
ments we make relating to conventional empirical growth practice as well as our
proposed alternative approach are generally relevant to that context too, so long
as cross-section variation is needed for parameter identification. Hence we focus
on cross-sections.
   A generic form for various cross-country growth regressions is
(1)                                    gi = Xi g + Zip + ei
where gi is real per capita growth in economy i over a given period (typically
measured as the change in per capita income between the beginning and end of
the sample divided by the number of years that have elapsed). We have divided
the regressors into two types. Xi represents variables whose presence is suggested
by the Solow growth model: a constant, initial income and a set of country-specific
savings and population growth rate controls. The Solow model is often treated
as a baseline from which to build up more elaborate growth models, hence these
variables tend to be common across studies. Zi, in contrast, consists of variables
chosen to capture additional causal growth determinants that a researcher be-
lieves are important and so generally differs across analyses.2

   2. See Galor (1996) for a discussion of the implications of different growth theories for convergence
and Bernard and Durlauf (1996) for an analysis of the economic and statistical meanings of convergence.
234    the world bank economic review, vol. 15, no. 2


   Though this regression is typically applied to national aggregates, it can in
principle be applied to regions or sectors once gi is reinterpreted as a vector of
growth rates within a country. This is particularly important for policy analysis
when a given policy may affect different regions or population groups differ-
ently. Our conjecture is that such decompositions are important when evaluat-
ing growth policies with significant distributional consequences.
   In our discussion we assume that the motivation for the estimation of a re-
gression of the type given in equation 1 is policy driven. Specifically, we assume
that a policymaker is interested in using this equation to advise some country i
on whether it should change some policy instrument z. If the policymaker’s ob-
jective function depends on the growth rate in country i, he will presumably need
to understand the country’s overall growth process and hence to make inferences
about a number of aspects of equation 1 in addition to pz, the coefficient on the
policy instrument. We return to this issue in section VI.

                            II. Econometric Issues

In this section we discuss three problems with the use of the baseline equation 1
in policymaking or other exercises in which one wishes to give a structural inter-
pretation to this regression. These problems all, at one level, occur because of
violations of the assumptions necessary to estimate equation 1 using ordinary
least squares (ols) and interpret the estimated equation as the structural model
of growth dynamics implied by the augmented Solow model. Each of these criti-
cisms ultimately reduces to questioning whether growth regressions as conven-
tionally analyzed can provide the causal inferences that motivate such analyses.
As discussed in the introduction, growth regressions have been subjected to a
wide range of criticisms from many authors. We do not claim that any of the
criticisms are necessarily original to us; instead, we believe our contribution in
this section lies in the way we organize and unify these criticisms.
                          Open-Endedness of Theories
A fundamental problem with growth regressions is determining what variables to
include in the analysis. This problem occurs because growth theories are open-
ended. By open-endedness, we refer to the idea that the validity of one causal theory
of growth does not imply the falsity of another. So, for example, a causal relation-
ship between inequality and growth has no implications for whether a causal re-
lationship exists between trade policy and growth. As a result, well over 90 differ-
ent variables have been proposed as potential growth determinants (Durlauf and
Quah 1999), each of which has some ex ante plausibility. As there are at best about
120 countries available for analysis in cross-sections (the number may be far smaller
as a result of missing observations on some covariates), it is far from obvious how
to formulate firm inferences about any particular explanation of growth.
   This issue of open-endedness has not been directly dealt with in the literature.
Instead, a number of researchers have proposed ways to deal with the robust-
                                                                           Brock and Durlauf         235


ness of variables in growth regressions. The basic idea of this approach is to
identify a set of potential control variables for inclusion in equation 1 as ele-
ments of Zi. Inclusion of a variable in the final choice of Zi requires that its as-
sociated coefficient prove to be robust with respect to the inclusion of other
variables. Levine and Renelt (1992) introduced this idea to the growth litera-
ture, employing Edward Leamer’s ideas on extreme bounds analysis (see Leamer
1983 and Leamer and Leonard 1983). In extreme bounds analysis, a coefficient
is robust if the sign of its ols estimate stays constant across a set of regressions
representing different possible combinations of other variables. Sala-i-Martin
(1997), arguing that extreme bounds analysis is likely to lead to the rejection of
variables that do influence growth, proposes computing likelihood-weighted
significance levels of coefficients across alternative regressions.
   These proposals for dealing with the plethora of growth theories are useful,
but neither is definitive as a way to evaluate model robustness.3 The reason is
simple. In these approaches a given coefficient will prove not to be robust if its
associated variable is highly collinear with variables suggested by other candi-
date growth theories. This is obvious for the Sala-i-Martin approach, because
collinearity affects significance levels. It is also true for extreme bounds analy-
sis, in the sense that a given coefficient is likely to be highly unstable when alter-
native collinear regressors are included alongside its corresponding regressor.
Hence these procedures will give sensible answers only if lack of collinearity is a
“natural�? property for a regressor that causally influences growth. Yet when one
thinks about theories of how various causal determinants of growth are them-
selves determined, it is clear that collinearity is a property that one might expect
to hold for important causal determinants of growth.4 This is easiest to see by
considering a recursive model for growth. Suppose that growth is causally de-
termined by a single regressor, di, and that this regressor in turn depends caus-
ally on a third regressor, ci, so that
(2)                                         gi = digd + ei
                                            di = cipc + hi.
It is easy to construct cases (which will depend on the covariance structure of ci,
ei, and hi) in which adding ci to the growth equation will render di fragile.
    Important recent papers by Doppelhofer, Miller, and Sala-i-Martin (2000) and
Fernandez, Ley, and Steel (2001) have proposed ways to deal with regressor


    3. Leamer’s work on model uncertainty falls into two parts: a powerful demonstration of the im-
portance of accounting for such uncertainty in making empirical claims, and a specific suggestion, extreme
bounds analysis, for determining when regressors are fragile. The first constitutes a fundamental set of
ideas. The second is a particular way of instantiating Leamer’s deep ideas of accounting for model
uncertainty and is more easily subjected to criticism. By analogy, Rawl’s controversial use of minimax
arguments to infer what rules are just in a society does not diminish the importance of his idea of the
veil of ignorance. Economists have inappropriately used criticisms of extreme bounds analysis to ig-
nore the conceptual issues raised by Leamer’s work.
    4. Leamer is quite clear on this point. See Leamer (1978, p. 172) for further discussion.
236     the world bank economic review, vol. 15, no. 2


choice and hence at least indirectly with model open-endedness through the use
of Bayesian model averaging techniques. We exploit the approach used in those
papers and therefore defer discussion of them until section V.
                                  Parameter Heterogeneity
A second problem with conventional growth analyses is the assumption of pa-
rameter homogeneity. The vast majority of empirical growth studies assume that
the parameters that describe growth are identical across countries. This assump-
tion is surely implausible. Does it really make sense to believe that a change in
the level of a civil liberties index has the same effect on growth in the United
States as in the Russian Federation? Although the use of panel data approaches
to growth has addressed one aspect of this problem by allowing for fixed effects
(Evans [1998] is particularly clear on this point), it has not addressed this more
general question.
   In some sense this criticism might seem unfair, as it presumably applies to any
socioeconomic data set. After all, economic theory does not imply that individual
units ought to be characterized by the same behavioral functions. That said, any
empirical analysis necessarily will require a set of interpretable statistical prop-
erties that are common across observations; when homogeneity assumptions are
or are not to be made is a matter of judgment. Our contention is that the assump-
tion of parameter homogeneity is particularly inappropriate in studying com-
plex heterogeneous objects, such as countries. See Draper (1997) for a general
discussion of these issues.
   Evidence of parameter heterogeneity has been developed in different contexts,
such as in Canova (1999); Desdoigts (1999); Durlauf and Johnson (1995); Durlauf,
Kourtellos, and Minkin (2000); Kourtellos (2000); and Pritchett (2000). These
studies use very different statistical methods, but each suggests that the assump-
tion of a single linear statistical growth model that applies to all countries is
incorrect.5 Put differently, the reporting of conditional predictive densities based
on the assumption that all countries obey a common linear model may under-
state the uncertainty present when the data are generated by a family of models;
Draper (1997) provides further analysis of this idea.
   There has been substantial interest in the empirical growth literature in in-
corporating forms of parameter heterogeneity when panel data are available.
Islam (1995) is an early analysis that allows constant terms to differ across country
growth processes for a panel in which growth is measured in five-year intervals.
In what appears to be the richest analysis of parameter heterogeneity to date,
Lee, Pesaran, and Smith (1997) show how to allow for parameter heterogeneity
for regressor slope parameters for a growth model employing annual data.



    5. Conventional growth analyses give some attention to parameter heterogeneity between rich and
poor countries: Barro (1996), for example, allows the effects of democracy on growth to differ between
rich and poor countries.
                                                            Brock and Durlauf    237


    The idea that panel data may be used to model rich forms of parameter het-
erogeneity is of course important; a comprehensive analysis is Pesaran and Smith
(1995). However, this approach is of limited use in empirical growth contexts,
because variation in the time dimension is typically small. This occurs for two
reasons. First, many of the variables used as proxies for new growth theories do
not vary over high frequencies. For some variables, such as political regime, this
is true by their nature; for others, this is due to measurement. In any event, this
means that cross-section variation must be used to uncover parameters. Second,
there is a conceptual question of the appropriate time horizon over which to
employ a growth model. High-frequency data will contain business cycle fac-
tors that are presumably irrelevant for long-run output movements. Hence it is
difficult to see how annual or biannual data, for example, can be interpreted in
terms of growth theories. In our view the use of long run averages has a power-
ful justification for identifying growth as opposed to cyclical factors.
                           Causality versus Correlation
A final source of skepticism about conventional growth empirics relates to a
problem endemic to all structural inference in social science—the question of
causality versus correlation. Many of the standard variables used to explain
growth patterns—democracy, trade openness, rule of law, social capital, and the
like—are as much outcomes of socioeconomic decisions and interrelationships
as growth itself is. Hence there is an a priori case that the use of ols estimates of
the relationship between growth and such variables cannot be treated as struc-
tural any more than coefficients produced by ols regressions of price on quan-
tity can be. Yet the majority of empirical growth studies treat the various growth
controls as exogenous variables and so rely on ordinary or heteroskedasticity-
corrected least squares estimation. What is particularly ironic about the lack of
attention to endogeneity is that it was precisely this lack of attention in early
business cycle models that helped drive the development of rational expectations
econometrics.
   Recent econometric practice in growth has begun to employ instrumental
variables to control for regressor endogeneity. This is particularly common for
panel data sets where temporally lagged variables are treated as legitimate in-
struments. However, this trend toward using instrumental variables estimation
has not satisfactorily addressed this problem. The reason is that the failure to
account properly for the open-endedness of growth theories has important im-
plications for the validity of instrumental variables methods.
   What we mean by this is the following. For a regression of the form
(3)                                 yi = Ri g + ei
the use of some set of instrumental variables Ii as instruments for Ri requires,
of course, that each element of Ii be uncorrelated with ei. In the growth litera-
ture this is not a condition typically employed to motivate the choice of instru-
ments. Instead, instruments are typically chosen exclusively because they are
238    the world bank economic review, vol. 15, no. 2


in some sense exogenous, which operationally means that they are predeter-
mined with respect to ei. Predetermined variables, however, are not necessar-
ily valid instruments.
   As discussed in Durlauf (2000), a good example of this pitfall can be found in
Frankel and Romer (1996), which studies the relationship between trade and
growth. Frankel and Romer argue that because trade openness is clearly endog-
enous, it is necessary to instrument the trade openness variable in a cross-country
regression to consistently estimate the trade openness coefficient. To do this, they
use a geographic variable, area, as an instrument and argue in favor of its validity
that area is predetermined with respect to growth. Their argument that the in-
strument is predetermined is certainly persuasive. Nevertheless, it is hard to make
an argument that it is a valid instrument. Is it plausible that country land size is
uncorrelated with the omitted growth factors in their regression? The history
and geography literatures are replete with theories of how geography affects
political regime, development, and so on. For example, larger countries may be
more likely to be ethnically heterogeneous, leading to attendant social problems.
Alternatively, larger countries may have higher per capita military expenditures,
which means relatively greater shares of unproductive government investment,
higher distortionary taxes, or both. Our argument is not that any one of these
links is necessarily empirically salient, but that the use of land area as an instru-
ment presupposes the assumption that the correlations between land size and all
omitted growth determinants are in total negligible. It is difficult to see how such
an assumption can be defended when these omitted growth determinants are
neither specified nor evaluated.
   It is interesting to contrast the difficulties of identifying valid instruments in
growth contexts with the relative ease with which this is done in rational expec-
tations contexts. The reason for this difference is that rational expectations models
are typically closed in the sense that a particular theory will imply that some
combination of variables is a martingale difference with respect to some sequence
of information sets. For the purposes of data analysis, rational expectations
models therefore generate instrumental variables, that is, any variables observ-
able at the time expectations are formed, whose orthogonality to expectation
errors may be exploited to achieve parameter identification. Of course, rational
expectation models can be faulted for imposing sufficiently wide-ranging restric-
tions on the economic behavior under study that some of the assumptions nec-
essary for identification are not plausible; that is, for being insufficiently open-
ended in the sense we have described. So the problems associated with theory
open-endedness in growth are hardly nonexistent in other contexts.

                             III. Exchangeability

Inferences from any statistical model can only be made, of course, conditional
on various prior assumptions that translate the data under study into a particu-
                                                                         Brock and Durlauf        239


lar mathematical structure. One way to evaluate the plausibility of inferences
drawn from empirical growth regressions is by assessing the plausibility of the
assumptions made in making this translation. In the empirical growth literature
it is easy to find examples where the assumptions employed to construct statistical
models are clearly untenable. For example, researchers typically assume that the
errors in a cross-section regression are jointly uncorrelated and orthogonal to the
model’s regressors.6 Do they really wish to argue that no omitted factors exist that
induce correlation across the innovations in the growth regressions associated with
the model? More generally, it is easy to see that parameter heterogeneity and
omitted variables, which, we argued in the previous section, are endemic to growth
regressions, can each lead to a violation of the error uncorrelatedness assumption,
the regressor orthogonality assumption, or both.
    On the other hand, econometrics has a long tradition of identifying mini-
mal sets of conditions under which coefficients and standard errors may be
consistently estimated. Examples include the emphasis on orthogonality con-
ditions between regressors and errors as the basis for ols consistency (rather
than the interpretation of the ols estimators as the maximum likelihood esti-
mates for a linear model with nonstochastic regressors and i.i.d. normal errors)
or the use of mixing conditions to characterize when central limit theorems
apply to dependent data (rather than the modeling of the series as a known
autoregressive moving average process). Hence any critique of cross-country
growth analyses that is based on the plausibility of particular statistical assump-
tions needs to argue that the violations of the assumptions invalidate the ob-
jectives of a given exercise.
    In this section we argue that of the three econometric issues we have raised, the
first two may be interpreted as examples of deviations of empirical growth mod-
els from a statistical “ideal�? that allows for the sorts of inferences researchers wish
to make in growth contexts. Our purpose is to establish a baseline for statistical
growth models such that if a model does not meet this standard, a researcher needs
to determine whether the reasons for this invalidate the goal of the empirical exer-
cise. Hence the baseline does not describe a necessary requirement for empirical
work, but instead helps define a strategy that we think empirical workers should
follow in formulating growth models. When a model does not meet this standard,
researchers should be prepared to argue that the violations of the standard do not
invalidate the empirical claims they wish to make. This standard is based on a
concept in probability known as exchangeability.7

    6. In the subsequent discussion, we focus on OLS estimation of growth regressions. In the empirical
growth literature examples can be found of heteroskedasticity corrections to relax assumptions of iden-
tical residual variances and instrumental variables to deal with violations of error/regressor orthogo-
nality. Our discussion is qualitatively unaffected by either of these alternatives to OLS.
    7. Bernardo and Smith (1994) provide a complete introduction to exchangeability. Draper and others
(1993) develop a detailed argument on the importance of exchangeability to statistical inference. Our
analysis is much indebted to their perspective.
240      the world bank economic review, vol. 15, no. 2


                                              Basic Ideas
A formal definition of exchangeability is as follows.

Definition: Exchangeability. A sequence of random variables hi is exchange-
able if, for every finite collection h1 . . . hK of elements of the sequence,
(4)                                 m(h1 = a1, . . . , hK = aK) =
                                   m(hr(1) = a1, . . . , hr(K) = aK)8
where r( . ) is any operator that permutes the K indices.
   Exchangeability is typically treated as a property of the unconditional prob-
abilities of random variables. In regression contexts, however, it is often more
natural to think in terms of the properties of random variables conditional on
some information set. For example, in a regression, one is interested in the prop-
erties of the errors conditional on the regressors. We therefore introduce a sec-
ond concept, F-conditional exchangeability.9

Definition: F-conditional exchangeability. For a sequence of random vari-
ables hi and a collection of associated random vectors Fi, hi is F-conditionally
exchangeable if, for every finite collection h1 . . . hK of elements of the sequence,
                                  m(h1 = a1 , . . . , hK = aK |~ F)=
                                                                    F)
                                 m(hr(1) = a1 , . . . , hr(K) = aK |~
                                                             F = {F1 . . . FK}.
where r( · ) is any operator that permutes the K indices and ~
   If Fi = f ∀ i, the empty set, F-conditional exchangeability reduces to
exchangeability.
   Associated with exchangeability and F-conditional exchangeability is the idea
of partial exchangeability.

Definition: Partial exchangeability. A sequence of random variables hi is
partially exchangeable with respect to a sequence of random vectors Yi if, for
every finite collection h1 . . . hK of elements of the sequence,
(6)                 m(h1 = a1 , . . . , hK = aK|Yi = Y | ∀ i ∈ {1 . . . K}) =
                   m(hr(1) = a1 , . . . , hr(K) = aK|Yi = Y | ∀ i ∈ {1 . . . K})
where r( · ) is any operator that permutes the K indices.
  The key difference between exchangeability and partial exchangeability is the

    8. Throughout, m( . ) is used to denote probability measures.
    9. F-conditional exchangeability was originally defined in Kallenberg (1982). Ivanoff and Weber
(1996) provide additional discussion. The notion of F-conditional exchangeability is rarely employed
in the statistics literature and is not mentioned in standard textbooks such as Bernardo and Smith (1994).
We believe the reason for this is that exchangeability analyses in the statistics literature generally focus
on whether the units under study are exchangeable, rather than whether they are conditional on certain
characteristics, the more natural notion in economic contexts.
                                                                             Brock and Durlauf         241


conditioning on common values of some random vectors Yis associated with the
his in the partial exchangeability case. If Yi is a discrete variable, partial exchange-
ability implies that a sequence may be decomposed into a finite or countable
number of exchangeable subsequences.
   Even though F-conditional exchangeability of model errors constitutes a stron-
ger assumption than is needed for many of the interpretations of ols, this ex-
changeability condition is nevertheless useful as a benchmark in the construc-
tion and assessment of statistical models. We make this claim for two reasons.
First, this exchangeability concept helps organize discussions of the plausibility
of the invariance of conditional moments that lie at the heart of policy relevant
predictive exercises. Draper (1987, p. 458) describes the critical role of exchange-
ability in any predictive exercise:
   Predictive modeling is the process of expressing one’s beliefs about how
   the past and future are connected. These connections are established through
   exchangeability judgments: with what aspects of past experience will the
   future be more or less interchangeable, after conditioning on relevant fac-
   tors? It is not possible to avoid making such judgments; the only issue is
   whether to make them explicitly or implicitly.
Put in the context of growth analysis, the use of cross-country data to predict the
behavior of individual countries presupposes certain symmetry judgments about
the countries, judgments that are made precise by forms of exchangeability.
   Second, exchangeability is separately important because of its implications
for the appropriate statistical theory to apply in growth contexts. The reason
for this relates to a deep result in probability theory known as de Finetti’s Rep-
resentation Theorem.10 This theorem, formally stated in the technical appendix,
establishes that the sample path of a sequence of exchangeable random variables
behaves as if the random variables were generated by a mixture of i.i.d. pro-
cesses. For empirical practice, de Finetti’s Representation Theorem is important
because it creates a link between a researcher’s prior beliefs about the nature of
the data under analysis (specifically, the properties of regression errors) that
permits the researcher to interpret ols estimates and associated test statistics in
the usual way.11
     10. See Bernardo and Smith (1994, chs. 4 and 6) for an insightful discussion of the nature and im-
plications of the theorem and Aldous (1983) for a comprehensive mathematical development of vari-
ous forms of the theorem.
     11. Caution is needed in using de Finetti’s theorem to calculate the distributions of regression esti-
mators. For linear regression models of the form of equation 1, with normally distributed errors and
nonstochastic regressors, Arnold (1979, p. 194) shows that “many optimal procedures for the model
with i.i.d. errors are also optimal procedures for the model with exchangeably distributed errors . . . in
the univariate case the best linear unbiased estimator and the ordinary least squares estimator are equal
. . . as long as the experimenter is only interested in hypotheses about (the slope coefficients of the
regression) he may act as though the errors were i.i.d.�? Further, if the errors are non-normal, de Finetti’s
theorem leads one to expect analogous asymptotic equivalences. Similarly, we believe that analogies to
de Finetti’s theorem can be developed for stochastic regressors and F-conditional exchangeability, al-
though as far as we know no such results have been established.
242    the world bank economic review, vol. 15, no. 2


                          Exchangeability and Growth
How does exchangeability relate to the assumptions underlying cross-country
growth regressions? These models typically assume that once the included growth
variables in the model are realized, no basis exists for distinguishing the prob-
abilities of various permutations of residual components in country-level growth
rates, that is, these residuals are F-conditionally exchangeable, where F is the
modeler’s information set. Notice that F may include variables beyond those
included in a growth regression as well as knowledge about nonlinearities or
parameter heterogeneity in the growth process.
   Various forms of exchangeability appear, in our reading of the empirical
growth literature, to implicitly underlie many of the regression specifications.
An implicit (F-conditional) exchangeability assumption is made whenever the
empirical implementation of the growth trajectory for a single country from a
given theoretical model is turned into a cross-country regression (typically af-
ter linearizing) by allowing the trajectory’s state variables to differ across coun-
tries and appending an error term. Such an assumption of exchangeability has
substantive implications for how a researcher thinks about the relationship be-
tween a given observation and others in a data set. Suppose that a researcher
is considering the effect of a change in trade openness on a country, for ex-
ample, Tanzania, in Sub-Saharan Africa. How does the researcher employ es-
timates of the effects of trade on growth in other countries to make this assess-
ment? The answer depends on the extent to which the causal relationship
between trade and growth in Tanzania can be uncovered using data from other
countries.
   More generally, notice how a number of modeling assumptions that are stan-
dard in conventional growth exercises are conceptually related to the assump-
tions that the model errors ei are F-conditionally exchangeable and that the
growth rates gi are partially exchangeable with respect to available information.
Specifically,

  1. The assumption that a given regression embodies all of a researcher’s
     knowledge of the growth process is related to the assumption that the errors
     in a growth regression are F-conditionally exchangeable.
  2. The assumption that the parameters in a growth regression are constant
     is related to the assumption that country-level growth rates are partially
     exchangeable.
  3. The justification for the use of ordinary (or heteroskedasticity-corrected)
     least squares, as is standard in the empirical growth literature, is related to
     the assumption that the errors in a growth regression are exchangeable
     (or are exchangeable after a heteroskedasticity correction).

  Our general claim is that exchangeability, in particular, F-conditional ex-
changeability of model errors, is an “incredible�? (Sims 1980) assumption in the
context of the standard cross-country regressions of the growth literature. (By a
                                                                        Brock and Durlauf        243


standard regression, we refer to equation 1, in which a small number of regres-
sors are assumed to explain cross-country growth patterns.12) For exchangeability
to hold for a given regression and information set, the likelihood of a positive error
for a given country—say, Japan—would need to be the same as that for any other
country in the sample. In turn, for this to be true, no prior information could exist
about the countries under study that would render the distribution of the asso-
ciated growth residuals for these countries sensitive to permutations.
    To repeat, exchangeability is not necessary to justify the estimation methods
and structural interpretations conventionally given to cross-country growth re-
gressions.13 Hence our use of the term related in the three points above. What
exchangeability does is provide a baseline, based on economic theory and a
researcher’s prior knowledge of the growth process, by which to assess cross-
country regressions. Exchangeability is a valuable baseline for two reasons. First,
the conditions under which various types of exchangeability do or do not hold
for growth rates or model residuals can be linked to a researcher’s substantive
understanding of the growth process in ways that alternative sets of (purely sta-
tistical) assumptions on errors usually cannot be. In turn, once exchangeability
is believed to be violated, a researcher can naturally link the reasons that ex-
changeability fails to hold to the question of whether the estimation methods
used in the growth literature nevertheless can be expected to yield consistent
parameter estimates and standard errors.
    Second, exchangeability is important because it shifts the focus of specifica-
tion analysis away from the question of theory inclusion (determining which
variables need to be included in a growth regression to cover relevant structural
growth determinants) to the identification of groups of countries that obey a
common regression surface and hence can provide information on the growth
process. This shift of emphasis is important for two reasons. First, for many
growth determinants, the variables used to proxy for theories are very poor
measures. For example, in the standard Gastil index of political rights, often used
to measure levels of democracy, South Africa is ranked as high as or even higher
than (depending on the period) the Republic of Korea for the period 1972–84.
It is difficult to know what this means (political rights for whom?) and in what
sense this rank ordering is relevant for the aspects of democracy conducive to
growth. A more fruitful exercise is to identify groups of countries that obey a
common, parsimonious growth model. Put differently, if, as seems plausible,
many growth determinants such as political regime are common background
variables for subsets of countries, a more productive empirical strategy may be
to identify these subsets rather than to use crude empirical proxies for regime.
Second, to the extent that nonquantifiable factors, such as “culture�? (see Landes


   12. To be fair, empirical growth papers often check the robustness of variables relative to a small
number of alternative controls, but such robustness checks do not address exchangeability per se.
   13. In section VI we return to the question of when the full force of exchangeability is useful for
policy analysis.
244    the world bank economic review, vol. 15, no. 2


2000), matter for growth, the identification of partially exchangeable subsets
may be necessary for any sort of growth inferences.
   To see how exchangeability plays a role in the leap from the identification of
statistical patterns to structural inference, suppose that one runs the baseline
Solow regression and observes that regression errors for the countries in Sub-
Saharan Africa are predominately negative (as is the case). How does one inter-
pret this finding? One can either attribute the finding to chance (the errors are,
after all, zero mean with nonnegligible variance) or conclude that there was some-
thing about those countries that was not captured by the model. Easterly and
Levine (1997), for example, develop a comprehensive argument on the role of
ethnic divisions as a causal determinant of growth working from this initial fact.
Or, put differently, Easterly and Levine (1997), from prior knowledge about the
politics and cultures of these countries, developed their analysis on the basis that
the Solow errors were not exchangeable, that is, that there was something about
Sub-Saharan African countries that should have been incorporated into the Solow
model.
   Does the requirement of exchangeability imply the impossibility of structural
inference whenever observational data are being studied? This would grossly
exaggerate the import of our critique. Exchangeability of errors is conceivable
for a wide range of models with observational data sets. For example, exchange-
ability seems to be a plausible assumption for statistical models based on the use
of individual-level data sets, such as the Panel Study of Income Dynamics (psid),
once relevant information about the individuals under study is controlled for.
One reason for this relates to the units of analysis. A basic difference between
microeconomic data sets of this type and macroeconomic data sets of the type
used in growth analysis is that macroeconomic observations pertain to large
heterogeneous aggregates for which a great deal of information is known; infor-
mation that can imply that exchangeability does not hold. In addition, the large
size of individual-level data sets such as the psid means that the range of pos-
sible control variables is much greater than that for growth. By this we mean
something deeper than “the more data points, the more regressors may be
included.�? Instead, we argue that large data sets of the type found in micro-
economics will contain observations on groups of individuals who are sufficiently
similar with respect to observables that they may be plausibly regarded as repre-
senting exchangeable observations.
   That said, we fully accept that exchangeability for observations on objects as
complicated as countries may well be problematic. Will our knowledge of the
histories and cultures of the countries in cross-country regressions ever be em-
bedded in the regressions to such an extent that the exchangeability requirement
is met? This question is at the heart of many of the controversies about the em-
pirical growth literature.
   To summarize, conventional growth econometrics has failed to consider the
ways in which appropriate exchangeability concepts may or may not hold for
                                                            Brock and Durlauf    245


the specific models analyzed. This failure in turn renders these studies difficult if
not impossible to interpret, because one must know whether any exchangeabil-
ity violations that are present invalidate the statistical exercise being conducted.
We therefore concur with Draper and others (1993, p. 1), who argue that
  statistical methods are concerned with combining information from differ-
  ent observational units and with making inferences from the resulting sum-
  maries to prospective measurements on the same or other units. These
  operations will be useful only when the units to be combined are judged to
  be similar (comparable or homogeneous) . . . judgments of similarity in-
  volve concepts more primitive than probability, and these judgments are
  central to preliminary activities that all statisticians must perform, even
  though probability specifications are absent or contrived at such a prelimi-
  nary stage.
                         Exchangeability and Causality
Though exchangeability is a useful benchmark for understanding some of the
major sources of skepticism about growth regressions, it does not bear in any
obvious way on the third of our general criticisms, the lack of attention to cau-
sality versus correlation in growth analysis. For example, following a nice ex-
ample due to Goldberger (1991), a regression of parental height on daughter
height can have a perfectly well-defined set of exchangeable errors, so that pa-
rental heights are partially exchangeable, yet the interpretation of the associated
regression coefficient is obviously noncausal. More generally, causality is a dif-
ferent sort of question than the other issues we have addressed, in that it cannot
be reduced to a question of whether the data fulfill a generic statistical property.
As Heckman (2000, p. 89) notes, “causality is a property of a model . . . many
models may explain the same data and . . . assumptions must be made to iden-
tify causal or structural models.�? And (2000, p. 91):
  Some of the disagreement that arises in interpreting a given body of data is
  intrinsic to the field of economics because of the conditional nature of causal
  knowledge. The information in any body of data is usually too weak to
  eliminate competing causal explanations of the same phenomenon. There
  is no mechanical algorithm for producing a set of “assumption free�? facts
  or causal estimates based on those facts.
   In our subsequent discussion we do not address strategies for dealing with
questions of causality. Instead, we focus on model uncertainty, which presup-
poses that causality uncertainty within a given model has been addressed by
suitable assumptions by the analyst. In doing this, we are not diminishing the
importance of thinking about causal inference; instead, we believe that causal
arguments require judgments about economic theory and qualitative informa-
tion about the problem at hand that represent issues separate from those we
address.
246     the world bank economic review, vol. 15, no. 2


              IV. A Digression on Noneconometric Evidence

Regression analyses of the type conventionally done are useful mechanisms for
summarizing data and uncovering patterns. These techniques are not, as cur-
rently employed, particularly credible ways to engage in causal inference. Be-
fore proceeding to econometric alternatives, we wish to point out the importance
of integrating different sources of information in the assessment of growth theo-
ries. These sources are often the basis on which exchangeability can be ques-
tioned in a particular context.
   The economic history literature is replete with studies that are of enormous
importance in adjudicating different growth explanations, yet this literature
usually receives only lip service in the growth literature.14 An exemplar of his-
torical studies that can speak to growth debates is Clark (1987), which explores
the sources of productivity differences between cotton textile workers in New
England and those workers in other countries in 1910. These differences were
immense—a typical New England textile worker was about six times as produc-
tive as his counterpart in China or India and more than twice as productive as
his counterpart in Germany. Clark painstakingly shows that these differences
cannot be attributed to differences in technology, education, or management.15
Instead, they seem to reflect cultural differences in work and effort norms. Such
studies have important implications for understanding why technology may not
diffuse internationally and how poverty traps may emerge, and should play a
far greater role in the empirical growth literature.
   Historical and qualitative studies also play a crucial role in the development
of credible statistical analyses. One reason for this is that these sorts of studies
provide information on the plausibility of identifying assumptions that are made
to establish causality. Further, our discussion on exchangeability and growth
analysis may be interpreted as arguing that a researcher needs to do one of two
things to claim that a regression provides causal information. The researcher must
make a plausible argument that, given the many plausible growth theories and
plausible heterogeneity in the way different causal growth factors affect differ-
ent countries, the errors in a particular growth regression are nevertheless ex-
changeable. Or, the researcher can make the argument that the violations of
exchangeability in the regression occur in ways that do not affect the interpreta-
tion of the coefficients and standard errors from those that are employed. To
some extent exchangeability judgments must be made prior to a statistical exer-
cise, as Draper and others (1993) note above. Where does information of this
type come from? Often from qualitative and historical work. Hence, the detailed
study of individual countries that is a hallmark of work by the World Bank, for
example, plays an invaluable role in allowing credible statistical analysis.


   14. There are notable exceptions, such as Easterly and Levine (1997) and Prescott (1998).
   15. See also Wolcott and Clark (1999), which provides detailed evidence that managerial differ-
ences cannot explain the low productivity in Indian textiles.
                                                               Brock and Durlauf   247


                     V. Modeling Model Uncertainty

The main themes of our criticisms of current econometric practice may be sum-
marized as two claims:
  1. The observations in cross-country growth regressions do not obey vari-
     ous exchangeability assumptions given the information available on the
     countries under study.
This implies that:
  2. Model uncertainty is not appropriately incorporated into empirical growth
     analyses.
   There are no panaceas for the interpretation problems we have described for
growth regressions. Although our formulation of model uncertainty can reduce
the dependence of empirical growth studies on untenable exchangeability or other
assumptions, growth regressions will always rely on untestable and possibly
controversial assumptions if causal or structural inferences are to be made. It
may be impossible, for example, to place every possible growth theory in a com-
mon statistical analysis, so critiques based on theory open-endedness will apply,
at some level, to our own suggestions. Further, we will not be able to model all
aspects of uncertainty about partial exchangeability of growth rates. However,
we do not regard this as a damning defect. Empirical work always relies on judg-
ment as well as formal procedures, what Draper and others (1993, p. 16) refer
to as “the role of leaps of faith�? in constructing statistical models. What we wish
to do is reduce the number and magnitude of such leaps.
                               General Framework
We assume that the structural growth process for country i obeys a linear struc-
ture that applies to all countries j that are members of class J(i). Suppose that
this model is described by a set of regressors S that we partition into a subset X
and a scalar z. Our analysis focuses on how to employ data to uncover bz, the
coefficient that determines the effect of zi on country i’s growth. We work with
models of the form:
(7)                   gj = Sjz + ej = Xjp + zj bz + ej, j ∈ J(i).
When a given model represents the “true�? or correct specification of the growth
process for countries in J, the sequence of residuals ej will be F-exchangeable.
The information set F comprises the total available information to a researcher
about the countries. For our purposes, F will consist of a collection of regressors
available to a researcher; S is a subset of these. The idea that a model consists of
the specification of a set of growth determinants, (Sj), and the specification of a
set of countries with common parameters, J(i), that together render the associ-
ated model errors F-conditionally exchangeable, will, as we shall see, parallel
our earlier discussion of the first two sources of criticisms of growth regressions.
248     the world bank economic review, vol. 15, no. 2


It is skepticism about the claim that a particular model is correctly specified in
the sense we have described that renders many of the empirical claims in the
growth literature not credible.
    The standard approach to statistical analysis in the growth literature can be
thought of as using a single model M and given data set D to analyze model
parameters. Suppose that the goal of the exercise is to uncover information about
a particular parameter bz. From a frequentist perspective, this involves calculat-
ing an estimate of the parameter bz along with an associated standard error for
the estimate. From a Bayesian perspective, this involves calculating the poste-
rior density m(bz | D,M). We will employ the Bayesian framework in our subse-
quent discussion. That said, we will be interested in relating our analysis to
frequentist analyses of growth. For this reason we shall often employ a “leading
case�? in the analysis. As described in the technical appendix, under some condi-
tions the posterior mean of the set of regression coefficients in equation 7 equals
the ols estimates of the parameters and the posterior variance/covariance ma-
trix equals the variance/covariance matrix of the ols estimates. We will use this
equivalence repeatedly in the next section.
                        Formulating Types of Model Uncertainty
Suppose that there exists a universe of models, M with typical element Mm, that
are possible candidates for the “true�? growth model that generated the data under
study; the true model is assumed to lie in this set.16 This universe is generated
from two types of uncertainty. First, there is theory uncertainty. In particular,
we assume that there is a set X of possible regressors to include in a growth re-
gression whose elements correspond to alternative causal growth mechanisms.
In our framework a theory is defined as a particular choice of regressors for a
model of the form of equation 7. Second, there is heterogeneity uncertainty. By
this we mean that there is uncertainty as to which countries make up J(i), that is
are partially exchangeable with country i.17 In the presence of these types of
uncertainty a researcher will be interested not in m(bz | D,Mm) for a particular
Mm but in m(bz | D); the exception, of course, is when the correct growth theory
and the set of countries that are partially exchangeable with country i are known
with certainty to the modeler.
   This dichotomy of model uncertainty can, at least in principle, incorporate
other forms of uncertainty as well. Consider the question of nonlinearities in the
growth process. One could attempt to deal with functional form uncertainty
through the addition of regressors. Examples would include adding regressors
that are nonlinear functions of the initial set of theory-based regressors (appeal-
ing to Taylor series-type or other approximations) or adding regressors whose


   16. It is possible to consider contexts where no model in M is correct, as discussed in Bernardo and
Smith (1994), but that is beyond the scope of this paper.
   17. These two types of uncertainty are not independent; for example, theory uncertainty may in-
duce heterogeneity uncertainty.
                                                                          Brock and Durlauf        249


values are zero below some threshold and equal to a theory-based regressor above
that threshold (as suggested by such models as Azariadis and Drazen [1990]. In
this sense heterogeneity uncertainty is no different from theory uncertainty.
   It is possible to integrate theory uncertainty and some forms of heterogeneity
uncertainty into a common variable selection framework. Doing so has the im-
portant advantage that it allows us to draw on new developments in the statis-
tics literature stemming from an important paper by Raftery, Madigan, and
Hoeting (1997). By definition, theory uncertainty is a question of variable inclu-
sion. To see how to interpret heterogeneity uncertainty in a similar way, we
proceed as follows. For a given regressor set S, suppose that one believes that
the countries under study may be divided into two subsets with associated sub-
scripts A1 and A2 such that the countries within each subset are partially exchange-
able, but that countries in one subset may not be partially exchangeable with
countries in the other because of parameter heterogeneity. Each of these subsets
is characterized by a linear equation so that
(8)                               gj = Xjp + zjbz + ej if j ∈ A1

and

(9)                              gj = Xjp' + zjb'z + ej if j ∈ A2.

This last equation can be rewritten as

(10)             gj = Xjp + zjbz + Xj (p' – p) + zj (b'z - bz) + ej if j ∈ A2.

Therefore, the two equations can be combined into a single growth regression
of the form:

(11) gj = Xjp + zjbz + Xj dj, A2 (p' – p) + zj dj, A2 (b'z – bz) + ej, if j ∈ A1 ∪ A2
where dj,A2 = 1 if j ∈ A2, 0 otherwise. The additional regressors Xjdj, A2 and zjdj, A2
therefore produce a common regression for all observations.18 Of course, this
type of procedure is often done in empirical work; our purpose in this develop-
ment here is to emphasize how heterogeneity uncertainty may be explicitly mod-
eled in terms of variable inclusion. Notice that it is straightforward to generalize
this procedure to multiple groups of partially exchangeable countries. This pro-
cedure is not completely general in that it restricts the sort of possible parameter
heterogeneity allowed; for example, each country is not allowed a separate set
of coefficients. To allow for this more general type of heterogeneity would re-
quire moving to an alternative structure, such as a hierarchical linear model (see
Schervish 1995, ch. 8); we plan to pursue this in subsequent work.
   18. When heterogeneity uncertainty is introduced, the variable z will be associated with different
parameters for different countries. For ease of exposition we let bz refer to the relevant parameter for
the country i that is of interest.
250     the world bank economic review, vol. 15, no. 2


                                   Posterior Probabilities
Once a researcher has formulated a space of possible models, it is relatively
straightforward to calculate posterior probabilities that do not rely on the as-
sumption that one model is true. In the presence of model uncertainty the calcu-
lation of m(bz | D) requires integrating out the dependence of the probability
measure m(bz | D,Mm) on the model Mm. By Bayes’s rule, the posterior density of
a given coefficient conditional only on the observed data is
(12)                       m(bz | D) = ∑ m(bz | D,Mm)m(Mm | D),
                                          m

which can be rewritten as
(13)                   m(bz | D) �? ∑ m(bz | D,Mm)m(D | Mm)m(Mm),
                                      m

where �? means “is proportional to,�? m(D | Mm) is the likelihood of the data given
model Mm, and m(Mm) is the prior probability of model Mm. This formulation
gives a way of eliminating the conditioning of the posterior density of a given
parameter on a particular model choice.
   Calculations of this type originally appeared in Leamer (1978) and are reported
in Draper (1995). Leamer (1978, p. 118) gives the following derivations of the
conditional mean and variance of bz given the data D:
(14)                       E(bz | D) = ∑ m(Mm | D)E(bz | D,Mm)
                                          m

and

(15)                      var(bz | D) = E(b2                   2
                                           z | D) – (E(bz | D)) =

           ∑ m(Mm | D)(var(bz | Mm,D) + (E(bz | D,Mm))2) – (E(bz | D))2 =
            m
       ∑ m(Mm | D)var(bz | Mm,D) + ∑ m(Mm | D)(E(bz | D,Mm) – (E(bz | D))2.
       m                                      m

As discussed in Leamer (1978) and Draper (1995), the overall variance of the
parameter estimate bz depends on the variance of the within-model estimates
(the first term in equation 15) and the variance of the estimates across models
(the second term in equation 15).
   Equation 12 and the related expressions are all examples of Bayesian model
averages. The methodology surrounding Bayesian model averaging is specifically
developed for linear models with uncertainty about variable inclusion in Raftery,
Madigan, and Hoeting (1997).19 Doppelhofer, Miller, and Sala-i-Martin (2000),
focusing on theory uncertainty only, compute a number of measures of variable
robustness based on the application of this formula to growth regressions and
conclude that initial income is the “most robust�? regressor. Fernandez, Ley, and
Steel (1999) also employ Bayesian model averaging for theory uncertainty, fo-
cusing on the explicit computation of posterior coefficient distributions. Our own
development should be read as an endorsement and extension of the analyses in

   19. The survey by Hoeting and others (1999) provides a nice introduction to model averaging tech-
niques. See also Wasserman (2000).
                                                                          Brock and Durlauf        251


these articles. Our formulation differs in two respects from previous work. First,
we treat heterogeneity uncertainty as well as theory uncertainty as part of over-
all model uncertainty. Draper and others (1993) provide a general overview
of the importance of accounting for heterogeneity uncertainty in constructing
credible empirical exercises. As our discussion illustrates, heterogeneity uncer-
tainty can be treated as a question of variable inclusion, so the ideas in
Doppelhofer, Miller, and Sala-i-Martin (2000) and Fernandez, Ley, and Steel
(1999) can be extended to this domain in a straightforward fashion. Second, we
develop an explicit decision-theoretic approach to interpreting growth regres-
sions. As far as we are aware, this analysis is new.
                                              Outliers
One important concern in the empirical growth literature has revolved around
the role of outliers in determining various empirical claims. A famous example
is the role of the Botswana observation in determining the estimated magnitude
of social returns to equipment investment (DeLong and Summers 1991, 1994;
and Auerbach, Hassett, and Oliner 1994).20
    Outliers can be dealt with in a straightforward fashion. There are three strat-
egies one can pursue. First, one can always employ a within-model estimator
that is designed to be robust to outliers. As Temple (2000) points out, one can
employ a trimmed least square estimator (one that drops or downweights ob-
servations whose associated ols residuals are large) in estimating each model’s
parameters and still employ whatever posterior analysis one wishes. Second, one
can explicitly allow the density for model errors to accommodate outliers. For
example, one can model errors as drawn from a mixture distribution. Third, and
most promising in our view, one can employ a Bayesian bagging procedure due
to Clyde and Lee (2000). Bayesian bagging (“bagging�? is an abbreviation for
bootstrap aggregating) was introduced by Breiman (1996) to improve the per-
formance of what he called “unstable�? prediction and modeling methods. A
method is “unstable�? when small changes in the data set lead to large changes in
the method’s output. Intuitively, the Clyde and Lee procedure constructs boot-
strap data sets from the empirical distribution function of a data set, computes
a model average for each sample, and then averages these results. (See their ar-
ticle for details.) Clyde and Lee provide reasons to think that this modification
of model averaging will be robust to outliers.
    That said, the ex post analysis of outliers, as was carried out in the Botswana
case we described, is often problematic; as Leamer (1978, p. 265) remarks, the
mechanical and typically ad hoc dropping of outliers both leads to invalid sta-
tistical conclusions and ignores valuable information.

    20. The role of outliers in growth regressions has been somewhat overstated; for example, the DeLong
and Summers results are far more robust to the inclusion or exclusion of Botswana than is often as-
serted, as a careful reading of DeLong and Summers (1991, 1994) and Auerbach, Hassett, and Offner
(1994) clearly reveals. Temple (1998) is a more persuasive example of the importance of dealing with
outliers.
252    the world bank economic review, vol. 15, no. 2



                     Priors on the Space of Possible Models
An important issue in the implementation of the model averaging approach that
we describe is the choice of the prior distribution on the space of models. For
the problem of variable inclusion, this is typically handled by assuming that all
2k possible models (where k is the number of regressors that may be placed in a
given model) have equal probability; Fernandez, Ley, and Steel (1999) follow
this procedure in their analysis. The procedure in essence assumes that the prior
probability that a given regressor is in a model is 1/2. Doppelhofer, Miller, and
Sala-i-Martin (2000) make the alternative assumption that for a regression whose
expected number of included regressors is k,| the probability of inclusion of a
given regressor is k | / k. They make this assumption to avoid “a very strong prior
belief that the number of included variables should be large�? (2000, pp. 15–16).
   These alternative approaches to setting model priors are not very appealing
from the perspective of economic theory. Clearly, the addition of a given regres-
sor to the set of possible regressors should affect the probabilities with which
other variables are included. It is unclear, for example, why the effect of ethnicity
on growth should be independent of the effect of democracy, as it can easily be
imagined that one will affect growth only if the other does as well. The conven-
tional approaches to modeling the space of priors ignore this fact.
   This problem is closely associated with a standard criticism of the “irrelevance
of independent alternatives�? assumption in choice theory, originally due to
Debreu (1960) and later instantiated in the choice literature as the “red bus/blue
bus problem�? (see Ben-Akiva and Lerman 1985, sec. 3.7). In discrete choice
theory irrelevance of independent alternatives means that the ratio of choice
probabilities between any two alternatives should be unaffected by the presence
of a third. As pointed out by Debreu, this assumption is untenable if the third
choice is a close substitute for one of the other two. For the analysis of growth
regressions, the priors we have discussed suffer from a similar problem, although
the reasons are more complicated. As noted above, the likelihood that one growth
theory matters may covary positively with whether another one matters. Fur-
ther, because the variables employed to capture growth theories are often crude
proxies for underlying theories, their inclusion probabilities could covary posi-
tively, as each helps measure some common growth determinants. For example,
contra Doppelhofer, Miller, and Sala-i-Martin (2000, n. 15), the likelihood that
political assassinations predict growth differences could be positively associated
with the likelihood that revolutions predict growth differences, as each helps
instrument the unobservable variable “political instability.�?
   We have no advice to offer on how to deal with this problem, because its reso-
lution will depend on one’s priors on the space of underlying growth models, as
determined by the interconnections between particular growth theories. In our
view, it makes more sense at this stage of development to treat the prior distri-
bution over models as a benchmark for reporting posterior statistics. (A number
of Bayesians have developed a similar view of priors; see, for example, the dis-
                                                            Brock and Durlauf    253


cussion of “robust�? priors in Berger 1987, p. 111.) Because the complexity of
the growth process speaks to the strong likelihood that a large number of growth
factors substantively matter, the uniform prior of Fernandez, Ley, and Steel (1999)
makes the most sense at this stage in providing a benchmark.
   That said, there is nothing theoretically compelling about the assumption that
the inclusion probability of each regressor is 1/2. We therefore believe that it
might make sense in future work to report values for some benchmark alterna-
tive probabilities, in order to help evaluate the robustness of results. By choos-
ing inclusion probabilities lower than 1/2, it is possible to incorporate the spirit
of the Doppelhofer, Miller, and Sala-i-Martin concerns without having to form
prior beliefs on the expected number of regressors in a model, which seems ex-
tremely problematic.

        VI. Toward a Policy-Relevant Growth Econometrics

The framework developed in the previous section provides a general way of
describing model uncertainty in growth regressions. It does not, however, pro-
vide any guidance on how to determine what variables should be included in a
regression, or on when to regard the sign or magnitude of a regression coeffi-
cient as robust. The reason is that the posterior densities embodied in equations
12 to 15 are nothing more than data summaries. As such, they can inform policy
analysis only to the extent that they are integrated with a specific formulation of
the decision problem of a policymaker. Hence it is necessary to develop an ex-
plicit decision-theoretic basis for assessing growth data. The decision-theoretic
framework we describe explicitly incorporates the various forms of model un-
certainty associated with possible violations of exchangeability, as discussed in
the previous section. In this section we discuss the use of growth regressions to
inform empirical analysis when one of the growth controls is under the control
of a policymaker. Many of the purported policy variables included in growth
regressions—for example, indices of political stability—are not necessarily tightly
linked to the variables over which a policymaker has control. The framework
we describe can be generalized to incorporate a more complicated relationship
between growth determinants and policy than the one we analyze here.
   The decision-theoretic perspective involves moving away from a specific con-
cern with a particular hypothesis to an evaluation of the implications of a given
set of data for a particular course of action. Kadane and Dickey (1980, p. 247)
argue

  The important question in practice is not whether a true effect is zero, for
  it is already known not to do exactly zero, but rather, How large is the effect?
  But then this question is only relevant in terms of How large is important?
  This question in turn depends on the use to which the inference will be put,
  namely, the utility function of the concerned scientist. Approaches which
  attempt to explain model specification from the viewpoint of the inappro-
254     the world bank economic review, vol. 15, no. 2


   priate question, Is it true that . . . ? have a common thread in that they all
   proceed without reference to the utility function of the scientist. And there-
   fore, from the decision theory point of view, they all impose normative
   conditions on the utility function which are seldom explicit and often far
   from the case.
Substituting policymaker for scientist in this quotation makes it clear why policy-
relevant growth econometrics needs to explicitly integrate policy objectives and
empirical practice. Our approach is well summarized by Kass and Raftery (1994,
p. 784): “The decision making problem is solved by maximizing the posterior
expected utility of each course of action considered. The latter is equal to a
weighted average of the posterior expected utilities conditional on each of the
models, with the weights equal to the posterior model probabilities.�? In other
words, we argue that policy-relevant econometrics needs explicitly to identify
the objectives of the policymaker and then calculate the expected consequences
of a policy change.
                               Policy Assessment: Basic Ideas
The basic posterior coefficient density described by equation 12 and the associ-
ated first and second moments described by equations 14 and 15 represent data
summaries and as such have no implications for either inference or policy as-
sessment. The goal of a policy analysis is not to construct such summaries but to
assess the consequences of changes in a policy. Similarly, such data summaries
do not imply the validity of particular rules for data evaluation or inference. For
example, the assessment of whether regressors are robust, such as is in extreme
bounds analysis or the comparison of models using Bayes factors,21 may not be
appropriate for certain policy exercises. Put differently, decisions on whether to
treat regressors as robust and the like should, for the purposes of policy analy-
sis, be derived from the policymaker’s assessment of the expected payoffs asso-
ciated with alternative policies.
   In this section we explore policy assessment when model uncertainty has been
explicitly accounted for. The purpose of this exercise is twofold. First, it cap-
tures what we believe is the appropriate way for policymakers to draw infer-
ences from data. Second, it shows that various rules for the assessment of re-
gressor fragility, such as extreme bounds analysis, will arise in such exercises. A
critical feature of this approach to model assessment is it illustrates that the
evaluation of regressor robustness can be derived from particular aspects of the
policymaker’s objective function.
   For expositional purposes we initially suppose that the goal of an empirical
exercise is to evaluate the effect of a change dzi22 in some scalar variable that is


   21. For any two models Mm and Mm', the Bayes factor Bm,m' is defined as m(D | Mm) / m(D | Mm').
Kass and Raftery (1994) provide an extensive overview of the use of Bayes factors in model evaluation.
   22. Without loss of generality, we generally assume that dzi > 0.
                                                             Brock and Durlauf    255


under the control of a policymaker and believed to have some effect on growth.
Therefore, the decisionmaker’s set of actions A is {0,dzi}. This decision rule is
based on a vector observable data D. This means that a decisionmaker chooses
a rule f(·) that maps D to A so that
(16)                           f(D) = dzi if D ∈ D1
                               f(D) = 0 otherwise.
D1 is therefore the acceptance region for the policy change. We assume that the
“true�? linear growth model is a causal relationship that will allow evaluation of
the effect of this change.
   Because we restrict ourselves to linear models, the analysis of the policy deci-
sion is particularly straightforward, as m(bz | D) will describe the posterior distri-
bution of the effect of a marginal change in z on growth in a given country. A
marginal policy intervention in country i can be evaluated as follows. Let zi de-
note the level of a policy instrument in country i. This instrument appears as one
of the regressors in the linear model that describes cross-country growth. Sup-
pose one has the option of either keeping the policy instrument at its current
value or changing it by a fixed amount dzi. Let gi denote the growth rate in the
country in the absence of the policy change, and gi + bzdzi the growth rate with
the change. Finally, let V(gi,Oi) denote the utility value of the growth rate to the
policymaker. Oi is a placeholder vector that contains any factors relating to
country i that affect the policymaker’s utility.
   An expected utility assessment of the policy change can be based on the
comparison
(17)                E(V(gi + bzdzi, Oi) | D) – E(V(gi, Oi) | D).
Calculations of the expected utility differential in equation 17 implicitly contain
all information relevant to a policy assessment. From the perspective of policy
evaluation, the various rules that have been proposed for the assessment of re-
gressor robustness should be an implication of this calculation. Notice that this
calculation requires explicitly accounting for model uncertainty, because the
conditioning is always done solely with respect to the data.
             Policy Assessment under Alternative Utility Functions
In this section we consider the implications of some alternative utility functions
for the analysis of growth regressions. Our goal is to show how particular utility
functions will lead a policymaker to decide whether or not to implement a policy
on the basis of aspects of the posterior distribution of bz. We do not claim that
the utility functions we examine are particularly compelling. We have chosen
them to illustrate what sort of utility functions can justify some of the standard
ways of interpreting growth regressions.

Risk neutrality. Suppose that V is linear and increasing in the level of growth,
that is,
256     the world bank economic review, vol. 15, no. 2


(18)                             V(gi,Oi) = a0 + a1gi,a1 > 0.
For this policymaker the relevant statistic is the posterior mean of the regressor
coefficient. In this case it is straightforward to see that the policy change is justi-
fied if the expected value of the change in the growth rate is positive, that is,

(19)                            ∑ m(Mm | D)E(bz | D,Mm) > 0.
                                m

When the prior model probabilities are equal, this is equivalent to the condition

(20)                            ∑ m(D | Mm)E(bz | D,Mm) > 0
                                m

so the likelihoods m(D | Mm) determine the relative model weights.

Mean/variance utility over changes in the growth rate. Suppose that a
policymaker has preferences that relate solely to changes in the growth rate, as
opposed to its level. The idea here is that a policymaker assesses a policy relative
to the baseline gi. Operationally, we assume that one chooses the elements of Oi
and the functional form of V(· , ·) so that
(21)                  E(V(gi + bzdzi,Oi) | D) – E(V(gi, Oi) | D) =
                   a0E(bzdzi | D) + a1var(bzdzi | D)1/2, a0 > 0, a1 < 0.
When |a0/a1| =1/2, this utility specification implies that the policymaker will act
only if the t-statistic (the posterior expected value of bz divided by its posterior
standard deviation) is greater than 2. Hence this specification, at least qualita-
tively, corresponds to the standard econometric practice of ignoring regressors
whose associated t-statistics are less than 2.
   From a decision-theoretic perspective, the conventional practice of ignoring
“statistically insignificant�? coefficients (by which we mean coefficients whose
posterior standard errors are more than twice their posterior expected values)
can be justified only in very special cases. First, it is necessary to assume that the
form of risk aversion of the policymaker applies to the standard deviation rather
than to the variance of the change in growth. Otherwise, the desirability of the
policy will depend on the magnitude of dzi. For example, if the utility function is
(22)                   E(V(gi + bzdzi,Oi) | D) – E(V(gi, Oi) | D) =
                     a0E(bzdzi | D) + a1var(bzdzi | D), a0 > 0, a1 < 0
with |a0/a1| = 1/2, there will be a threshold level T such that for all 0 < dzi ≤ T a
policy change increases the policymaker’s utility.23 Therefore, the rule of ignor-
ing regressors with t-statistics less than 2 presupposes a very specific assump-
tion about how risk affects the policymaker’s utility. Second, if equation 22 is
the correct utility function, the policymaker may still choose to act with the fixed
dzi level we started with under (conventionally defined) statistical insignificance

    23. This is an example of the famous result of Pratt that one will always accept a small amount of
a fair bet. We plan to address the question of the optimal choice of dzi in future work.
                                                                     Brock and Durlauf        257


or, alternatively, may decline to act when the coefficient is statistically signifi-
cant. These possibilities can be generated through appropriate choices of |a0/a1|.

Knightian uncertainty and maximin preferences. In the examples we have
studied thus far we have allowed all uncertainty about the correct model Mm to
be reflected in the posterior model probabilities m(Mm | D). An alternative ap-
proach to model uncertainty, one in the tradition of Knightian uncertainty, as-
sumes that an additional layer of uncertainty exists in the environment under
study that may be interpreted as a distinct type of risk, sometimes called ambi-
guity aversion, as will be seen below.
   As before, let M denote the universe of possible growth models. A risk sensi-
tive utility function for the policymaker can be defined as
(23)           (1– e)E(V(gi,Oi) | D) + e(infMm       ∈ ME(V(gi,Oi)     | D,Mm)).
In this equation e denotes the degree of ambiguity.
   This equation is motivated by recent efforts to reconceptualize utility theory
in light of results such as the classic Ellsberg paradoxes. For example, if experi-
mental subjects are given a choice between (1) receiving $1 if they draw a red
ball at random from an urn that they know contains 50 red balls and 50 black
balls and (2) receiving $1 if they draw a red ball when the only information avail-
able is that the urn contains 100 red and black balls, the subjects typically choose
the first, “unambiguous�? urn (Camerer 1995, p. 646). Clearly, if subjects were
Bayesians who placed a flat prior on the distribution of the balls in case 2, they
would be indifferent between the two options.24
   Experimental evidence of ambiguity aversion has led researchers—including
Anderson, Hansen, and Sargent (1999); Epstein and Wang (1994); and Gilboa
and Schmeidler (1989)—to consider formal representations of preferences that
exhibit ambiguity aversion. One popular representation, studied in Epstein and
Wang (1994), replaces expected utility calculations of the form ∫u(w)dP(w) with
infp ∈ P ∫ u(w)dP(w), where P is a space of possible probability measures. When
this space contains a single element, this second expression reduces to the first,
which is the standard expected utility formulation. A variant of this formula-
tion is to assume that P consists of a set of mixture distributions (1 – e)P0 + eP1,
where P0 is a baseline probability measure that a policymaker believes to be true,
P1 is the least favorable of all possible probability measures for the policymaker,
and e represents the strength of the possibility that this measure applies. When
the universe of alternative processes for growth is the space of linear models that
we have described, one can replace P with Mm and P with M and obtain the sec-
ond part of equation 23. In using this specification, we do not claim that it is the
only sensible way to model ambiguity aversion by policymakers. We introduce



   24. See Camerer (1995) for additional examples of ambiguity aversion as well as a survey of the
implications of different results in the experimental economics literature for utility theory.
258    the world bank economic review, vol. 15, no. 2


it to illustrate how recent developments in decision theory may be linked to econo-
metric practice.
    We can explore the effects of this additional uncertainty on our analysis by
considering the two specifications of V studied above. First, assume that V is
linear and increasing, while equation 23 characterizes the ambiguity aversion
we have described. In this case the policy change dz is justified if
(24)              (1 – e)E(bz | D) + e(infMm ∈ ME(bz | D,Mm)) > 0.
   When e = 1, the policy action will be taken only when E(bz | D,Mm) > 0 for all
Mm ∈ M. This has an interesting link to ols coefficients for different models in M.
In the leading case described in the technical appendix, the posterior expectation
E(bz | D,Mm) equals bzﬂ ,m, the ols coefficient associated with the regressor z for
model Mm. If e = 1, this utility function would then mean that a policymaker
will choose to implement dzi if the ols coefficient estimate of bz is positive for
every model in M.
   Alternatively, assume that the policymaker is risk-averse in the sense that
equation 21 describes his utility function. In this case the policy change should
be implemented if
(25)                     (1 – e)(a0E(bz | D) + a1var(bz | D)½) +
              e(infMm                                            ½
                        ∈ M (a0E(bz | D,Mm) + a1var(bz | D,Mm) )) > 0.

Again, this rule has an interesting link to ols parameter estimates. If e = 1 and
|†a0/a1 | = 1/2, then for the leading case in the technical appendix, the policymaker
will not act unless the ols regression coefficient bz,m    ﬂ is positive and statistically
significant (in the sense that the t-statistic is at least 2) for each model in M. (Here,
we rely on the additional fact that for the leading case, var(bz | D,Mm) equals the
ols variance of bzﬂ ,m.)
   The policy rules that hold for e = 1 are closely related to the recommenda-
tions made by Leamer for assessing coefficient fragility through extreme bounds
analysis (see Leamer 1983 and Leamer and Leonard 1983). In extreme bounds
analysis, recall that when a regressor “flips signs�? across specifications, this is
argued to imply that the regressor is fragile. From the perspective of policy rec-
ommendations, we interpret this notion of fragility to mean that no policy change
should be made when there is a model of the world under which the policy change
can be expected to make things worse off. This suggests that extreme bounds
analysis is based on a maximin assumption of some type. Our derivations show
that this intuition can be formalized.
   This derivation of extreme value analysis appears to complement a number
of the objections raised against it by Granger and Uhlig (1990) and McAleer,
Pagan, and Volker (1985). Both these articles argue that extreme bounds analy-
sis can lead to spurious rejections of regressors as a result of changes in sign in-
duced by regressions that are, by standard tests, misspecified. In our view these
criticisms need to be developed from the perspective of the objectives of the
empirical exercise. Put differently, the salience of these critiques of extreme
                                                                          Brock and Durlauf        259


bounds analysis requires that one reject the utility functions we have described
as supporting extreme bounds analysis.
   Further, we believe that our derivations provide an appropriate way of modi-
fying extreme bounds analysis—through the use of utility functions, such as
equation 23 for 0 < e < 1. For such cases the relative goodness of fit of different
models will be relevant to the empirical exercise. As is well known (see Wasserman
2000, p. 94 for a nice exposition), when Bayesian model selection between two
models is based on posterior odds ratios and the prior odds on the models are
equal, the posterior odds ratios will equal the ratio of their likelihoods, that is,
the posterior odds will reflect the relative likelihoods of the data under the alter-
native models. Further, as the amount of data becomes large enough, for this
special case of equal prior odds, the model with the minimum Kullback-Leibler
Information Criterion (klic) distance to the “truth�? will be revealed. If the set
of models under scrutiny includes the true model, the true model will be revealed
in large samples.25 Thus in our context, under our assumption of equal prior
odds across models, we may expect the data in large samples to ultimately place
greater weight on models whose klic distance is closer to the true model.26
   By choosing 0 < e < 1 for policymaker utility functions such as equations 24
and 25, one can retain the ambiguity aversion that justifies (in a limiting case)
extreme bounds analysis. In particular, one can reflect a policymaker’s desire to
avoid harm when he faces scientific ambiguity, but at the same time prevent him
from being so ambiguity-averse that he fails to take welfare-enhancing actions
that are supported by relatively good posterior odds under available scientific
evidence (especially when samples are large enough to contain some policy-
relevant predictive information). Notice that our treatment avoids the criticism
of Bayes factors in Kadane and Dickey (1980) that the weights do not account
for the purpose of the empirical exercise.

Alternative utility functions. In the previous section we assumed that the
policymaker cares only about the level of or change in growth induced by a policy
change. It is of course possible to imagine other plausible utility functions for a
policymaker. One possibility is to assume that a policymaker evaluates a utility
on the basis of changes in the expected value of growth within a regime; for-
mally, one assumes that there exists a function y such that
(26)                       y(E(gi + bzdzi | D,Mm) – E(gi | D,Mm))



    25. See White (1994) for a discussion of measures of closeness based on KLIC and how various esti-
mators achieve minimum KLIC distance to the true mode in large samples.
    26. There are some subtleties involved in making the argument we have sketched precise. For ex-
ample, regularity conditions need to be assumed to justify assertions about the relationship between
KLIC distance minimization and quasi-maximum likelihood estimation. Furthermore, as Fernandez, Ley,
and Steel (2001) point out, the form of priors for parameters within models raises thorny issues. Never-
theless, we believe that this heuristic argument is useful.
260    the world bank economic review, vol. 15, no. 2


measures the utility for a policy change conditional on a particular growth model.
Again assuming the leading case where E(bz | D,Mm) equals the ols coefficient
bzﬂ ,m, linearity of the expected growth process implies that the expected utility
from dzi, once one accounts for model uncertainty, is

(27)       E(y(E(bzdzi | D,Mm)) | D) = ∑ m(Mm | D)y(E(bzdzi | D,Mm))
                                           m
                               = ∑ m(Mm | D)y(b z,m
                                                 ﬂ dzi)
                                 m

When y(·) is linear, this reduces to the risk-neutral case discussed earlier. How-
ever, alternative utility functions can produce very different decision rules. For
example, suppose that either y(c) = – ∞ if c < 0, y(·) bounded otherwise, or
y(c) = – ∞ if c > 0, y(·) bounded otherwise. One will then have the implied de-
cision rule that a single sign change in the ols coefficient estimate bz,m
                                                                         ﬂ as one
moves across models is sufficient to imply that the policymaker should not act
to either increase or decrease zi by dz. This type of utility function induces be-
havior mimicking that found under Knightian uncertainty.
   At first glance this might appear to be an unreasonable utility function for a
policymaker. This conclusion is at least partially incorrect. Suppose that each
state of the world is indexed by the growth process that is “true�? under it. The
utility of the policymaker will then depend on both the growth rate that is ex-
pected to prevail and the state of the world under which it transpires. For ex-
ample, suppose that there is a model of the world in which the expected effect of
democracy on growth is negative. Such a model could be one whose features
imply that a policymaker is particularly wary of reducing growth by changing a
given policy instrument. For example, if there is a (positive probability) model
of the world in which democracy is especially fragile and may not survive a growth
reduction, a policymaker might be especially wary of the policy change for fear
this would prove to be the correct model of the world.
   This type of argument can be formalized by considering model-dependent
utility specifications. Suppose that conditional on model Mm, the utility from a
policy change is equal to
(28)                                              ﬂ dzi,Mm)
                     U(E(bz | D,Mm)dzi,Mm) = U(b z,m
so that the posterior expected utility of the policy change is
(29)                   ﬂ dzi,Mm) | D) = ∑ m(Mm | D)U(b z,m
                E(U(b z,m                               ﬂ dzi,Mm).
                                          m

Manipulating U(·,·), one can produce (under the leading case) a result that is
consistent with refusing to act whenever the posterior mean b zﬂ ,m is negative for
at least one Mm, thereby producing extreme bounds–like behavior in the sense
that one would not choose dzi > 0, even though for all other models b zﬂ ,m is positive.
                       Policy Analysis and Exchangeability
A decision-theoretic approach of the type we have advocated makes clear the
importance of a growth model being rich enough for a researcher to plausibly
                                                                     Brock and Durlauf        261


regard the observations as F-conditionally exchangeable. Suppose that a re-
searcher is using data from I countries to provide a recommendation for the
optimal choice of zi subject to some constraint set Zi for country i. In other words,
a researcher is attempting to solve the problem
(30)                                 maxzi ∈ ZiE(V(gi,zi))
where information in computing this expression is taken from the regression
described by equation 1.
   What information in equation 1 is relevant to this calculation? The answer
depends on the shape of V. Suppose that V is linear in growth rates, as in equa-
tion 18 above. The only information needed about the growth process as de-
scribed by equation 1 is the posterior expected value of bz. In our leading case,
the ols coefficient in a growth regression will be sufficient for policy analysis as
long as all countries are described by a common linear model. Growth rates need
not be partially exchangeable, because partial exchangeability requires symme-
try with respect to all moments of the growth process. Similarly, suppose that V
is quadratic. In this case one will need only the second moments from the poste-
rior densities generated by equation 1 to apply to country i; partial exchange-
ability is still not necessary.
   However, if V is arbitrary, one will need to employ equation 1 to obtain in-
formation on the full conditional distribution F(ei | Xi,Zi). To reveal this type of
statistic from cross-country data, one will require full F-conditional exchange-
ability of the type we have discussed.

                             VII. An Empirical Example

In this section we reconsider an important growth study, Easterly and Levine
(1997), which examines the role of ethnic conflict in growth.27 We chose to re-
examine this study for three reasons. The study is widely regarded as quite im-
portant in the growth literature. It has important implications for policy and the
sorts of advice and advocacy an international organization would engage in. And
the authors of the study have done an admirable job of making their data and
programs publicly available.
   Easterly and Levine’s analysis is designed to explain why in standard cross-
country growth regressions the performance of Sub-Saharan Africa28 is so much
worse than that of the rest of the world. Rather than remain content with mod-
eling this phenomenon as a fixed effect (a dummy variable) for these countries,
Easterly and Levine argue that a major cause of the poor growth performance is
the presence of ethnic conflict in these countries. They construct a measure of
ethnic diversity to proxy for this conflict. This variable is substantially larger for


   27. We thank Duncan Thomas for suggesting to us that the findings in Easterly and Levine (1997)
warrant reexamination.
   28. See the data appendix for the list of the countries in Sub-Saharan Africa.
262      the world bank economic review, vol. 15, no. 2


Sub-Saharan Africa than for the rest of the world. Inclusion of the variable in a
cross-country growth regression reduces the size of the African fixed effect and
is itself statistically significant. Easterly and Levine (1997, p. 1241) conclude that
“the results lend support to the theories that interest group polarization leads to
rent-seeking behavior and reduces the consensus for public goods, creating long-
run growth tragedies.�?
    Our reexamination of this study has an explicitly narrow focus. In our view it
is important to see whether and how the influence of ethnolinguistic heteroge-
neity on growth depends on what other variables are included in the regression.
Further, a natural alternative to the claim that the African growth experience is
different because of an omitted variable, ethnolinguistic heterogeneity, is that
other growth determinants influence Africa differently than they do the rest of
the world. Put differently, parameter heterogeneity is a natural alternative ex-
planation.
    We therefore conduct the following analysis to account for the effect of model
uncertainty on Easterly and Levine’s results. A data appendix describes the vari-
ables we employ; these are identical to those used in Easterly and Levine (1997).
The data are based on decade-long average observations for the 1960s, 1970s,
and 1980s, except where indicated in the appendix. We focus on a reexamina-
tion of Easterly and Levine’s equation 3, table IV, which by conventional mea-
sures (such as the statistical significance of all included variables) is arguably their
strongest regression in support of the role of ethnic diversity in growth. Our results
using this regression are reported in column 1 of our table 1. The key variable of
interest is ELF60, a measure of ethnic diversity in each country in 1960.
    We explore the role of model uncertainty in two ways. We first consider the
impact of theory uncertainty on inferences about the determinants of growth.
We do this by constructing a universe of models that consists of all possible com-
binations of the variables in Easterly and Levine’s baseline regression. This ex-
ercise should be interpreted as a robustness check for Easterly and Levine’s re-
sults. To perform this exercise, we employ an approximation algorithm whereby
posterior model probabilities are replaced with their maximum likelihood esti-
mates. We perform the subsequent calculation of the posterior mean and stan-
dard deviation of each regression coefficient using formulas 14 and 15.29
    Our results incorporating theory uncertainty are reported in column 2 of table
1. Interestingly, we find that the evidence of a role for ethnic diversity in the
growth process is slightly strengthened through the model averaging technique.
Specifically, the posterior mean of ELF60 is –0.02 under model averaging, com-
pared with the –0.017 estimate reported by Easterly and Levine. Our primary
conclusion from this exercise is that Easterly and Levine’s main result is robust
to theory uncertainty as we have characterized it.

    29. See the computational appendix for details on the calculation of these quantities. Ethnolinguistic
heterogeneity is not, of course, directly subject to a policymaker’s control, so we do not explore the
issues raised in section VI. The policy importance of the variable stems from the implications of its
importance to questions of institutional design.
                                                                           Brock and Durlauf          263


Table 1. OLS and Bayesian Model Averaging Coefficient Estimates and
Standard Errors Using Data from Easterly and Levine (1997)
                          [1]         [2]           [3]            [4]           [5]            [6]

Intercept term            —        —                —              —            0.4013         0.1382
                          —        —                —              —           (0.3985)       (0.0336)
Dummy for Sub-         –0.0113 –0.0031             0.9558        0.0761          —              —
 Saharan Africa        (0.0048) (0.0053)          (0.3704)      (0.0302)         —              —
Dummy for Latin        –0.0191 –0.0197            –0.0197       –0.0184          —              —
 America and the       (0.0036) (0.0042)          (0.0035)      (0.0037)         —              —
 Caribbean
Dummy for 1960s     –0.2657        –0.2200       –0.3643        –0.0028         —              —
                    (0.0998)       (0.1765)      (0.1328)       (0.0326)        —              —
Dummy for 1970s –0.2609            –0.2154       –0.3520         0.0009        0.0080         0.0050
                    (0.0997)       (0.1745)      (0.1332)       (0.0325)      (0.0134)       (0.0079)
Dummy for 1980s –0.2761            –0.2298       –0.3650        –0.0143       –0.0038        –0.0024
                    (0.0996)       (0.1751)      (0.1336)       (0.0325)      (0.0132)       (0.0058)
Log of initial       0.0870         0.0756       –0.1090         0.0218       –0.0696        –0.0004
  income            (0.0254)       (0.0444)      (0.0986)       (0.0088)      (0.1171)       (0.0027)
Log of initial      –0.0063        –0.0056        0.0070        –0.0022        0.0044        –0.0000
  income squared    (0.0016)       (0.0029)      (0.0067)       (0.0006)      (0.0088)       (0.0002)
Log of schooling     0.0117         0.0130       –0.0220         0.0130       –0.0131        –0.0017
                    (0.0042)       (0.0056)      (0.0216)       (0.0045)      (0.0194)       (0.0077)
Assassinations     –12.8169        –3.3629     –377.3810       –30.6120     –306.4870      –343.4434
                    (9.2709)       (7.8137)    (165.5661)      (86.9027)    (158.4484)     (181.6948)
Financial depth      0.0162         0.0111        0.1010         0.0129        0.0774         0.0104
                    (0.0058)       (0.0083)      (0.0497)       (0.0075)      (0.0483)       (0.0278)
Black market        –0.0188        –0.0219       –0.0130        –0.0207       –0.0171        –0.0039
  premium           (0.0045)       (0.0053)      (0.0098)       (0.0043)      (0.0107)       (0.0081)
Fiscal surplus/GDP   0.1210         0.1717        0.1200         0.1382        0.1654         0.0948
                    (0.0314)       (0.0411)      (0.0874)       (0.0357)      (0.0986)       (0.1071)
Ethnic diversity    –0.0169        –0.0222       –0.2020        –0.1437       –0.1516        –0.1595
  (ELF60)           (0.0060)       (0.0066)      (0.0376)       (0.0279)      (0.0353)       (0.0327)
   [1] Ordinary least squares estimates for model “ALL�?.
   [2] Bayesian model averaging estimates for model “ALL�?.
   [3] Ordinary least squares estimates for model “ALL + ALL*I(AFRICA)�?; composite coefficient estimates
and standard errors reported. AFRICA, LATINCA, and DUM60 dropped from AFRICA-specific set of regres-
sors.
   [4] Bayesian model averaging estimates for model “ALL + ALL*I(AFRICA)�?; composite coefficient es-
timates and standard errors reported. AFRICA, LATINCA, and DUM60 dropped from AFRICA-specific set of
regressors.
   [5] Ordinary least squares on AFRICA subsample.
   [6] Bayesian model averaging on AFRICA subsample.
   Note: Standard errors are in parentheses.


   As we have emphasized, theory uncertainty is not the only form of model
uncertainty that needs to be accounted for in cross-country analysis. We there-
fore next incorporate heterogeneity uncertainty. Following equation 11, we do
this by constructing for each regressor xi in the baseline regressors a corresponding
variable xidj,A , where dj,A = 1 if country j is in Sub-Saharan Africa and 0 other-
               A            A
wise. This allows for the possibility that the Sub-Saharan African countries have
different growth parameters than the rest of the world. Column 3 in table 1 re-
264      the world bank economic review, vol. 15, no. 2


ports the ols values and standard errors of the regressor coefficients for the
African countries; column 4 reports the same statistics when model averaging
is done over the augmented variable set. Column 5 reports ols estimates of
the growth regression coefficients and standard errors when the African sub-
sample is analyzed in isolation; column 6 reports the corresponding model
average results.
   Our explorations of the role of heterogeneity uncertainty provide a rather
different picture of the role of ethnicity in African growth than of its role in the
rest of the world. The coefficient estimates for Africa are about 7–10 times greater
than the corresponding estimates for the world.30 This result is extremely strik-
ing and makes clear that the operation of ethnic heterogeneity on growth is dif-
ferent in Africa, not just the levels of ethnic heterogeneity. Further, a compari-
son of the other regressor coefficients for Africa with those of the rest of the world
makes clear that the growth observations for African countries should not be
treated as partially exchangeable with the growth rates of the rest of the world.
   These results in no way diminish the importance of Easterly and Levine’s find-
ings. In fact, our exercises show that their basic claims are robust to a limited
variable uncertainty exercise. Our finding of parameter heterogeneity with re-
spect to ethnolinguistic heterogeneity suggests a direction along which to extend
their research. Our results illustrate how additional insights can be obtained by
explicitly controlling for model uncertainty.
   Finally, we again note that this reexamination is quite narrow. A full-scale
study should at a minimum include explicit calculations and presentation of the
predictive distribution of the effects of the policy change on growth. Fernandez,
Ley, and Steel (1999) provide a good illustration of how to present results of
this type. More generally, the reporting of results should always include the in-
formation necessary to calculate the posterior expected utility changes of the
policymaker. Our own reporting is useful for mean/variance utility functions,
but not for the others we have discussed. In addition, we have not allowed for
parameter heterogeneity for countries outside Sub-Saharan Africa; doing so is a
natural extension of this exercise. The results we report should be treated as
suggestive, in this sense, of more elaborate examinations of the role of ethnic
heterogeneity in the growth process.

                                      VIII. Conclusions

This paper has had two basic aims. First, we attempted to delineate the major
criticisms of cross-country growth regressions and to show how to interpret two

    30. Similar results are obtained when one compares Sub-Saharan Africa with the rest of the world.
When the Sub-Saharan African countries are dropped from the data set, the OLS estimate for the ELF60
regressor is –0.0115 with an associated standard error of 0.006. The associated values when model
averaging is done across different regressor combinations (to check for robustness to theory uncertainty)
are –0.013 and 0.009. By conventional levels, one would conclude that ethnicity is marginally statisti-
cally significant outside Sub-Saharan Africa.
                                                                     Brock and Durlauf       265


of these criticisms, theory uncertainty and parameter uncertainty, as violations
of a particular assumption—F-conditional exchangeability—in the residual com-
ponents of growth models. Second, we outlined a framework for conducting and
interpreting growth regressions. For conducting regressions, we advocated an
explicit modeling of theory and heterogeneity uncertainty and the use of model
averaging to condition out strong assumptions. For interpreting regressions, we
argued that the policy objectives associated with a given exercise must be made
explicit in the analysis. We outlined a decision-theoretic approach to growth
regressions and explored its relationship to conventional approaches to assess-
ing model robustness. Finally, in an empirical application we showed how at-
tention to model uncertainty can provide new insights into the relationship be-
tween ethnicity and growth.
   To amplify some earlier remarks, we do not believe that there is a single privi-
leged way to conduct statistical or, for that matter, empirical analysis in the social
sciences. Persuasive empirical work always requires judgments and assumptions
that cannot be falsified or confirmed within the statistical procedure being em-
ployed.31 Indeed, this is the reason that we have not included a treatment of how
to provide more robust arguments in favor of causality in this article. What we
hope is that this article has provided some initial steps toward the development
of a language in which policy-relevant empirical growth research may be better
expressed.

                             Computational Appendix

All model averaging calculations were done using the program bicreg, which was
written in SPLUS by Adrian Raftery and is available at www.research.att.com/
~volinsky/bma.html. Given the large number of possible models, this program,
as is standard in the model averaging literature, uses a search algorithm that
explores only a subset of the model space; the key feature of the design of the
algorithm is that it ensures that the search proceeds along directions such that it
is likely to cover models that are relatively strongly supported by the data. We
follow the procedure suggested by Madigan and Raftery (1994); see Raftery,
Madigan, and Hoeting (1997); and Hoeting and others (1999) for additional
discussion. Though the reader should see those papers for a full description of
the search algorithm, Hoeting and others (1999, p. 385) provide a nice intuitive
description:
  First, when the algorithm compares two nested models and decisively re-
  jects the simpler model, then all submodels of the simpler model are rejected.
  The second idea, “Occam’s window,�? concerns the interpretation of the
  ratio of posterior model probabilities pr(M0/D)/pr(M1/D). Here M0 is
  “smaller�? than M1. . . . If there is evidence for M0 then M1 is rejected, but
  rejecting M0 requires strong evidence for the larger model M1.
  31. See Draper and others (1993) and Mallows (1998) for valuable discussions of such issues.
266    the world bank economic review, vol. 15, no. 2


   In implementing the model averaging procedure, the algorithm we em-
ploy uses an approximation, due to Raftery (1995), based on the idea that, be-
cause for a large enough number of observations, the posterior coefficient dis-
tribution will be close to the maximum likelihood estimator, and so one can use
the maximum likelihood estimates to avoid the need to specify a particular
prior. We refer the reader to Raftery (1994) as well as to Tierney and Kadane
(1986) for technical details. While some evidence exists that this approximation
works well in practice, more research is needed on the specification of priors for
model averaging; an important recent contribution is Fernandez, Ley, and Steel
(2001).

                   Data Appendix: Variable Definitions

All data are the same as those used in Easterly and Levine (1997).

  • AFRICA:   Dummy variable for Sub-Saharan African countries, as defined by
      the World Bank. These countries are Angola, Benin, Botswana, Burkina
      Faso, Burundi, Cameroon, Cape Verde, the Central African Republic,
      Chad, Comoros, Democratic Republic of Congo, Republic of Congo, Côte
      d’Ivoire, Djibouti, Equatorial Guinea, Ethiopia, Gabon, The Gambia,
      Ghana, Guinea, Guinea-Bissau, Kenya, Lesotho, Liberia, Madagascar,
      Malawi, Mali, Mauritania, Mauritius, Mozambique, Namibia, Niger,
      Nigeria, Rwanda, São Tomé and Principe, Senegal, Seychelles, Sierre
      Leone, Somalia, South Africa, Sudan, Swaziland, Tanzania, Togo, Uganda,
      Zambia, and Zimbabwe.
  •   ASSASS: Number of assassinations per 1,000 population.
  •   BLCK: Black market premium, defined as log of 1 + decade average of black
      market premium.
  •   DUM60: Dummy variable for 1960s.
  •   DUM70: Dummy variable for 1970s.
  •   DUM80: Dummy variable for 1980s.
  •   ELF60: A measure of ethnic diversity, equalling an index of ethnolinguistic
      fractionalization in 1960. This variable measures the probability that two
      randomly selected individuals from a given country will not belong to the
      same ethnolinguistic group.
  •   GYP: Growth rate of real per capita gdp.
  •   LATINCA: Dummy variable for countries in Latin America and the Carib-
      bean.
  •   LLY: Financial depth, measured as the ratio of liquid liabilities of the finan-
      cial system to gdp, decade average. Liquid liabilities consist of currency
      held outside the banking system plus demand and interest-bearing liabili-
      ties of banks and nonbank financial intermediaries.
  •   LRGDP: Log of real per capita gdp measured at the start of each decade.
  •   LRGDPSQ: Square of lrgdp.
                                                                        Brock and Durlauf   267


   • LSCHOOL:  Log of 1 + average years of school attainment, quinquennial val-
     ues (1960–65, 1970–75, 1980–85).
   • SURP: Fiscal surplus/gdp: Decade average of ratio of central government sur-
     plus to gdp, both in local currency, local prices.


                                 Technical Appendix

                       1. De Finetti’s Representation Theorem
De Finetti’s theorem establishes that the symmetry inherent in the concept of
the exchangeability of errors leads to a representation of the joint distribution
of the errors in terms of an integral of the joint product of identical marginal
distributions against some conditional distribution function. The theorem is as
follows.

If hi is an infinite exchangeable sequence with associated probability measure P,
there exists a probability measure Q over F, the space of all distribution func-
tions on R, such that the joint distribution function F(hi – j . . . hi . . . hi + k) for any
finite collection hi – j . . . hi . . . hi + k may be written as
                                                         k
(A-1)                F(hi – j . . . hi . . . hi + k) = ∫ r=
                                                          P –j
                                                               F(hi + r)dQ(F).
See Bernardo and Smith (1994, p. 177), for this formulation of de Finetti’s theo-
rem as well as a proof.
        2. Some Relations between         OLS   Estimates and Bayesian Posteriors
For the linear model
(A-2)                             gi = Siz + ei i = 1 . . . I
suppose that (1) conditional on S1 . . . SI, the eis are independent and identically
distributed and jointly normal; the marginal distribution of the typical element
is N(0,s 2         2
         e), (2) s e is known, and (3) prior information on z is characterized by
the noninformative (improper) prior
(A-3)                                       m(z) �? c
where c is a constant. Denote the ols estimate (as well as the classical maximum
likelihood estimate) of z as z,ﬂ and denote the data matrix of regressors in equa-
tion A-2 as S.
   As shown for example in Box and Tiao (1973, p. 115), the posterior density
of the parameter vector z given the available data D, m(z | D,M), is, under our
assumptions, multivariate normal. Specifically,
(A-4)                           m(z | D) ~ N(z,ﬂ (S'S) –1s2
                                                          e)

The posterior density of any particular coefficient can of course be calculated
from this vector density. Under the assumptions justifying A-4, the posterior
268     the world bank economic review, vol. 15, no. 2


mean and variance of z therefore correspond to the standard ols estimates of
the parameter vector and its associated covariance matrix.
   When s2 e is unknown, the posterior density of z can also be characterized and
related to ols estimates. Formally, if s2
                                        e is unknown and has a noninformative
prior

                                        e) �? s ,
                                     m(s2
(A-5)                                         –2


then it can be shown (Box and Tiao 1973, p. 117) that
(A-6)                       m(z | D,s2       ﬂ S'S)–1s 2
                                     e) ~ N(z,(        e).

For reasonably large samples, s2                                         2
                               e can be replaced with the ols estimate s§ e so that,
approximately,
(A-7)                                       ﬂ S'S)–1s§ 2
                              m(z | D) ~ N(z,(         e)

and again the posterior mean and variance of z may be equated with the corre-
sponding ols estimates. We refer to this as the “leading case�? in the text.
  In our evaluation of growth models, we have emphasized the role of F-
exchangeable, as opposed to independent and identically distributed errors.
De Finetti’s theorem provides a link between exchangeability and independence
and so motivates our use of this leading case.

                                    References

The word “processed�? describes informally reproduced works that may not be commonly
available through library systems.
Aldous, D. 1983. “Exchangeability and Related Topics.�? In École d’Été de Probabilités
   de Saint Flour XIII. Lecture Notes in Mathematics Series, no. 1117. New York:
   Springer-Verlag.
Anderson, E., L. Hansen, and T. Sargent. 1999. “Risk and Robustness in Equilibrium.�?
   Department of Economics, Stanford University. Processed.
Arnold, S. 1979. “Linear Models with Exchangeably Distributed Errors.�? Journal of the
   American Statistical Association 74:194–99.
Auerbach, A., K. Hassett, and S. Oliner. 1994. “Reassessing the Social Returns to Equip-
   ment Investment.�? Quarterly Journal of Economics 109:789–802.
Azariadis, C., and A. Drazen. 1990. “Threshold Externalities in Economic Development.�?
   Quarterly Journal of Economics 105:501–26.
Barro, R. 1991. “Economic Growth in a Cross-Section of Countries.�? Quarterly Jour-
   nal of Economics 106:407–43.
———. 1996. “Democracy and Growth.�? Journal of Economic Growth 1:1–27.
Ben-Akiva, M., and S. Lerman. 1985. Discrete Choice Analysis: Theory and Applica-
   tion to Travel Demand. Cambridge, Mass.: mit Press.
Berger, J. 1987. Statistical Decision Theory and Bayesian Analysis. New York: Springer-
   Verlag.
Bernard, A., and S. Durlauf. 1996. “Interpreting Tests of the Convergence Hypothesis.�?
   Journal of Econometrics 71:161–72.
                                                             Brock and Durlauf    269


Bernardo, J., and A. Smith. 1994. Bayesian Theory. New York: John Wiley and Sons.
Box, G., and G. Tiao. 1973. Bayesian Inference in Statistical Analysis. New York: John
   Wiley and Sons. Reprinted 2000.
Breiman, L. 1996. “Bagging Predictors.�? Machine Learning 26:123–40.
Camerer, C. 1995. “Individual Decision Making.�? In J. Kagel and A. Roth, eds., Hand-
   book of Experimental Economics. Princeton, N.J.: Princeton University Press.
Canova, F. 1999. “Testing for Convergence Clubs in Income Per-Capita: A Predictive
   Density Approach.�? Department of Economics, University of Pompeu Fabra, Spain.
   Processed.
Clark, G. 1987. “Why Isn’t the Whole World Developed? Lessons from the Cotton Mills.�?
   Journal of Economic History 47:141–73.
Clyde, M., and H. Lee. 2000. “Bagging and the Bayesian Bootstrap.�? Duke University,
   Department of Statistics, Durham, N.C. Processed.
Debreu, G. 1960. “Review of R. D. Luce, Individual Choice Behavior: A Theoretical
   Analysis.�? American Economic Review 50:186–88.
DeLong, J. B., and L. Summers. 1991. “Equipment Investment and Economic Growth.�?
   Quarterly Journal of Economics 106:445–502.
———. 1994. “Equipment Investment and Economic Growth: Reply.�? Quarterly Jour-
   nal of Economics 109:803–7.
Desdoigts, A. 1999. “Patterns of Economic Development and the Formation of Clubs.�?
   Journal of Economic Growth 4:305–30.
Doppelhofer, G., R. Miller, and X. Sala-i-Martin. 2000. “Determinants of Long-Term
   Growth: A Bayesian Averaging of Classical Estimates (bace) Approach.�? Working
   Paper no. 7750, National Bureau of Economic Research, Cambridge, Mass.
Draper, D. 1987. “Comment: On the Exchangeability Judgments in Predictive Model-
   ing and the Role of Data in Statistical Research.�? Statistical Science 2:454–61.
———. 1995. “Assessment and Propagation of Model Uncertainty.�? Journal of the Royal
   Statistical Society, Series B 57:45–70.
———. 1997. “On the Relationship between Model Uncertainty and Inferential/Predic-
   tive Uncertainty.�? School of Mathematical Sciences, University of Bath. Processed.
Draper, D., J. Hodges, C. Mallows, and D. Pregibon. 1993. “Exchangeability and Data
   Analysis.�? Journal of the Royal Statistical Society, Series A 156:9–28.
Durlauf, S. 2000. “Econometric Analysis and the Study of Economic Growth: A Skepti-
   cal Perspective.�? In R. Backhouse and A. Salanti, eds., Macroeconomics and the Real
   World. Oxford: Oxford University Press.
Durlauf, S., and P. Johnson. 1995. “Multiple Regimes and Cross-Country Growth Be-
   havior.�? Journal of Applied Econometrics 10:365–84.
Durlauf, S., and D. Quah. 1999. “The New Empirics of Economic Growth.�? In J. Tay-
   lor and M. Woodford, eds., Handbook of Macroeconomics. Amsterdam: North
   Holland.
Durlauf, S., A. Kourtellos, and A. Minkin. 2000. “The Local Solow Growth Model.�?
   Forthcoming in the European Economic Review, Papers and Proceedings.
Easterly, W., and R. Levine. 1997. “Africa’s Growth Tragedy: Policies and Ethnic Divi-
   sions.�? Quarterly Journal of Economics 112:1203–50.
Epstein, L., and T. Wang. 1994. “Intertemporal Asset Pricing Behavior under Knightian
   Uncertainty.�? Econometrica 62:283–322.
270    the world bank economic review, vol. 15, no. 2


Evans, P. 1998. “Using Panel Data to Evaluate Growth Theories.�? International Eco-
    nomic Review 39:295–306.
Fernandez, C., E. Ley, and M. Steel. 1999. “Model Uncertainty in Cross-Country Growth
    Regressions.�? Department of Economics, University of Edinburgh (also forthcoming,
    Journal of Applied Econometrics).
———. 2001. “Benchmark Priors for Bayesian Model Averaging.�? Journal of Econo-
    metrics 100:381–427.
Frankel, J., and D. Romer. 1996. “Trade and Growth: An Empirical Investigation.�?
    Working Paper 5476, National Bureau of Economic Research. Cambridge, Mass.
Freedman, D. 1991. “Statistical Models and Shoe Leather.�? In P. Marsden, ed., Socio-
    logical Methodology 1991. Cambridge: Basil Blackwell.
———. 1997. “From Association to Causation via Regression.�? In V. McKim and
    S. Turner, eds., Causality in Crisis. South Bend, Ind.: University of Notre Dame
    Press.
Galor, O. 1996. “Convergence? Inferences from Theoretical Models.�? Economic Jour-
    nal 106:1056–69.
Gilboa, I., and David Schmeidler. 1989. “Maximin Expected Utility with Nonunique
    Prior.�? Journal of Mathematical Economics 18:141–53.
Goldberger, A. 1991. A Course in Econometrics. Cambridge, Mass.: Harvard Univer-
    sity Press.
Granger, C., and H. Uhlig. 1990. “Reasonable Extreme-Bounds Analysis.�? Journal of
    Econometrics 44:159–70.
Heckman, J. 2000. “Causal Parameters and Policy Analysis in Economics: A Twentieth
    Century Retrospective.�? Quarterly Journal of Economics 115:45–97.
Hoeting, J., D. Madigan, A. Raftery, and C. Volinsky. 1999. “Bayesian Model Averag-
    ing: A Tutorial.�? Statistical Science 14:382–401.
Islam, N. 1995. “Growth Empirics: A Panel Data Approach.�? Quarterly Journal of Eco-
    nomics 110:1127–70.
Ivanoff, B., and N. Weber. 1996. “Some Characterizations of Partial Exchangeability.�?
    Journal of the Australian Mathematical Society, Series A 61:345–59.
Kadane, J., and J. Dickey. 1980. “Bayesian Decision Theory and the Simplification of
    Models.�? In J. Kmenta and J. Ramsey, eds., Evaluation of Econometric Models. New
    York: Academic Press.
Kallenberg, O. 1982. “Characterizations and Embedding Properties in Exchangeability.�?
    Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 60:249–8l.
Kass, R., and A. Raftery. 1994. “Bayes Factors.�? Journal of the American Statistical
    Association 90:773–95.
Kourtellos, A. 2000. “A Projection Pursuit Approach to Cross-Country Growth Data.�?
    Department of Economics, University of Wisconsin. Processed.
Landes, D. 2000. “Culture Makes Almost All the Difference.�? In L. Harrison and
    S. Huntington, eds., Culture Matters. New York: Basic Books.
Leamer, E. 1978. Specification Searches. New York: John Wiley.
———. 1983. “Let’s Take the Con Out of Econometrics.�? American Economic Review
    73:31–43.
Leamer, E., and H. Leonard. 1983. “Reporting the Fragility of Regression Estimates.�?
    Review of Economics and Statistics 65:306–17.
                                                              Brock and Durlauf    271


Lee, K., M. H. Pesaran, and R. Smith. 1997. “Growth and Convergence in a Multi-
   Country Stochastic Solow Model.�? Journal of Applied Econometrics 12:357–92.
Levine, R., and D. Renelt. 1992. “A Sensitivity Analysis of Cross-Country Growth Re-
   gressions.�? American Economic Review 82:942–63.
Lucas, R. 1988. “On the Mechanics of Economic Development.�? Journal of Monetary
   Economics 22:3–42.
Madigan, D., and A. Raftery. 1994. “Model Selection and Accounting for Model Un-
   certainty in Graphical Models Using Occam’s Window.�? Journal of the American Sta-
   tistical Association 89:1535–46.
Mallows, C. 1998. “The Zeroth Problem.�? American Statistician 52:1–9.
Mankiw, N. G., D. Romer, and D. Weil. 1992. “A Contribution to the Empirics of Eco-
   nomic Growth.�? Quarterly Journal of Economics 107:407–37.
McAleer, M., A. Pagan, and P. Volker. 1985. “What Will Take the Con Out of Econo-
   metrics?�? American Economic Review 75:293–307.
Pack, H. 1994. “Endogenous Growth Theory: Intellectual Appeal and Empirical Short-
   comings.�? Journal of Economic Perspectives 8:55–72.
Pesaran, M. H., and R. Smith. 1995. “Estimating Long-Run Relationships from Dynamic
   Heterogeneous Panels.�? Journal of Econometrics 68:79–113.
Prescott, E. 1998. “Needed: A Theory of Total Factor Productivity.�? International
   Economic Review 39:525–52.
Pritchett, L. 2000. “Patterns of Economic Growth: Hills, Plateaus, Mountains, and
   Plains.�? World Bank Economic Review 14:221–50.
Quah, D. 1996a. “Convergence Empirics across Economies with Some Capital Mobil-
   ity.�? Journal of Economic Growth 1:95–124.
———. 1996b. “Empirics for Growth and Economic Convergence.�? European Economic
   Review 40:1353–75.
———. 1997. “Empirics for Growth and Distribution: Polarization, Stratification, and
   Convergence Clubs.�? Journal of Economic Growth 2:27–59.
Raftery, A. 1995. “Bayesian Model Selection in Social Research.�? In P. Marsden, ed.,
   Sociological Methodology 1995. Cambridge: Blackwell.
Raftery, A., D. Madigan, and J. Hoeting. 1997. “Bayesian Model Averaging for Linear
   Regression Models.�? Journal of the American Statistical Association 92:179–91.
Raiffa, H., and R. Schlaifer. 1961. Applied Statistical Decision Theory. New York: John
   Wiley.
Romer, P. 1986. “Increasing Returns and Long Run Growth.�? Journal of Political
   Economy 94:1002–37.
Romer, P. 1990. “Endogenous Technical Change.�? Journal of Political Economy 98:S71–
   S102.
Sala-i-Martin, X. 1997. “I Just Ran Two Million Regressions.�? American Economic
   Review, Papers and Proceedings 87:178–83.
Schervish, M. 1995. Theory of Statistics. New York: Springer-Verlag.
Schultz, T. P. 1998. “Inequality in the Distribution of Personal Income in the World:
   How It Is Changing and Why.�? Journal of Population Economics 11:307–44.
———. 1999. “Health and Schooling Investments in Africa.�? Journal of Economic
   Growth 13:67–88.
Sims, C. 1980. “Macroeconomics and Reality.�? Econometrica 48:1–48.
272    the world bank economic review, vol. 15, no. 2


Temple, J. 1998. “Robustness Tests of the Augmented Solow Growth Model.�? Journal
   of Applied Econometrics 13:361–75.
———. 2000. “Growth Regressions and What the Textbooks Don’t Tell You.�? Bulletin
   of Economic Research 52:181–205.
Tierney, L., and J. Kadane. 1986. “Accurate Approximations for Posterior Moments and
   Marginal Densities.�? Journal of the American Statistical Association 81:82–6.
Wasserman, L. 2000. “Bayesian Model Selection and Model Averaging.�? Journal of
   Mathematical Psychology 44:97–102.
White, H. 1994. Estimation, Inference and Specification Analysis. Cambridge: Cambridge
   University Press.
Wolcott, S., and G. Clark. 1999. “Why Nations Fail: Managerial Decisions and Perfor-
   mance in Indian Cotton Textiles, 1890–1938.�? Journal of Economic History 59:397–
   423.