On Measuring Aggregate "Social Efficiency"


                                              Martin Ravallion*

                             World Bank, 1818 H Street NW, Washington DC




        Abstract: Cross-country comparisons of social indicators controlling for income

        and/or social spending have been widely used to measure and explain "social

        efficiency," analogously to "technical efficiency" in production. The paper argues

        that these methods are clouded in ambiguities about what exactly is being

        measured.        Standard methods of measuring technical efficiency require

        assumptions that seem unlikely to hold for social indicators. In the context of a

        simple parametric model of life expectancy, conditions are identified under which

        there will be a systematic pattern of bias in estimates of efficient health spending.



Keywords: Social indicators; human development; poverty; social spending; efficiency frontiers.

                                              JEL: D61, I12, O57




World Bank Policy Research Working Paper 3166, November 2003

The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange
of ideas about development issues. An objective of the series is to get the findings out quickly, even if the
presentations are less than fully polished. The papers carry the names of the authors and should be cited
accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors.
They do not necessarily represent the view of the World Bank, its Executive Directors, or the countries they
represent.  Policy    Research    Working     Papers    are   available    online   at   http://econ.worldbank.org.




*       Address for correspondence: mravallion@worldbank.org. For comments the author is grateful to
Angus Deaton, Jed Friedman, Aart Kraay, Erwin Tiongson and Adam Wagstaff.

1. Introduction
        Invariably many of the things relevant to assessing a county's performance in promoting

human development and reducing poverty are not directly observed by stakeholders. For

example, governmental efforts in delivering social services to those in most need are not readily

observable by the people who finance the spending and/or vote for those responsible. Efforts to

make governments more accountable, and make development assistance more performance

driven, beg for reliable methods of assessing latent aspects of country performance. This would

allow aid donors and domestic tax payers to determine how much social outcomes might be

improved by better use of an economy's existing resources.

        This paper provides a critical overview of the most common approach found in the

literature. By this approach, one attempts to infer the "social efficiency" of an economy from the

measured deviations of an observed social indicator -- such as average life expectancy, the

infant mortality rate or the literacy rate -- from an efficiency frontier, typically identified from

the residuals of a regression of that indicator on control variables such as mean income and

public spending on social services. The econometric tools used have largely been borrowed from

the literature on measuring technical efficiency in production.

        What can these methods tell us about latent aspects of country performance in improving

social outcomes? Naturally there are (potentially important) concerns about the quality of the

data. However, the present paper will put such concerns aside; before any method is taken to

bigger and better data sets, or even implemented on existing imperfect data sets, we should look

closely at its theoretical foundations. Nor does the paper present any new empirical findings.

Instead, the sole aim is to critically assess whether existing methods can be expected to reliably

enhance public knowledge about an economy's efficiency in achieving agreed social goals.




                                                   2

        The following section describes examples of these methods found in the literature.

Section 3 then points to a number of concerns about the conceptual foundations and empirical

reliability of these methods. Section 4 elaborates on the specific sources of bias in estimates of

social efficiency in the context of a simple expository model. Section 5 concludes.


                                2. Approaches in the literature

        In comparing social indictors across countries it has become common to control for

income differences. In an early and influential example, Sen (1981) looked at the deviations of

actual log life expectancies from their predicted values obtained by regressing on log income per

capita. These residuals suggested that Sri Lanka was the best performer amongst developing

countries. Sen showed that the predicted national income corresponding to Sri Lanka's (high)

life expectancy was about 20 times higher than the country's actual income. This excellent

conditional performance was attributed to Sri Lanka's high level of social spending over many

decades. By interpretation, Sri Lanka was deemed to be a good performer in human

development because a relatively large share of its economic output was devoted to activities that

are good for health.

        Such conditional comparisons of social indicators and poverty measures across countries

have since become common in both the academic literature and policy discussions. They have

taken the form of either the "horizontal" comparison made by Sen (in which the difference in

performance is measured in the units of the horizontal axis) or the straight "vertical" residuals (in

units of the social indicator). One can now find many examples in studies of developing country

performance in human development. For example, the WHO's World Health Report for 1999

provides health "performance measures" over time by country, based on the residuals from

regressing health aggregates on the log of GDP per capita, its squared value and a trend (WHO,



                                                 3

1999, Annex Table 6). The residuals are taken to reveal public-sector performance, notably

through the expansion and dissemination of knowledge about health care. It is claimed that some

countries have performed considerably better than others, as assessed by this method. Other

international agencies have also used residual comparisons; examples can be found in both the

World Bank's World Development Reports and the UNDP's Human Development Reports (see,

for example, World Bank, 1993, and UNDP, 1996). Other examples of residual comparisons of

country performance include Kakwani (1993), Wang et al., (1999) and Moore et al (1999).

        Taking the idea a step further, a number of papers and reports have tried to assess country

performance relative to an "efficiency frontier" based on the best performing countries in terms

of social indicators or poverty measures conditional on the measured covariates. The latter vary

between applications, but typically include one or both of mean income and social spending per

capita and possibly other controls. There are numerous examples in the literature, and the

following discussion only focuses on four illustrative cases:1

    · In its World Health Report 2000, the WHO gives country rankings of the efficiency of

        national health care spending in raising DALE ("disability-adjusted life expectancy").2

        These rankings were based on efficiency frontiers calibrated to regressions of life

        expectancy on health expenditure (viewed as the "input") and schooling attainments (to

        proxy for "non-health system determinants of health") (Evans et al., 2000).

    · In research done in the IMF, Gupta et al., (1997) and Gupta and Verhoeven (2001)

        assessed the efficiency of government spending on health and education across countries

        and over time against efficiency frontiers calibrated to data on social indicators for health


1       Also see Fakin and de Crombrugghe (1997), Wang et al., (1999), Moore et al., (1999), Clements
(2002), Afonso et al., (2003), Afonso and St. Aubyn (2003) and Hollingsworth and Wildman (2003).
2       For a discussion of DALEs see Anand and Hanson (1997).



                                                   4

       and education. They found that countries in Africa are less efficient on average than

       elsewhere, though Africa's efficiency appears to have improved over time.

    · A similar idea has also been used to assess the efficiency of public spending on social

       services in reducing poverty. Gouyette and Pestieau (1999) regressed measures of

       poverty and inequality on levels of social spending across OECD countries and use the

       residuals to construct an efficiency frontier, identifying Belgium as the benchmark

       country, with lowest poverty given its social spending.3

    · In research at the World Bank, Jayasuriya and Wodon (2003) derived measures of the

       efficiency of countries (and provincial governments within countries) in attaining the

       Millennium Development Goals. Their frontier was based on regressions of social

       indicators on mean income and social spending. On this basis, they argued that

       substantial progress is possible through more efficient use of existing resources.

       The methods of fitting an efficiency frontier found in this literature have all been

borrowed from the literature on the measurement of technical inefficiency in production. In

estimating production functions, one can allow the possibility that there is technical inefficiency

such that actual output is less than the maximum output obtainable at given inputs. Various

methods have been used to determine the efficiency frontier. Sometimes it is fitted non-

parametrically to the observations with the best measured performance at each input level. Thus

the frontier is the upper boundary of the smallest set encompassing the data points on outputs and

inputs. An example of this approach is the "free disposal hull" (FDH) method used (in the

context of measuring social efficiency) by Fakin and de Crombrugghe (1997) and Gupta and

Verhoeven (2001). In other applications, a parametric model of the social indicator as a function


3      For further discussion of the results of Gouyette and Pestieau (1999) see Ravallion (2001).



                                                    5

of postulated covariates has been used to identify the frontier for social indicators. The examples

in Gouyette and Pestieau (1999), WHO (2000) and Evans et al., (2000) use variations on one of

the oldest methods used in production analysis to estimate the extent of technical inefficiency,

sometimes called the Corrected Ordinary Least Squares (COLS) (Kumbhakar and Lovell, 2000).

The production function is first estimated by regressing output on a vector of inputs, and then the

intercept is shifted upwards such that the production frontier bounds the data from above, i.e., by

finding the largest (positive) residual.4 With panel data one can implement a variation on this

method in whereby the frontier is anchored to the country with the maximum intercept in a

country fixed effects model estimated on panel data (following the method proposed by Schmidt

and Sickles, 1984, for measuring technical efficiency in production using panel data). This is the

method used by Evans et al., (2000) to quantify their frontier in assessing the comparative

efficiency of national health systems.

        An alternative to COLS-type methods is the stochastic frontier (SF) production function

whereby the error term in the regression for output includes a one-sided component (representing

inefficiency) as well as a regular zero mean error (Aigner et al., 1977; Meeusen and van den

Broeck, 1977). The SF method has the advantage over COLS and the main competing non-

parametric methods that it allows for random deviations from the frontier (in both directions),

such as due to measurement errors or shocks. Thus, not all of the data need be in the production

set. In estimating the model's parameters it is assumed that both error terms are independent and

identically distributed, and it is usually assumed that the zero-mean component is normally

distributed while inefficiency component is half normal. This is the method for measuring

efficiency in reaching human development goals used by Jayasuriya and Wodon (2003).


4       This method appears to have been first proposed by Winsten (1957) in his comments on Farrell
(1957). Subsequent variations on this method are discussed in Kumbhakar and Lovell (2000).


                                                   6

       The literature has not stopped at measuring social efficiency, but has tried to explain the

revealed differences across countries or provinces. Differences in social policies are one

possible explanation. This was how Sen (1981) explained why Sri Lanka appeared to be an

outstanding performer in human development given its income, although this explanation was

the subject of subsequent debate (Bhalla and Glewwe, 1986; Sen, 1988; Anand and Ravallion,

1993; Aturupane et al., 1994). Of course, differences in social policies are not the only

explanation that can be offered. Another source of heterogeneity in social outcomes at given

mean income is cross-country differences in the distribution of income. It has been argued that

aggregate health indicators such as life expectancy and the infant mortality rate depend far more

on incomes of the poor than the nonpoor (Bidani and Ravallion, 1997). Then the incidence of

income poverty also matters to human development, independently of social policies.

       There have also been attempts to explain the measured differences in "social efficiency"

using a second-stage regression as a means of sorting out which of the possible sources of these

differences matters most. This entails regressing the efficiency measure derived from the first

stage on other variables. For example, Moore et al., (2000) measure "efficiency in converting

national material resources into human development" by the residuals from a regression of a

human development index on income. They then regress this efficiency measure on a set of

explanatory variables, including a measure of the quality of government institutions. Similarly,

Jayasuriya and Wodon (2003) retrieve a measure of inefficiency in attaining human development

goals (using the SF specification for production functions) and then regress this measure on

indicators of the quality of government and urbanization. They find evidence of an inverted-U

relationship with urbanization (and governance, though less markedly), whereby social




                                                7

efficiency first increases as developing countries urbanize but then starts to decline at sufficiently

high levels of urbanization.


                  3. Pitfalls in measuring and explaining social efficiency

        It would be an impressive achievement -- indeed a remarkable one -- to extract credible

measures of latent inefficiencies in attaining social objectives from the type of aggregate

country-level data used in this literature. But has that really been achieved?

        A problem in assessing these methods is that the theoretical foundation for the empirical

models of social indicators has never been clear. We are told that they should be interpreted as

empirical "production functions." A standard assumption in the economic theory of production

is free disposability, meaning that if the point (x, y), for an output y and inputs x, is in the

producer's production set then so too is any point (x, y) such that x  x and y  y . As noted

in the last section, the assumption of free disposability has been invoked explicitly in some

studies of social efficiency and is implicit in other studies. This may be a defensible assumption

for a production process (though it can certainly be questioned in that context). But how can we

interpret the application of this assumption to (say) life expectancy as the "output" and public

spending on health as the "input"? There are (thankfully) very few governments in the world

that can freely dispose of their citizens such that if the country initially has a life expectancy of

(say) 60 years, and health spending of (say) $100 per person per year, it is equally feasible for it

to have a life expectancy of 40 at the same or greater spending.

        The applicability of production theory to measuring social efficiency is questionable.

Social indicators do not stem from anything one could reasonably think of as a production

function representing a well-defined technology operated by an individual producer with well-

defined physical inputs. While there are production functions under the surface somewhere,


                                                   8

there is clearly a lot more going on in determining the aggregate relationship between measured

social outcomes and social spending and/or national income.

        Without specifying a complete model it is hard to assess the specification choices made

in this literature. But there is already enough to make one skeptical. The accounting of

"outputs" is worryingly incomplete, such as by focusing on health only, or just one aspect of

health. This raises the concern that public spending that is deemed to be inefficient with respect

to the partial social indicator may be of value with respect to some omitted indicator. And even

if one accepts that life expectancy (say) is an adequate proxy for the "outputs" of the "health

system," the accounting of "inputs" is rarely convincing. For example, the practice in the

literature of using public spending on health (say) as a measure of the inputs to health production

is hard to defend on theoretical grounds; if anything, one would be more tempted to interpret

these regressions as some kind of inverse cost function, in which the cost of an entire bundle of

inputs depends on the output level. However, the input prices would then have to be included for

a correctly specified empirical model. The omission of these (country-specific) prices means that

what is being called "inefficiency" may be nothing more than how public health authorities

respond to the input prices they face, including wages.

        Similarly, why should one not also control for other types of public spending,

recognizing that there is rarely a clear one-to-one mapping between types of spending and

measured social indicators? Public spending on health care is not just about raising life

expectancy (say), but is also about improving the quality of peoples' lives. And public spending

on (say) education or infrastructure may well matter greatly to health outcomes.5 Similarly, why



5       For example, Jalan and Ravallion (2003) find that access to water infrastructure improves child
health outcomes in rural India, and that the extent of this effect depends on maternal education and
household incomes.



                                                      9

not also control for factors such as distribution of income, or the administrative capabilities of its

government? What is being identified as "social inefficiency" in this literature could well stem

entirely from omitted interdependencies between types of spending and other country

circumstances, combined with a partial accounting of social outcomes.

        Nor is it clear that the residuals used to assess efficiency in many of these methods are

based on reliable measures of the expected values of the social outcomes conditional on

spending, income, or whatever else one chooses to control for. In the context of parametric

methods, misspecification of the functional form is known to be a concern in measuring

technical efficiency in production. For example, Giannakas et al., (2003) provide Monte Carlo

simulations indicating the potential for sizeable bias in the mean efficiency measures from SF

methods of estimating production functions due to misspecification of the functional form of the

production frontier. They demonstrate that the method can suggest quite high levels of

inefficiency (10-30% of output) for fully efficient producers.

        Non-parametric methods of setting the frontier can avoid such problems, though they

introduce new concerns. One must assume a continuous frontier, which must be interpolated

from the discrete data points. Sensitivity to outliers, or to "holes" in the support provided by

data, can be expected. Recent advances in nonparametric frontier estimation offer the promise of

results that are more robust to outliers and noise, by nonparametrically smoothing the frontier,

allowing some data points to be outside the production set (Cazals et al., 2002).

        Naturally, the more data one has, the more believable these nonparametric methods

become. Micro applications in production analysis often use samples of many thousands of

producers. However, in the applications to measuring social efficiency at country-level, one is

fitting a continuous frontier to at most 200 data points; indeed the application by Gupta and




                                                   10

Verhoeven (2001) of the FDH method to measuring social efficiency uses data for 37 countries.

Simulations by Park et al., (2000) suggest considerable imprecision in FDH estimates of

efficiency with sample sizes of 100 or less, even when one is allowing for just a few inputs; the

imprecision naturally rises with the number of inputs and falls with the number of data points.

Asymptotic results for drawing statistical inferences about estimates of efficiency using the FDH

method are now available in the literature (Park et al., 2000), though there do not appear to have

been any applications to the measurement of social efficiency.

        There are also concerns about the functional forms used in these methods. Given the

bounded nature of most social indicators, the linear and even linear-in-logs specifications

favored in most empirical work using parametric social indicator regressions cannot possibly be

right, at least globally.6 The regression residuals will then be some higher-order (nonlinear)

function of income and so lose their interpretation as the deviations of actual social outcomes

from their expected values conditional on income.

        Further concerns arise about whether conditional cross-country comparisons should be

based on the levels of the social indicators and the control variables (as in most of the literature)

or their changes over time (as advocated by Bhalla and Glewwe, 1986, and Aturupane et al.,

1994). Taking differences over time (or deviations from time means) has the usual advantage

that country fixed effects correlated with the regressors (that would otherwise bias the results)

can be swept away. However, it also raises well-known concerns that the measured changes may

not properly capture the effects of interest. One way this can happen is that the changes over

time are measured with far greater error than the levels, so that the signal-to-noise ratio

deteriorates substantially, with a corresponding increase in bias due to measurement error in the



6       Better transformations have been proposed by Anand and Ravallion (1993) and Kakwani (1993).



                                                  11

regressors and higher standard errors. Another way that the "change-on-change" regression of

social indicators on income (and possibly other controls) may miss the effects of interest is that

the time period is too short (even if there is no measurement error within the chosen period). For

example, in response to the claim by Bhalla and Glewwe (1986) that Sri Lanka's improvement in

social indicators was not unusually good relative to its gain in income, Sen (1988) argued that

effects of interest largely predated the time period used by Bhalla and Glewwe.7

        Even putting all these problems to one side, there remain important concerns about the

validity of the assumptions made about the error term in current practices for measuring social

efficiency by parametric methods using social indicator regressions. One concern is about the

distributional assumptions made in the SF method. Any non-normality (especially skewness) in

the zero-mean random error component will be incorrectly attributed to inefficiency (as pointed

out by Skinner, 1994). And even if normality is deemed to be an acceptable assumption for the

zero-mean error component, this is far from clear for the inefficiency component, and it is known

that the results obtained can be quite sensitive to relaxing this distributional assumption.8

        Some of these concerns are known from the literature on measuring technical efficiency

in production. However, there is a further problem that is intrinsic to assessments of social

efficiency by parametric methods but appears to have been entirely ignored in that literature.9

This relates to the validity of a key assumption in all versions of these methods found in practice,

namely the assumption that the error component deemed to reflect "inefficiency" is uncorrelated



7       Time series data for Sri Lanka are consistent with Sen's conclusion that a significant and
quantitatively important role was played by Sri Lanka's social spending in reducing infant mortality at
given average income (Anand and Ravallion, 1993).
8       See, for example, Baccouche and Kouki (2003). Also see the discussion in Greene (1999).
9       For example, although the WHO's health efficiency estimates using the frontier method have
attracted a good deal of criticism, the WHO's (2001, Chapter 11) survey of those criticisms does not
mention the following concern.



                                                   12

with the observed control variables. This assumption will be more defensible in some

applications than others. When estimating a production function one might be willing to treat the

extent of technical inefficiency as being uncorrelated with factor inputs. This is justified under

the assumption that the inefficiency is unknown to the producer, and so could not affect input

choices. This is a special case of the longstanding argument for using Ordinary Least Squares

(OLS) to estimate a production function under the assumption that the production error term is

unknown to producers ex ante, at the time input choices were made.10 That assumption is

questionable and there is a literature that has attempted to relax it, such as by using longitudinal

(panel data) following Mundlak (1963).11 However, one can at least point to a story as to why

production inputs might be safely treated as exogenous to technical inefficiency, and this has

been the maintained assumption in the literature on frontier production functions.12

        When modeling social outcomes there must, however, be a reasonable presumption that

the error component intended to capture "inefficiency" is correlated with the regressors, both in

cross-sectional data and over time. The inefficiency in attaining desired social outcomes

presumably stems from social or economic activities that are unproductive from the point of

view of those outcomes. Examples include public spending policies that do little or nothing to

improve social outcomes and income gains to the rich, which arguably do little to improve

attainments in basic health and education or to reduce poverty. Only under rather special

conditions will these inefficient components be uncorrelated with total income or public

spending. As an economy grows one expects at least some of the income gains to be inefficient



10      For an interesting recent discussion of how stochastic terms arise in estimating agricultural
production functions see Pope and Just (2003).
11      See Olley and Pakes (1996) for a more general treatment, allowing for endogenous exist of
producers as well as endogenous input choice.
12      The assumption is partially relaxed in the method of estimating production frontiers using panel
data proposed by Schmidt and Sickles (1984).


                                                    13

from the point of view of certain social goals. Similarly, at least some of an increment to total

social spending can be expected to be inefficient. Thus a positive correlation between the level of

inefficient income or spending and the totals can be postulated. Standard econometric methods

for estimating parametric production frontiers are not then valid for the problem of assessing

social efficiency and it is not clear what meaning can be given to the resulting measures.

         There are a number of further specification issues in the subset of the literature that has

also tried to explain measured differences in social efficiency. Without greater conceptual clarity

about what "social efficiency" means in this context, it is difficult to assess the empirical

specifications used in this strand of the literature. When estimating a production function it is

reasonably clear what variables qualify for the first stage regression -- they should be the factor

inputs to production.13 However, when the same tools are applied to human development it is

unclear which variables should be in the first stage (used to measure inefficiencies) and which

should be in the second (used to explain inefficiencies). Why, for example, does urbanization

only matter to the extent of "social inefficiency" (as in Jayasuriya and Wodon, 2003)?

Urbanization influences the costs of public service provision, which would presumable matter to

the efficient level of outcomes too. And why is it deemed "inefficient" for an economy with

weak administrative capabilities to devote fewer resources to activities that are intensive in those

capabilities? In some papers social spending appears in the second stage (as in Sen's, 1981,

explanation of differences in performance at given incomes, though Sen did not have a second

stage regression as such) and sometimes in the first stage (as in Jayasuriya and Wodon, 2003).




13       Although the ambiguity about specification choices also arises in the production function
literature, since the variables used to in the second stage to explain measured technical efficiency could
equally well also qualify as shift parameters in the production frontier. For further discussion see
Kumbhakar and Lovell (2000, Chapter 7).


                                                       14

        Misspecifications in the first stage will clearly also contaminate the second stage. For

example, in their first stage regression, Jayasuriya and Wodon (2003) assume a linear

relationship between their social indicators and income. In the second stage, they then find an

inverted-U relationship between their efficiency measure and urbanization. However, this could

simply reflect a first-stage misspecification, given that the relationship between social indicators

and income cannot possibly be linear but is very likely to be concave (given the bounded nature

of the social indicators). Even if there is no real effect (linear or otherwise) of urbanization on

social efficiency, the efficiency measure will be found to have an inverted-U relationship with

urbanization, but only because urbanization acts as a proxy for mean income.

        To give another example, it is plausible that differences in the distribution of income will

matter to social outcomes at given mean income. Yet, I do not know of any study of social

efficiency that has controlled for income distribution. The second-stage covariates of measured

social efficiency could then be solely picking up covariates of inequality.

        The general point here is that it is unclear what the second stage regression could ever

meaningfully tell us about the determinants of "social efficiency" when the measure of the latter

is biased and inconsistent because the relevant error component is not in fact orthogonal to the

regressors at the first stage. All one might be picking up in the second stage are correlations with

the biases passed on from the first-stage.

        When monitoring country performance over time, a further concern arises from the fact

that there are two distinct sources of the measured changes in social efficiency. Firstly, there

may be an unconditional change in the social indicator and secondly there may be a change in

the conditioning variable(s). Either could account for the measured improvement or worsening

over time in social efficiency. For example, a country that appears to be performing poorly now




                                                  15

in terms of its average health attainments given its income could become a star performer in the

future simply by (suffering) negative growth, without any gain in actual health attainments. This

is a nagging concern about the assessments of progress in human development using these

methods.


                      4. Sources of bias in a simple expository model

        The literature has tended to justify these methods on casual, intuitive grounds, without

rigorously defining the theoretical concept one is trying to measure or the conditions under

which the methods used will give reliable results. Established methods from the analysis of

production functions have been applied to the problem of explaining human development or

poverty without due consideration as to whether the methods are appropriate.

        This section will try to throw further light on the specific sources of bias in assessments

of social efficiency using these methods. Attention is confined here to parametric frontier

methods, notably the COLS method, though I will note some implications for the SF method

when applied to measuring social efficiency. The analysis will focus on the case in which one is

assessing the efficiency of health spending in raising life expectancy, though equally well one

might be using this method to assess the efficiency of other types of public spending or, indeed,

the efficiency of the economy as a whole (in which case the control variable is national income)

with respect to one or more social indicators or aggregate welfare measures. By using only one

"input" it will be possible to demonstrate the key points with nothing more than some simple

algebra. However, the main messages also apply to versions of this method that add more

control variables.

        One must first be more precise about what we mean by "social inefficiency." The

definition that appears to be closest to that underlying much of the applied work reviewed in the



                                                  16

previous sections is that social inefficiency refers to that share of spending that is devoted to

things that do nothing for a specific social outcome. So total health spending H has two

components, one which raises life expectancy and one which does not:

                Hi = Hi + Hi
                        E     I   (i=1,...,n)                                                  (1)


where Hi  0 is the efficient component, that is health promoting, while Hi  0 is the
          E                                                                      I



inefficient component. By definition, life expectancy (denoted Li , which may be some

appropriate nonlinear function of actual life expectancy) depends on the efficient component:

                Li =  +  Hi + i
                              E                                                                (2)

where  and  are parameters and i is a zero-mean i.i.d. error term. However, equation (2) is

not estimable since the efficient component is unobserved. The model linking the observed total

spending on health to life expectancy is:

                Li =  +  Hi + µi                                                               (3)


where µi = - Hi + i . It is readily verified that the OLS regression coefficient of L on H is
                   I



^ = ^ where ^ = cov(H,HE)/var(H) is the OLS regression coefficient of HE on H. Since

the estimates of social efficiency using the COLS method are based on the OLS regression in (3),

and this is biased for all ^  1, all estimates of social efficiency at the country level will also be

biased.

        What can we say about the direction of bias? Two seemingly natural assumptions are:

        Assumption 1: Higher levels of efficient health spending raise life expectancy (  > 0).

        Assumption 2: Both the efficient and inefficient components of spending tend to rise

        with total spending (0 < ^ < 1).




                                                  17

Under these assumptions, the OLS estimate of the regression coefficient of life expectancy on

health spending will be positive but will underestimate the true impact of efficient health

spending on life expectancy.

        The same source of bias is found in the SF method, in which one allows explicitly for a

one-sided error component representing "inefficiency." To see why, re-write equations (3) as:

                Li = [ - E(Hi )]+  Hi +[µi + E(Hi )]
                                  I                         I                                   (4)

Given that i is a zero-mean error term, the transformed error term in (4) now has zero mean.

So one is tempted to estimate  by OLS. One can then retrieve estimates of the other

parameters ( and the variances of Hi and i ) by invoking the distributional assumptions in
                                          I



the SF model (i.e., that i is normally distributed and Hi is positive half-normal).14 Thus we
                                                             I



appear to have everything needed to measure social efficiency at country level.

        However, all this breaks down as soon as one recognizes that applying OLS to equation

(4) does not give a consistent estimate of  for the reason discussed above, namely that

cov(Hi , µi)  0 in general. Again, the bias in estimating the slope of the frontier is passed onto

the estimates of social efficiency.

        What are the implications for estimates of the social efficiency of spending? Recall that

by the COLS method one estimates the level of socially efficient spending in each country by

shifting the intercept of the social indicator regression until it passes through the data point for

the country with the largest (negative) residual. The equation of this frontier is thus

Li = ^ + ^ H^ i + µ^i where ^ and ^ are the OLS estimates of the parameters of (3) and
              E      *




14      For the present purpose, this two step method-of-moments procedure is equivalent to using
maximum likelihood in one step; for further discussion of the two-step method see Kumbhakar and
Lovell (2000, Chapter 3).


                                                   18

µ^* = max(µ^1 ,...,µ^n) whereµ^i = Li -^ - ^ Hi. An estimate of the health-promoting output for

country i ( H^i ) is then obtained by inverting the equation for the frontier, giving:
              E




                 H^i =
                    E   Li -^ - µ^*     *    Li - L^*
                             ^      = H^ +     ^                                             (5)


(noting that L^* = ^ + ^ H^ + µ^* ). Notice, however, that this is nothing more than a fixed linear
                            *



transformation of the observed life expectancy for country i. If the aim is simply to rank

countries by their socially efficient spending then this method is not telling us anything more

about any specific country than we already know from the observed (unconditional) life

expectancy.

       How does equation (5) compare to the true social efficiency of spending in country i?

Inverting equation (2) we find that:


                 Hi = H* +
                    E         Li - L* + * - i
                                                                                             (6)


where (H *,L*) is the data point for the true benchmark country, for which the error term is

µ* =* = L* - - H* = max(µ1,...,µn), where µi = Li - -  Hi. Subtracting equation (6)

from (5) and re-arranging terms it is readily verified that:


                 H^i - Hi =
                    E     E    1 (
                                   µ* - µ^* )+(Li - L^*)(^ -1) +(i -*)
                                          *              1
                                                                                             (7)


where µ^*  L^* - -  H^ . The first term in square brackets on the RHS of equation (7),
          *                 *



µ* - µ^* = L* - L^* - (H* - H^ ), measures the extent to which the true residual of the
        *                         *



benchmark country is underestimated when evaluated at the true parameters. It can be readily

verified that if the data form a convex set then µ*  µ^* (with the inequality strict if the set is
                                                         *




                                                  19

strictly convex), so this term would be a source of upward bias to estimates of the efficient level

of spending. One might be happy to make such a convexity assumption if this was a production

set. But that is not the case. The discreteness of the data alone will generate non-convexities.

More generally, there appears to be little one can say on a priori grounds about the sign of this

first term, as its value will be data-specific.15 I will set this term to zero under the following

(admittedly ad hoc) assumption:

         Assumption 3: The benchmark county is a sufficient outlier that it stays being the

         country with highest conditional life expectancy after correcting for the bias in the slope

         of the efficiency frontier, i.e., µ* = µ^* .
                                                   *



         The second term in square brackets in (7) arises from the bias in estimating the slope of

the frontier, as already discussed. This term will impart a downward (upward) bias to estimates

of social efficiency for all countries with life expectancy less than (greater than) that for the

country with the highest conditional life expectancy.

         The third term in squared brackets (i - *) reflects country-specific heterogeneity.

Even at mean points ( E() = 0 ), unobserved variables that influence life expectancy in the

benchmark country at given health spending will lead the method to miss-identify the vertical

location of the frontier. (Under the distributional assumptions about the error terms in the SF

method, this effect will vanish in expectation. In panel data versions of the COLS method, only

the time invariant component of this heterogeneity term matters.)

         We have seen that the direction of bias is hard to predict at the level of individual

countries. What can we say about average social efficiency? Under Assumptions 1-3, it is



15       Given that  is underestimated, it must be the case that L* < L and that H * < H *; however,
                                                                       ^*                ^

this is not sufficient for determining the sign of µ* - µ^* given that  > 0.
                                                          *




                                                      20

readily verified that the asymptotic bias in the estimate of mean socially efficient spending is

given by:


                 Bias  plim(   H^i  E / n) - E(H ) =
                                                  E
                                                        1 ( L - L*)(1 -1) - *
                                                                    ^                          (8)


Two sources of asymptotic bias are now evident in the squared brackets on the RHS of (8). The

first source is the bias in the slope of the frontier arising from the correlation between inefficient

spending and total spending while the second is the bias in the height of the frontier arising from

latent heterogeneity in the benchmark country.

        The history of social policy in the frontier country is clearly a potentially important

source of latent heterogeneity in conditional social outcomes. If the country with the best

conditional performance got to that point by a long history of favorable policies (and not just for

health) then life expectancy will tend to be higher than one would expect given current health

social spending. From equation (8) we can see that if the benchmark country has above average

life expectancy ( L < L*) as well as favorable latent conditions for life expectancy (* > 0) then


mean social efficiency will be underestimated.

        Under certain conditions it is possible to use an instrumental variables estimator to

correct the bias in the original social indicator regression. The practical challenge would be to

find an instrumental variable (IV) that is correlated with the socially efficient component of

spending but uncorrelated with the inefficient component. It is far from obvious that such an IV

exists, though this route may merit further research, such as by using lagged observations over

time as IV's in a panel-data structure. Notice, however, that it cannot be presumed that simply

correcting for the bias in the regression coefficient on spending will reduce the overall bias in the




                                                    21

estimate of mean social efficiency. That would only hold if the two sources of bias discussed

above work in the same direction.


                                          5. Conclusions

        A strand of the literature on human development and poverty has applied existing tools

for measuring technical efficiency in production to the problem of measuring aggregate "social

efficiency." The paper has questioned whether these methods can deliver credible results.

        There is a nagging concern that what is being called "inefficiency" in this literature may

reflect nothing more than how arbitrarily omitted differences in country circumstances -- such

as differences in the prices faced, or other relevant types of public spending, or administrative

capabilities -- influence partially measured social outcomes. The set of feasible combinations of

social outcomes and levels of income and social spending in any economy is almost certainly

riddled with non-convexities ("holes") arising from real constraints on what governments can

and cannot do. Without specifying which of those constraints is deemed to be binding in

assessing "social efficiency" and which is not, it is difficult to make sense of the calculations.

        Even if one accepts the free disposability assumption for social outcomes, there are some

poorly resolved specification issues for these "social indicator production functions." In contrast

to production analysis (for which it is reasonably clear what constitutes a production input) it is

not clear what should be a control variable in measuring social efficiency and what should be

used to explain inefficiency. This throws doubt on both the measures obtained, and the

explanations that have been given in the literature for measured differences in efficiency across

countries.

        However, even putting these concerns aside, there are other reasons to question the

reliability of these estimates. The main reason for "inefficiency" in attaining desired social



                                                  22

outcomes in human development and poverty reduction is presumably that there are public and

private activities that do not promote those goals. Obvious candidates include inefficient social

policies (that do not reach those in most need due to insufficient outlays or design deficiencies)

and persistently high levels of income inequality (whereby a large share of the aggregate income

differences between countries arguably does little for human development or poverty reduction).

These sources of "social inefficiency" are typically buried in the regression error term for the

social indicator. Even with unbiased estimates of these error terms, disentangling the

inefficiency from other factors (including measurement errors) is clearly problematic. However,

the least-squares regression parameters and (hence) the residuals can be expected to be biased in

general. This arises when total social spending (or national income) is correlated with it's own

components, including both the efficient and inefficient activities from the point of view of

social outcomes. This source of bias has been routinely ignored in the literature on social

efficiency.

        We have seen that a systematic pattern of downward bias in measures of aggregate social

efficiency can be expected under certain conditions that are not obviously implausible. In

particular, mean social efficiency will be underestimated by standard parametric frontier methods

as long as both the efficient and inefficient components rise with total spending, the best

conditional performer is a sufficient "outlier" in the data, and the frontier country or countries

tend to have above average (unconditional) performance -- possibly reflecting favorable latent

conditions for human development stemming from a past history of good social policies and/or

low inequality. However, overestimation of social efficiency in certain countries, and even at the

mean, cannot be easily ruled out.




                                                 23

        These observations point to serious limitations of past attempts to measure and explain

social efficiency using cross-country comparisons of observed aggregate data. It is not clear

what can be inferred about average efficiency using these methods, even in large samples. And it

is problematic indeed to use this type of method to assess and monitor the performance of any

specific country, or to explain cross-country differences in performance.

        Some of the concerns raised in this paper relate solely to the parametric methods based

on social indicator regressions. Nonparametric methods can avoid some of these problems and

this may be a more promising route, though the "curse of dimensionality" comes to the fore in

applications on cross-country data sets.

        However, greater conceptual clarity about the definition and origin of "social

inefficiency" is begging before applied work on this topic borrows even more sophisticated tools

from production analysis. Whether some other approach to measuring "social efficiency" can

yield more credible results remains an open question. There does appear to be scope for less

inferentially ambitious approaches based on information pertaining more directly to sources of

low performance in reaching agreed social goals in specific settings, including for specific public

programs.




                                                 24

                                           References

Aigner, D., C. Lovell and P. Schmidt (1977). Formation and estimation of stochastic

       frontier production models, Journal of Econometrics, 6, 21-37.

Afonso, Antonio and Miguel St. Aubyn (2003). Non-Parameteric Approaches to Education and

       Health Expenditure Efficiency in the OECD, mimeo, Technical University of Lisbon,

       Lisbon, Portugal.

Afonso, Antonio, Ludger Schuknecht and Vito Tanzi (2003). Public sector efficiency: An

       international comparison, European Central Bank Working Paper No. 242, European

       Central Bank, Frankfurt, Germany.

Anand, Sudhir and K. Hanson (1997). Disability-adjusted life years: A critical review. Journal of

       Health Economics 16, 685-702.

Anand, Sudhir and Martin Ravallion (1993). Human development in poor countries: On

       the role of private incomes and public services, Journal of Economic

       Perspectives, 7, 133-150.

Aturupane, Harsha, Paul Glewwe and Paul Isenman (1994). Poverty, human development

       and growth: An emerging concensus? American Economic Review, Papers and

       Proceedings, 84(2), 244-249.

Bhalla, Surjit S., and Paul Glewwe (1986). Growth and equity in developing countries: A

       reinterpretation of the Sri Lankan experience, World Bank Economic Review, 1, 35-63.

Bidani, Benu and Martin Ravallion (1997). Decomposing social indicators using

       distributional data, Journal of Econometrics, 77(1), 125-140.

Baccouche, Rafik and Mokhtar, Kouki (2003). Stochastic production frontier and technical

       inefficiency: A sensitivity analysis, Econometric Reviews, 22(1), 79-91.




                                                25

Cazals, Catherine, Jean-Pierre Florens and Leopold Simar (2002). Nonparametric frontier

        estimation: A robust approach, Journal of Econometrics 106, 1-25.

Clements, Benedict (2002). How efficient is education spending in Europe? European Review of

        Economics and Finance 1(1), 3-26.

Evans, David B., Ajay Tandon, Christopher J.L. Murray and Jeremy A. Lauer (2000). The

        comparative efficiency of national health systems in producing health: An analysis of 191

        countries. GPE Discussion Paper 29, World Health Organization, Geneva.

Fakin, Barbara and Alain de Crombrugghe (1997). Fiscal adjustment in transition economies:

        Social transfers and the efficiency of public spending. Policy Research Working Paper

        1803, World Bank, Washington DC. http://econ.worldbank.org/resource.php?type=5

Farrell, M.J. (1957). The measurement of productive efficiency, Journal of the Royal

        Statistical Society A, 120(3), 253-281.

Giannakas, Konstantinos, Kien C. Tran and Vangelis Tzouvelekas (2003). Predicting

        technical efficiency in stochastic production frontier models in the presence of

        misspecification: A Monte-Carlo analysis. Applied Economics 35, 153-161.

Gouyette, Claudine and Pierre Pestieau (1999). Efficiency of the welfare state, Kyklos,

        52, 537-553.

Greene, W.H. (1999). Frontier production functions, in Handbook of Applied

        Econometrics, edited by M.H. Pesaran and P. Schmidt, Oxford: Blackwell

        Publishers.

Gupta, Sanjeev, K. Honjo and Martijn Verhoeven (1997). The efficiency of government

        expenditure: Experience from Africa, Working Paper 97/153, International Monetary

        Fund. http://www.imf.org/external/pubs/cat/longres.cfm?sk=2409.0




                                                 26

Gupta, Sanjeev and Martijn Verhoeven (2001). The efficiency of government

        expenditure: Experience from Africa, Journal of Policy Modeling 23, 433-467.

Hollingsworth, Bruce and John Wildman (2003). The efficiency of health production: Re-

        estimating the WHO panel data using parametric and non-parametric approaches to

        provide additional information, Health Economics 12, 493-504.

Jalan, Jyotsna and Martin Ravallion (2003). Does piped water reduce diarrhea for children in

        Rural India? Journal of Econometrics 112, 153-173.

Jamison, D.T., J.Wang, K. Hill and J.L. Londono (1996). Income, mortality and fertility

        in Latin America: Country level performance, 1960-90, Revista-de-Analisis-Economico,

        11(2), 219-61.

Jayasuriya, Ruwan and Quentin Wodon (2003). Efficiency in Reaching the Millennium

        Development Goals, World Bank, Washington DC.

        http://publications.worldbank.org/ecommerce/catalog/product?item_id=2435559

Kakwani, Nanak (1993), Performance in living standards: An international comparison,

        Journal of Development Economics, 41, 307-336.

Kumbhakar, Subal C., and C.A. Knox Lovell (2000). Stochastic Frontier Analysis,

        Cambridge: Cambridge University Press.

Moore, Mick, Jennifer Leavy, Peter Houtzager and Howard White (2000). Polity

        Qualities: How Governance Affects Poverty, Working Paper 99, Institute of Development

        Studies, University of Sussex. http://www.ids.ac.uk/ids/bookshop/wp/wp99.pdf

Meeusen, W., and J. van den Broeck (1977). Efficiency estimation from Cobb-Douglas

        production functions with composed error. International Economic Review 18(2),

        435-444.




                                               27

Mundlak, Y. (1963). Estimation of production and behavioral functions from a combination of

       cross-section and time series data, in C. Christ et al. (eds) Measurement in Economics:

       Studies in Mathematical Economics and Econometrics in Memory of Yehuda Grunfeld,

       Stanford: Stanford University Press.

Olley, G. Steven and Ariel Pakes (1996). The dynamics if productivity in the

       telecommunications equipment industry, Econometrica 64(6), 1263-1297.

Park, B.U.,L. Simar and Ch. Weiner (2000). The FDH estimator for productivity efficiency

       scores, Econometric Theory 16, 855-877.

Pope, Rulon D., and Richard E. Just (2003). Distinguishing errors in measurement from

       errors in optimization, American Journal of Agricultural Economics, 85(2),

       348-358.

Ravallion, Martin (2001). On assessing the efficiency of the welfare state: A comment,

       Kyklos 54(1), 115-123.

Schmidt, Peter and R.C. Sickles, (1984). Production frontiers and panel data, Journal of

       Business and Economic Statistics, 2(4), 367-374.

Sen, Amartya K., (1981). Public action and the quality of life in developing countries,

       Oxford Bulletin of Economics and Statistics, 43, 287-319.

______________, (1988). Sri Lanka's achievements: When and how?, in Srinivasan,

       T.N., and P.K. Bardhan (eds) Rural Poverty in South Asia, New York: Columbia

       University Press, 549-56.

Skinner, Jonathan, (1994). What do stochastic frontier cost functions tell us about inefficiency?

       Journal of Health Economics 13, 323-328.

United Nations Development Programme (UNDP) (1996). Human Development Report.




                                                 28

       New York: Oxford University Press.

Wang, Jia, Dean T. Jamison, Eduard Bos, Alexander Preker, John Peabody (1999).

       Measuring Country Performance on Health: Selected Indicators for 115

       Countries, Health, Nutrition and Population Series, World Bank.

Winsten, C.B. (1957). Discussion of Mr. Farrell's paper, Journal of the Royal

       Statistical Society A, 120(3), 282-284.

World Bank (1993). World Development Report: Investing in Health, New York: Oxford

       University Press.

_________ (2003). World Development Indicators, World Bank, Washington DC.

World Health Organization (1999). The World Health Report: Making a Difference.

       Geneva: World Health Organization.

_______________________ (2000). The World Health Report: Health Systems. Improving

       Performance. Geneva: World Health Organization.

_______________________ (2001). Report of the Scientific Peer Review Group on Health Systems

       Performance Assessment, Geneva: World Health Organization.

       http://www.who.int/health-systems-performance/sprg/report_of_sprg_on_hspa.htm




                                              29