WPS4216



          Underlying Dimensions of Knowledge
                                         Assessment:
          Factor Analysis of the Knowledge Assessment
                                     Methodology Data



                                         Derek H. C. Chen*
                                             The World Bank

                                        Kishore Gawande**
                                        Texas A&M University



The Knowledge Assessment Methodology (KAM) database measures variables that may
be used to provide an assessment of countries' readiness for the knowledge economy, and
has many policy uses. Formal analysis employing KAM data is faced with the problem of
which variables to choose and why. Rather than make these decisions in an ad hoc
manner, we recommend factor-analytic methods to distill the information contained in the
many KAM variables into a smaller set of ``factors". The main objective of the paper is
to quantify the factors for each country, and do so in a way that allows comparisons of
the factor scores over time. We investigate both principal components as well as true
factor analytic methods, and emphasize simple structures which help to not only provide
a clear political-economic meaning of the factors, but also allow comparisons over time.

World Bank Policy Research Working Paper 4216, April 2007

The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the
exchange of ideas about development issues. An objective of the series is to get the findings out quickly,
even if the presentations are less than fully polished. The papers carry the names of the authors and should
be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely
those of the authors. They do not necessarily represent the view of the World Bank, its Executive Directors,
or the countries they represent. Policy Research Working Papers are available online at
http://econ.worldbank.org.


      *Economist, Knowledge for Development Program, Human Development Department, World Bank
Institute. ** Professor, Bush School of Government and Public Service.

1. Introduction



In order to facilitate the attempt of countries to make the transition to the knowledge

economy, the Knowledge Assessment Methodology (KAM) was developed at the World Bank

(Chen and Dahlman, 2004, 2005). It is designed to provide an assessment of countries'

readiness for the knowledge economy, and identifies sectors or areas in which policymakers

should focus their attention and make future investments. The KAM is currently being

widely used both internally and externally to the World Bank, and frequently facilitates

engagements and policy discussions with government officials from client countries. This

rich database is also potentially useful for research by political economists and political

scientists.


The KAM database includes variables such as tariff and non-tariff barriers, regulatory qual-

ity, rule of law, adult literacy rate, secondary enrollment, tertiary enrollment, researchers in

R&D , patent applications granted by the USPTO, acientific and technical journal articles,

telephones, computers, and internet users.1 They are constructed for over 120 countries, and

are available at different points in time.


Any formal analysis employing KAM data must confront the problem of which variables to

choose and why. Rather than make these decisions in an ad hoc manner, we recommend

"reducing" the set of KAM variables to a smaller set of variables without losing information

contained in the full set of variables. Factor-analytic methods are concerned with precisely

this problem ­ reducing the data in a way that parsimoniously represents essentially the

same information contained in the many variables. The parsimonious set of variables is the

set of "factors" to which the data in the large number of variables is reduced.


   1Source: The Knowledge Assessment Methodology (KAM) website (www.worldbank.org/kam).




                                                1

Our main objective in undertaking the factor analysis is to quantify the factors for each

country, that is, compute "factor scores" on each factor. Importantly, we wish to accomplish

this in a way that allow comparisons of the factor scores over time. To this end, the paper

details four issues in the factor analysis of the KAM data in detail. The first is whether

the KAM data should be factor-analyzed and what factor-analytic method may be most

appropriate; the second is determining the optimal dimensionality of the data, that is, the

number of factors to which the data may be adequately reduced; the third, and perhaps

most important, is giving clear meaning to the factors. Each of the above issues is treated

exhaustively in the paper.


If subsets of variables are correlated, then depending on the extent of the correlation, factor

analysis is worth doing. A formal test shows that the KAM data are not just amenable to

factor analysis but they greatly benefit from it. There are enough inter-correlations among

the variables that the real information in the data can be distilled down to a smaller number

of dimensions.


What is the optimal dimensionality to which the information contained in the variables

can be reduced? Depending on the factor analytic method that is chosen, the answer is

different. For example, in principal components analysis this is determined by the number

of principal components required to explain, say, 95% of the total variance in the data. In

"true" factor analysis (which we estimate using maximum likelihood) a formal chi-squared

test or information criteria that measure fit in terms of explained intercorrelations, not just

variance, are used to determine optimal dimensionality.


The most important contribution of the paper is that it gives political-economic meaning

to the dimensions, whether they go by the name of "factors" or "principal components".

Ultimately, we hope to make factor analysis a useful policy tool to indicate warning signals



                                               2

about the health of countries. The tool we will use to give political-economic meaning to the

factors are the "factor loadings". Intuitively, these are the coefficients of the regression of

each variable on the factors. Thus, if one variable has a very high coefficient on one factor but

not on any of the others, we say that variable loads heavily on that factor. If the data, or the

information in the variables can be reduced to a smaller set of factors then what we should

find is that some variables load heavily on the same factor and other variables load heavily

on other factors. That is, the structure of factor loadings should be simple". One definition

of a simple structure of loadings is as follows: a structure in which any single variable loads

only on one factor and minimally on the others, and that more than one variable load on

one factor. We will spend considerable effort in producing simple structures, because simple

structures make the political-economic content of the factors unambiguous and clear. We

will also test for the adequacy of our simple structures.


Obviously, the preceding discussion about factors loadings as regression coefficients is meant

only to set ideas because, unlike in regression analysis, the factors themselves are unknown.

In other words, the factor scores, or the value that the factors take, are not known and

no regression in the usual sense can be estimated. Section 2 provides the theory behind

how factor scores and factor loadings are computed simultaneously within a factor analytic

framework.


The paper proceeds as follows. In section 2, we outline the generic factor model. In this sec-

tion, two fundamentally different methods of factor analysis ­ principal components analysis

and true factor analysis are explained in detail. A special case of the true factor analy-

sis, the error components method, is also discussed here. Section 3 discusses the data and

sources. The analysis is carried out on 12 variables measured across 120 countries. The data

are from two time periods, 1995 and a more recent vintage around 2003. The same section

discusses how we impute missing data in order to cover the sample of 120 countries, not all



                                               3

of which have complete data on all 12 variables. We also point to a data pitfall that should

be avoided before doing the factor analysis. Section 4 contains the empirical results and

the main contribution of the paper. We analyze principal components separately from the

true factor analysis results. There are three main components to this section. The first is

the use of factor loadings in order to name the factors. We show that with the KAM data

we are able to achieve a fairly simple structure. The second is a set of formal tests for the

dimensionality. The third is another methodological pitfall, whose resolution confronts us

with a tradeoff. We indicate how and why we choose to resolve this in the manner we do.

The choice is obvious from our overriding objective of computing factor scores as precisely

as possible. Section 5 discusses the output from this factor analysis. We use graphs to show

how countries have changed their rankings on the underlying dimensions over this ten year

period. Section 6 concludes.




2. Factor Analysis Models


The notation and material in this section borrows from Reyment and Joreskog (1993, Sections

2, 4). The general factor analysis model is



      X(N×p) = F(  N×k)A( k×p)+ E( N×p),                                                  (1)



where X is the data matrix of p variables, F is the matrix of k < p factors, and N is the

sample size. The k × p "factor loadings" matrix A is used to linearly sum the factors to

predict each column of X. What cannot be predicted is collected in the error matrix E.


In the context of the KAM data, each column of X is a measure (i.e. variable) containing

"scores" for a set of N countries. There are p such measures on which country scores have

                                              4

been compiled.2 The individual components of F are the "scores" for common factors since

they are common to several different measures. The KAM measures are thus predicted as

linear combinations of the factors. The coefficients of the factors, called the factor loadings,

are the elements of A . For example, consider the ith measure (variable) xi. It can be written

as a regression model


      xi = ai f + ai f + . . . aikfk + ei.
                 1 1     2 2                                                                                 (2)


where f1, . . . , fk are the "exogenous" factors, and the coefficients ai , . . . , aik are the "load-
                                                                                    1

ings" contained in the ith column of A . While ei is given the interpretation of a regression

residual, in fact it is made up of the measurement error in the measure xi plus a "specific"

factor that xi does not share in common with other measures. Thus, each of the p variables

xi, i = 1, . . . , p can be written as a regression model with the factors acting as the common

"exogenous" variables weighted by the coefficients ai , . . . , aik, and where ei is the regression
                                                                 1

residual.


Written in this form makes it clear that factor analysis is a method of data-reduction. The

method seeks to parsimoniously represent in a small set of variables (f1, . . . , fk) essentially

the same information contained in a much larger set of variables (x1, . . . , xp). We will reduce

the KAM data variables to their essential factors using two different factor-analytic methods.


The difference between model (1) and ordinary regression models is that the factors and

coefficients are both unknown. That is, neither F nor A is known and must be estimated.

There is a fundamental indeterminacy in the model. If we (linearly) transform F and A ,

respectively, as F* = F C- and A* = C A , then (1) is equivalently written as:
                                 1



   2p need not be fixed. Factor analysis of the KAM variables may be performed separately on subsets of
the KAM variables. For example, each of the four pillars of the KAM data ­ (i) economic and institutional
regime data, (ii) education and skills data, (iii) infrastructure data, and (iv) innovation potential data ­ may
be distilled down to one or two factors.


                                                         5

      X( N×p) = F*( N×k)A*( k×p) + E( N×p) ,                                               (3)



Then, by observing X we cannot distinguish between these two models. This should be

familiar from econometric textbook discussions on identification (e.g. Greene, 2004). De-

vising "simple" structures in which as many factor loading as possible are zeros, facilitates

identification and interpretations of the factors. We will explore simple structures in detail.

We now formally discuss the two popular methods of factor analysis that we will use: the

Principal Components (PC) method and the pure factor analysis model which we estimate

by maximum likelihood (ML).



Fixed versus Random Factors


A distinction is made between models that presume the factor matrix F in (1) to be fixed,

and models that presume F to be random. The random factors model is appropriate when

we want to extend our inferences to different samples (say, of individuals), while the non-

random factors model is appropriate when the specific observations (here countries), and not

just the model structure, are of interest. The KAM data pertain to specific countries, and

are exhaustive across countries, which makes a compelling case for the use of fixed-factor

models. However, if inferences from the factor analysis were to be applied to countries not

in the sample, or to the same countries but in a future period, then it is advisable to use

random-factor models. The likelihood function for (identified) models with random F is well

defined (see e.g. Anderson, 1984 p 552). Estimation of models with non-random F proceeds

based on least squares criteria (for which, unlike the random factors case, no distributional

assumptions need be made unless statistical testing is to be done).




                                               6

2.1 Principal Components Analysis of the Fixed Factors Model3


In Stata, estimation of the Principal components model proceeds as in a fixed factor model.
Let Y be the mean-removed data matrix and scaled by 1/N so that the matrix S = Y Y

is the data covariance matrix. Consider the (non-random factor) model for Y:


      Y( N×p)  = F(  N×k)A( k×p) + E( N×p)  ,                                                             (4)


where A is the factor loadings matrix and F is the matrix containing the factor scores. Using

least squares to fit the (fixed data) model implies estimating F and A (for a given k, see

Section 4 on determining k) in order to minimize the sum of the squares of the residual

matrix:


      E = Y - FA .                                                                                        (5)


The singular value decomposition (SVD) theorem indicates that a solution with the largest

k singular values 1, 2, . . . , k is given as:

      FA = 1v1u1 + 2v2u2 + . . . + kvkuk,
      ^ ^                                                                                                 (6)


where uj is a (p × 1) vector, and vj is a (N × 1) vector, and j is the jth eigenvalue of

the data covariance S. Define the matrices: Vk = [v1, v2, . . . , vk], Uk = [u1, u2, . . . , uk] and

k = diag[1, 2, . . . , k]. Their dimensionalities are: Vk : (N × k), Uk : (p × k), k : (k × k).

Then the solution is:


      FA = VkkUk.
      ^ ^                                                                                                 (7)


Note that there is not a unique solution for F and A individually. Our solution will be in
                                                     ^       ^

the direction of "simple" structures for A.  ^


   3The principal components (PC) method is applicable to both, fixed and random factors models (Reyment
and Joreskog, 1993). We focus on PC as applied to fixed factors since Stata estimates PC for the fixed factors
model. We indicate how to estimate the PC model for random factors in fn. 4


                                                     7

Consider the following solution:


       F = Vk,
       ^             A = kUk.
                     ^                                                                               (8)


Then the factor scores for the k factors, F, are also in standardized form with covariance
                                                 ^

equal to the identity matrix. That is, they are pairwise uncorrelated.


If E is small so that Y is approximated by FA , then the data covariance is approximately:
                                                   ^ ^


       S = Y Y  AF FA = AA
                     ^ ^ ^ ^       ^ ^                                                               (9)


The "PCA" routine in Stata calculates Principal Components in the following steps:


1. Compute the covariance matrix S.4

2. Compute the k eigenvectors corresponding to the largest eigenvalues of S. Arrange the

eigenvectors in the p × k matrix Uk.

3. Estimate factor loadings as A = Uk
                                    ^

4. Estimate Factor Scores as F = ZA, where Z is the data matrix with the p variables
                                    ^       ^

standardized to have zero mean and unit variance.


Thus, the factor loading matrix is the set of eigenvectors corresponding to the largest k

eigenvalues. This is also the factor scoring matrix. Note that Stata computes factor scores

using the standardized variables Z, not Y. In this solution, the factors have different variances
and they are not comparable (their "units" are different). They must be scaled by -                1/2
                                                                                                 k    to

be comparable (and have unit variance).


   4In our analysis we use the Stata default to analyze the data correlation matrix, which produces quan-
titatively somewhat different loadings and scores from the analysis of the variance matrix ­ known as the
"scaling" problem of PC analysis ­ but qualitatively the results are close.




                                                     8

2.2 True Factor Analysis of Intercorrelations (using Maximum Likelihood)


True factor analysis is based on the random factors model. While the model in the random

factors case is the same as (1), the population covariance matrix is:


        = AA +                                                                                              (10)


if the factors are uncorrelated, and


        = AF FA +                                                                                           (11)


if the factors are correlated. In (10) and (11)  is the true error covariance matrix.5


In order to estimate the parameters of the model, we proceed by analyzing data that are

mean-removed, so that the data covariance S = X'X. We make the following assumptions

about the true covariances:

        1                     1                   1                     1
                                                                                                            (12)
       N  X X  ,             N  F F  ,           N  F E  0,            N E   E  ,


that is, finite second moments and orthogonality of the error and factor score matrices. We

will assume that the error covariance  is diagonal, that is, measurement (and other) errors

are uncorrelated across different variables. This diagonal error covariance is constant across

observations (or "homoskedastic" covariance). The factors may be correlated, that is,  is


   5PC analysis of the random factors model is also possible, but requires the assumption that  is small
(that is, E in (1) is small). The Unweighted Least-Squares (ULS) criteria fits the factor model so that the
sum of squares of the elements of S - AA (presuming factors are uncorrelated) is minimized. The PC
solution to this problem may be computed using the following steps:
  1. Compute the covariance matrix S.
  2. Compute the k largest eigenvalues and arrange them in a diagonal matrix k.
  3. Compute the corresponding k eigenvectors Uk of S. Compute A = Uk1k . Each eigenvector is now
                                                                         ^          /2

scaled so that its length equals the corresponding eigenvalue.
  4. Compute the factor scores as F = YA- .
                                     ^     ^    1
                                               k
  While this solution is different from the fixed factor solution, it is applicable to the fixed factor case with
the ULS criterion applied to the error matrix E in (4) and (5).


                                                      9

permetted to be non-diagonal (if the factors are uncorrelated ­ more on this below ­ then

 = I). Therefore, the population variance  is a function of the model parameters A, 

and .


       = A A + .                                                                          (13)


In PC analysis of the random factors model (see fn 5), factors are determined so that they

account for maximum variance of all the observed variables. Thus, the emphasis in PC

analysis is on eigenvalues, because the sum of all eigenvalues is the total variance is all the

variables. In true factor analysis, the factors are determined so that they best account for

the intercorrelations of the variables. In true factor analysis, the errors are presumed to be

uncorrelated with each other so that  is diagonal (in PC  is simply assumed to be small

in the sense that   A A. The rank of A A, and therefore of  is approximately k.

In true factor analysis, in (13)  has diagonal elements only, so that the off-diagonals of 

are exactly equal to the off-diagonals elements of A A, and the parameters are estimated

to make the off-diagonal elements of the data correlation matrix as close as possible to the

off-diagonals of A A. The diagonal elements of  are equal to the sum of the diagonal

elements of A A (the "communalities" of the variables) and those of  (the "uniqueness"

of the variables). The off-diagonal elements assume greater importance in true factor analysis

than in PC analysis (where they are assumed away).


The ML estimates of A and  is based on the assumption that the error vector for obser-

vation i, Ei., is multivariate normal with mean 0 and variance . The likelihood function

for the multivariate data is


      ln|| + tr S- - ln|S| - p,
                       1                                                                  (14)


which is maximized over the parameters A and . Asymptotic properties of the MLE have



                                              10

well-defined limiting distributions which are used for testing.6


Computing Factor Scores and Standard Errors


In order to estimate factor scores from the ML method, consider a single observation on the

factor model:


       x(p×1) = A(   p×k) (k×1)
                         f     + e(p×1) ,                                                             (15)



where the lower case letters denote the vector elements of their matrix counterparts in (1).

We proceed as described in Anderson (1971, p. 575). The transposed vectors are column

vectors. The data vector x and the factor score vector a have a joint normal distribution

distribution with mean (0 , 0 ) and covariance:

                                        x          + AA           A
                                 cov         =
                                        f             A            

. The factor scores are computed by the regression of f on x . In terms of the population


parameters, this is:

       E(f |x ) = A ( + AA )- x .        1                                                            (16)



Using the conditional variance formula, the covariance of the regression is

       cov(f |x ) =  - A ( + AA )- A.           1                                                     (17)



Replacing the parameters by their ML estimates yields an estimate of the (conditional) co-

variance. They may be used to test hypotheses about scores (for an observation) on different


   6In Stata, the "factor" command is used together with the "ml" option in order to estimate the parameters
of the factor model.


                                                    11

factors. The square roots of the diagonal elements of the (estimated) covariance are the stan-

dard errors of the estimated k-vector of factor scores on that observation. These standard

erros are constant across observations. Dividing the factor score with the corresponding

standard error produces a t-statistic for testing statistical significance of individual factor

scores. To take a simple, example, suppose the data are aggregated into a single factor,

k = 1. Then the matrix  collapses to unity. Then the estimator for the (scalar) factor

score is


      E(f |x ) = A ( + AA )- x , 1                                                        (18)



and it's (scalar) variance is


      cov(f |x ) = 1 - A ( + AA )- A.  1                                                  (19)



For this single factor case, denoting the ML parameter estimates with "hats", the factor

score (for the single observation) is computed as the conditional mean

      E(f |x ) = A  + AA        -1 x ,                                                    (20)



and its standard error is

                                           0.5
      se(f |x ) = 1 - A  + AA         -1
                                        A     .                                           (21)




2.3 Error Components Method


The error-components approach (EC) used by Kaufmann et al. (2005) to measure gover-

nance across several countries is a random-factors approach based on econometric methods

developed for latent data models (see e.g. Goldberger (1972) and MIMIC models of Joreskog


                                              12

(1967) and Joreskog and Sorbom (1979)). The Kaufmann et al. approach is to fix the num-

ber of variables that map into a factor and then estimate score for the factor as conditional

means, conditional on parameters estimated by maximum likelihood. Thus, one major dif-

ference from the PC and ML methods of factor analysis described above is that the number

of variables that map into a factor is prespecified.7 Thus, the number of variables p (and

which ones they are) is treated as prior information. The computation of EC factor scores

proceeds in two steps. First, the model parameters are estimated using maxmimum likeli-

hood. Next, they are used to compute the scores as conditional means. The method also

produces conditional variances which may be used to construct confidence intervals for the

factor scores or for testing.


They (implicitly) consider the following factor model:


       X( N×p)  = F( N×1) (1 ×p)+ E( N×p)  .                                                           (22)



This corresponds to (1) except that the factor loading vector  takes the place of the factor

loading matrix A in (1). Whereas in (1) p variables mapped into k factors, here p variables

map into a single factor. Note that while we have chosen to use the same notation to indicate

matrix dimensions, the number of variables p may be chosen to be a specific set of variables,

and not the entire data matrix at hand (as we did in the case of the ML and PC methods in

which the number of factors k is determined by the data). Since in the EC method k = 1,

the p variables may be chosen to be a "homogeneous" subset of the variables designed for

mapping into that factor.


The EC likelihood function is as follows. Let , ,  be (p×1) parameter vectors. As before,

   7For example, Chen and Dahlman (2005) partition the KAM variables into four "pillars": Economic
Incentive and Institutional Regime, Education and Human Resources, Innovation System, and Information
Infrastructure. In the Chen-Dahlman scheme, since tariff and non-tariff barriers, regulatory quality, rule of
law represent the Economic Incentive and Institutional Regime pillar, p = 3 for this factor.


                                                   13

 is defined to be the diagonal (p×p) error covariance matrix with elements 1, . . . , p. Let

the (p × p) matrix  =  + . Then, the likelihood function for the data is:

                                 N
       L = -0.5 × N × ln|| +       (xj - ) - (xj - ).
                                              1                                         (23)
                                j=1

In (23) the parameter  is simply the vector of the means of the p variables in X. For

observation j the 1 × p data vector is denoted xj.


Denoting the ML parameter estimates with "hats", the factor score for observation j is

computed as the conditional mean (conditional on xj)

       Fj = ^ - (xj - ),
                 1       ^                                                              (24)


and the standard error of this estimates is computed as


       sej = 1 -  -    1   0.5.                                                         (25)


This is exactly the same as (21), where A =  (so that ( + AA =  +  = ).


Where the EC method differs from the (random) factor method estimated by ML is in the

specification of the likelihood functions. Whereas in the EC method the data likelihood is

maximized over the parameters (A, ), in the ML factor method the likelihood of the inter-

correlations in the data is maximized over the parameters. In this sense, the EC method

is still a variance method (driven by a squared error loss objective) while the ML factor

method pays attention to the intercorrelations among the variables.


3. Data


The Knowledge Assessment Methodology (KAM) database consists of more than 80 struc-

tural and qualitative variables to measure countries' performance on how they perform as

"knowledge economies". We will use the subset of 12 variables that are used by the KAM


                                             14

method to compute each country's "basic scorecard". They are: tariff and non-tariff barriers,

regulatory quality, rule of law, adult literacy rate (% age 15 and above), secondary enroll-

ment, tertiary enrollment, researchers in R&D, patent applications granted by the USPTO,

scientific and technical journal articles, telephones (mainlines + mobile phones), comput-

ers, and internet users. The KAM website (see fn 1) indicates the variety of sources from

which the data are drawn. In addition to this unscaled data, we will also perform factor

analysis on these variables, but now a subset of the variables will be scaled so that country

size does not influence the analysis. The scaled set of variables are: tariff and non-tariff

barriers, regulatory quality, rule of law, adult literacy rate (% age 15 and above), secondary

enrollment, tertiary enrollment, researchers in R&D (per million population), patent ap-

plications granted by the USPTO (per million population), scientific and technical journal

articles (per million population) telephones per 1000 persons (telephone mainlines + mobile

phones), computers per 1000 persons, and internet users per 10,000 persons.


Data on the 12 variables are available at two points in time, one measured in 1995 and

another during a more recent period, between 2002-04. We will use the term "2002" to

indicate the recent data. Table 1 describes the variables and reports descriptive statistics

for the 12 variable-pairs.


3.1 Missing Data Imputation


The factor analysis restricts the sample to one which has complete data on all included

variables. Hence, a crucial pre-estimation step is to impute missing data in order to have as

broad a coverage of countries as possible. The imputations are carried out using a simple

regression of the variable with missing data, using as the independent variable a conceptually

closely related variable. For example, (unscaled) research95 has data for only 86 countries.

However, the closely related research03 has data for 95 countries. Therefore, nine observa-



                                               15

tions can be additionally imputed by regressing research95 on research03. The first column

of Table 2 shows the results of this regression. The R-squared of 0.91 indicates a good fit for

the imputation. Having filled these nine data points, we now have data for 95 countries for

research95. That is still not enough. The next closely related variable is technical journal

output in 1995 (techjour95). The second column of Table 2 indicates that this regression

has an averagely good fit, with an R-squared of 0.70. The variable, techjour95 is statisti-

cally significant at 1%. Therefore, the two-step regression process makes available data on

research95 for 120 countries.


A similar two-step regression process is used to impute missing research03 data via the

regressions shown in columns 3 and 4 of Table 2. The last three columns in Table 2 impute

data for computer95, computer04 and tariffs and NTBs for '95 (tntb95) using, respectively,

GDP per capita for the two computer variables and tntb05 as regressors.8 After completing

the imputations we have available data for 120 out of the 128 countries. Unavailability of

data on the regressors prevents imputing missing data for the remaining 8 countries. The

factor analysis is based on the sample of these 120 countries. The authors' working paper

provide details on the countries for which variables are imputed and the imputed values.


3.2 Is It Worth Doing Factor Analysis on the KAM Data?


The main objective of the factor analysis is to understand whether countries have advanced

their positions over the 10-year period in terms of (i) the absolute measures of the factors,

and (ii) their factor score ranks vis-a-vis other countries. Before proceeding with factor

analysis and the computation of factor scores, it is important to understand whether and

how much we can gain from undertaking a factor analysis. The cross-country data on the 12


   8For imputing missing country values we use not only available data across countries but also for the
ten regions including the world. There is additional information in these aggregated regions which can be
brought to bear on the imputations.



                                                   16

variables have considerable correlations among them. However, if the correlations are driven

by common underlying factors, then the factors become the main objects of interest for us.


If two variables share a common factor with other variables, their partial correlation, con-

trolling for all remaining variables, will be small. The Kaiser-Meyer-Olkin (KMO) statistic,

based on this idea, computes the ratio of (i) the sum of squared correlations of each variable

in the analysis with every other variable to (ii) the same sum plus the sum of squared partial

correlations of each variable with every other variable, controlling for all remaining variables.

Large values for this "overall" KMO measure indicate that the partial correlations are small,

that is, common underlying factors are responsible for the correlations among variables. A

large value for the KMO measure indicates considerable gains from undertaking a factor

analysis. Table 3 indicates that the overall KMO measure is 0.875, which provides solid

support for proceeding with factor analysis of the KAM data. Further, the KMO statistic

for each variable individually indicates that their high correlations are driven by underlying

factors.9


3.3 Avoiding a Pitfall


Performing factor analyses separately on the 1995 variables and the 2002 variables is a

pitfall one should avoid if the purpose of the factor analysis is to compare factor scores

across the two periods. Separate analyses produce factor scores (that is, the quantity of

a factor contained in each country) that are not strictly comparable. For example, in PC

analysis separate analyses produces factors with different variances so that their magnitudes

are not comparable (that is, their "units" are different). In order to solve this problem,

we proceed as follows. First, we combine the 1995 and 2002 variables into one set of 12

variables. To be consistent, the same factor analytic method is used to combine each pair


   9The variables in Table 3 are an amalgam of each variable-pair over the two years, see below.



                                                  17

of variables as is used in the factor analysis for the full set of 12 variables. For example,

Computer95 and Computer04 are factor-analytically combined into one computer variable

using either maximum likelihood (ML) or principal components (PC). Second, we proceed

with the factor analysis of this set of 12 amalgamated variables. Third, we use the common

estimate of the "scoring coefficients" matrix (see below) produced by the factor analysis, but

apply that matrix separately to the 1995 and 2002 variables in order to compute separate

sets of factor scores, one 1995 and one for 2002. These scores are used to analyze changes in

the factors over the two periods. In this paper the scores are used ordinally to rank countries.

However, making simple adjustments to the mean and standard deviation allows cardinal

comparisons as well, for example in regression analyses.


In the following section we report and analyze the results from the Principal Components

(PC) method and the Maximum Likelihood (ML) method.               The discussion is from the

ground-up and provides details about (i) why the specific number of factors are chosen

in each method, (ii) the reason why we choose "simple" factor loading structures for our

analyses, (iii) the reason we choose the specific method for obtaining the simple structure,

and (iv) an approximation that makes the structure especially simple and is essential to

achieve our objective of comparing factor scores across years (plus a chi-squared test that

tests whether the approximation is statistically accurate).



4. Empirical Results


4.1 Principal Components Analysis


The first step in factor analysis is choosing k, or the number of factors that will fit the data

"adequately". In PC analysis, an oft-used criterion is to set k to be no less than the number




                                              18

required to explain 95% or more of the total variance in the data.10 Table 4.1 shows that

six factors are required to explain at least 95% of the variance in the 12 KAM variables.

Thus, we choose k = 6. Even though all of six factors are required to account for 95% of the

variance in the data, the first factor accounts for the giant's share of the data variance. Table

4.1 indicates that the first principal component accounts for 60.2% of the total variance. We

might expect that this component will also have the maximum number of large loadings

among all principal components. Table 4.2 reports the loadings with k=6. As expected, a

majority of the variables load heavily on this factor. Not only does this make it a catch-all

factor, but since variables do not load on the remaining factors they have little political-

economic content. For this reason, factor analysis has sought to design factors with "simple"

structures so that all factors have meaningful content.


Simple Structure: Orthogonal and Oblique Rotations

The criteria advanced by Thurstone (1947) have been influential in producing computation-

ally feasible methods that deliver simple structures:

     · There should be at least one zero in each row of the factor loadings matrix.

     · There should be many (at least k) zeros in each column of the factor matrix.

  10 While this criterion is the most popular, other criteria have also been used. They are:


   (i) The size of individual factor loadings: the factor loadings squared (for orthogonal factors) indicate the
variance of a variable accounted for by a particular factor. Factors not contributing much may be dropped.
if parsimony is a driving concern, the thumb-rule proposed by Reyment and Joreskog (1993) ­ that there
should be at least three significant loadings in each factor­ may be used.


   (ii) The variance explained by a factor: The sum of squared loadings for a given factor represents the
information content in the factor. The ratio of the sum of squares to the trace of the correlation matrix is
the proportion of total information residing in the factor. A cutoff value can be used to then determine how
many factors to retain.


   (iii) Significant residuals: A residual correlation matrix may be calculated after each factor has been
extracted.
random error determines k. The standard error of the residual correlations (estimated roughly as 1/ N - 1
             k is determined at the point when the residual matrix consists of correlations solelydue to

can be used to determine whether the correlations are significantly greater than zero.


                                                       19

   · For every pair of factors


         ­ only a few variables should have near-zero loadings on both.

         ­ some variables should load heavily on one and not at all on the other.

         ­ several variables should have near-zero loadings on both factors.


Two classes of methods have evolved that produce simple structures. The first class of

methods ­ orthogonal rotation ­ maintains the uncorrelatedness of the factors, while the

second class of methods ­ oblique rotations - seeks to find simple structures with correlated

factors. Since the latter relax the constraint on orthogonality of factors, they are capable of

producing even simpler structures than orthogonal rotations.


Technically, rotations work as follows. In the Stata code, a k × k rotation matrix T rotates

the factor loadings in Step 4 (see Section 2.1) so that the rotated factor loadings matrix,

denoted A^R, is given by


      AR = AT = UkT.
      ^      ^                                                                            (26)


If T is an orthogonal transformation matrix, the rotation preserves the orthogonality of the

factor score matrix F. Otherwise, the rotation is oblique, that is, the factors are correlated.

Table 4.3 displays the oblique rotation matrix T that produces the simple structure that we

use to proceed with our analysis and computations. In PC analysis, even after rotation the

total variance explained by the factors is still the same (95.30%), but the portion accounted

by each factor is now different. As Table 4.4 shows, rotation distributes the portion of the

variance explained more evenly than the unrotated factors. This is the point of the simple

structure: to identify a factor associated with only few variables.


A graphical analysis makes the connection between rotation and simple structure clear.

Figure A1 (see appendix) plots the unrotated factor loadings for k = 6 factors. A row of


                                               20

the factor loadings matrix, indicating how the corresponding variable loads on each of the

6 factors, is depicted as a point in the 6-dimensional space of factors. The projection of

these points on the (15) possible 2-dimensional subspaces is displayed, respectively, in each

panel of Figure A1. Consider the top row of five graphs in which the y-axis measures the

loadings of the 12 variables on the first principal component (C1). While the C1 vs. C2

graph shows some evidence of clustering, the structure of the loadings is not very simple as

we move across the row of graphs. Now compare the same row in Figure A1 with the first

row in Figure A2, in which the axes have been rotated orthogonally (that is, rotating the

axes while keeping the origin at the same point and maintaining the angle between the axes

at 90 degrees) in order to achieve a simpler structure. This type of rotation is known as the

"varimax" rotation. There is a clear separation of the loadings into two y-clusters: one set

of variables (patapp, techjour, tel) projects into high y-values, that is, high C1 loadings, and

the other into low y-values.


This type of simple structure is in evidence not only for loadings on C1 but on C4 (tertiary

and secondary enrollment), C5(adult literacy), and C6 (tariffs & ntbs). While the C2 and

C3 rows indicate the presence of clusters with high loadings (regulation quality and law

load on C2 and computers and net users load on C3), the structure of loadings on these

two components is not as simple as Thurstone's ideal. Regardless, the varimax rotation has

made the structure of loadings much simpler.


Can an oblique rotation that relaxes the constraint on uncorrelatedness of the principal

components (i.e. the 90 degree angle between the axes) achieve an even simpler structure?

Figure A3 depicts the result of an oblique rotation (known as the "Oblimin" rotation).11

There is no visible difference between the orthogonal rotation results in Figure A2 and the


  11Although in the figure the axes appear perpendicular to each other, they are not. That is done merely
for convenience. Correlation between two components implies that the angle is less than 90 degrees. However
what is important to us is the projection of the points on the axes.


                                                    21

Oblique rotation loadings in Figure A3. As Figure A2 indicates the difference in the load-

ings is almost negligible. In other words, in Principal Components Analysis the orthogonal

rotation considerably simplifies the structure of loadings, and the oblique rotation repro-

duces it but does not simplify it further. However, in the True Factor analysis below (using

ML) an oblique rotation produces significant improvement over the orthogonal rotation. We

therefore adopt the oblique rotation results for computing factor scores.


Outside of economics and political science, in psychometrics for example, researchers con-

ducting exploratory factor analysis have generally assumed orthogonal factors.12 In eco-

nomics and political science, however, there is every reason for believing that factors should

be correlated. Multiple regression is prevalent in economics and political science precisely

because non-experimental data are correlated. In order to satisfy the ceteris paribus as-

sumption, considerable care is taken to include appropriate control variables. We should

embrace the idea that political-economic data are correlated when such data are determined

in general equilibrium. It is therefore almost impossible for the data to be orthogonal.

Within a data class there may be strong interdependencies while across data classes these

interdependencies may be weak. In that case, the assumption of partial equilibrium for each

data class may be justified. In Chen and Dahlman (2004) this assumption leads the authors

to think of their data classes as "pillars". Here, we let the data decide how to form into

groups. There are two related messages here. The first is that the main objective of the

factor analysis is to be able to identify the underlying dimensions that the observed data

purport to measure. Second, and related to this objective, there is absolutely no theoretical

reason why the underlying factors should be uncorrelated. The underlying dimensions are


  12Traditionally, a clear distinction has been made between confirmatory factor analysis (CFA) and ex-
ploratory factor analysis (EFA). In CFA, if theory suggests two factors are correlated, then an oblique
rotation is justified. In EFA, there is neither is there a theoretical basis for knowing how many factors there
are nor whether they are correlated. In economics and political science, we argue, CFA will generally indicate
correlated factors due to interdependencies of general equilibrium under which the data are generated.




                                                        22

determined by the same general equilibrium mechanism that generates measures of these

factors (i.e. the variables). Theoretically, factors should be correlated.13


Political-Economic Dimensions of the Data: Naming the Factors

What names are appropriate for the principal components? Table 4.5 shows that the first

principal component (RC1) has the variables researchers, technical journals, and patent ap-

plications load heavily on it. Therefore, this component is named the Innovation Potential

factor, since the ability of an economy to innovate is appropriately measured by these impor-

tant inputs. Since RC2 has law and the quality of regulations load heavily on it, we call it

the Law and Regulation factor. RC3 is named the ICP factor since computers, net users and

telephones lines load heavily on it. RC4 is the Education factor since secondary and tertiary

enrollments load heavily one it. RC5 is named the Literacy factor after the single variable,

adult literacy. The final factor RC6 is the Openness factor because tariffs and NTBs almost

entirely load on it. Of note is the fact that the unexplained variance (last column), after

accounting for the six principle components, is quite small for every variable. This indicates

that the factor model with six components fit the data well at the individual variable level

(therefore satisfying criteria (ii) in fn 5, as well).


Computing Factor Scores

The scores on any factor indicate how much of the factor is "contained" in a particular

country. We use the oblique-rotated factor loadings as the basis for our factor score com-

putations. As indicated in (15), the unrotated principal components (eigenvectors) A are        ^

transformed into the rotated components AR as AR = AT = UkT (Table 4.3 displays the
                                                  ^       ^      ^

matrix T). In order to estimate the factor scores, we use the direct method (as different

from the regression method, see Reyment and Joreskog, pp 223-225).14 In this method, the

  13It is reasonable that correlations among factors should be weaker than correlations among variables
measuring any single factor (else the two factor should be combined into one).

  14This is different from the method used to compute the scoring matrix in the ML method below.


                                                    23

factor scores F are computed as
               ^


      F = Z[(ARAR)- AR] ,
      ^        ^  ^    1 ^                                                                (27)


where F is the (n × k) matrix containing factor scores on each factor for the 120 countries,
        ^

and Z is the (n × p) matrix containing the standardized data variables. The (p × k) scor-
     ^

ing coefficient matrix in Table 4.6 (produced by Stata) is the transpose of the coefficients

(ARAR)- AR. However, before computing factor scores using (16), an important step is
 ^   ^    1 ^

required to avoid another pitfall.


Avoiding a Pitfall (and trade-offs involved)


The scoring coefficients in Table 4.6 show that a few coefficients, indicated in bold, should

dominate the measurement of the factor scores. In practice, however, other elements of

the matrix can and do influence the computation of the factor scores, with unexpected

consequences. Consider the first factor, the ICT factor, in Table 4.6. It consists of three

large positive scoring coefficients (computers, internet users and telephones) and nine small

coefficients, some of which are negative. These negative coefficients can actually produce

contrarian factor scores. Take Angola, for example. Applying the direct method in (16) pro-

duces an ICT factor score for Angola that ranks it 70th among the 120 countries. However,

when the countries in the sample are ranked individually according to the three variables

that measure ICT, Angola ranks near the bottom of the list, below 115th, in all three rank-

ings. The reason why its rank on the ICT factor score is much higher than its rank on any of

the three variables is because some negative coefficients multiply into (large) negative values

of the corresponding standardized variables to create positive numbers. The point is that,

although the small coefficients appear innocuous, using them in a formulaic manner can lead

to mismeasuring factor scores, sometimes quite poorly.


Because accurately measuring the factors scores is critically important, we take care to


                                             24

produce the simplest structure possible.15 The example above indicates that despite those

efforts, the structure is still not as simple as Thurstone's ideal. Had that ideal been achieved

by the oblique rotation, it would also have produced accurate factor scores. In order to

overcome the pitfall, exemplified by the Angola case, we propose to keep only the leading

scoring coefficients in each column while computing factor scores, and to set the remaining

coefficients to zero. This scoring matrix with the embedded zeros is presented in Table 4.7.

For example, in the first column we retain the first three elements of the scoring coefficient

matrix in Table 4.6 that correspond to the main loadings on this factor. The remaining

elements are set to zero.


This approximation may be formally tested using Anderson's eigenvector test (Reyment and

Joreskog, 1993, p. 101). In order to test whether a specific vector b is equal to the eigenvector

ai associated with the eigenvalue i of a matrix S (ai is the ith principal component of the

data correlation matrix), the Anderson test statistic is:

       2eig = (N - 1) ib S- b +   1      1
                                                                                                       (28)
                                         i b Sb - 2

The statistic is distributed as 2 with p - 1 degrees of freedom, where p is the number

of elements of the eigenvector (here p = 12). Inserting the eigenvector a in place of b in

(17) results in a value of 2eig = 0. We will use this property to adapt (17) to test for the

rotated factor loadings AR which was created by transforming A using the rotation matrix

T , AR = AT. Even though the columns of AR are not eigenvectors, since the columns of
ART- are the original eigenvectors (17) can equivalently be written as a test statistic for
        1


the equality of the rotated vector corresponding to eigenvector ai, aRi, with a specific vector

bR as:

       2Reig = (N - 1) iT- bRS- bRT- +
                                  1       1       1     1     1              1
                                i                i                                                     (29)
                                                       i  T- bRSbRT- - 2 ,
                                                            i               i



  15The ordinal measure is a unit-free standardized factor score, relative to the median country whose score
is zero.


                                                    25

where T- is the column of the T- that corresponds to eigenvector being tested for equality
          1                                1
         i

with the specific vector bR. We use (18) to test for the equality of each component (column)

of the rotated factor loading matrix AR ­ the simplest possible principal component structure
                                            ^

given the data ­ with the corresponding column of the rotated factor loading matrix with

embedded zeros AR ­ our "ideal" Thurstonian structure given the data.16 The statistic
                     ^
                        0

re-rotates the components back into the original unrotated eigenvector space and compares

whether the simpler structure can map back closely to the unrotated loadings (which is the

basis for the test statistic). Thus, the computed statistics correspond to the order of the

unrotated eigenvectors. For the six columns we get the calculated chi-squared statistics, with

11 degrees of freedom, to be: 304.8, 54.6, 2.17, 35.0, 28.8, and 7.60. The critical value of

24.7 rejects equality of four of the six principal components (i.e. columns of AR and AR ). ^         ^
                                                                                                          0



The trade-off that this result forces is between (a) accepting the results of the test and using

the full scoring matrix to compute (sometimes unreliably) the factor scores, and (b) to pro-

ceed using a scoring matrix with zeros replacing the elements for which the corresponding

loadings are small. The latter option is the one we choose for two reasons. First, we believe

that while imperfect, as shown by the formal test procedure, it is a good approximation

because the simple structure has delivered a clear picture of the variables that are strong

measures of each factor. They come close to approximating the Thurstone ideal, and replac-

ing the small loadings with zeros accomplishes that ideal. Second, and more important, is

the overwhelming need to have consistent estimates of factor scores. As the Angola example

drives home, the scores on a factor must be consistent with the underlying rankings of those

variables that overwhelmingly determine the characteristics of the factors.


For these reasons, we proceed with the use of the zero-embedded scoring coefficient matrix


  16This method may not be used with the ML method below because the ML method's loadings matrix is
not the matrix of eigenvectors, so there is no correspondence between the loadings matrix and the eigenvalues.




                                                       26

in Table 4.7 to determine the scores on the six factors. The same matrix is used to compute

the 1995 scores and the 2002 scores on the six factors. so that the scores across the two

periods may be compared. These factor scores will be used to depict two important features

of the sample. First, the scores allow us to rank each country according to the values of

each factor, for any specific period. Second, the scores from the two periods indicate how

a country's relative position on a factor has changed over that time. Before performing

these comparisons, we undertake a different kind of factor analysis, estimated by maximum

likelihood. We will then be able to draw on a richer and robust set of results when we inspect

country rankings and how they have changed.


4.2 True Factor Analysis with Maximum Likelihood


The salience of many of the issues discussed while analyzing the PC results ­ rotation, simple

structures, correlation of factors ­ are relevant for true factor analysis as well. Here too, our

objective is to achieve the simplest structure, which for the ML estimates requires an oblique

factor rotation. As was the case with PC analysis, in order to avoid inconsistent estimates

and rankings in the true factor analysis, we set the factor scoring coefficients corresponding

to small factor loadings to zero. The important difference from the PC analysis is that ML

favors fewer factors. The communalities are reasonably high as indicated by the fairly low

(below 0.35) uniqueness in the variables.


With the focus now on intercorrelations rather than variances (see (14)), the appropriate

measure of fit used to assess how many factors best fit the data, is no longer the amount of

total variance explained by the factors as in PC analysis. Three measures are appropriate

here, a chi-squared measure of fit, denotes 2fit, and two information-based criteria - the Bayes
information criterion (BIC), and Akaike's information criterion (AIC). The first column Table

5.1 indicates the number of factors. The next five columns are related to the chi-squared fit



                                               27

statistic corresponding to the number of factors in the first column. 2fit is distributed with
0.5[(p - k)2 - (p + k)] degrees of freedom. It is used to test the hypothesis that k or less

factors are required to rationalize the data. At the 1% level of significance the calculated

statistic rejects k = 1 and k = 2 but fails to reject k = 3. The smallest k is therefore three

according to this measure of fit. Another use of this statistic is to see if the difference in

the statistic with every increase in k is "statistically significant". Thus, going from k = 5 to

k = 6 is the first increase (starting from k = 1) for which the change in the statistic is not

significant. According to this variant, k = 5.


The two information criteria reward parsimony and penalize over-parameterization, with

the BIC penalizing over-parameterization more strictly. The smaller the BIC and AIC, the

more preferred the model. The BIC chooses k = 4 while the AIC chooses k = 5. Thus, the

statistical tests conclude that we should focus our attention on no more than five and no less

than four factors. We estimated the model with both four and five factors. Upon examining

the simplest loading structure we found the four factor model to have cleaner political-

economic content since the fifth factor is not distinct in the sense of clearly generating even

one of the variables. That is, it consists of many small undistinguished loadings that are

collectively significant but not individually so. Thus, we proceed with k = 4.


Table 5.2 indicates the oblique-rotated factor loading matrix with four factors (the rotation

matrix T is reported in Table 5.3). The oblique rotation improves upon the orthogonal

(varimax) rotation and produces a simple structure. The first factor is named the ICT

factor because the variables computers, internet users and telephones load heavily on this

factor. Further, these three variables do not load heavily on any of the remaining factors,

thus satisfying an important simple structure requirement. The second factor is named the

Law, Regulation and Openness Factor because the three variables law, quality of regulation,

and tariff and non-tariff barriers load heavily on this factor. These variables also do not load



                                               28

heavily on the other factors. Factor 3 is named the Literacy and Education Factor because

the three variables adult literacy, secondary enrollment and tertiary enrollment load heavily

on this factor (and not on any other). Finally, factor four is named the Innovations Fac-

tor because the number of patent applications, research and number of articles in technical

journals load heavily on this factor. Thus, Table 5.2 indicates a clear and simple struc-

ture of factors. These four factors define the underlying dimensions in the data, which are

measured by the observed variables. That is, computers, internet users and telephones are

essentially different measures of the ICT dimension, and adult literacy, secondary enrollment

and tertiary enrollment are different measures of the Literacy and Education dimension.


An attractive feature of the four factors is that they account for the communalities in the

variables quite well. The residual variances are small, as indicated by the last column of

Table 5.2. None of the variables have a large measure of "uniqueness". If one of the variables

did, then it would mean that the error variance from a regression of that variable on the

factors would be large. As a thumb rule, a uniqueness measure for a variable greater than

0.50 would indicate the presence of a unique factor, uncorrelated with the four common

factors. Fortunately, the four factors rationalize our data well. Finally, just as for the PC

analysis, in order to compute factor scores we use the Thurstonian scoring coefficient matrix

in Table 5.5, achieved by replacing the undistinguished elements in the full scoring coefficient

matrix in Table 5.4 by zero and retaining the significant loadings in each column.


4.3 Weighted data


In addition to the data set analyzed above, it is instructive to analyze a data set in which

variables that increase with the size of the country are scaled down. Thus, we also factor-

analyze a "weighted" data set which is different from the "unweighted" data (analyzed thus

far) with regard to three variables: patent applications, researchers in R&D, and scientific



                                              29

and technical journal articles. In the "weighted" data these three variables are scaled by

population, while the remaining nine variables are exactly the same as in the "unweighted"

data. The scaling of these three variables does influence the optimal number of principal

components required to rationalize the data. For brevity, we refer the reader to Chen and

Gawande (2006) for details such as the factor loading matrices for the "weighted" data. The

methods for estimating those matrices and then using them to estimate the factor scores are

exactly the same as described in Section 3.


The main differences between the two data sets are that in the "weighted" with the ML

method the optimal number of factors (according to the Bayes information criterion) is

three, one less than for the "unweighted" data, while in PC analysis the optimal number of

components is seven, one more than for the "unweighted" data. In the graphical analysis of

the factor scores and rankings below, we differentiate the findings from "unweighted" and

"weighted" data sets.



5. Analysis of the Factor Output


The main objective of the factor score computations is to use them to describe how countries

rank on the basis of these factors, and how those rankings have changed over the two peri-

ods. The authors' working paper contains a more complete analysis for 20 underdeveloped,

developing, emerging, oil-rich and industrialized economies. Here we discuss these results

for five countries.


Figure 1 for Albania has four panels in it. The panel on the top left depicts Albania's rank

vis-a-vis the other 120 countries in the sample on each of the six principal components. The

spider chart on the top right depicts Albania's rank on the four ML factors. The bottom

row panel contains the weighted data counterparts to the top row. There are seven principal



                                             30

components and three ML factors in this data. The green lines inside the spider chart shows

how Albania ranked on each principal component or ML factor in 1995. The red line in the

spider graph shows Albania's ranking in the most recent period, around 2002. If, along any

factor axis, the red line graph is closer to the center than the green line, then it indicates

that Albania's position relative to other countries in the sample on that factor has worsened

over the decade. This unpleasant and surprising finding applies to the Literacy factor, the

ICP factor, and the Education factor. Albania's ranking on the Literacy factor dropped from

being near the top 25th percentile to the bottom 35th percentile over this decade. Similar

deteriorations are in evidence for the ICP factor and the Education factor. Whether this

decline in rankings imply that Albania degraded in absolute terms on the factor score or

whether it improved, but at a far slower pace than other countries, is not obvious from the

graphs. However, since we have used a common factor scoring matrix for computing factor

scores for the two periods, the scores can be put to use in cardinal comparisons as well. The

ML factors, although fewer in number, convey the same difficult message about the change

in Albania's ranking on the ICP factor and the Literacy & Education factor.


Angola ranks towards the bottom of the list of 120 countries in almost all dimensions whether

measured by principal components are maximum likelihood. It ranks abysmally in literacy,

law, education, and innovation potential. The "unweighted" data may stack the odds against

small countries like Angola since the variable patent applications, number of researchers and

technical journal output is unscaled by population. The "weighted" data do indicate hope

for Angola. Its rank in terms of its (scaled) patent applications is closer to the median. Its

ranking on net users and (scaled) number of researchers has also increased over the 10 year

period indicating the country is taking steps to keep up with the technological changes in

the world.


One reason for separately analyzing the "weighted" and "unweighted" data sets is the belief



                                              31

that there is a scale effect in the sheer numbers. That is, there may be threshold effects

in innovation potential based on the stock of intellectual R&D and capital as measured by

technical journal output, number of patent applications, number of researchers. This is the

sense in which the "unweighted" data are different from the "weighted" data. In addition to

the obvious examples of the US, Japan and Western European countries, India and China

have also demonstrated such threshold effects. On the other hand, scaling these variables by

population indicates the extent to which the full technical potential of the population is being

tapped. High levels of these scaled measures are also indicators of innovation potential as

countries like Finland and Iceland have demonstrated in the last decade. So while there is no

compelling reason that sheer numbers should be more or less important than the proportion

of the population that is involved in technical pursuits, it is clear that both have led to the

potential to innovate.


Argentina has, as one might expect after a major currency and banking crisis, degraded

along many dimensions. In the "unweighted" data, it has fallen to the bottom quartile on

the law dimension, as well as in openness. Rising inequality due to the recession are prob-

ably responsible for the degradations in the law dimension. The devaluation was probably

not enough to make their exports competitive and therefore, while the rest of the world has

cut back on trade barriers, Argentina has maintained or increased them. The four dimen-

sional ML factors show a stark picture on the law and openness dimension. Surprisingly,

Argentina has not lost its ranking in the other three dimensions. Its literacy ranking has

actually increased, on innovation potential it has kept pace, and on the ICT dimension it

has maintained its position. The "weighted" data reiterate the same messages from the

"unweighted" data.


Brazil has made gains and presents a contrasting picture to Argentina on at least the law

dimension. While its high income inequality is probably responsible for placing Brazil in



                                              32

the bottom 50% percentile on the law and regulation dimension, the country has improved

on this dimension during the last decade. In the four dimensional ML graph, the green line

contains the red line, indicating that over this ten year period Brazil has improved its ranking

on each dimension. The principal components show that its rank on the openness dimension

has lowered, which probably has to do with Mercosur Argentina and Brazil shared similar

rankings on openness in 1995 ­ or is a result of keeping trade barriers at fixed levels while the

rest of the world has liberalized). The ML graph indicates impressive gains in literacy and

education in Brazil. It is probably a good bet that this trend will also lead to an increase in

Brazil's rankings on the law dimension in future years (recall that the factors are correlated).

The "weighted" data paint a similar picure.


China, being a populous country, will obviously show different rankings for "weighted" versus

"unweighted" dimensions. We should be cautious about interpreting the meaning of the

innovation potential factors in the "unweighted" versus the "weighted" data. In the unscaled

data China ranks high on the innovation potential list because of the sheer strength of its size.

The "weighted" data present quite a contrast along the dimension measured by researcher

and technical journals. In other words, while China has a critical mass in innovation potential

(which may be the reason it attracts foreign direct investment), China still has a long way to

go in achieving its full potential on innovation as measured by the scaled data. If it produced

patents, researchers and technical journal at the same per capita rate as the more advanced

countries, China would probably be an OECD country. Such trends are already in evidence.

Along each of these dimensions in the "weighted" data, China is already at the median of

the sample and has made strides to move ahead, especially in patent applications. On other

dimensions, literacy has not improved greatly. However, the ICT factor leaped from the

bottom quartile to close to the median among the sample.




                                               33

6. Conclusion


We factor-analyze the Knowledge Assessment Methodology (KAM) data. The KAM data

was developed at the World Bank to assess countries' readiness for the knowledge economy.

The data potentially draw the attention of policymakers to specific areas deserving of more

attention and future investments. We factor-analyze KAM data in order to reduce those

variables to their essential dimensions or factors. Our main objective in undertaking the

factor analysis is to quantify the factors for each country, that is, compute factor scores on

each factor. To this end, the paper details these issues in the factor analysis of the KAM data

in detail ­ whether the KAM data should be factor-analyzed, the optimal dimensionality of

the data, and giving political-economic meaning to the factors. We find that the KAM data

are not just amenable to factor analysis but they greatly benefit from it. There are enough

inter-correlations among the variables that the real information in the data can be distilled

down to a smaller number of dimensions.


We use two factor analytic methods ­ Principal Components (PC) analysis and "true" factor

analysis which we estimate using maximum likelihood (ML). While PC analysis focuses on

explaining the variance in the data, the ML method seeks to explain the intercorrelations in

the data. We should therefore expect the two methods to produce different results. While

the results are different (PC analysis requires many more dimensions to rationalize the data

than ML analysis), there are common themes.


A contribution of the paper is identifying the political-economic dimensions in the KAM

data and measuring them for (ordinal) comparisons over time. We embrace the idea of a

simple structure of the dimensions and allow these dimensions to be correlated with each

other. The output from the factor analysis is used to graphically analyze how countries have

changed their rankings on the underlying dimensions over the 1995-2002 period.



                                               34

References



Anderson. T. W. 1984. An Introduction to Multivariate Statistical Analysis. New York,

Wiley.


Bohara, A. K., A. I. Camargo, T. Grijalva, and K. Gawande. 2005. "Fundamental Dimen-

sions Underlying the Regulation of U.S. Trade." Journal of International Economics 65(1):

93-125.


Bollen, K.A., 1989. Structural Equations with Latent Variables. New York, NY: Wiley.


Chen, H. C. Derek, and C. J. Dahlman, 2005. "The Knowledge Economy, the KAM Method-

ology, and World Bank Operations." Manuscript.


Chen, H. C. Derek, and C. J. Dahlman, 2004. "Knowledge and Development: A Cross-

Section Approach." World Bank Policy Research Working Paper #3366.


Goldberger, A., 1972. "Maximum Likelihood Estimation of Regressions Containing Unob-

servable Independent Variables." International Economic Review 13: 1-15.


Joreskog, K.G. and Sorbom, D., 1996. LISREL 8: User's Reference Guide. Chicago, IL:

Scientific Software International Inc.


Joreskog, K.G. and Sorbom, D., 1979. Advances in Factor Analysis and Structural Equations

Models. Cambridge, MA: Abt Books.


Joreskog, K. G., 1967, "A general approach to confirmatory maximum likelihood factor

analysis", Psychometrika 34, 183-202.




                                            35

Kaufmann, D., A. Kraay, and M. Mastruzzi, 1999. "Aggregating Governance Indicators."

World Bank Policy Research Working Paper #2195.


Kaufmann, D., A. Kraay, and M. Mastruzzi, 2004. "Governance Matters III: Governance

Indicators for 1996, 1998, 2000, and 2002." World Bank Economic Review 18: 253-287.


Lawley, D. N. and A. E. Maxwell, 1971. Factor analysis as a statistical method. New York,

NY: American Elsevier.


Reyment, R. and Joreskog, K.G., 1993. Applied Factor Analysis in the Natural Sciences.

Cambridge, UK: Cambridge University Press.


Rubin, D. B. and D. T. Thayer, 1982. "EM Algorithms for ML Factor Analysis". Psychome-

trika, Vol 47, No. 1, March, 1982.


Theil, H., 1971. Principles of Econometrics. New York, NY: John Wiley.




                                            36

37

38

39

40

41

42

43

44

45

46

47

48