ï»¿ WPS6480
Policy Research Working Paper 6480
PPML Estimation of Dynamic Discrete
Choice Models with Aggregate Shocks
Erhan Artuc
The World Bank
Development Research Group
Trade and International Integration Team
June 2013
Policy Research Working Paper 6480
Abstract
This paper introduces a computationally efficient method on agentsâ€™ expectations, thus it can accommodate
for estimating structural parameters of dynamic discrete macroeconomic and policy shocks. The regression
choice models with large choice sets. The method is requires count data as opposed to choice probabilities;
based on Poisson pseudo maximum likelihood (PPML) therefore it can handle sparse decision transition matrices
regression, which is widely used in the international trade caused by small sample sizes. As an example application,
and migration literature to estimate the gravity equation. the paper estimates sectoral worker mobility in the
Unlike most of the existing methods in the literature, United States.
it does not require strong parametric assumptions
This paper is a product of the Trade and Integration Team, Development Research Group. It is part of a larger effort by
the World Bank to provide open access to its research and make a contribution to development policy discussions around
the world. Policy Research Working Papers are also posted on the Web at http://econ.worldbank.org. The author may be
contacted at eartuc@worldbank.org.
The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development
issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the
names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those
of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and
its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.
Produced by the Research Support Team
PPML Estimation of Dynamic Discrete Choice Models
with Aggregate Shocks
Erhan Artucâˆ—
May, 2013
Abstract
This paper introduces a computationally eï¬ƒcient method for estimating structural
parameters of dynamic discrete choice models with large choice sets. The method is
based on Poisson pseudo maximum likelihood (PPML) regression, which is widely used
in the international trade and migration literature to estimate the gravity equation.
Unlike most of the existing methods in the literature, it does not require strong para-
metric assumptions on agentsâ€™ expectations, thus it can accommodate macroeconomic
and policy shocks. The regression requires count data as opposed to choice probabili-
ties; therefore it can handle sparse decision transition matrices caused by small sample
sizes. As an example application, the paper estimates sectoral worker mobility in the
United States.
Keywords: Poisson Pseudo Maximum Likelihood, Labor Mobility, Migration, Dis-
crete Choice Models, Gravity Equation.
JEL Codes: C25, F22, J61, J62, F16.
âˆ—
The views in this paper are the authorâ€™s and not those of the World Bank Group or any other institution.
Artuc: World Bank, Trade and International Integration Unit, Development Economics Research Group
(Economic Policy), 1818 H Street, NW Washington DC, 20433 USA; eartuc@worldbank.org. I thank Jim
Anderson, Chad Bown, Irene Brambilla, David Kaplan, Hiau Looi Kee, John Kennan, John McLaren, Caglar
Ozden, Daniel Lederman, Guido Porto, Ray Robertson, Diego Rojas and Yoto Yotov for their comments.
All errors and omissions are mine.
1
Poisson pseudo maximum likelihood regression (henceforth PPML) within the context of
the gravity equation, has become very popular in international trade and migration literature.
It was introduced by Gourieroux, Monfort and Trognon (1984). Then, more recently, Santos
Silva and Tenreyro (2006) showed that it is a simple but powerful method for estimating
bilateral resistance parameters of the gravity equation. After these two seminal papers, it has
become one of the standard tools in the international economics literature, widely used to
explain trade, and more recently migration ï¬‚ows1 . This paper extends this popular method
further and shows how it can be used to estimate structural dynamic discrete choice models
by adding a linear reduced form regression step. This novel method can handle models
with large choice sets, heterogeneity, and aggregate shocks. Our approach is an intuitive
combination of well known and widely used methods, therefore it imposes little set-up cost
to the econometrician and can utilize standard statistical software.
The method has two steps: First, we run PPML regression using discrete choice data,
similar to the gravity equation estimation, to estimate expected values that appear in the
Bellman equations. Second, we construct a linear regression equation by plugging the esti-
mated expected values into the Bellman equation that characterizes the dynamic decision
making process of agents. In the second step, we estimate distributional and utility ï¬‚ow
parameters of the discrete choice model. Since we estimate expected values rather than cal-
culating value functions by iteration or backward solution, expectations of agents are fully
accounted for even when they are not quantiï¬?able by the econometrician. Both regressions
are based on orthogonality conditions, rather than maximum likelihood, therefore the dis-
tributions of payoï¬€ streams or aggregate shocks are not required for the estimation. The
estimated system does not need to be at the steady state. In fact, the steady state may not
even exist in the presence of macroeconomic and policy shocks. The orthogonality conditions
we use have analytical derivatives, therefore the method we use is much faster than maximum
1
Two recent examples from the migration literature that use gravity equation are Beine, Docquier and
Ozden (2011) and Grogger and Hanson (2011). In the trade literature, Olivero and Yotov (2011) generalize
gravity equation in a dynamic framework but estimate trade ï¬‚ows at the steady state without considering a
discrete choice speciï¬?cation.
2
likelihood based methods and it allows us to estimate a large number of parameters.
Accounting for aggregate shocks is an important challenge for the estimation of dynamic
discrete choice models. In the literature, the most common methods are based on maximum
likelihood estimation (henceforth ML) using backwards solution or conditional choice proba-
bilities. ML estimation requires strong distributional assumptions on aggregate shocks, thus
on workersâ€™ expectations about payoï¬€s. In the literature, the most common assumption is
the absence of aggregate shocks; because it is very diï¬ƒcult to rigorously model transmission
of aggregate shocks into the payoï¬€ streams and workersâ€™ expectations2 . In contrast to ML
estimation, our method does not require distributional assumptions on aggregate shocks or
workersâ€™ expectations, except rationality. Therefore, using this novel method, international
and internal migration, sectoral labor mobility, occupational mobility, and other dynamic
discrete choice models can be estimated eï¬ƒciently in the presence of macroeconomic and
policy shocks.
The two most recent papers that address similar discrete choice problems are Anderson
(2011) and Artuc, Chaudhuri and McLaren (2010). Anderson (2011) shows how the gravity
equation can be considered as an equilibrium condition for discrete choice problems, and it
can be estimated with PPML regression. After the estimation step, he solves the structural
push and pull parameters from the PPML regression coeï¬ƒcients along with multilateral re-
sistance parameters. Technically, our ï¬?rst step regression is similar to a gravity equation,
as in Anderson (2011). Diï¬€erent from him, we interpret PPML regression ï¬?xed eï¬€ects as
expected values in the Bellman equation that gives the optimality condition for the under-
lying discrete choice model. We estimate parameters of the Bellman equation rather than
the gravitational push and pull parameters.
2
For example, after the recent housing crisis of 2007, the demand for unskilled labor in the construction
sector decreased signiï¬?cantly. This negative shock potentially aï¬€ected expected payoï¬€s of workers in the
construction sector and welfare of low-skill immigrants in general. Consequently, the construction sector
shrank signiï¬?cantly and some states such as Alabama and Arizona passed strict anti-immigration laws. Any
migration or sectoral mobility model that is estimated via ML has to contain distributional assumptions
regarding such policy and macroeconomic shocks that are diï¬ƒcult to quantify.
3
Artuc, Chaudhuri and McLaren (2010) derive an equilibrium condition for workersâ€™ sec-
toral choice, that is in essence an Euler equation. Similar to ours, their method also allows
aggregate shocks. They impute workersâ€™ values from observed gross ï¬‚ows. However, their
expected value imputation method does not allow sparsity in the transition matrices of ï¬‚ows
from one decision to another. This limits the number of choices, and also creates problems
in incorporating heterogeneity. Furthermore, it increases the standard errors, preventing
estimation of detailed versions of their model. Our method is free from these limitations.
The idea of imputing values from conditional choice probabilities (henceforth CCP) was
ï¬?rst introduced by Hotz and Miller (1993), a milestone for discrete choice models due to their
inversion theorem. They consider stationary problems and estimate choice probabilities us-
ing a non-parametric method. Aguirregabiria and Mira (2002) introduced a new algorithm,
â€œNested Pseudo Maximum Likelihoodâ€? (henceforth NPM) that combines a CCP with an
iterative step to improve eï¬ƒciency of CCP in small samples3 . More recently, in their semi-
nal paper, Arcidiacono and Miller (2011) combine CCP with the expectation-maximization
(henceforth EM) algorithm to allow certain non-stationary processes and unobserved het-
erogeneity. Similar to Arcidiacono and Miller (2011), our estimation procedure can use an
EM loop to account for unobserved heterogeneity. In this paper we argue that PPML is a
convenient and eï¬ƒcient alternative to CCP for problems with large choice sets or a small
sample. However, it is not a substitute for EM or NPM, and it can be used in combination
with EM and NPM instead of non-parametric CCP estimation.
Our procedure has two major diï¬€erences with the other non-iterative dynamic program-
ming methods in the literature: First, we use Poisson regression rather than the Hotz and
Miller (1993) inversion equation to estimate expected values. Second, our method does not
rely on maximum likelihood estimation but orthogonality conditions. Therefore, we do not
need distributional assumptions for aggregate shocks. Thanks to the linearity of the estimat-
ing equations, it is well suited for problems with a large number of choices and structural
3
It is essentially an intuitive combination of Hotz and Miller (1993) and Rust (1987). See Aguirregabiria
and Mira (2010) for an extensive survey of the literature.
4
parameters. Our method is computationally eï¬ƒcient and can utilize standard statistical
software, widely used to estimate gravity equations in the context of migration and trade
ï¬‚ows4 . Similar to the other non-iterative solution methods, the state space has to be small
compared to the backwards solution methods.
In the next section, we present a representative discrete choice model that can be esti-
mated with our method. In the following sections, we summarize our estimation strategy,
and provide an example application, and present simulation results.
1 Model
Consider an economy with inï¬?nitely-lived L agents and N sectors, where each agent is in
a discrete state s âˆˆ S . Sectors can be industries, occupations, cities, countries, or any
combination of such choices, while the state could be the type of agent such as education
level, gender, age or other individual characteristics. It is also possible to consider economic
policies as a part of state space, such as trade policy, migration policy, or education policy.
We can also incorporate unobserved types, which is omitted from this section for the sake
of clarity.
A type s agent chooses a sector i âˆˆ {1, 2, 3, .., N } in the end of period t âˆ’ 1, and receives
instantaneous utility ui,s
t at time t deï¬?ned as
ui,s i,s i,s
t = wt + Î· , (1)
i,s
where wt is the observed sector speciï¬?c random payoï¬€ common to all type s agents working
in sector i with ï¬?nite moments, and Î· i,s is the unobserved sector speciï¬?c iid utility shock
also common to all type s agents. Hence, the state of each agent can be summarized with
the pair (i, s) where s is the type and i is the current sector.
4
PPML estimation is based on an orthogonality condition, which has an analytical derivative. This
computational convenience makes the estimation process much faster than alternatives. For example, it
converges within minutes even with hundreds of choices and many structural parameters.
5
i,s
We assume that only wt is observed by the econometrician, Î· i,s is known by agents,
but not by the econometrician. All agents are risk neutral, have rational expectations and
a common discount factor Î² < 1. The expected future payoï¬€ streams can change over time,
i,s i,s
Et+1 wt +n = Et wt+n for n â‰¥ 1. The present discounted choice-speciï¬?c utility of agent l is
equal to
i,s
Uti,s,l = wt + Î· i,s + max Î²Et Vtj,s ij,s
+1 âˆ’ Ct âˆ’ Îµj,l
t , (2)
j
where Ctij,s + Îµj,l
t is the cost of choosing sector j , for type s agent l who is currently in sector
i. The â€œmoving costâ€? has two components, a deterministic part, Ctij,s , common to all type
s agents, and a random part, Îµj,l
t , speciï¬?c to agent l . All type s agents are identical except
for their individual moving cost shock Îµj,l ii,s
t . We assume that Ct = 0, which means the ï¬?xed
component of moving cost is zero for stayers.
i,s
The timing of events is as follows: 1. Agents learn values of wt once they receive it.
2. Then, in the end of time t, they learn the random component of â€œmoving cost,â€? Îµj,l
t , for
every j = 1, .., N , and choose the next period sector (based on expected stream of future
payoï¬€s and moving costs). 3. Agents pay the moving cost, Ctij,s + Îµj,l
t , where j is the chosen
sector. 4. Period t + 1 starts, and the cycle repeats itself.
After taking expectation of (2) with respect to agent speciï¬?c shocks, the choice speciï¬?c
value function can be expressed as
i,s
Vti,s,l = wt + Î· i,s + Et max Î² Ï€ (s, s ) Vtj,s ,l
+1 âˆ’ Ct
ij,s
âˆ’ Îµj,l
t , (3)
j
s âˆˆS
where Ï€ (s, s ) is the probability of switching from type s to type s . We assume that
Ï€ (s, s ) is exogenous5 . Henceforth, we drop the agent superscript l for notational convenience.
We can rearrange the value function as
Vti,s = wt
i,s Ëœti,s
+ Î· i,s + Î² V j ij,s
+1 + Et max{Îµt + Îµt },
j
5
It is possible to endogenize this transition matrix, but is out of scope of this paper.
6
where
Îµij,s
t = [Î² Vtj,s i,s ij,s
+1 âˆ’ Î² Vt+1 ] âˆ’ Ct ,
and
Vti,s
+1 = Ï€ (s, s ) Et Vti,s
+1 . (4)
s âˆˆS
Then, the choice speciï¬?c values can be written as
i,s
Vti,s = wt + Î· i,s + Î² Vti,s i,s
+1 + â„¦t . (5)
The option value â„¦i,s
t is equal to
N âˆž
â„¦i,s
t = (Îµj + Îµij,s j
t )f (Îµ ) F (Îµj + Îµij,s
t âˆ’ Îµik,s j
t )dÎµ ,
j =1 âˆ’âˆž k=j
where F (Îµ) is the cumulative distribution function and f (Îµ) is the probability density
function of the moving cost shocks. The option value, â„¦i
t , is the extra utility generated by
being able to change sectors. As moving cost Ctij increases, the option value decreases, and
it diminishes to zero when the moving cost goes to inï¬?nity. The option value function is
crucial for the implementation of estimation process since it can be solved analytically under
certain distributional assumptions.
Assume that Îµi
t is distributed iid extreme value type I with location parameter âˆ’Î½Î³ , scale
parameter Î½ , and cdf F (Îµ) = exp (âˆ’ exp (âˆ’Îµ/Î½ âˆ’ Î³ )), where E (Îµ) = 0, V ar (Îµ) = Ï€ 2 Î½ 2 /6
and Î³ is the Eulerâ€™s constant.
Assume that mij,s
t is equal to the ratio of type s agents who switch from sector i to
sector j . This can be interpreted as gross ï¬‚ows from i to j , or the probability of choosing
ij,s
j conditional on (i, s). The total number of agents moving from i to j is equal to yt =
Li,s ij,s i,s ij,s
t mt , where Lt is the number of type s agents who are in i at time t. yt can be
interpreted as number of people migrating from one city to another, changing occupation,
7
changing industry, etc.
Thanks to the extreme value distribution and McFadden (1973), the gross ï¬‚ow mij,s
t is
equal to
exp Î² Vtj,s i,s ij,s
+1 âˆ’ Î² Vt+1 âˆ’ Ct
1
Î½
mij,s
t = N
, (6)
exp Î² Vtk,s i,s ik,s
+1 âˆ’ Î² Vt+1 âˆ’ Ct
1
Î½
k=1
and we can show that the option value6 is equal to
N
1
â„¦i,s
t = Î½ log exp Î² Vtk,s i,s ik,s
+1 âˆ’ Î² Vt+1 âˆ’ Ct . (7)
k=1
Î½
Note that we could use an expression similar to (7) to construct a CCP representation of
the Bellman equation, because â„¦i,s ii ij
t = âˆ’Î½ log mt and mt is a conditional choice probability,
see Appendix C for the details. We do not use the method CCP or the Hotz-Miller inversion
equation in this paper. Actually, unlike Hotz Miller (1993) or Artuc Chaudhuri McLaren
(2010), we never take logarithm of probabilities in the estimation algorithm. Diï¬€erent from
the CCP representation, we estimate expected values directly from count data which makes
our method more convenient when estimation of probabilities or evaluation of the likelihood
function is diï¬ƒcult. This is usually the case when the number of choices and structural
parameters is large.
In the next section, we describe the estimation procedure of the generic model we present
here. (5), (6) and (7) play key roles in the estimation procedure.
2 Estimation
Our method has two stages: First, the Poisson regression stage, where we estimate expected
values associated with each choice for every time period. Second, the Bellman equation
6
See Appendix B for derivation of the equations.
8
stage, where we plug estimated expected values into a Bellman equation to construct a
linear regression and retrieve structural parameters of the model.
Step 1: PPML Regression
Ëœti,s and bilateral resistance parame-
In this step, our goal is to estimate expected values V
ters Ctij,s . We construct a simple expression for ï¬‚ows between options, similar to the gravity
equation, which is essentially a Poisson pseudo maximum likelihood regression available in
many diï¬€erent types of statistical software.
The Stage 1 regression equation is
ij,s
yt = exp Î›j,s i,s ij,s
t + Î“t + Î¨t
1,ij,s
+ Î¾t , (8)
ij,s ij,s
where yt is total number of agents with state (i, s) who choose j (hence yt = Li,s ij,s
t mt ),
Î›j,s
t is the destination ï¬?xed eï¬€ect, Î“i,s ij,s
t is the origin ï¬?xed eï¬€ect, and Î¨t is the bilateral
resistance term. The equation above can be interpreted as a Poisson pseudo-maximum
likelihood regression7 .
Derivation of the Stage 1 regression equation:
If we multiply (6) with Li
t , we get
i,s Î² Ëœ j,s Î² Ëœ i,s 1 1 ij,s
yt = exp Vt+1 âˆ’ Vt+1 + log Li,s
t âˆ’ â„¦i,s
t âˆ’ Ct ,
Î½ Î½ Î½ Î½
then we can arrange the terms as i-speciï¬?c terms, j -speciï¬?c terms and bilateral terms.
(Note that, we need to drop either destination or ï¬?xed eï¬€ect for one choice. Otherwise the
regression matrix becomes singular. Assume that we drop the destination ï¬?xed eï¬€ect for the
choice i = 1).
7
See Gourieroux, Monfort and Trognon (1984) and Cameron and Trivedi (1998) for properties of the
PPML regression.
9
Then the j -speciï¬?c term or the destination ï¬?xed eï¬€ect Î›j,s
t is equal to
Î² Ëœ j,s Î² Ëœ 1,s
Î›j,s
t = V âˆ’ V ,
Î½ t+1 Î½ t+1
the i-speciï¬?c term or the origin ï¬?xed eï¬€ect Î“i,s
t is equal to
Î² Ëœ i,s 1 i,s Î² Ëœ 1,s
Î“i,s i,s
t = âˆ’ Vt+1 âˆ’ â„¦t + log(Lt ) + V ,
Î½ Î½ Î½ t+1
and the bilateral resistance term Î¨ij,s
t is equal to
1
Î¨ij,s
t = âˆ’ Ctij,s .
Î½
Note that the option value term â„¦i,s
t can be expressed as
1 i,s
â„¦ = âˆ’Î›i,s i,s i,s
t âˆ’ Î“t + log(Lt ), (9)
Î½ t
Stage 2: Bellman Equation
In Stage 1, we have estimated the expected values, Î›j,s
t , and moving cost parameters,
Î¨ij,s
t . The next step is to estimate other parameters, including 1/Î½ . In Stage 2, the goal is to
construct the Bellman equation using the estimated parameters from Stage 1 and estimate
the remaining parameters.
The Stage 2 regression equation is
Î² i,s
Ï†i,s
t = Î¶ts + Î· i,s + i,s
wt+1 + Î¾t , (10)
Î½
10
where Î¶ts is the time dummy speciï¬?c to type s, Î· i,s is the sector dummy speciï¬?c to s,
i,s i,s
wt +1 is the expected wage constructed using (12), Î¾t is the regression residual and ï¬?nally
Ï†i,s
t is the dependent variable constructed from Step 1 estimates using equation
Ï†i,s
t = Î›i,s
t + Ï€ (s, s ) Î² Î“i,s i,s
t+1 âˆ’ log(Lt+1 ) . (11)
s âˆˆS
The expected wages in (10) are equal to
i,s i,s
wt +1 = Ï€ (s, s ) wt +1 . (12)
s âˆˆS
It is possible to use Generalized Method of Moments or Instrumental Variables method
for the regression 8 .
Derivation of the Stage 2 regression equation:
After multiplying (5) with Î²/Î½ , aggregating it over possible states and moving all terms
to the left hand side, we get
Î² i,s Î² i,s
Et Vt+1 âˆ’ Ï€ (s, s ) wt +1 + Î·
i,s
+ Î² Vti,s i,s
+2 + â„¦t+1 = 0,
Î½ Î½ s âˆˆS
Then
Î² 1,s Î² i,s Î² 2 1,s
Et Î›i,s
t âˆ’ Vt+1 âˆ’ wt+1 âˆ’ Î· i,s âˆ’ Ï€ (s, s ) âˆ’Î² Î“i,s
t+1 + log(Li,s
t+1 ) âˆ’ Vt+2 = 0, (13)
Î½ Î½ s âˆˆS
Î½
i,s
where wt +1 is deï¬?ned in (12) and
8
For a detailed analysis of identiï¬?cation problems in discrete dynamic models, see Magnac and Thesmar
(2002).
11
Î²
Î· i,s = Ï€ (s, s ) Î· i,s .
Î½ s âˆˆS
We deï¬?ne
Î² 1,s Î² 2 1,s
Î¶ts = V + Ï€ (s, s ) V ,
Î½ t+1 s âˆˆS Î½ t+2
then we can re-arrange (13) and write it as
Î² i,s
Et Ï†i,s s
t âˆ’ Î¶t âˆ’ Î·
i,s
âˆ’ w = 0.
Î½ t+1
Alternative speciï¬?cations:
We focus on models that can be estimated using repeated cross-section data with retrospec-
tive questions, such as household labor force surveys which are available for many countries.
An example from the US is the March supplement of Current Population Survey9 . However
if longitudinal data are available, it is possible to consider unobserved heterogeneity in the
model. Arcidiacono and Miller (2011) show how an EM loop can be incorporated in CCP to
estimate unobserved heterogeneity. Their intuition can also be applied to PPML regression.
Appendix D illustrates how it is possible to use an EM loop within PPML regression when
panel data are available.
Another alternative modeling approach is to use wage shocks rather than moving cost
shocks in agentsâ€™ utility function. In Appendix A, we provide an equation that can be used
instead of (10) in case of such wage shocks.
In the next section we present an example to illustrate a practical application of the
method.
9
Other countries with such data, that we are aware of so far, are Indonesia, Mexico and Turkey.
12
3 Example Application: Sectoral Mobility in the US
In this section we present an application of the estimation method. First, we estimate
a disaggregated variant of Artuc, Chaudhuri and McLaren (2010) using the exact same
data, which is the Current Population Survey from years 1975 to 2001 (henceforth CPS).
Recently, Kaplan, Lederman and Robertson (2013), Artuc and McLaren (2012) and Artuc,
Bet, Brambilla and Porto (2013) used the estimation method we introduce herein.
Model
To elaborate on the generic model we presented in the previous section, consider that
sectors are industries in which workers choose to work in each time period. For each choice,
i
workers receive a payoï¬€ wt and an idiosyncratic utility Î· i common to all workers in sector i.
Assume that Î· 1 = 0 for normalization. For simplicity, we consider one type of worker, hence
drop the state superscript s. We allow the deterministic moving cost to change over time
such that Ctij = ct if i = j and Ctij = 0 if i = j .
We use two regressions to estimate structural parameters of the model. First, the Poisson
regression equation is
ij j ij
yt = exp Î“i
t + Î›t + Î¨t 1i=j + et , (14)
where the regression coeï¬ƒcient Î¨t = âˆ’ct /Î½ and the indicator function 1i=j is equal to
one when i = j and zero otherwise, and eij
t is the residual. In many cases, the discount rate
can not be identiï¬?ed, therefore we assume that it is equal to Î² = 0.97, and known by the
econometrician.
Second, the regression equation based on the Bellman equation is
Î² i
Ï†i i
t = Î¶t + Î· +
i
w + Î¾t , (15)
Î½ t+1
i
where Î¾t is the residual, Ï†i i i i i
t = Î›t + Î² Î“t+1 âˆ’ log(Lt+1 ) is the dependent variable, Î· is a sector
dummy, and Î¶t is a time dummy. We set Î· 1 = 0 for the ï¬?rst sector.
In the â€œAlternative speciï¬?cationâ€? we use diï¬€erent set of sector dummies, we allow the
13
sector speciï¬?c ï¬?xed utility to have linear time trends. This speciï¬?cation potentially cap-
tures changes in employment opportunities in diï¬€erent sectors over time. Modifying the
assumption on ï¬?xed utility only aï¬€ects the second stage regression.
The second stage regression for the â€œAlternative Speciï¬?cationâ€? is
Î² i
Ï†i i i
t = Î¶t + Î·1 + Î·2 t +
i
w + Î¾t , (16)
Î½ t+1
i i 1 1
where Î·1 is the intercenp and Î·2 is the trend. We set Î·1 = 0 and Î·2 = 0 for the ï¬?rst sector.
Data
Artuc, Chaudhuri and McLaren (2010) use CPS data for males between 25 to 64 years
old, from the year 1976 to 2001; we use the same data and sample section procedure. CPS is
a repeated cross-section, and its March supplement provides retrospective industry questions
regarding workersâ€™ industry in the previous year, along with their current industry. These
retrospective questions allow us to construct number of workers moving from industry i to j ,
ij
denoted as yt . In addition to ï¬‚ow data, we use average wage data for each industry. Artuc,
Chaudhuri and McLaren (2010) aggregate industries to 6 major sectors10 . Diï¬€erent from
them, we aggregate industries to 16 sectors. The sectors are: 1. Agriculture, 2. Mining, 3.
Construction, 4. Non-durable manufacturing, 5. Durable manufacturing, 6. Transportation,
7. Communications, 8. Utilities, 9. Wholesale trade, 10. Retail trade, 11. Finance, 12.
Business, 13. Personal services, 14. Entertainment, 15. Professional, and 16. Public.
In addition to increasing the number of choices, we consider sector speciï¬?c iid utility
shocks, Î· i , and let the deterministic part of moving cost, ct , change over time. These two
changes improve their theoretical model signiï¬?cantly because some sectors may be more
preferable by workers for non-pecuniary reasons and the moving costs may change over the
twenty six year sample. These possibilities are now addressed in the model. We use PPML
regression in the ï¬?rst step and IV regression in the second step. Everything else is exactly
10
They were not able to disaggregate sectors further because their method did not allowed zero cells in
the transition matrix. The were able to estimate at most seven structural parameters.
14
the same as their basic model and benchmark regression, including the choice of instruments.
Results
We estimate the distributional parameter 1/Î½ , 15 parameters for Î· i , and 26 parameters for
ct , thus 42 structural parameters total. In the ï¬?rst stage, we estimate ct /Î½ , and destination
and origin ï¬?xed eï¬€ects using equation (14). Then, we construct the second stage regression
equation (16) using the destination and origin ï¬?xed eï¬€ects from the ï¬?rst stage regression. In
i
the second stage, we estimate the remaining structural parameters, Î·t /Î½ and 1/Î½ . We use a
one year lag for the second stage IV regression.
Table 1 shows the estimation results for the basic speciï¬?cation. We present robust stan-
dard errors in the ï¬?rst stage regression. In the ï¬?rst step, all coeï¬ƒcients are signiï¬?cant at 1
percent level. We ï¬?nd that Ct /Î½ changes between 4.49 and 4.88, with an average of 4.67.
In the second step, 1/Î½ is estimated as 0.96 and is signiï¬?cant at 1 percent level, and 9
out of 15 unobserved utility coeï¬ƒcients are signiï¬?cant at 1 percent level and 5 coeï¬ƒcients
are signiï¬?cant at 5 percent level.
Table 2 shows the estimation results for the alternative speciï¬?cation. The ï¬?rst stage
regression for the alternative speciï¬?cation is identical to the basic speciï¬?cation, thus Ct /Î½
estimates are exactly the same. We ï¬?nd that 1/Î½ is estimated as 3.67 which is much larger
than the basic speciï¬?cation estimate. (Note that larger 1/nu means smaller Î½ and C ).
In the following section, we simulate data for steady state and transition under pol-
icy shocks and re-estimate the model with simulated data to illustrate performance of our
estimation method relative to other methods in the literature.
4 Monte Carlo Simulations
Running counter-factual policy simulations is usually the main motivation for structural
estimation. Reduced form equations are subject to Lucas critique and can not be used
15
in policy simulations11 . Although we use estimators which are traditionally reduced form,
each coeï¬ƒcient in the regression equations corresponds to a structural parameter. Using the
structural parameters, it is possible to simulate the model presented in the previous sections
under diï¬€erent policy scenarios. However, in this paper, we are not interested in particular
eï¬€ects of policies per se: Our goal is to show the performance of this new estimation method
using simulated data. We expose the system to policy shocks and illustrate robustness of the
estimation method under non-stationary conditions. In a sense, we create aggregate shocks
artiï¬?cially.
For an illustration, we consider an open economy model with trade shocks, exogenous
prices and endogenous wages. To simulate trade shocks, we need to deï¬?ne equilibrium real
wages as functions of labor supply and prices. Assume that sectors are perfectly competitive
with simple Cobb-Douglas production functions. We assume that workers are paid their real
marginal products. Then, the following real wage equation closes the model
i
wt = (pi Ëœi i ai âˆ’1 ,
t /Pt )ai A (Lt ) (17)
where pi
t is the exogenous price of sector i output, Pt is the consumer price index Pt =
Î i (pi bi Ëœi
t ) with basket shares bi , and A is a constant that is calibrated from the data.
We calculate Cobb-Douglas labor shares and consumer basket shares from Bureau of
Ëœi to match average wages in given sectors.
Economic Analysis data. Then, we calibrate A
The calibration exercise is similar to Artuc Chaudhuri McLaren (2010). The production
function and consumer price index parameters are reported in Table 2 along with wages and
labor allocations. We normalize all prices to one at steady state, pi
t = 1 for i = 1, .., 16, and
ï¬?x the deterministic part of moving cost to be constant12 over time ct = 4.5. We assume
11
As an example, consider a policy experiment of reducing the moving costs, Ct , by 50 per cent. Assume
that we would like to know the eï¬€ect of this change on workersâ€™ mobility decisions. We cannot simply change
the resistance coeï¬ƒcient Î¨ij
t and keep other coeï¬ƒcients as they were. Because after a change in the moving
cost, the values would also change, thus the Î“i i
t and Î›t parameters would change as well. So, it is impossible
i i
to use reduced form parameters Î“t and Î›t for simulations. Because of Lucas critique, one has to know the
underlying structural parameters.
12
We assume that moving costs are constant over time for cosmetic reasons, so that the results are easy
16
Î½ = 1 and assign arbitrary values to Î· i .
For the simulations, we use a multiple shooting algorithm similar to Lipton et al (1982),
but one can use other shooting methods instead.
We consider four simulation exercises:
In Simulation I, we simulate the model around steady state13 . Then, we estimate the
model using 26 years of simulated data.
In Simulation II, we drop the manufacturing prices 20 percent as a surprise one time
shock, which implies a tariï¬€ reduction in the protected manufacturing industries (sectors 4
and 5). After this one time shock, we let the system reach new steady state over time. Then,
we estimate the model using simulated data during this transitory period.
In Simulation III, we increase the number of years from 26 to 100 to show the asymptotic
properties of the estimation method.
In Simulation IV, we decrease the number of choices from 16 to 8 to show the impact of
having a smaller number of observations because of smaller number of choices.
Then, we repeat all four simulation exercises 300 times. All simulations are conducted
with L = 20, 000 agents, which is approximately equal to the sample size of March-CPS that
is used for the estimation in the previous section.
Table 4 presents the Monte Carlo simulation results. The column labeled as â€œSim Iâ€? shows
that the estimates are reasonably close to the true values and expected to be unbiased.
The column â€œSim IIâ€? shows that using data contaminated with a non-stationary trade
policy shock does not aï¬€ect the performance of the method. Note that we did not specify
the nature of the aggregate shock in the estimation procedure. The method introduced
herein does not require strong distributional assumptions about the aggregate shocks. CCP
method and other maximum likelihood based methods require the aggregate shocks to be
to read.
13
Note that wages show some ï¬‚uctuations over time, for that reason we added an iid normal shock to
equilibrium wages with standard deviation equal to 0.05, approximately equal to the standard error of
average wages in the data. We also added an unexpected surprise shock to the wages with a standard
deviation equal to 0.05.
17
fully speciï¬?ed and to be stationary.
The column â€œSim IIIâ€? presents the results for the longer time series with 100 years. It
hints that the method has plausible asymptotic properties, i.e. standard errors decrease
as we increase the length of time series, and the estimates converge to the true parameter
values.
Finally, column â€œSim IVâ€? shows that as the number of choices decrease, the standard
errors increase in both stages.
In the following tables, 5 and 6, we compare results of PPML and CCP based estimation
strategies. We use â€œSimulation Iâ€? data with 20, 000 agents, then we repeat the exercise with
2, 000 and 4, 000 agents to demonstrate small sample properties of the estimators.
Table 5 shows estimated values and standard errors in parentheses, Î›i , using PPML and
CCP methods with 2, 000 agents, with 4, 000 agents, with 20, 000 agents and with inï¬?nitely
many agents14 . We use equal weights in the non-parametric stage of CCP estimation. The
last two columns show that both PPML and CCP methods converge to the true values,
therefore they are asymptotically equivalent. With ï¬?nite number of agents, PPML estimates
are closer to the true values, especially when the sample size is small. However, we do
not argue that PPML is more eï¬ƒcient than CCP, because the non-parametric stage of
CCP estimation could be conducted with diï¬€erent weights and we cannot try all possible
weighting vectors. When the number of choices are large, PPML is more convenient than
non-parametric CCP estimation since it does not rely on taking logarithms of probabilities
that can be very close to zero.
Table 6 presents the estimation results (and standard errors in parentheses) for C/Î½ and
1/Î½ using diï¬€erent estimation methods. When we use a maximum likelihood based method,
we assume that the econometrician knows the distribution of aggregate shocks to wages in
order to use maximum likelihood estimation. Also we assume that the econometrican knows
the true values of Î· â€™s, because otherwise repeating ML estimation procedure 300 times takes
The econometrician can observe true switching probabilities mij
14
t when there are inï¬?nitely many agents.
i
But there may still be uncertainty due to the aggregate shocks to wt .
18
unreasonably long time.
PPML1 is the method described in the previous section, which is the two stage procedure
with PPML estimation in the ï¬?rst stage and linear regression in the second stage. PPML2
method uses PPML to impute expected values within a maximum likelihood estimation algo-
rithm. It is similar to CCP, but rather than imputing expected values non-parametrically we
use PPML. ACM method is the estimator used in Artuc, Chaudhuri and McLaren (2010)15 .
CCP method is the conditional choice probability method that uses maximum likelihood
and non-parametric estimation of expected values with Hotz-Miller inversion equation. In
the CCP and PPML2 methods, we assume that the exonometrician knows the distribution
of aggregate shocks, since it is needed for the maximum likelihood estimation. Table 6 shows
that all four methods perform well with large sample. However, CCP method did not con-
verge when the sample size was small (L = 2, 000). PPML based methods seem to perform
better when sample size is small, also PPML1 has an important advantage over ML-based
methods since it does not require distributional assumptions about the aggregate shocks.
In the next Monte-Carlo exercise, we shut down the aggregate shocks to wages. Without
aggregate shocks, it is straightforward use iterative methods pioneered by Rust (1987). Iter-
ative estimation methods are out of the scope of this paper but the â€œNested Psudo-Maximum
Likelihoodâ€? method introduced by Aguirregabiria and Mira (2002) is relevant and impor-
tant. They showed that it is possible to use an iterative step to improve the performance of
CCP with small samples. Without aggregate shocks, we are able to compare performance
of the NPM with PPML methods.
Table 6 presents results of Monte-Carlo simulations with PPML1, PPML2, ACM CCP
and NPM (standard errors are in parentheses). With all ï¬?ve methods, estimates converge
to the true parameter values as the sample size increases. NPM indeed improves the CCP
results for small samples, however it is very diï¬ƒcult to implement it when there are aggregate
15
Unlike Artuc Chaudhuri and McLaren (2010), we have zero cells in the transition matrix where mij t =0
for some i, j, t because we have 16 choices rather than 6. Diï¬€erent from them, we drop the observation when
mij
t = 0, which makes the ACM estimator biased for this particular exercise.
19
shocks. CCP cannot be used as starting point for the NPM algorithm when the sample
size is very small (when we simulated 2000 agents the CCP method did not converge to
ï¬?nite numbers). Naturally, it is possible to use PPML estimates as a starting point for the
NPM algorithm. The row labeled as â€œPPML-NPMâ€? shows estimates of a variation of NPM
method that uses PPML as a starting point rather than CCP. The extra NPM-loop after
the PPML regression reduces the standard errors, but it is diï¬ƒcult to implement when there
are aggregate shocks.
5 Conclusion
We present a novel and computationally eï¬ƒcient method for estimating dynamic discrete
choice models with heterogeneity and time-varying resistance (i.e moving cost) parameters.
The method performs well with large number of choices, sparse decision transition matri-
ces (caused by small sample size) and aggregate shocks. All expectations of agents are
fully accounted for in the ï¬?rst step regression, which allows us to be agnostic about agentsâ€™
expectations and distribution of aggregate shocks. Therefore the method can be used for
estimation out of steady state. Potential applications are migration, sectoral and occupa-
tional labor mobility models with large number of discrete choices, macroeconomic shocks
and limited heterogeneity.
20
Appendix A: An Alternative Model with Wage Shocks
The moving cost shock Îµi
t , in essence, is a utility shock. However, it is common in the
labor economics literature to consider wage shocks, rather than utility shocks, as the main
driving force behind labor mobility. Consider an alternative speciï¬?cation where Îµi
tâˆ’1 is a
wage shock that is revealed at the end of time t âˆ’ 1 but aï¬€ects observed wage at time t,
i i
rather than a utility shock. Assume that the econometrician observes w
Â¯t = wt + Îµi
tâˆ’1 , but
i
not wt . Then (10) can not be used as the basis for the regression, since observed wages are
i i
Â¯t
self-selected and w is a biased measure of true underlying sectoral wage wt .
With wage shocks, there are not any changes in the PPML regression step, but the second
step has to be modiï¬?ed. The expected wage conditional on being in sector i is equal to
n
Et i
wt + Îµi
tâˆ’1 |i = i
wt âˆ’Î½ mji ji
t log mt ,
j =1
where mji
t is the ratio of agents who switch from j to i conditional on being in sector i in
period t,
Lj,s
t mt
ji,s
mji,s
t = n .
Lk,s
t mt
ki,s
k=1
Assume that
n
Âµi,s
t =âˆ’ mji,s
t log mji,s
t ,
j =1
then the wages in the second stage regression equation should be corrected using Âµi,s
t . Deriva-
tion of the equations are provided in Appendix B.3.
Appendix B: Derivation of Key Equations
As noted in the main text, the cdf for the extreme value type I distribution with location
21
parameter âˆ’Î½Î³ and scale parameter Î½ is :
F (Îµ) = exp(âˆ’ exp(âˆ’Îµ/Î½ âˆ’ Î³ )),
where E (Îµ) = 0, V ar (Îµ) = Ï€ 2 Î½ 2 /6 and Î³ is the Eulerâ€™s constant (Î³ âˆ¼
= 0.577). Then, pdf is:
f (Îµ) = (1/Î½ ) exp(âˆ’Îµ/Î½ âˆ’ Î³ âˆ’ exp(âˆ’Îµ/Î½ âˆ’ Î³ )).
B.1 Gross Flow Function
We are dropping the state superscript s and time subscript t for notational convenience.
Deï¬?ne
Îµij = [Î² V j âˆ’ Î² V i ] âˆ’ C ij ,
The gross ï¬‚ow function, mij , is equal to the probability that a given i sector worker will
switch to j sector, that is the probability of a sector i worker to have higher utility in sector
j in the next period. This probability is
mij = Pr Îµij + Îµj â‰¥ Îµik + Îµk for k = 1, . . . , n ,
this can be written as
âˆž
ij
m = f (Îµj ) F (Îµj + Îµij âˆ’ Îµik )dÎµj .
âˆ’âˆž k=j
Thanks to the extreme value distribution and McFadden (1973), the gross ï¬‚ow mij,s
t can
be written as
exp Î² Vtj i ij
+1 âˆ’ Î² Vt+1 âˆ’ Ct
1
Î½
mij
t = N
,
exp Î² Vtj
+1 âˆ’ Î² Vti
+1 âˆ’ Ctij 1
Î½
k=1
which is equal to
22
exp Îµij
t /Î½
mij
t = N
.
exp Îµik
t /Î½
k=1
B.2 Option Values
We follow the steps in Artuc, Chaudhuri and McLaren (2010). Deï¬?ne, for convenience:
n
exp(z k /Î½ )
x = Îµj /Î½ + Î³ and z = log( k=1
exp(z k /Î½ )
).
Now, deï¬?ne:
âˆž
Î¦ij â‰¡ âˆ’âˆž
Îµj f (Îµj ) j =k F (Îµj + Îµij âˆ’ Îµik )dÎµj
1
= Î½
Îµj exp(âˆ’Îµj /Î½ âˆ’ Î³ âˆ’ exp(âˆ’Îµj /Î½ âˆ’ Î³ )) k=j exp(âˆ’ exp(âˆ’[Îµj + Îµij âˆ’ Îµik ]/Î½ âˆ’ Î³ ))dÎµj
Then,
1
Î¦ij = Î½
Îµj exp(âˆ’Îµj /Î½ âˆ’ Î³ âˆ’ exp(âˆ’Îµj /Î½ âˆ’ Î³ )) exp(âˆ’ k=j exp(âˆ’[Îµj + Îµij âˆ’ Îµik ]/Î½ âˆ’ Î³ ))dÎµj
1 n
= Î½
Îµj exp(âˆ’Îµj /Î½ âˆ’ Î³ ) exp(âˆ’ k=1 exp(âˆ’[Îµj + Îµij âˆ’ Îµik ]/Î½ âˆ’ Î³ ))dÎµj
1 n
= Î½
Îµj exp (âˆ’Îµj /Î½ âˆ’ Î³ ) âˆ’ k=1 exp(âˆ’[Îµj + Îµij âˆ’ Îµik ]/Î½ âˆ’ Î³ ) dÎµj
1 n
= Î½
Îµj exp (âˆ’Îµj /Î½ âˆ’ Î³ ) âˆ’ exp((âˆ’Îµj /Î½ âˆ’ Î³ )) k=1 exp(âˆ’[z j âˆ’ z k ]/Î½ ) dÎµj
1 n
= Î½
Îµj exp (âˆ’Îµj /Î½ âˆ’ Î³ ) âˆ’ exp((âˆ’Îµj /Î½ âˆ’ Î³ )) k=1 exp(z k /Î½ ) / exp(z j /Î½ ) dÎµj
n
exp(z k /Î½ )
Note that, x = Îµj /Î½ + Î³ and z = log( k=1
exp(z k /Î½ )
). Then,
23
Î¦ij = Îµj exp(âˆ’x âˆ’ exp(âˆ’(x âˆ’ z )))dx
= Î½ (x âˆ’ Î³ ) exp(âˆ’x âˆ’ exp(âˆ’(x âˆ’ z )))dx
= (âˆ’Î½Î³ ) exp(âˆ’z ) + Î½ x exp(âˆ’x âˆ’ exp(âˆ’(x âˆ’ z )))dx
= (âˆ’Î½Î³ ) exp(âˆ’z ) + Î½ exp(âˆ’z ) x exp(âˆ’x + z âˆ’ exp(âˆ’(x âˆ’ z )))dx
We know that exp(âˆ’z ) = mij from McFadden (1973). Substituting this in:
Î¦ij = (âˆ’Î½Î³ )mij + Î½mij x exp(âˆ’x + z âˆ’ exp(âˆ’(x âˆ’ z )))dx
= (âˆ’Î½Î³ )mij + Î½mij x exp(âˆ’x + z âˆ’ exp(âˆ’(x âˆ’ z )))dx
+Î½mij z exp(âˆ’x + z âˆ’ exp(âˆ’(x âˆ’ z )))dx
âˆ’Î½mij z exp(âˆ’x + z âˆ’ exp(âˆ’(x âˆ’ z )))dx
Then we set y = x âˆ’ z , thus
Î¦ij = (âˆ’Î½Î³ )mij + Î½mij (x âˆ’ z ) exp(âˆ’x + z âˆ’ exp(âˆ’(x âˆ’ z )))dx
+Î½mij z exp(âˆ’x + z âˆ’ exp(âˆ’(x âˆ’ z )))dx
24
Î¦ij = (âˆ’Î½Î³ )mij + Î½mij y exp(âˆ’y âˆ’ exp(âˆ’y ))dy + Î½zmij exp(âˆ’y âˆ’ exp(âˆ’y ))dy
= (âˆ’Î½Î³ )mij + Î½mij y exp(âˆ’y âˆ’ exp(âˆ’y ))dy + Î½zmij .
Noting that y exp(âˆ’y âˆ’ exp(âˆ’y ))dy = Î³ (Eulerâ€™s constant), we can simplify:
Î¦ij = (âˆ’Î½Î³ )mij + Î½zmij + Î½Î³mij
= âˆ’Î½ log(mij )mij
Then we can add this across possible destinations j , note that the utility of a worker in
i is equal to:
n
Vti = ui
t + Î¦ij ij ij ij j
t âˆ’ mt Ct + Î²mt Vt+1
j =1
n
= ui
t + mij ij ij j
t âˆ’Î½ log(mt ) âˆ’ Ct + Î² Vt+1 )
j =1
n
= ui
t + mij ij ij j i i
t âˆ’Î½ log(mt ) âˆ’ Ct + Î² (Vt+1 âˆ’ Vt+1 ) + Î² Vt+1
j =1
n
= ui
t+ mij ij ij i
t Îµt âˆ’ Î½ log(mt ) + Î² Vt+1 .
j =1
25
n
Now, recall from above that log(mij ) = Îµij
t /Î½ âˆ’ log k=1 exp(Îµik /Î½ ) . This yields:
n n
Vti = ui
t + mij
t Î½ log exp(Îµik /Î½ ) + Î² Vti
+1
j =1 k=1
n
= ui
t + Î½ log exp(Îµik /Î½ ) + Î² Vti
+1 .
k=1
This implies that the option value â„¦i can be written as
â„¦i = âˆ’Î½ log mii .
B.3 Wage Shocks
Assume that di j
t denotes agentâ€™s choice at time t. Expected Îµ conditional on a sector i
agent choosing sector i is equal to
âˆž
âˆ’âˆž
Îµj f (Îµj ) j =k F (Îµj + Îµt ij âˆ’ Îµt ik )dÎµj
E Îµi
t |dt = i, dt+1 = j = âˆž
âˆ’âˆž
f (Îµj ) j =k F (Îµj + Îµt ij âˆ’ Îµt ik )dÎµj
Î¦ij
t
=
mij
t
= âˆ’Î½ log mij
t
Adding this across possible origins, we ï¬?nd
n
Et Îµ j
tâˆ’1 |dt = j = âˆ’Î½ mij ij
t log mt ,
i=1
where mji
t is the probability of a sector i agent to originate from sector j
Li,s
t mt
ij
mij
t = n .
kj
Lk
t mt
k=1
26
Appendix C: CCP Representation of the Model
Following the steps in Appendix B.2 and using (3) the Bellman equation can be written
as
Vti,s = ui,s j,s ij,s
t + max Î² Vt+1 âˆ’ Ct âˆ’ Îµj
t ,
j
Î¦ij,s
= ui,s
t + mij
t Î² Vtj,s
+1 âˆ’ Ctij,s + tij,s ,
j
mt
= ui,s
t + mij ij ij j,s k,s k,s
t âˆ’Î½ log(mt ) âˆ’ Ct + Î² (Vt+1 âˆ’ Vt+1 )) + Î² Vt+1 ,
j
N
= ui,s
t + mij
t Ctij âˆ’ Î² (Vtj,s i,s
+1 âˆ’ Vt+1 ) + Î½ log Â¯in
exp Îµ t /Î½ âˆ’ Ctij + Î² (Vtj,s k,s k,s
+1 âˆ’ Vt+1 ) + Î² Vt+1
j n=1
N
= ui,s
t + mij
t Î² Vti
+1 + Î½ log Â¯in
exp Îµ t /Î½ âˆ’ Î² Vtk,s k,s
+1 + Î² Vt+1
j n=1
N
= ui,s
t + mij
t Î² Vti
+1 + Î½ log Â¯in
exp Îµ t /Î½ âˆ’ Î² Vtk,s ik,s
+1 + Ct + Î² Vtk,s ik,s
+1 âˆ’ Ct
j n=1
= ui,s k,s ik,s
t + Î² Vt+1 âˆ’ Ct âˆ’ Î½ log mik,s
t
Then
Vtj,s âˆ’ Vti,s = uj,s k,s jk,s
t + Î² Vt+1 âˆ’ Ct âˆ’ Î½ log mjk,s
t âˆ’ ui,s k,s ik,s
t âˆ’ Î² Vt+1 + Ct + Î½ log mik,s
t ,
= (uj,s i,s jk,s
t âˆ’ ut ) âˆ’ Ct âˆ’ Î½ log mjk,s
t + Ctik,s + Î½ log mik,s
t ,
since k can be any sector, we can add the expression over all possible sectors to increase
27
precision
Vtj,s âˆ’ Vti,s = (uj,s i,s
t âˆ’ ut ) + xij,k âˆ’Ctjk,s âˆ’ Î½ log mjk,s
t + Ctik,s + Î½ log mik,s
t , (18)
k=1
where xij,k
t is an arbitrary weighting vector such that xij,k
t = 1.
k
Then (18) is the CCP representation of the model. Note that ui,s
t
i,s
= wt + Î· i,s . We
need guessed values of Î· i,s for the CCP representation of the model. Therefore, we need to
estimate all parameters at once if we use CCP and maximum likelihood. Which makes CCP
computationally demanding when the number of choices and structural parameters are large.
Also may be diï¬ƒcult to use CCP in certain cases when many of the observed conditional
choice probabilities are close to zero.
Appendix D: EM loop within PPML regression
It is possible to incorporate Expectation-Maximization algorithm to our estimation pro-
cedure in the ï¬?rst step. For notational convenience we consider the case where agentsâ€™
current and last two sectors are observed in the data, it is straightforward to generalize this
procedure for panels with longer time dimensions.
Assume that we observe each agentâ€™s decision at time t and t + 1, let us denote agentâ€™s
location at time t with i, time t + 1 with j , and time t + 2 with k . The two period ï¬‚ow at
time t is denoted with mijk,s
t , the number of workers who chose i, j , and k consecutively is
ijk,s
equal to yt = Li ijk,s Ëœ, the observed
. Each agent has an unobserved discrete type Ïƒ âˆˆ S
t mt
states (or types) are still denoted with s âˆˆ S . We are interested in ï¬?nding the ratio of type
Ïƒ workers in the observed ï¬‚ow mijk,s
t , let us denote this probability with Î¶tijk,s,Ïƒ . Then the
number of workers with state (i, s, Ïƒ ) who choose j then k starting at time t is equal to
ijk,s
Î¶tijk,s,Ïƒ yt .
Note that
28
exp Î² Vtj,s,Ïƒ i,s,Ïƒ ij,s,Ïƒ
+1 âˆ’ Î² Vt+1 âˆ’ Ct
1
Î½
exp Î² Vtk,s,Ïƒ j,s,Ïƒ jk,s,Ïƒ
+2 âˆ’ Î² Vt+2 âˆ’ Ct+1
1
Î½
Î¶tijk,s,Ïƒ mijk,s
t = N
. N
,
exp Î² Vtn,s,Ïƒ i,s,Ïƒ in,s,Ïƒ
+1 âˆ’ Î² Vt+1 âˆ’ Ct
1
Î½
exp Î² Vtn,s j,s jn,s,Ïƒ
+2 âˆ’ Î² Vt+2 âˆ’ Ct+1
1
Î½
n=1 n=1
which can be represented in log-linear format to construct a Poisson regression as we show
in the ï¬?rst step.
Assume that we have an initial guess for Ztijk,s,Ïƒ for time t. Let us denote this initial
ijk,s,Ïƒ,(1)
guess with Zt . Using this initial guess, we can run the Poisson regression in the ï¬?rst
step
ijk,s,Ïƒ,(1) ijk,s i,s,Ïƒ,(1) j,s,Ïƒ,(1) k,s,Ïƒ,(1) ijk,s,Ïƒ,(1)
log Zt yt = âˆ†t + Î“t + Î›t + Î¨t Xtijk,s + eijk,s,Ïƒ
t
Then the updated probability will be
i,s,Ïƒ,(1) j,s,Ïƒ,(1) k,s,Ïƒ,(1) ijk,s,Ïƒ,(1)
ijk,s,Ïƒ,(2) âˆ†t + Î“t + Î›t + Î¨t Xtijk,s
Zt = ,
i,s,Ïƒ ,(1) j,s,Ïƒ ,(1) k,s,Ïƒ ,(1) ij,s,Ïƒ ,(1)
Ëœ
âˆ†t + Î“t + Î›t + Î¨t Xtij,s
Ïƒ âˆˆS
Hence guess at step Ï„ + 1 will be
i,s,Ïƒ,(Ï„ ) j,s,Ïƒ,(Ï„ ) k,s,Ïƒ,(Ï„ ) ijk,s,Ïƒ,(Ï„ )
ijk,s,Ïƒ,(Ï„ +1) âˆ†t + Î“t + Î›t + Î¨t Xtijk,s
Zt = .
i,s,Ïƒ ,(Ï„ ) j,s,Ïƒ ,(Ï„ ) k,s,Ïƒ ,(Ï„ ) ijk,s,Ïƒ ,(Ï„ )
Ëœ
âˆ†t + Î“t + Î›t + Î¨t Xtijk,s
Ïƒ âˆˆS
ijk,s,Ïƒ,(Ï„ +1)
Our simulations conï¬?rm that Zt converges to true Ztijk,s,Ïƒ and we obtain consistent
estimates for the destination and origin ï¬?xed eï¬€ects16 .
16
Results are available upon request.
29
References
[1] Anderson, James (2010). â€œThe Gravity Model,â€? The Annual Review of Economics, 3(1).
[2] Aguirregabiria, Victor and Pedro Mira (2002). â€œSwapping the Nested Fixed Point Al-
gorithm: A Class of Estimators for Discrete Markov Decision Models,â€? Econometrica,
70(4).
[3] Aguirregabiria, Victor and Pedro Mira (2010). â€œDynamic Discrete Choice Models: A
Survey,â€? Journal of Econometrics, 156.
[4] Arcidiacono, Peter, and Robert Miller (2011). â€œCCP Estimation of Dynamic Discrete
Choice Models with Unobserved Heterogeneity,â€? forthcoming in Econometrica.
[5] Artuc, Erhan, Shubham Chaudhuri and John McLaren (2010). â€œTrade Shocks and Labor
Adjustment: A Structural Empirical Approach,â€? American Economic Review, 100(3).
[6] Artuc, Erhan, and John McLaren (2012). â€œTrade Policy and Wage Inequality: A Struc-
tural Analysis with Occupational and Sectoral Mobility,â€? NBER Working Paper: 18503.
[7] Artuc, Erhan, German Bet, Irene Brambilla and Guido Porto (2013). â€œTrade Shocks,
Firm Level Investment Inaction and Labor Market Responses,â€? Mimeo: World Bank.
[8] Baine, Michel, Frederic Docquier and Caglar Ozden (2009). â€œDiasporas,â€? Journal of
Development Economics, 95(1).
[9] Cameron, Colin and Pravin Trivedi (1998). â€œRegression Analysis of Count Data,â€? Cam-
bridge University Press, Cambridge.
[10] Gourieroux, C., A. Monfort, and A. Trognon (1984). â€œPseudo maximum likelihood
methods: Applications to Poisson models.,â€? Econometrica, 52(3).
[11] Grogger, Jeï¬€rey and Gordon Hanson (2011). â€œIncome Maximisation and the Selection
and Sorting of International Migrants.,â€? Journal of Development Economics, 95(1).
30
[12] Hotz, V. Joseph, and Robert Miller (1993). â€œConditional Choice Probabilities and the
Estimation of Dynamic Models.,â€? Review of Economic Studies, 60(3).
[13] Kaplan, D.S., D. Lederman and R. Robertson (2013). â€œWorker-Level Adjustment Costs
in a Developing Country: Evidence from Mexico.,â€? Mimeo: World Bank.
[14] Lipton, D., J. Poterba, J. Sachs, L. Summers (1982). â€œMultiple shooting in rational
expectation models,â€? Econometrica, 50.
[15] Magnac, Thierry, and David Thesmar (2002). â€œIdentifying Dynamic Discrete Decision
Processes,â€? Econometrica, 70(2).
[16] McFadden, Daniel (1973). â€œConditional Logit Analysis of Qualitative Choice Behavior,â€?
in P. Zarembka (ed.) Frontiers in Econometrics, New York, Academic Press.
[17] Olivero, Maria and Yoto Yotov (2011) â€œDynamic Gravity: Theory and Empirical Im-
plications,â€? Canadian Journal of Economics forthcoming.
[18] Rust, John (1987). â€œOptimal Replacement of GMC Bus Engines: An Empirical Model
of Harold Zurcher,â€? Econometrica, 55(5).
[19] Santos Silva, Joao, and Silvana Tenreyro (2006). â€œThe log of gravity,â€? Review of Eco-
nomics and Statistics, 88(4).
31
Table 1: Regression Results (Basic Speciï¬?cation)
Moving Cost
Estim SE
Mean Ct /Î½ 4.671 ** (0.055)
Max Ct /Î½ 4.884 ** (0.059)
Min Ct /Î½ 4.488 ** (0.054)
1/Î½ 0.959 ** (0.255)
Î·/Î½ (Utility)
Sector Estim SE
1 0.000 -
2 -0.595 ** (0.162)
3 -0.156 * (0.094)
4 -0.231 * (0.125)
5 -0.224 * (0.114)
6 -0.258 ** (0.113)
7 -0.601 ** (0.166)
8 -0.420 ** (0.131)
9 -0.297 ** (0.122)
10 -0.063 (0.070)
11 -0.486 ** (0.169)
12 -0.224 * (0.101)
13 -0.140 ** (0.052)
14 -0.385 ** (0.080)
15 -0.285 * (0.133)
16 -0.295 ** (0.128)
* signiï¬?cant at 5% level.
** signiï¬?cant at 1% level.
32
Table 2: Regression Results (Alternative Speciï¬?cation)
Moving Cost
Estim SE
Mean Ct /Î½ 4.671 ** (0.055)
Max Ct /Î½ 4.884 ** (0.059)
Min Ct /Î½ 4.488 ** (0.054)
1/Î½ 3.672 ** (0.667)
Î·/Î½ (Utility)
Intercept Trend
Sector Estim SE Estim SE
1 0.000 - 0.000 -
2 -2.378 ** (0.455) 0.009 (0.008)
3 -1.422 ** (0.314) 0.027 ** (0.009)
4 -1.528 ** (0.340) 0.003 (0.008)
5 -1.509 ** (0.330) 0.011 (0.008)
6 -1.690 ** (0.359) 0.023 ** (0.009)
7 -2.257 ** (0.425) -0.004 (0.008)
8 -1.565 ** (0.313) -0.014 * (0.008)
9 -1.712 ** (0.362) 0.014 * (0.008)
10 -0.857 ** (0.224) 0.014 * (0.008)
11 -2.057 ** (0.407) -0.013 (0.008)
12 -1.234 ** (0.269) 0.001 (0.007)
13 -0.727 ** (0.166) 0.017 * (0.008)
14 -1.351 ** (0.239) 0.018 ** (0.008)
15 -1.497 ** (0.329) -0.010 (0.008)
16 -1.505 ** (0.331) -0.006 (0.007)
* signiï¬?cant at 5% level.
** signiï¬?cant at 1% level.
33
Table 3: Descriptive Statistics and Simulation Parameters
Descriptive Statistics Simulation Parameters
Labor Allocation Wage Labor Share Constant CPI Share
Sector Mean SE Mean SE a AËœ b
1 0.02 (0.00) 0.58 (0.03) 0.30 0.14 0.07
2 0.02 (0.00) 1.19 (0.05) 0.30 0.23 0.00
3 0.09 (0.01) 0.92 (0.06) 0.85 0.75 0.30
4 0.17 (0.02) 1.04 (0.03) 0.57 0.86 0.20
5 0.10 (0.01) 1.00 (0.03) 0.57 0.64 0.10
6 0.06 (0.00) 1.00 (0.05) 0.49 0.50 0.06
7 0.02 (0.00) 1.21 (0.06) 0.42 0.28 0.03
8 0.03 (0.00) 1.07 (0.05) 0.49 0.34 0.04
9 0.06 (0.00) 1.04 (0.04) 0.58 0.53 0.00
10 0.11 (0.01) 0.81 (0.05) 0.58 0.54 0.00
11 0.05 (0.00) 1.22 (0.09) 0.22 0.54 0.01
12 0.05 (0.01) 0.95 (0.05) 0.68 0.53 0.05
13 0.01 (0.00) 0.71 (0.04) 0.61 0.22 0.03
14 0.01 (0.00) 0.85 (0.05) 0.60 0.22 0.06
15 0.14 (0.01) 1.08 (0.05) 0.68 0.84 0.06
16 0.07 (0.01) 1.06 (0.03) 0.82 0.81 0.00
34
Table 4: Simulation Results: Estimation with PPML
Moving Cost ( C/Î½ and 1/Î½ )
Sim I Sim II Sim III Sim IV
Actual Estim SE Estim SE Estim SE Estim SE
Mean Ct /Î½ 4.500 4.503 (0.023) 4.503 (0.022) 4.503 (0.022) 4.504 (0.037)
Max Ct /Î½ 4.500 4.507 (0.024) 4.507 (0.022) 4.507 (0.022) 4.509 (0.035)
Min Ct /Î½ 4.500 4.500 (0.021) 4.497 (0.021) 4.497 (0.021) 4.498 (0.036)
1/Î½ 1.000 0.995 (0.119) 0.993 (0.109) 0.999 (0.049) 1.010 (0.186)
Fixed Utility ( Î· i /Î½ )
Sim I Sim II Sim III Sim IV
Sector Actual Estim SE Estim SE Estim SE Estim SE
2 0.100 0.103 (0.017) 0.101 (0.016) 0.101 (0.007) 0.101 (0.020)
3 0.150 0.153 (0.020) 0.153 (0.017) 0.151 (0.008) 0.151 (0.024)
4 0.200 0.202 (0.019) 0.203 (0.017) 0.201 (0.007) 0.201 (0.025)
5 0.250 0.252 (0.016) 0.252 (0.016) 0.250 (0.007) 0.252 (0.019)
6 0.300 0.301 (0.020) 0.301 (0.017) 0.301 (0.008) 0.303 (0.023)
7 0.350 0.352 (0.027) 0.350 (0.026) 0.351 (0.011) 0.352 (0.039)
8 0.400 0.400 (0.029) 0.399 (0.027) 0.400 (0.012) 0.403 (0.040)
9 0.000 0.002 (0.027) 0.002 (0.025) 0.000 (0.013) - -
10 -0.100 -0.098 (0.036) -0.097 (0.034) -0.100 (0.015) - -
11 -0.150 -0.148 (0.040) -0.147 (0.039) -0.149 (0.018) - -
12 -0.200 -0.197 (0.040) -0.196 (0.038) -0.199 (0.017) - -
13 -0.250 -0.248 (0.022) -0.249 (0.022) -0.251 (0.009) - -
14 -0.300 -0.298 (0.024) -0.301 (0.023) -0.302 (0.012) - -
15 -0.350 -0.346 (0.064) -0.345 (0.064) -0.350 (0.029) - -
16 -0.400 -0.396 (0.059) -0.395 (0.055) -0.399 (0.025) - -
35
Table 5: Simulation Results: Imputing Values with PPML and CCP
L = 4, 000 L = 20, 000 Lâ†’âˆž
Actual PPML CCP PPML CCP PPML CCP
2
Î› 0.395 0.391 0.578 0.400 0.421 0.395 0.395
(0.142) (0.305) (0.327) (0.185) (0.244) (0.142) (0.142)
Î›3 1.025 1.030 1.378 1.026 1.020 1.025 1.025
(0.145) (0.298) (0.345) (0.178) (0.230) (0.145) (0.145)
Î›4 1.518 1.511 1.919 1.519 1.501 1.518 1.518
(0.179) (0.295) (0.339) (0.198) (0.249) (0.179) (0.179)
Î›5 1.248 1.262 1.642 1.251 1.252 1.248 1.248
(0.138) (0.275) (0.323) (0.166) (0.219) (0.138) (0.138)
Î›6 1.106 1.114 1.473 1.112 1.122 1.106 1.106
(0.168) (0.276) (0.313) (0.194) (0.235) (0.168) (0.168)
Î›7 0.716 0.730 1.009 0.713 0.730 0.716 0.716
(0.153) (0.295) (0.318) (0.173) (0.224) (0.153) (0.153)
Î›8 0.895 0.908 1.218 0.903 0.908 0.895 0.895
(0.153) (0.271) (0.298) (0.182) (0.234) (0.153) (0.153)
Î›9 0.797 0.823 1.143 0.785 0.794 0.797 0.797
(0.155) (0.273) (0.322) (0.176) (0.223) (0.155) (0.155)
Î›10 0.703 0.711 0.987 0.702 0.707 0.703 0.703
(0.141) (0.282) (0.334) (0.176) (0.229) (0.141) (0.141)
Î›11 0.736 0.745 1.028 0.738 0.757 0.736 0.736
(0.146) (0.285) (0.343) (0.179) (0.242) (0.146) (0.146)
Î›12 0.368 0.395 0.562 0.378 0.400 0.368 0.368
(0.136) (0.270) (0.319) (0.183) (0.249) (0.136) (0.136)
Î›13 -0.405 -0.416 -0.651 -0.407 -0.431 -0.405 -0.405
(0.127) (0.344) (0.447) (0.174) (0.259) (0.127) (0.127)
Î›14 -0.433 -0.452 -0.685 -0.425 -0.424 -0.433 -0.433
(0.122) (0.343) (0.468) (0.190) (0.252) (0.122) (0.122)
Î›15 0.840 0.837 1.137 0.840 0.846 0.840 0.840
(0.156) (0.292) (0.341) (0.182) (0.245) (0.156) (0.156)
Î›16 0.273 0.266 0.390 0.281 0.285 0.273 0.273
(0.147) (0.301) (0.363) (0.194) (0.242) (0.147) (0.147)
36
Table 6: Simulation Results: Comparing Diï¬€erent Methods (with Aggregate Shocks)
Sample Size Method C/Î½ 1/Î½
- Actual 4.500 - 1.000 -
L = 2, 000, T = 25 PPML1 4.530 (0.015) 1.001 (0.024)
PPML2 4.515 (0.016) 1.006 (0.027)
ACM 4.217 (0.250) 0.908 (0.084)
L = 4, 000, T = 25 PPML1 4.515 (0.010) 1.000 (0.020)
PPML2 4.506 (0.010) 1.003 (0.021)
ACM 4.429 (0.179) 0.958 (0.060)
CCP 4.517 (0.011) 1.083 (0.038)
L = 20, 000, T = 25 PPML1 4.503 (0.005) 0.999 (0.014)
PPML2 4.500 (0.005) 1.003 (0.016)
ACM 4.560 (0.074) 1.001 (0.032)
CCP 4.506 (0.005) 0.998 (0.018)
L â†’ âˆž, T = 25 PPML1 4.500 (0.000) 0.999 (0.012)
PPML2 4.498 (0.001) 1.003 (0.014)
ACM 4.500 (0.000) 0.999 (0.019)
CCP 4.498 (0.001) 1.003 (0.014)
37
Table 7: Simulation Results: Comparing Diï¬€erent Methods (without Aggregate Shocks)
Sample Size Method C/Î½ 1/Î½
- Actual 4.500 - 1.000 -
L = 2, 000, T = 25 PPML1 4.530 (0.015) 0.999 (0.023)
PPML2 4.515 (0.015) 1.007 (0.023)
ACM 4.248 (0.269) 0.912 (0.081)
PPML-NPM 4.501 (0.014) 1.013 (0.016)
L = 4, 000, T = 25 PPML1 4.515 (0.010) 0.999 (0.016)
PPML2 4.507 (0.010) 1.003 (0.016)
ACM 4.429 (0.177) 0.961 (0.056)
CCP 4.520 (0.011) 1.080 (0.031)
NPM 4.495 (0.010) 1.006 (0.015)
L = 20, 000, T = 25 PPML1 4.503 (0.005) 1.000 (0.007)
PPML2 4.500 (0.005) 1.001 (0.007)
ACM 4.559 (0.073) 1.006 (0.025)
CCP 4.505 (0.005) 0.996 (0.012)
NPM 4.499 (0.005) 1.015 (0.016)
L â†’ âˆž, T = 25 PPML1 4.500 (0.000) 1.000 (0.000)
PPML2 4.500 (0.000) 1.000 (0.000)
ACM 4.500 (0.000) 1.000 (0.000)
CCP 4.500 (0.000) 1.000 (0.000)
NPM 4.500 (0.000) 1.000 (0.000)
38