THE MICROECONOMICS OF INCOME DISTRIBUTION DYNAMICS IN EAST ASIA AND LATIN AMERICA François Bourguignon Francisco H. G. Ferreira Nora Lustig Editors THE MICROECONOMICS OF INCOME DISTRIBUTION DYNAMICS IN EAST ASIA AND LATIN AMERICA THE MICROECONOMICS OF INCOME DISTRIBUTION DYNAMICS IN EAST ASIA AND LATIN AMERICA François Bourguignon Francisco H. G. Ferreira Nora Lustig Editors A copublication of the World Bank and Oxford University Press © 2005 The International Bank for Reconstruction and Development / The World Bank 1818 H Street, NW Washington, DC 20433 Telephone: 202-473-1000 Internet: www.worldbank.org E-mail: feedback@worldbank.org All rights reserved. First printing September 2004 1 2 3 4 08 07 06 05 A copublication of the World Bank and Oxford University Press. Oxford University Press 198 Madison Avenue New York, NY 10016 The findings, interpretations, and conclusions expressed herein are those of the author(s) and do not necessarily reflect the views of the Board of Executive Directors of the World Bank or the governments they represent. The World Bank does not guarantee the accuracy of the data included in this work. The boundaries, colors, denominations, and other information shown on any map in this work do not imply any judgment on the part of the World Bank concerning the legal status of any territory or the endorsement or acceptance of such boundaries. Rights and Permissions The material in this work is copyrighted. Copying and/or transmitting portions or all of this work without permission may be a violation of applicable law. The World Bank encourages dissemination of its work and will normally grant permission promptly. For permission to photocopy or reprint any part of this work, please send a request with complete information to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA, telephone 978-750-8400, fax 978-750-4470, www.copyright.com. All other queries on rights and licenses, including subsidiary rights, should be addressed to the Office of the Publisher, World Bank, 1818 H Street, NW, Washington, DC 20433, USA, fax 202-522-2422, e-mail pubrights@worldbank.org. ISBN 0-8213-5861-8 Cataloging-in-Publication Data has been applied for. Contents Preface xiii Acknowledgments xv Contributors xvii Abbreviations and Acronyms xix 1 Introduction 1 François Bourguignon, Francisco H. G. Ferreira, and Nora Lustig 2 Decomposing Changes in the Distribution of Household Incomes: Methodological Aspects 17 François Bourguignon and Francisco H. G. Ferreira 3 Characterization of Inequality Changes through Microeconometric Decompositions: The Case of Greater Buenos Aires 47 Leonardo Gasparini, Mariana Marchionni, and Walter Sosa Escudero 4 The Slippery Slope: Explaining the Increase in Extreme Poverty in Urban Brazil, 1976­96 83 Francisco H. G. Ferreira and Ricardo Paes de Barros 5 The Reversal of Inequality Trends in Colombia, 1978­95: A Combination of Persistent and Fluctuating Forces 125 Carlos Eduardo Vélez, José Leibovich, Adriana Kugler, César Bouillón, and Jairo Núñez 6 The Evolution of Income Distribution during Indonesia's Fast Growth, 1980­96 175 Vivi Alatas and François Bourguignon 7 The Microeconomics of Changing Income Distribution in Malaysia 219 Gary S. Fields and Sergei Soares v vi CONTENTS 8 Can Education Explain Changes in Income Inequality in Mexico? 275 Arianna Legovini, César Bouillón, and Nora Lustig 9 Distribution, Development, and Education in Taiwan, China, 1979­94 313 François Bourguignon, Martin Fournier, and Marc Gurgand 10 A Synthesis of the Results 357 François Bourguignon, Francisco H. G. Ferreira, and Nora Lustig Index 407 Figures 3.1 Gini Coefficient of Equivalent Household Income Distribution in Greater Buenos Aires, 1985­98 49 3.2 Hourly Earnings­Education Profiles for Men (Heads of Household and Other Family Members), Age 40 55 3.3 Hourly Earnings­Education Profiles for Women (Spouses), Age 40 56 3.4 Weekly Hours of Work by Educational Level for Men (Heads of Household), Age 40 59 4.1 Macroeconomic Instability in Brazil: Inflation 84 4.2 Macroeconomic Instability in Brazil: Per Capita GDP 84 4.3 Truncated Pen Parades, 1976­96 87 4.4 Plotted Quadratic Returns to Education (Wage Earners) 88 4.5 Plotted Quadratic Returns to Experience (Wage Earners) 89 4.6 Combined Price Effects by Sector 106 4.7 Price Effects Separately and for Both Sectors Combined 107 4.8 Occupational-Choice Effects 108 4.9 The Labor Market: Combining Price and Occupational-Choice Effects 109 4.10 Demographic Effects 110 4.11 Shift in the Distribution of Education, 1976­96 111 4.12 Education Endowment and Demographic Effects 112 4.13 A Complete Decomposition 113 5.1 Average Household Size by Income Decile in Urban Colombia, Selected Years 135 CONTENTS vii 5.2 Change in Income from Changes of Returns to Education, Relative to Workers Who Have Completed Secondary Education: Male and Female Wage Earners in Urban Colombia, Selected Periods 140 5.3 Change in Income from Changes of Returns to Education, Relative to Workers Who Have Completed Secondary Education: Male and Female Self-Employed Workers in Urban Colombia, Selected Periods 141 5.4 Probability of Being Employed or a Wage Earner in Urban Colombia according to Various Individual or Household Characteristics, Various Groups of Household Members, Selected Years 146 5.5 Simulated Occupational-Choice and Participation Changes in Percentage Points by Percentile of Earnings for Urban Males and Females, 1978­88 154 5.6 Simulated Occupational-Choice and Participation Changes in Percentage Points by Percentile of Earnings for Urban Males and Females, 1988­95 156 5.7 Changes in Employment Rate by Income Percentile, Females in Urban Colombia, Selected Periods 157 6.1 Summary Decomposition of Changes in the Equivalized Household Distribution of Income 213 7.1 Changing Quantile Functions 221 7.2 Differences in Quantile Functions 222 7.3 Changing Lorenz Curves 224 7.4 Differences in Lorenz Curves 225 7.5 Household Quantile Curves: 1984 Baseline 253 7.6 Household Quantile Curves: 1989 Baseline 255 7.7 Quantile Curves: Simulated Values Minus 1984 Actual Values 257 7.8 Quantile Curves: Simulated Values Minus 1989 Actual Values 258 7.9 Lorenz Curves: Simulated Values Minus 1984 Actual Values 262 7.10 Lorenz Curves: Simulated Values Minus 1989 Actual Values 263 7.11 Household Quantile Curves: 1989 Baseline 265 7.12 Household Quantile Curves: 1997 Baseline 267 7.13 Quantile Curves: Simulated Values Minus 1989 Actual Values 269 7.14 Quantile Curves: Simulated Values Minus 1997 Actual Values 270 viii CONTENTS 8.1 Observed Change in Individual Earnings by Percentile in Mexico, 1984­94 276 8.2 Change in Women's Labor-Force Participation by Education Level in Mexico, 1984­94 281 8.3 Returns to Education for Men by Location, Education Level, and Type of Employment in Mexico, 1984 and 1994 292 8.4 Effect of Labor Choices on Earnings by Percentile in Mexico, 1984­94 298 8.5 Effect of Educational Gains on Earnings by Percentile in Mexico, 1984­94 300 8.6 Effect of Changes in Returns to Education on Earnings by Percentile in Mexico, 1984­94 301 8.7 Effect of Urban-Rural Disparities on Earnings by Percentile in Mexico, 1984­94 303 9.1 Evolution of Income Inequality, 1979­94 318 9.2 Elasticity of Spouses' Occupational Choice with Respect to Head of Household's Earnings 332 9.3 1979­94 Variation in Individual Earnings Caused by the Price Effect, by Centiles of the 1979 Earnings Distribution 336 9.4 Simulated Entries into and Exits from the Wage Labor Force 337 9.5 Simulation of the 1994 Education Structure on the 1979 Population 340 9.6 1979­94 Variation in Household Income Caused by the Price Effect, by Centiles of the 1979 Distribution of Equivalized Household Income Per Capita (EHIP) 344 9.7 Entries into and Exits from the Labor Force: Overall Participation Effect 345 9.8 Effects of Imposing the 1994 Education Structure on the 1979 Population 348 9.9 Effects of Imposing the 1994 Children Structure on the 1979 Population: Relative Variation by Centile of the 1979 Distribution of Equivalized Household Income 351 Tables 1.1 Selected Indicators of Long-Run Structural Evolution 3 3.1 Distributions of Income in Greater Buenos Aires, Selected Years 49 3.2 Hourly Earnings by Educational Levels in Greater Buenos Aires, Selected Years 50 3.3 Log Hourly Earnings Equation Applied to Greater Buenos Aires, Selected Years 52 CONTENTS ix 3.4 Hourly Earnings by Gender in Greater Buenos Aires, Selected Years 57 3.5 Weekly Hours of Work by Educational Levels in Greater Buenos Aires, Selected Years 59 3.6 Hours of Work Equation for Greater Buenos Aires, Selected Years 60 3.7 Labor Status by Role in the Household in Greater Buenos Aires, Selected Years 62 3.8 Composition of Sample by Educational Level in Greater Buenos Aires, Selected Years 63 3.9 Decompositions of the Change in the Gini Coefficient: Earnings and Equivalent Household Labor Income in Greater Buenos Aires, 1986­92 71 3.10 Decompositions of the Change in the Gini Coefficient: Earnings and Equivalent Household Labor Income in Greater Buenos Aires, 1992­98 72 3.11 Decompositions of the Change in the Gini Coefficient: Earnings and Equivalent Household Labor Income in Greater Buenos Aires, 1986­98 73 3.12 Decomposition of the Change in the Gini Coefficient: Average Results Changing the Base Year in Greater Buenos Aires, Selected Periods 74 4.1 General Economic Indicators for Brazil, Selected Years 86 4.2 Basic Distributional Statistics for Different Degrees of Household Economies of Scale 91 4.3 Stochastic Dominance Results 93 4.4 Educational and Labor-Force Participation Statistics, by Gender and Race 94 4.5 Equation 4.2: Wage Earnings Regression for Wage Employees 99 4.6 Equation 4.3: Total Earnings Regression for the Self-Employed 101 4.7 Simulated Poverty and Inequality for 1976, Using 1996 Coefficients 104 4A.1 Real GDP and GDP Per Capita in Brazil, 1976­1996 115 4A.2 PNAD Sample Sizes and Missing or Zero Income Proportions 116 4A.3 A Brazilian Spatial Price Index 117 4A.4 Brazilian Temporal Price Deflators, Selected Years 118 4A.5 Ratios of GDP Per Capita to PNAD Mean Household Incomes, 1976­96 118 4B.1 Evolution of Mean Income and Inequality: A Summary of the Literature 119 x CONTENTS 5.1 Decomposition of Total Inequality between Rural and Urban Areas, Selected Years 129 5.2 Labor-Market Indicators in Urban and Rural Areas, Selected Years 132 5.3 Changes in Sociodemographic Characteristics in Urban and Rural Areas, Selected Years 134 5.4 Earnings Equations of Wage and Self-Employed Male and Female Urban Workers, Selected Years 138 5.5 Earnings Equations of Wage and Self-Employed Male and Female Rural Workers, Selected Years 139 5.6 Marginal Effect of Selected Variables on Occupational Choice among Wage Earners, Self-Employed Workers, and Inactive Individuals for Urban Heads of Household, Spouses, and Other Household Members, and All Rural Workers, Selected Years 144 5.7 Decomposition Income Distribution Changes for Households and Individual Workers in Urban and Rural Colombia: Changes in the Gini Coefficient, Selected Periods 150 5.8 Mean Income: Effect of Change in the Constant of the Earnings Equation 153 5.9 Simulated Changes in Participation and Occupational Choice in Urban Colombia, Selected Periods 153 6.1 Evolution of Mean Household Income, 1980­96 178 6.2 Evolution of the Socioeconomic Structure of the Population, 1980­96 179 6.3 Evolution of the Personal Distribution of Income, 1980­96 180 6.4 Individual Wage Functions by Gender and Area, 1980­96 186 6.5 Household Profit Functions and Nonfarm Activities, 1980­96 188 6.6 Simulated Evolution of Typical Incomes: Price Effect 192 6.7 Decomposition of Changes in the Distribution of Individual Earnings 194 6.8 Decomposition of Changes in the Distribution of Household Income Per Capita 198 6.9 Mean and Dispersion of Household Incomes according to Some Characteristics of Heads of Households 200 6.10 Occupational-Choice Behavior, 1980­96 202 6.11 Simulated Changes in Occupational Choices, Whole Population 205 CONTENTS xi 6.12 Simulated Changes in Occupational Choices, Rural and Urban Population 206 7.1 Location of Actual Distribution of Per Capita Household Income, 1984 and 1989, 1989 and 1997 223 7.2 Inequality of Actual Distribution of Per Capita Household Income, Selected Periods 226 7.3 Occupational-Position Equations for Male Heads of Household 231 7.4 Occupational-Position Equations for Female Heads of Household 233 7.5 Occupational-Position Equations for Male Family Members Who Are Not Heads of Household 235 7.6 Occupational-Position Equations for Female Family Members Who Are Not Heads of Household 237 7.7 Earnings Functions for Male Wage Earners 240 7.8 Earnings Functions for Female Wage Earners 242 7.9 Earnings Functions for Male Self-Employed Workers 244 7.10 Earnings Functions for Female Self-Employed Workers 246 7.11 Distribution of Per Capita Household Income, Substituting 1989 Values into 1984 Distribution 259 7.12 Distribution of Per Capita Household Income, Substituting 1984 Values into 1989 Distribution 260 7.13 Rising Educational Attainments in Malaysia, 1984­97 261 7.14 Actual and Simulated Inequality for Disaggregated Gender and Occupational-Position Groups 264 7.15 Distribution of Per Capita Household Income, Substituting 1997 Values into 1989 Distribution 271 7.16 Distribution of Per Capita Household Income, Substituting 1989 Values into 1997 Distribution 272 8.1 Inequality in Earnings and Household Income in Mexico, 1984 and 1994 277 8.2 Characteristics of the Labor Force in Mexico, 1984 and 1994 279 8.3 Selected Results from Earnings Equations for Mexico 290 8.4 Decomposition of Changes in Inequality in Earnings and Household Income in Mexico, 1984­94 295 8.5 Rural Effect in the Decomposition of Changes in Inequality in Earnings and Household Income in Mexico, 1984­94 297 9.1 Evolution of the Structure of the Population at Working Age, 1979­94 316 9.2 Wage Functions for Men, Corrected for Selection Bias, Selected Years 327 xii CONTENTS 9.3 Wage Functions for Women, Corrected for Selection Bias, Selected Years 328 9.4 Decomposition of the Evolution of the Inequality of Individual Earnings, 1979­80 and 1993­94 334 9.5 Decomposition of the Evolution of the Inequality of Equivalized Household Incomes, 1979­80 and 1993­94 335 10.1 A Summary of the Decomposition Results 359 10.2 Interpreting the Decompositions: A Schematic Summary 381 Preface The process of economic development is inherently about change. Change in where people live, in what they produce and in how they produce it, in how much education they get, in how long and in how well they live, in how many children they have, and so on. So much change, and the fact that at times it takes place at such sur- prising speed, must affect the way incomes and wealth are distrib- uted, as well as the overall size of the pie. While considerable efforts have been devoted to the understanding of economic growth, the economic analysis of the mechanisms through which growth and development affect the distribution of welfare has been rudimentary by comparison. Yet understanding development and the process of poverty reduction requires understanding not only how total income grows within a country but also how its distribution behaves over time. Our knowledge of the dynamics of income distribution is presently limited, in part because of the informational inefficiency of the scalar inequality measures generally used to summarize dis- tributions. Single numbers can often hide as much as they show. But recent improvements in the availability of household survey data for developing countries, and in the capacity of computers to process them, mean that we should be able to do a better job comprehend- ing the nature of changes in the income distribution that accompany the process of economic development. We hope that this book is a step in that direction. By looking at the evolution of the entire distribution of income over reasonably long periods--10 to 20 years--and across a diverse set of societies--four in Latin America and three in East Asia--we have learned a great deal about a variety of development experi- ences, and how similar building blocks can combine in unique ways, to shape each specific historical case. But we have also learned about the similarities in some of those building blocks: the complex effect of educational expansion on income inequality, the remarkable role of increases in women's participation in the labor force, and the importance of reductions in family size, to name a few. xiii xiv PREFACE We have learned that the complexity of the interactions between these forces is so great that aggregate approaches to the relationship between growth and distribution are unlikely to be of much use for any particular country. We have also learned that some common patterns can be discerned and, with appropriate care and humility, understanding them might be helpful to policymakers seeking to enhance the power of development to reduce poverty and inequity. We hope that readers might share some of the joy we found in uncovering the stories behind the distributional changes in each of the countries studied in this book. François Bourguignon Francisco H. G. Ferreira Nora Lustig Acknowledgments This book started as a joint research project organized by the Inter- American Development Bank (IDB) and the World Bank, and we are grateful to the many people in both institutions who supported it throughout its five-year lifespan. We would like to thank particu- larly Michael Walton, who supported the birth of the project when he directed the Poverty Reduction Unit at the World Bank, as well as Carlos Jarque and Carlos Eduardo Vélez of the IDB, who sup- ported the project's completion. We are also very grateful to Martin Ravallion, who commented on various versions of the work, from research proposal to finished papers; to James Heckman, who acted as a discussant for three chapters at a session in the 2000 Meetings of the American Eco- nomic Association; to Ravi Kanbur, who provided very useful sug- gestions at an early stage of the research process; and to Tony Shorrocks, who gave us many insights into the nature of the decom- positions we undertook. We are similarly indebted to a number of participants in seminars and workshops that took place at various meetings of the Econometric Society (in particular in Latin America and the Far East); of the European Economic Association (in Venice); of the Network on Inequality and Poverty of the IDB, World Bank, and LACEA (Latin American and Caribbean Economic Association); and at the Universities of Brasília, Maryland, and Michigan, The Catholic University of Rio de Janeiro, the European University Institute in Florence, and DELTA (Département et Laboratoire d'Economie Théorique et Appliquée) in Paris. Our greatest debt, of course, is to the authors of the seven case studies, who really wrote the book. Their names and affiliations are listed separately in the coming pages, and we thank them profoundly for their commitment and endurance during the long process of pro- ducing this volume. Finally, the book would not have been possible without the dedication, professionalism, and attention to detail of Janet Sasser and her team at the World Bank's Office of the Publisher. xv Contributors Vivi Alatas Economist in the East Asia and Pacific Region at the World Bank, Jakarta, Indonesia César Bouillón Economist in the Poverty and Inequality Unit of the Inter-American Development Bank, Washington, D.C. François Bourguignon Senior vice president and chief econo- mist of the World Bank, Washington, D.C. Walter Sosa Escudero Professor of econometrics at the Uni- versidad de los Andes, Buenos Aires, Argentina, and at the Universidad Nacional de La Plata, Argentina; researcher at Centro de Estudios Distributivos, Laborales y Sociales (CEDLAS) at the Universidad Nacional de La Plata Francisco H. G. Ferreira Senior economist in the Development Research Group at the World Bank, Washington, D.C. Gary S. Fields Professor of labor economics at Cornell University, Ithaca, New York Martin Fournier Researcher at the Centre d'Etudes Français sur la Chine Contemporaine (CEFC), Hong Kong, China, and associate professor at the Université d'Auvergne, Clermont-Ferrand, France Leonardo Gasparini Director of CEDLAS, as well as pro- fessor of economics of income distrib- ution and professor of labor econom- ics at the Universidad Nacional de La Plata, Argentina xvii xviii CONTRIBUTORS Marc Gurgand Researcher at the Département et Laboratoire d'Economie Théorique et Appliquée (DELTA) at the Centre National de la Recherche Scientifique (CNRS), Paris, France Adriana Kugler Associate professor of economics at the Universitat Pompeu Fabra, Barcelona, Spain, and assistant pro- fessor of economics at the University of Houston, Texas Arianna Legovini Senior monitoring and evaluation specialist in the Africa Region at the World Bank, Washington, D.C. José Leibovich Assistant director of the Departamento Nacional de Planeación (Department of National Planning), Bogotá, Colombia Nora Lustig President of the Universidad de Las Americas, Puebla, Mexico Mariana Marchionni Professor of econometrics at the Universidad Nacional de La Plata, Argentina, and researcher at CEDLAS Jairo Núñez Researcher at the Universidad de los Andes, Bogotá, Colombia Ricardo Paes de Barros Researcher at the Instituto de Pesquisa Econômica Aplicada (IPEA), Rio de Janeiro, Brazil Sergei Soares Senior education economist in the Latin America and Caribbean Region at the World Bank, Washington, D.C., and researcher at IPEA, Rio de Janeiro, Brazil Carlos Eduardo Vélez Chief of the Poverty and Inequality Unit at the Inter-American Develop- ment Bank, Washington, D.C. Abbreviations and Acronyms CPI Consumer price index DANE Departamento Nacional de Estadística (National Department of Statistics, Colombia) DGBAS Directorate-General of Budget, Accounting, and Statistics (Taiwan, China) EH Encuesta de Hogares (Household Survey, Colombia) EHIP Equivalized household income per capita ENIGH Encuesta Nacional de Ingresos y Gastos de los Hogares (Household Income and Expenditure Surveys, Mexico) EPH Encuesta Permanente de Hogares (Permanent Household Survey, Argentina) GDP Gross domestic product IBGE Instituto Brasileiro de Geografia e Estatística (Brazilian Geographical and Statistical Institute) ICV-DIEESE Índice do Custo de Vida­Departamento Intersindical de Estatística e Estudos Sócio- Econômeios (Cost of Living Index­Inter Trade Union Department of Statistics and Socioeco- nomic Studies, Brazil) IGP-DI Índice Geral de Preços­Disponibilidade Interna (General Price Index, Brazil) INEGI Instituto Nacional de Estadística, Geografia y Informática (National Institute of Statistics, Geography, and Informatics, Mexico) INPC-R Índice Nacional de Preços ao Consumidor­Real (National Consumer Price Index, Brazil) MIDD Microeconomics of Income Distribution Dynamics xix xx ABBREVIATIONS AND ACRONYMS OLS Ordinary least squares PNAD Pesquisa Nacional por Amostra de Domicílios (National Household Survey, Brazil) Progresa Programa de Educación, Salud y Alimentación (Program for Education, Heath, and Nutrition, Mexico) 1 Introduction François Bourguignon, Francisco H. G. Ferreira, and Nora Lustig This book is about how the distribution of income changes during the process of economic development. By its very nature, the process of development is replete with structural change. The composition of economic activity changes over time, generally away from agriculture and toward industry and services. Relative prices of goods and factors of production change too, and their dynamics involve both long-term trends and short-term shocks and fluctuations. The sociodemo- graphic characteristics of the population evolve, as average age rises and average family size falls. Patterns of economic behavior are not constant either: female labor-force participation rates increase, as do the ages at which children leave school and enter employment. Generations save, invest, and bequeath, and so holdings of both phys- ical and human capital change. But although change is everywhere and although some patterns can be discerned across many societies, no single country ever follows exactly the same development path. The combination, sequence, and timing of changes that are actually observed in any given country, at any given period, are always unique, always unprecedented. Each one of these processes of structural change is likely to have powerful effects on the distribution of income. Social scientists in general--and economists in particular--have long been searching for some general rule about how development and income distribu- tion dynamics are related. Karl Marx (1887) concluded that, under the inherent logic of capital accumulation by a few and relentless 1 2 BOURGUIGNON, FERREIRA, AND LUSTIG competition in labor supply by many, social cleavages would grow increasingly deeper, until revolution changed things forever. Simon Kuznets (1955)--drawing on W. Arthur Lewis (1954)--believed that the migration of labor and capital from traditional, less pro- ductive sectors of the economy toward more modern and produc- tive ones would result first in rising inequality, followed eventually by declining inequality. Jan Tinbergen (1975) argued that the cru- cial struggle in modern economies was that between the rival forces of (a) technological progress--ever raising the demand for (and the pay of) more educated workers--and (b) educational expansion-- ever raising the supply of such workers. More recently, economists have developed models with multiple equilibria, each characterized by its own income distribution, with its own mean and its own level of inequality.1 These models show that different combinations of initial conditions--and of the historical processes that might follow them--could lead to diverse outcomes. In this book, we do not suggest yet another grand theory of the dynamics of income distribution during the process of development. Instead, we propose and apply a methodology to decompose distri- butional change into its various driving forces, with the aim of enhancing our ability to understand the nature of income distribu- tion dynamics.2 In fact, rather than searching for a unifying expla- nation, we explore the incredible diversity in the distributional experiences and outcomes across economies. Why do changes in inequality differ so markedly across economies that have similar rates of growth in gross domestic product (GDP) per capita, such as Colombia and Malaysia (see table 1.1)? Why do we observe rising inequality both in growing economies (Mexico) and in contracting ones (Argentina)? Why do educational expansions sometimes lead to greater equality (as in Brazil and Taiwan, China) and sometimes to greater inequality (as in Indonesia and Mexico)? The microeconomic empirics reported in this volume suggest that this diversity in outcomes results from the various possibilities that arise from the interaction of a number of powerful underlying social and economic phenomena. We group these phenomena into three fundamental forces: (a) changes in the underlying distribution of assets and personal characteristics in the population (which includes its ethnic, racial, gender, and educational makeup); (b) changes in the returns to those assets and characteristics; and (c) changes in how people use those assets and characteristics, principally in the labor market. At a general level, our approach to addressing these themes con- sists of simulating counterfactual distributions by changing how markets and households behave, one aspect at a time, and by observ- ing the effect of each change on the distribution, while holding all and China 6.0 5.7 6.0 9.5 70 84 46 50 4.9 4.2 3,786 0.271 0.290 initial 1979­94 in aiwan,T veys sur 2.4 1.1 5.6 6.9 63 58 33 41 5.3 4.9 5,758 0.491 0.549 Mexico 1984­94 household by c c 4.0 5.2 42 55 60 58 7.9 8.3 4.9 4.4 5,548 0.486 0.499 given Malaysia 1984­97 As b. . 6 5.7 5.1 3.8 23 35 32 48 5.0 4.4 only 1,430 0.384 0.402 Indonesia 1980­96 Aires d d Buenos 3.7 3.8 4.6 6.9 57 61 27 41 5.4 4.3 2,520 . 0.502 0.544 Colombia 1978­95 Greater Evolution only to refer sector d d 1.0 0.2 3.2 5.3 68 77 28 42 4.6 3.6 data 4,499 Structural Brazil 0.595 0.591 Urban 1976­96 d. . Argentine a over c c all 0.0 Long-Run 6,506 1.0- 86 88 45 56 8.7 9.8 4.4 4.4 and 0.417 0.501 data, 14 of Argentina 1986­98 age US$) urbanization Indicators in the population in and GDP For of percent) household size- c. 1980 parity (percent) (percent) GDP b in schooling rate mean women (household the Selected of capita, power of rate of (percent) households) years. from capita (1980­96, income rate years year year force year year per year 1.1 analyzed growth per year year year size year year coefficient capita capita Apart terminal ableT Indicator Period GDP (purchasing Annual per Growth per verageA Initial rminaleT Urbanization Initial rminaleT labor Participation Initial rminaleT Family Initial rminaleT income weighted Gini Initial rminaleT a. 3 4 BOURGUIGNON, FERREIRA, AND LUSTIG other aspects constant. We construct a simple income generation model at the household level, which allows us to separate the observed changes in the distribution of income into the three key forces just described. The first force comprises the changes in the sociodemographic structure of the population, as characterized by area of residence, age, education, ownership of physical and finan- cial assets, and household composition (collectively referred to as endowment effects, or population effects). The second force comes from changes in the returns to factors of production, including the various components of human capital, such as education and expe- rience (price effects). The third force has to do with changes in the occupational structure of the population, in terms of wage work, self-employment, unemployment, and inactivity (occupational effects). Of course, those causes of changes in the distribution of income are not independent of one another. For instance, a change in the sociodemographic structure of the population--such as higher education levels in some segments of the population--will proba- bly generate a change in the structure of prices, wages, and self- employment incomes, which may in turn modify the way people choose among alternative occupations. Conversely, exogenous changes in returns to education (say, from skill-biased technological change) are likely to induce some response from households in terms of the desired level of education for their children. Like all of its rel- atives in the Oaxaca-Blinder class of decompositions, the technique discussed in this volume is not designed to model those general equi- librium effects. It simply separates out how much of a given change would not have been observed under a well-defined statistical coun- terfactual (for example, if returns to education had not changed), without making any statement about the economic foundations of that counterfactual (for example, the conditions under which no change in the returns to education would be consistent with the other observed changes, in an economic sense). Nevertheless, as we hope the case studies in chapters 3 through 9 will show, the insights gained from the statistical decomposition and some basic microeconomic intuition allow analysts to improve their under- standing of the nature of changes in income distribution in a partic- ular economy. The microeconometric approach applied in this volume should be seen as complementary to the more prevalent macroeconometric (cross-country) studies of the relationship between growth and inequality (or the reverse). (See, for instance, Alesina and Rodrik 1994; Dollar and Kraay 2002; Forbes 2000.) Cross-country regres- sions can, if well specified and run on comparable data, tell us much about average relationships between measures of income dispersion INTRODUCTION 5 and other indicators of economic performance (such as economic growth). However, for two reasons they should be complemented by more detailed country studies of the sort included in this volume. First, one can argue that endogeneity and omitted variable biases inevitably plague most macroeconometric cross-country studies. Suppose, for instance, that inequality is on the left-hand side of a regression, and growth is treated as an explanatory variable.3 Vari- ous case studies in this volume suggest that changes in the distribu- tion of years of schooling affect income inequality. Standard growth and wealth dynamics theory suggests that such changes would also affect the rate of economic growth. Those changes cannot be ade- quately captured by the mean years of schooling alone. If they are not somehow included as explanatory variables (which they usually are not), then their correlation with growth would bias the esti- mated coefficient of mean schooling. Even if the changes were not correlated with growth (which is unlikely), their omission would increase the variance of the residuals, inflate standard errors, and compromise hypothesis testing. Second, even if the average relationships identified by the cross- country studies were true, they might not be particularly relevant to individual countries whose specific circumstances (some of which may not be observed at the macro level) place them at some point other than that average. Although useful lessons can be learned from the average relationships estimated macroeconometrically, specific country analysis and policy recommendations should also be informed by more in-depth country studies. The method proposed is applied to seven economies in this volume: three in East Asia and four in Latin America.4 The East Asian economies are Indonesia, Malaysia, and Taiwan (China). The Latin American ones are Argentina (Greater Buenos Aires), Brazil (urban), Colombia, and Mexico.5 Latin America and East Asia have had rather different experiences with trends in the distribution of income and with the pace of economic development (see table 1.1). For example, during 1980­2000, growth in GDP per capita was considerably higher in East Asia than in Latin America. Also, Latin America showed higher initial levels of income inequality and (with the exception of Brazil) sharper upward trends as well. In most economies, however, the average years of schooling, the share of urban population, and the participation of women in the labor force rose, while the average size of households fell. Given the similar demographic and educational trends in practically all the economies, what explains the differences in the evolution of inequality? We hope that learning about the forces at work in the Asian and Latin American contexts will provide new insights for development ana- lysts and policymakers. 6 BOURGUIGNON, FERREIRA, AND LUSTIG The volume is organized as follows. In this introductory chapter, we first review the broad changes in structure observed in the economies under study. We then present a nonmathematical descrip- tion of the methodology, placing it within the context of the litera- ture. The formal presentation of the method is found in chapter 2. Chapters 3 to 9 contain the analyses for each of the seven economies. Chapter 10 presents a synthesis of the results and some concluding remarks. Indicators of Structural Change in Seven Selected Economies The magnitude of the structural changes that a society undergoes during the development process is well illustrated by the figures reported in table 1.1. The table lists changes in average education levels, in the urban-rural structure of the economy, in female labor- force participation, and in family sizes over intervals ranging from one to two decades, from the mid-1970s to the late 1990s. It also includes two measures of economic growth (in GDP per capita and in household survey mean income) and the Gini coefficient for household per capita income. Although the exact initial and final years vary, some general trends emerge. In all economies, the changes achieved on these four fronts in the span of 10 to 20 years were most impressive. The importance of the rural sector declined drastically everywhere, including Indonesia, where it was initially much larger than in the other economies in our sample. The educa- tional level of the population also rose dramatically across all economies. Educational attainment measured by average years of schooling rose by 50 percent in Colombia and by even more in Brazil, Indonesia (urban), and Taiwan (China). (In the latter, educa- tional attainment rose from an already high initial level of six years.) In the Greater Buenos Aires area of Argentina, in Malaysia, and in Mexico, the change was less dramatic. The participation rate of women in the labor force was largely unchanged in Malaysia and increased only slightly in Taiwan, China, but it rose substantially in Indonesia and in the Latin American countries. Average family sizes went down everywhere, falling by a full person or more in Brazil and Colombia. In terms of economic growth, the disparity of experiences fits neatly into the expected continental lines. The three Asian economies grew so fast since the end of the 1970s that income per capita practically doubled during the 15 or so years under analysis. In the four Latin American countries, growth performance was dis- appointing. It was close to zero in Argentina and Brazil, positive but INTRODUCTION 7 small in Mexico, and moderate in Colombia. Taiwan, China, was poorer than both Brazil and Mexico in 1980, but substantially richer in the mid-1990s. All of those changes are likely to have had strong effects on the distribution of income, because many of them are known to be strongly income selective. Changes in female participation in the labor force or in fertility behavior are certainly not uniform across the population. Moreover, they directly affect per capita income in the households in which they take place. Likewise, per capita growth rates as high as 6 percent a year during 15-year periods are likely to be accompanied by changes in the structure of the economy that have repercussions on income distribution. Nevertheless, the net outcome in terms of the change in the Gini coefficient is far from uniform. It ranges from a decline of 0.4 Gini points in (urban) Brazil to a rise of 8.4 Gini points in (the Greater Buenos Aires area of) Argentina. However, these changes are not perfectly comparable across the seven economies. For a start, the periods over which each economy was observed differ somewhat. So does the coverage of the survey, particularly for Argentina and Brazil. Nevertheless, it is probably safe to assert that, despite facing broadly similar trends in terms of demographics, education, urbanization, and female participation, the seven economies have experienced very different changes in inequality. How should this observation be interpreted? Can all the differences be attributed to differences in growth rates or in the sec- toral composition of output? Did the distributional effects of struc- tural changes tend to compensate one another more in Brazil and Malaysia than in Indonesia and Mexico? Or are the distributional effects of each structural change themselves of smaller size in the first two economies? How is the net result produced in each eco- nomy, and why does it differ so much between them? Are changes in the distribution of income associated with changes in the stock of education more important than changes in the returns to skills? Are educational factors more or less important than changes in occupa- tional choices or fertility patterns? Those questions are taken up for each economy in chapters 3 through 9 and are summarized in chapter 10. Decomposing Changes in Inequality: An Introduction This study is certainly not the first one in which economists have tried to decompose changes in inequality in order to gain some insight into the processes that underlie them. Because the number of reliable data sets with the required time coverage before World War II 8 BOURGUIGNON, FERREIRA, AND LUSTIG was very small, it is probably fair to say that the first well-known empirical study of long-term income distribution dynamics was by Simon Kuznets (1955). Since then, a good number of studies have looked at the determinants of changes in poverty and inequality. The literature is too large to be done justice here, and we do not propose to survey it comprehensively. However, it may be useful to distinguish between two broad approaches to distributional change that are present in the literature. We will refer to the first, which relies primarily on aggregated data, as the macroeconomic approach. By contrast, empirical studies relying on fully disaggre- gated data from household surveys fall under the microeconomic approach. Macroeconomic approaches can be further classified into two groups. The first includes those that use standard regression analy- sis, relating aggregate poverty or inequality indices as dependent variables to a set of macroeconomic or structural (supposedly) inde- pendent variables. There are examples in which the variation occurs on a time series, as in Blejer and Guerrero (1990) and Ferreira and Litchfield (2001), and there are examples in which it occurs in a cross-section, as in Dollar and Kraay (2002), Ravallion (1997), and Ravallion and Chen (1997). These papers were, to a large extent, inspired by an earlier literature related to the empirical Kuznets curve (see, for example, Ahluwalia 1976), which also belongs in this group. This approach has at least two serious shortcomings. First, con- cerns about the endogeneity of many right-hand-side variables that are included--as well as about biases arising from others that are not6--mean that the regressions can at best be interpreted as (very) reduced-form estimates of the relationship between summary measures of poverty and inequality and a few macroeconomic vari- ables. Second, although single inequality and poverty indices are useful summary statistics, they are informationally restricted and often are not robust to changes in the assumptions underlying their construction (see Atkinson 1970). The second group of approaches relies on computable general equilibrium models. Once again, there is a long lineage. Some important contributions include Adelman and Robinson (1978); Bourguignon, de Melo, and Suwa (1991); Decaluwé and others (1999); and Lysy and Taylor (1980). Computable general equilib- rium models introduce more structure, but they are still essentially macroeconomic in nature and capture the distributional effect of only a limited number of variables, and then only on a limited num- ber of classes or groups. They are also pure simulation models, which rely on rough calibration procedures rather than on time- series or detailed household-level data. These approaches do not INTRODUCTION 9 capture the most interesting and revealing factors that explain the evolution of individual or household incomes and thus often appear inconclusive. This happens because the inherent diversity of indi- vidual situations and the complexity that characterizes the interac- tion of endowments, human behavior, and market conditions in determining individual incomes require a microeconomic focus. Of course, in parallel with these macroeconomic strands of the literature on income distribution dynamics there is also an estab- lished microeconomic tradition. Its distinguishing feature is that whereas the macroeconomic work relies on aggregated data for countries or regions, the microeconomic work relies on household- level data. The most common microeconomic approach found in the literature is based on decompositions of changes in poverty or inequality measures by population subgroups.7 In the case of inequality, the change in some scalar measure is decomposed into what is due to changes in the relative mean income of various pre- determined groups of individuals or households, what is due to changes in their population weights, and--residually--what is due to changes in the inequality within those groups. When groups are defined by some characteristic of the household or household head, such as location, age, or schooling, the method identifies the contri- bution of changes in those characteristics to changes in poverty or inequality. The decomposition of changes in the mean log deviation of earnings in the United Kingdom, by Mookherjee and Shorrocks (1982), is the best illustration of this type of work. The comparison of poverty profiles over time (Huppi and Ravallion 1996) or of poverty probit analyses (Psacharopoulos and others 1993) belong to the same tradition.8 There are at least four principal limitations to these approaches. First, the analysis again relies on summary measures of inequality and poverty, rather than on the full distribution. Second, the decomposition of changes in inequality or poverty measures often leaves an unexplained resid- ual of a nontrivial magnitude. Third, the decompositions do not easily allow for controls: it is impossible, for instance, to identify the partial share attributable to each factor in a joint decomposition of inequality changes by education, race, and gender subgroups. Finally, they shed no light on whether the contribution of a particu- lar attribute to changes in overall inequality is due to changes in its distribution or due to changes in market returns to it. A large share for education, for instance, might be consistent with large shifts in the distribution of years of schooling, with changes in returns, or-- indeed--with various combinations of the two. An alternative approach, which seeks to address all four of these shortcomings in scalar decompositions, is the counterfactual simu- lation of entire distributions on the basis of the disaggregated 10 BOURGUIGNON, FERREIRA, AND LUSTIG information contained in the household survey data set. This approach was first applied by Almeida dos Reis and Paes de Barros (1991) for Brazil. Juhn, Murphy, and Pierce (1993) use a technique of this kind to study the determinants of the increase in wage inequality in the United States during the 1970s and 1980s. Blau and Khan (1996) use this approach to compare wage distributions across 10 industrial countries. A semiparametric version of this approach is provided by DiNardo, Fortin, and Lemieux (1996) in a study of U.S. wage distribution between 1973 and 1992, which essentially relies on reweighing observations in kernel density esti- mates of continuous distributions of earnings so as to construct appropriate counterfactual distributions that shed light on the nature of the change in the actual distribution over time.9 As in the studies cited in the preceding paragraph, the method proposed and applied in this volume follows in the tradition estab- lished by Oaxaca (1973) and Blinder (1973). All of these approaches seek to shed light on what determines differences across income dis- tributions by simulating counterfactual distributions that differ from an observed distribution in a controlled manner. Unlike Blau and Khan (1996); Juhn, Murphy, and Pierce (1993); or, indeed, any of the aforementioned studies, all of which were concerned with wage distributions, the analysis in this book seeks to understand the more complex dynamics of the distribution of welfare, proxied by the dis- tribution of (per capita or equivalized) household income. The underlying determinants of this distribution are more complex. In addition to the quantities and prices of individual characteristics that determine earnings rates, household incomes depend also on participation and occupational choices, on demographic trends, and on nonlabor incomes. As a result, the approach followed here generalizes the counter- factual simulation techniques from the single (earnings) equation model to a system of multiple (nonlinear) equations that is meant to represent mechanisms of household income generation. This system comprises earnings equations, equations for potential household self-employment income, and occupational-choice models that describe how individuals at working age allocate their time between wage work, self-employment, and nonmarket time. In some cases, it also includes equations for determining educational levels and the number of children living in the household. In each economy, the model is estimated entirely in reduced form, thus avoiding the insurmountable difficulties associated with joint estimation of the participation and earnings equations for each household member. We maintain some strong assumptions about the independence of residuals. Therefore, the estimation results are never interpreted as corresponding to a structural model and no INTRODUCTION 11 causal inference is drawn. We interpret the parameter estimates gen- erated by these equations only as descriptions of conditional distri- butions, whose functional forms we maintain hypotheses about. Yet, even in this limited capacity, these estimates help us gain useful insights into the nature of differences across distributions and about the underlying forces behind their evolution over time. The most important methodological contribution undertaken in this book is to generalize the counterfactual simulation approach to distributional change from earnings to household income distrib- utions. The approach thus applies to problems related to the dis- tribution of total income, rather than only those related to the distribution of earnings. The method can shed light on the evolution of the entire distribution, rather than merely on the path of sum- mary statistics. And it can decompose any change in the incomes of a set of households into its fundamental sources: changes in the amounts of resources at their disposal (reflected in the population or endowments effects), changes in how the markets remunerate those resources (reflected in the price effects), and changes in the decisions made about how to use those resources (reflected in the occupa- tional effects). Within each such category, this approach also allows us to iden- tify the contributions from specific endowments and prices. Thus, we can distinguish the effect of changes in returns to education from those of other "prices," such as the effect of experience or of the gender wage gap. Analogously, we are able to understand the effect of changes in the distribution of education separately from that of changes in demographics. We can then shed some light on how one affects the other, always in terms of understanding how the condi- tional distributions of those variables have evolved, rather than seeking to establish directions of causation. This is as far as our econometrics allows us to go. But it is farther than we have gone before. The proposed methodology has some important advantages over others that have been used in the field. First, as we shall see, small changes in aggregate indices of inequality can hide strong counter- vailing forces. For example, a large reduction in dispersion in the distribution of years of education could be partially offset by the inequality-increasing effect of a rising skill premium. Substantial changes in spatial premiums (such as those evident from wage gaps between urban and rural areas) may be offset by migration and changes in labor-force participation (as in the Indonesian case). A rise in household income inequality arising from increases in the labor-force participation rates of educated women can be partly off- set by "progressive" declines in family size (as in the case of Taiwan, China). Methods that rely on decomposing a scalar measure of 12 BOURGUIGNON, FERREIRA, AND LUSTIG inequality will gloss over those dynamics. As we show in the subse- quent chapters, the evolution of the distribution of income is the result of many different effects--some of them quite large--which may offset one another in whole or in part. Researchers and policy- makers may find it useful to disentangle those effects, rather than to focus on a single dimension. Finally, the approach used here has an additional advantage. Because it analyzes the entire distribution of income, one can assess how different factors affect different parts of the distribution. That assessment can shed light on how different groups (for example, the urban versus the rural poor) are affected by changes in the distribution of assets, changes in the returns to those assets, and changes in how individuals and households choose to use their assets. The next chap- ter contains a formal presentation of the approach used in this book, which we refer to as generalized Oaxaca-Blinder decompositions. Notes 1. See, among others, Banerjee and Newman (1993), Galor and Zeira (1993), and Bénabou (2000). For good surveys, see Aghion, Caroli, and Garcia-Penalosa (1999) and Atkinson and Bourguignon (2000). 2. This volume is the result of a five-year multicountry research effort, known as the project on the Microeconomics of Income Distribution Dynamics (MIDD), which was sponsored by the Inter-American Develop- ment Bank and the World Bank. 3. A slightly modified version of the argument that follows could just as easily be made for the reverse specification (with inequality explaining growth) or, indeed, for the joint estimation of a two-equation model. 4. Data availability played a role in selecting economies from these two regions. The proposed methodology requires the availability of at least two comparable household surveys, separated by an interval of at least one decade, so that medium- to long-run structural effects of economic devel- opment and of changes in the sociodemographic characteristics of the pop- ulation on the distribution of income may be captured. 5. During the period in which this research project was conducted, a number of other excellent applications of the methodology have been pro- duced. They include Altimir, Beccaria, and Rozada (2001) on Argentina; Bravo and others (2000) on Chile; Dercon (2001) on Ethiopia; Grimm (2002) on Côte d'Ivoire; and Ruprah (2000) on the República Bolivariana de Venezuela. 6. Sometimes only GDP is used as the explanatory variable, as in the Kuznets curve literature. INTRODUCTION 13 7. This approach draws on earlier, static, decomposition approaches suggested by Bourguignon (1979), Cowell (1980), and Shorrocks (1980). 8. A related approach decomposes changes in scalar poverty measures into a component attributable to growth in the mean and one attributable to changes in the Lorenz curve (a "redistribution component"; see Datt and Ravallion 1992). 9. An alternative semiparametric approach to the estimation of density functions, which relies on their close relationship to hazard functions, was proposed by Donald, Green, and Paarsch (2000). References Adelman, Irma, and Sherman Robinson. 1978. Income Distribution Policy: A Computable General Equilibrium Model of South Korea. San Francisco: Stanford University Press. Aghion, Philippe, Eve Caroli, and Cecilia Garcia-Penalosa. 1999. "Inequal- ity and Economic Growth: The Perspective of New Growth Theory." Journal of Economic Literature 37(4): 1615­60. Ahluwalia, Montek. 1976. "Income Distribution and Development: Some Stylized Facts." American Economic Review 66(2): 128­35. Alesina, Alberto, and Dani Rodrik. 1994. "Distributive Politics and Eco- nomic Growth." Quarterly Journal of Economics 109: 465­89. Almeida dos Reis, José, and Ricardo Paes de Barros. 1991. "Wage Inequal- ity and the Distribution of Education: A Study of the Evolution of Regional Differences in Inequality in Metropolitan Brazil." Journal of Development Economics 36: 117­43. Altimir, Oscar, Luis Beccaria, and Martín González Rozada. 2001. "La Evolución de la Distribución del Ingreso Familiar en la Argentina: Un Análisis de Determinantes." Serie de Estudios en Finanzas Públicas 7. Maestría en Finanzas Públicas Provinciales y Municipales, Universidad Nacional de La Plata, La Plata Argentina. Atkinson, Anthony B. 1970. "On the Measurement of Inequality." Journal of Economic Theory 2: 244­63. Atkinson, Anthony B., and François Bourguignon. 2000. "Income Distri- bution and Economics." In Anthony B. Atkinson and François Bourguignon, eds., Handbook of Income Distribution, Vol. 1. Amsterdam: North-Holland. Banerjee, Abhijit V., and Andrew F. Newman. 1993. "Occupational Choice and the Process of Development." Journal of Political Economy 101(2): 274­98. Bénabou, Roland. 2000. "Unequal Societies: Income Distribution and the Social Contract." American Economic Review 90(1): 96­129. 14 BOURGUIGNON, FERREIRA, AND LUSTIG Blau, Francine, and Lawrence Khan. 1996. "International Differences in Male Wage Inequality: Institutions versus Market Forces." Journal of Political Economy 104(4): 791­837. Blejer, Mario, and Isabel Guerrero. 1990. "The Impact of Macroeconomic Policies on Income Distribution: An Empirical Study of the Philippines." Review of Economics and Statistics 72(3): 414­23. Blinder, Alan S. 1973. "Wage Discrimination: Reduced Form and Structural Estimates." Journal of Human Resources 8: 436­55. Bourguignon, François. 1979. "Decomposable Income Inequality Mea- sures." Econometrica 47: 901­20. Bourguignon, François, Jaime de Melo, and Akiko Suwa. 1991. "Modeling the Effects of Adjustment Programs on Income Distribution." World Development 19(11): 1527­44. Bravo, David, Dante Contreras, Tomás Rau, and Sergio Urzúa. 2000. "Income Distribution in Chile, 1990­1998: Learning from Microsimu- lations." Universidad de Chile, Santiago. Processed. Cowell, Frank A. 1980. "On the Structure of Additive Inequality Mea- sures." Review of Economic Studies 47: 521­31. Datt, Gaurav, and Martin Ravallion. 1992. "Growth and Redistribution Components of Changes in Poverty Measures." Journal of Development Economics 38: 275­95. Decaluwé, Bernard, André Patry, Luc Savard, and Erik Thorbecke. 1999. "Social Accounting Matrices and General Equilibrium Models in Income Distribution and Poverty Analysis." Cornell University, Ithaca, N.Y. Processed. Dercon, Stefan. 2001. "Economic Reform, Growth and the Poor: Evidence from Rural Ethiopia." Center for the Study of African Economies, Oxford University, Oxford, U.K. Processed. DiNardo, John, Nicole Fortin, and Thomas Lemieux. 1996. "Labor Market Institutions and the Distribution of Wages, 1973­1992: A Semiparamet- ric Approach." Econometrica 64(5): 1001­44. Dollar, David, and Aart Kraay. 2002. "Growth Is Good for the Poor." Jour- nal of Economic Growth 7: 195­225. Donald, Stephen, David Green, and Harry Paarsch. 2000. "Differences in Wage Distributions between Canada and the United States: An Applica- tion of a Flexible Estimator of Distribution Functions in the Presence of Covariates." Review of Economic Studies 67: 609­33. Ferreira, Francisco H. G., and Julie A. Litchfield. 2001. "Education or Inflation?: The Micro and Macroeconomics of the Brazilian Income Dis- tribution during 1981­1995." Cuadernos de Economía 38: 209­38. Forbes, Kristin J. 2000. "A Reassessment of the Relationship between Inequality and Growth." American Economic Review 90(4): 869­87. Galor, Oded, and Joseph Zeira. 1993. "Income Distribution and Macro- economics." Review of Economic Studies 60: 35­52. INTRODUCTION 15 Grimm, Michael. 2002. "Macroeconomic Adjustment, Socio-Demographic Change, and the Evolution of Income Distribution in Côte d'Ivoire." World Institute for Development Economics Research, Helsinki. Processed. Huppi, Monika, and Martin Ravallion. 1996. "The Sectoral Structure of Poverty during an Adjustment Period: Evidence for Indonesia in the Mid-1980s." World Development 19: 1653­78. Juhn, Chinhui, Kevin Murphy, and Brooks Pierce. 1993. "Wage Inequality and the Rise in Returns to Skill." Journal of Political Economy 101(3): 410­42. Kuznets, Simon. 1955. "Economic Growth and Income Inequality." Amer- ican Economic Review 45(1): 1­28. Lewis, W. Arthur. 1954. "Economic Development with Unlimited Supplies of Labour." Manchester School 22: 139­91. Lysy, Frank, and Lance Taylor. 1980. "The General Equilibrium Model of Income Distribution." In Lance Taylor, Edmar Bacha, Eliana Cardoso, and Frank Lysy, eds., Models of Growth and Distribution for Brazil. Oxford, U.K.: Oxford University Press. Marx, Karl. 1887. Capital: A Critical Analysis of Capitalist Production, Vol. 1. London: Sonnenschein. (Republished by St. Leonards, Australia: Allen & Unwin, 1938.) Mookherjee, Dilip, and Anthony F. Shorrocks. 1982. "A Decomposition Analysis of the Trend in U.K. Income Inequality." Economic Journal 92: 886­902. Oaxaca, Ronald. 1973. "Male-Female Wage Differentials in Urban Labor Markets." International Economic Review 14: 673­709. Psacharopoulos, George, Samuel Morley, Ariel Fiszbein, Haeduck Lee, and William Wood. 1993. "La Pobreza y la Distribución de los Ingresos en América Latina, Historia del Decenio de 1980." Documento Técnico 351S. World Bank. Ravallion, Martin. 1997. "Can High-Inequality Developing Countries Escape Absolute Poverty?" Economics Letters 56: 51­57. Ravallion, Martin, and Shaohua Chen. 1997. "What Can New Survey Data Tell Us about Recent Changes in Distribution and Poverty?" World Bank Economic Review 11(2): 357­82. Ruprah, Inder. 2000. "Digging a Hole: Income Inequality in Venezuela." Inter-American Development Bank, Washington, D.C. Processed. Shorrocks, Anthony F. 1980. "The Class of Additively Decomposable Inequality Measures," Econometrica 48: 613­25. Tinbergen, Jan. 1975. Income Differences: Recent Research. Oxford, U.K.: North-Holland. 2 Decomposing Changes in the Distribution of Household Incomes: Methodological Aspects François Bourguignon and Francisco H. G. Ferreira Many different forces are behind long-run changes in income distri- butions or, more generally, distributions of economic welfare, within a population. Some of those forces have to do with changes in the distribution of factor endowments and sociodemographic charac- teristics among economic agents, others with the returns these endowments command in the economy, and others still with modi- fications in agents' behavior such as labor supply, consumption pat- terns, or fertility choices. Of course, those forces are not indepen- dent of one another. In some cases, they tend to offset one another, whereas in others they could reinforce one another. They are also likely to be affected by exogenous economic shocks as well as by government policies and development strategies. For all of these reasons, it is generally difficult to precisely identify fundamental causes and mechanisms behind the dynamics of income distribu- tion. Yet, extracting information about the nature and magnitude of those forces from observed distributional changes is crucial for an understanding of the development process and the scope of policy intervention in the distributional sphere. 17 18 BOURGUIGNON AND FERREIRA This is a difficult analytical task, and it is tempting to rely on sta- tistical decomposition techniques that are meant to more or less automatically identify the main causes for distributional changes. Such techniques have long been in use in the fields of income and consumption distribution analysis. Largely for computational rea- sons, however, they have been limited to explaining differences in scalar summary measures of distributions, rather than in the full distributions. In other words, the techniques focused on some spe- cific definition of aggregate social welfare (or inequality) rather than on the distribution of individual welfare. Among the best examples of these techniques are the well-known Oaxaca-Blinder decomposi- tion of differences in mean incomes across population groups with different characteristics (Blinder 1973; Oaxaca 1973) and the variance-like decomposition property of the so-called decomposable summary inequality measures (Bourguignon 1979; Cowell 1980; Shorrocks 1980). In both cases, the underlying logic is that the aggregate mean income (or inequality measure) in a population is the result of the aggregation of various sociodemographic groups or income sources. Thus, changes in the overall mean or inequality measure can be explained by identifying changes in the means and inequality measures within those groups or income sources, and in their weights in the population or in total income. These early decomposition techniques proved to be extremely useful in several circumstances, and they should still be used as a first step in explaining changes in distributions of some economic attributes. Indeed, the Oaxaca-Blinder approach is still often used to analyze wage discrimination across genders or union status. Like- wise, decomposing inequality measures such as the Theil coefficient or the mean logarithmic deviation according to gender, education, or age groups may often be quite informative about the broad struc- ture of inequality in a society. At the same time, there is both a growing need and an increasing computational capacity to work with the entire distribution, rather than merely with its first moment or a few inequality indices. In particular, the focus on poverty reduc- tion, which increasingly drives development policy, requires analysis of the shape of the distribution in the neighborhood of and below the poverty line. In terms of the Oaxaca-Blinder approach, the issue is to know not so much whether mean earnings are lower for women than for men because the former have less average education, as whether the differences are greater or smaller for the bottom part of the earnings distribution. Answering this kind of question requires handling the whole distribution, rather than summary measures. Several techniques for decomposing distributional change, rather than merely changes in individual inequality or poverty measures, DECOMPOSING CHANGES IN THE DISTRIBUTION 19 have been developed in the past decade or so--in part because of increasing computational capacity. The technique used to analyze long-run distributional changes in this book belongs to this recent stream of new decomposition methodologies. It is based on a parametric representation of the way in which household income per capita or individual earnings are linked to household or individual sociodemographic character- istics, or endowments. From this point of view, it bears great resem- blance to the Oaxaca-Blinder approach, except for two points: (a) it deals with the entire distribution, rather than just the means of income or earnings, and (b) the parametric representation of the income-generation process for a household is more complex than the determination of individual earnings, in ways that we shall dis- cuss below. As in the Oaxaca-Blinder method, however, the decom- position of distributional change essentially consists of contrasting representations of the income-generation process (that is, evaluating differences in estimated parameters) for two different distributions (for example, two points in time), on the one hand, and accounting for changes in the joint distribution of endowments, on the other hand. Other methods, which do not rely so much on a parametric representation of individual or household income generation, could also have been applied to the case studies in the chapters that follow.1 Yet, it turns out that the parametric representation used throughout this volume is actually of inherent interest, because the parameters lend themselves directly to relevant economic interpretations. This chapter presents this methodology for decomposing observed changes in the (entire) distribution of household income per capita. It opens with a brief survey of decomposition techniques applied to the mean or to summary measures of income inequality. It continues with a general statement of the decomposition techniques that handle the whole distribution, focusing on the parametric method used in this volume. It then shows the detail of the paramet- ric representation of household income-generation processes that, in one way or another, underlies all case studies in this volume. The last section addresses a number of general econometric issues that arise in the estimation of the model. Decomposing Distributional Change: Scalar Methods The general problem is that of comparing two distributions of income--or of any other welfare measure2--in a population at two points in time, t and t . Without too much loss of generality, the two 20 BOURGUIGNON AND FERREIRA distributions will be represented by their density functions: ft(y) and ft (y). The objective is to explain the change from ft(y) to ft (y) by a series of elementary changes concerned with changes in the socio- demographic structure of the population, in income disparities across sociodemographic groups or, possibly, in the relative impor- tance and distribution of a particular income source. Before consid- ering this general functional problem, we briefly review simple ways of performing that decomposition when density functions are replaced by some scalar summary index. The Oaxaca-Blinder Decomposition of Changes in Means Although it refers to a decomposition of differences in means, rather than in distributions, it is convenient to start this short review with the so-called Oaxaca-Blinder method. Indeed, this method relies on a general principle that will be extensively used later. In addition, dealing with the first moments of the distribu- tions ft(y) and ft (y) should provide some indication as to how one could deal with higher order moments and, therefore, with inequal- ity or poverty. Oaxaca (1973) and Blinder (1973) independently found the fol- lowing way for comparing the mean earnings of two different pop- ulations.3 Assume that income may satisfactorily be approximated by the following linear model in both periods t and t : (2.1) yit = Xit · t + uit yjt = Xjt · t + ujt . In other words, the income of individual i observed in period t is supposed to depend linearly on a vector of his or her observed char- acteristics, Xit, and on some unobserved characteristics summa- rized by the residual term, uit. The same relationship holds for indi- vidual j observed in period t , who presumably is different from the individual observed in period t. The coefficients t and t simply map individual characteristics, X, into income, y. If the components of X are seen as individual endowments, then the coefficients may be interpreted as rates of return on those endowments, or as the "prices" of the services associated with them. Given a sample of individual observations at time t and another at time t , these prices may be estimated by ordinary least squares, under the usual assumption that the residual terms are independent of the observed endowments. Consider now the change in mean earnings or income between periods t and t . Under the innocuous assumption that the expected value of the residual terms is zero, an elementary transformation DECOMPOSING CHANGES IN THE DISTRIBUTION 21 leads to the following decomposition of the change in (the cross- sectional) means: (2.2) y = yt - yt = t · (Xt - Xt) + Xt · (t - t) . The change in mean earnings thus appears as the sum of two effects: (a) that of a change in mean endowments at constant prices (that is, the endowment effect), and (b) that of a change in prices at constant mean endowments (that is, the price effect). In other words, the change in the mean earnings of the population between times t and t is explained by a change in its mean characteristics (education, age, area of residence, and so on) and by a change in the rates of return to these characteris- tics. For instance, when the Oaxaca-Blinder decomposition bears on gender differences, the gender gap is decomposed into what is due to (a) the fact that working women and men do not have the same char- acteristics in terms of education, age, or occupation, and (b) the fact that, at constant characteristics, they are not paid the same rate. The practical interest of a decomposition such as equation 2.2 is obvious. If economic analysis were able to predict or explain changes in the price system, , then it would be easy to figure out what such changes may imply for the evolution of mean earnings or incomes. Of course, this decomposition ignores any possible causal relationship between the two sources of change. Yet it is likely that observed changes in prices may be caused at least partly by changes in the sociodemographic structure of the population, and also that changes in prices in turn induce some changes in the socio- demographic structure of the population. For instance, a more edu- cated labor force may lead to narrower wage-skill gaps, and a wider wage-skill gap may be an incentive for part of the population to become more educated. Three additional points must be noted about the Oaxaca-Blinder decomposition. First, the decomposition identity (equation 2.2) is path dependent. Indeed, an identity similar to equation 2.2 is as follows: y = yt - yt = t (Xt - Xt) + Xt · (t - t) . In this case, the endowment effect is evaluated using the prices at period t , whereas the price effect is estimated using the initial mean endowments. There is no reason for this decomposition to give the same estimates of the price and endowment effect as equation 2.2. The path that is used for the decomposition matters.4 A second point to be stressed is that different interpretations may be given to the endowment and the price effects identified by the preceding decomposition formula. For instance, the endowment effect may be interpreted as the effect of simply changing the weight of various population subgroups that are predefined by common 22 BOURGUIGNON AND FERREIRA endowments. The price effect could then be interpreted as the effect of changing the relative mean incomes of these groups. This inter- pretation may be closer to the definition of the decomposition of distributional changes given at the beginning of this chapter. Note also that the decomposition formula (equation 2.2) may be inter- preted simply as the effect on the mean income of changing the importance of various income sources, either through the coeffi- cients, or through the mean endowmentsX. In effect, the decompo- sition operates through the components tXt of the scalar product k k tXt, which may rather naturally be interpreted as different sources of income. Finally, the way the Oaxaca-Blinder approach was just presented might give the impression that it has little to do with the analysis of inequality, because it is concerned with means. This impression is not entirely appropriate. Suppose that the decomposition formula (equa- tion 2.2) is applied at time t to the difference in the mean incomes of two population groups A and B--men and women, for instance-- rather than being applied to a time difference. Equation 2.2 could then be rewritten as y = yB - yA = A · (XB - XA) + XB · (B - A). This earnings differential represents part of the inequality in the distribution of earnings (at time t): that part which is due to differences between groups A and B. The change in inequality between periods t and t will therefore include, among other things, the change in the A/B earnings differential. It might thus be decomposed into a change in the difference in endowments between groups A and B and a change in the difference in prices faced by the two groups. This argument simply combines an application of the Oaxaca-Blinder decomposition in a cross- section with an application over time. We will see below that the generalization of the Oaxaca-Blinder method to handle entire dis- tributions, rather than their first-order moments, involves an argument of this type. Decomposing Changes in Income Inequality Measures The principle behind the foregoing decomposition may also be applied to higher moments and, in particular, to summary inequal- ity measures. The "decomposable" or Generalized Entropy inequal- ity measures are endowed with very convenient decomposition properties.5 Suppose that the population of income earners is partitioned into G groups, g = 1, 2, . . . ,G, and denote by Ig the inequality measure for group g and by I the inequality for the whole DECOMPOSING CHANGES IN THE DISTRIBUTION 23 population. These measures satisfy the following general property: G (2.3) I = Igw(ng, mg) + I¯(n1, y1; n2, y2; ... ; nG, yG) = IW + IB g=1 where ng and mg stand respectively for the population and income shares of group g within the whole population and I¯(. . .) is the inequal- ity between groups--that is to say, the inequality that would be observed in the population if all incomes were equal within each group g. The distribution of income would thus consist of n1 times the income y1, n2 times the income y2, and so forth. Total inequality, I, thus decom- poses into two terms: the mean within-group inequality, where each group g is weighted by a weight, w, which depends on population and income shares, and the between-group inequality, I¯(. . .). The preceding property is intuitive because it resembles the well- known decomposition of variances across population subgroups. In the present context, however, we are less interested in the decomposition among groups at a point in time than in that of the change in inequal- ity between two points in time. Differentiating equation 2.3, it follows that the change in overall inequality, I, may be expressed as the sum of the change in within-group inequality, IW, and the change in between-group inequality, IB. In turn, both changes may be expressed as linear combinations of changes in within-group inequality measures Ig, and changes in population and income shares, ng and mg.6 The mean logarithmic deviation is the simplest of all decompos- able measures. Its expression for a population of n individuals i is the following: n L = 1Log( y/yi). n i=1 It is easily shown that the preceding decomposition formula (equation 2.3) writes, in this case G G L = ngLg + ngLog(y/yg) = IW + IB. g=1 g=1 Finally differencing this expression between two periods t and t yields the following:7 G y L yg ng - y g=1 yg G G (2.4) + Lg + Log(y/yg) ng + ng Lg. g=1 g=1 The total change in inequality is thus expressed as the sum of three types of effects: (a) changes in the relative mean income of the 24 BOURGUIGNON AND FERREIRA groups, (b) changes in group population weights,8 and (c) changes in within-group inequality. Analogous expressions can be derived for the other members of the family of decomposable inequality measures. For practical purposes, this decomposition methodology is imple- mented as follows. Suppose that the population of earners has been partitioned by educational attainment: no schooling, primary, lower secondary, and so forth. Then, following the preceding decomposi- tion, the change in overall inequality between year t and t may be analyzed as the sum of (a) the effects of changes in relative earnings by educational level, (b) the effects of changes in the educational structure of the population, and (c) the effects of changes in inequal- ity within educational groups. Thus, the last term is often taken as a kind of residual, corresponding to that part of the change in inequal- ity that is not explained by the change in mean incomes across edu- cational groups and the educational structure of the population. Of course, the preceding decomposition can be implemented for all possible observed characteristics of individuals in the population and, indeed, for all possible combinations of characteristics. For instance, groups may be defined simultaneously by the education of the household head, his or her age, his or her area of residence, or the number of people in the household. There are numerous applications of this decomposition methodology, starting with the analysis of the evolution of inequality in the United Kingdom by Mookherjee and Shorrocks (1982). One of the reasons for its appeal is its analogy with the Oaxaca-Blinder decomposition: changes in group relative incomes play a role similar to the changes in the price coefficients, , whereas the change in groups' population weights is another way of representing the changes in the sociodemographic structure of the population, Xt - Xt. There are two basic differences between these two approaches, beyond the fact that one is applied to mean incomes and the other to income inequality. First, the inequality decomposi- tion formula is nonparametric, whereas the Oaxaca-Blinder relies on a linear income model.9 Second, the inequality decomposition has a residual term--the change in within-group inequality--which is inde- pendent of the inputs of the Oaxaca-Blinder decomposition.10 This residual is one of the sources of dissatisfaction with the pre- ceding methodology. In empirical applications, it turns out to be an important component of observed change in inequality, even though it does not lend itself to an economic interpretation as easily as the other two components. Another source of dissatisfaction is that it seems somewhat restrictive to analyze changes in distribution through a single summary inequality measure. Of course, this decomposition might be combined with the Oaxaca-Blinder decom- position, thus yielding information on the change in the mean as DECOMPOSING CHANGES IN THE DISTRIBUTION 25 well as on the disparity of incomes. But that disparity is still sum- marized by a single index. Using alternative indices belonging to the general class of decomposable inequality measures is always possi- ble but never quite as convincing as looking at differences across the entire distribution. A final problem with the decomposition of changes in decompos- able inequality measures is that it applies to a disaggregation of the population into subgroups, but not to a disaggregation of income by sources. Suppose that the income of individual i may be expressed as the sum of incomes coming from two sources, say, wages (1) and self-employment (2): yi = y1 + y2 . i i It may be interesting to decompose the change in the inequality of total income into what is due to the changes in the means and in the inequality of income sources 1 and 2. The preceding decompo- sition formulas do not work in this case. In particular, it is simply not true that total inequality is the weighted average of the inequal- ity of each income source. The covariance of the two sources within the population is of obvious importance. Shorrocks (1982) shows the way in which total inequality Iy at a point in time can be decomposed into the inequality coming from the various income sources. In particular, he shows that, for E2, the Generalized Entropy measure with = 2, it is identically true that (2.5) E2 = cov(yj, y) yj (E2 E2 )1/2 j j var(yj)var(y) y where cov(yj, y) is the covariance between the income source j (= 1, 2) and total income in the population. In other words, the ratio of this covariance and the variance of total income may be interpreted as the percentage contribution of income source j to total inequality, whatever the inequality measure being used. It turns out that this decomposition is somewhat difficult to use when time changes are considered. Indeed, to analyze how a change in the distribution of an income source--say, source 1--may modify the overall inequality of income, one must first figure out how this change may modify the covariance between that income source and total income. Doing so requires figuring out how the change in the distribution of source 1 may itself modify the covariance between the incomes of sources 1 and 2. In other words, the analyst must not operate only at the level of the marginal distribution of income of one source but at the level of the joint distribution of incomes aris- ing from the various sources. The need to handle this joint distribu- tion may explain why the preceding property of decomposability by income source is seldom used in empirical work on distributional changes.11 26 BOURGUIGNON AND FERREIRA Decomposing Changes in Poverty and the Need for Distributional Analysis Poverty measures are scalars that summarize the shape of the distri- bution of income up to some arbitrary poverty line, z. The simplest poverty measure is the headcount ratio, H, which is simply the value of the cumulative distribution function at the poverty line. Other poverty measures may be defined on the basis of specific axioms. There is an infinity of poverty measures associated with any given poverty line, z, as there is an infinity of inequality measures. Among the properties frequently desired from poverty measures is subgroup decomposability, which simply requires poverty to be additive with respect to a partition of the whole population into two groups. Thus, if Pz is the poverty measure for the whole population when the poverty line is z and if Pj measures poverty in group j, the z following property should hold: Pz = wj · Pj z j where wj stands for the demographic weight of group j, as before. Clearly, this property holds for the headcount ratio. In effect, all poverty measures based on the sum of individual income depriva- tion (z - yi) caused by poverty, whatever way in which this depri- vation is measured, satisfy this property.12 Given the linear structure implied by subgroup decomposability, something akin to the Oaxaca-Blinder decomposition principle applies. Differencing the preceding expression with respect to time, we obtain the following: (2.6) Pz = wj · Pj + z Pj z wj. j j In other words, the change in total poverty is decomposed into a component that is due to changes in poverty within groups and into a component that is due to changes in the population weights of the groups. If groups are defined by common sociodemographic char- acteristics, it may be said that the second term corresponds to the endowment effect in the Oaxaca-Blinder decomposition. The first term partly accounts for changes in prices and behavior that may generate changes in the mean income of a group and, therefore, changes in total poverty. But the change in total poverty also partly depends on changes in the distribution of income within groups. This was already the case with the residual term in the decomposi- tion of a change in inequality (see equation 2.4). Unlike in the decomposition of inequality, however, here it is not possible to DECOMPOSING CHANGES IN THE DISTRIBUTION 27 isolate these two effects. The basic reason is that inequality is defined on relative incomes, and it is therefore independent from the general scale of incomes and from the mean. On the contrary, poverty depends on the distribution of absolute incomes. As a consequence, a change in the general scale of incomes--and therefore in mean income--has a complex effect on poverty, which depends on the shape of the distribution around (and below) the poverty line. It is, therefore, impossible to have changes in group mean incomes--which we have suggested are analogous to price and pos- sibly behavioral effects--appearing explicitly in a simple way in the decomposition formula for poverty changes, as was the case for decomposable inequality measures. For poverty measurement, changes in mean incomes cannot be straightforwardly disentangled from distributional changes. Thus, poverty changes cannot be decomposed into endowment, price, and behavioral effects without considering the actual distribution within groups, rather than merely some summary poverty measure for each of those groups.13 A better understanding of changes in poverty thus requires a more disaggregated approach to distributional dynamics. And poverty is not the only reason to invest in developing such an approach. As indicated earlier, a combination of the standard Oaxaca-Blinder decomposition of changes in means with various inequality decompositions by population subgroup is hardly a direct and effective method to understand disaggregated changes in a dis- tribution of income. The next section proposes a generalization of the Oaxaca-Blinder framework to deal directly with full distribu- tions, rather than just means or other scalar indices. Decomposing Distributional Change: Nonparametric and Parametric Methods for Entire Distributions A Simple Generalization of Oaxaca-Blinder: Distributional Counterfactuals This section offers a general formulation of the way in which the preceding scalar decomposition analysis may be extended to the case of distributional changes. Let ft(y) and f t(y) be the density functions of the distribution of income, y, or any other definition of economic welfare, at times t and t . The objective of the analysis is to identify the factors responsible for the change from the first to the second distribution. To do so, it seems natural to depart from the joint distributions (y, X), where X is a vector of observed individual or household 28 BOURGUIGNON AND FERREIRA characteristics, such as age, education, occupation, and family size. The superscript (= t, t ) denotes the period in which this joint dis- tribution is observed. The distribution of household incomes, f(y), is of course the marginal distribution of the joint distribution (y, X): (2.7) f (y) = ··· (y, X) dX C(X) where the summation is over the domain C(X) on which X is defined. Denoting g(y|X), the distribution of income conditional on X, an equivalent expression of the marginal income distribution at time is (2.8) f (y) = ··· g (y |X) (X) dX C(X) where (X) is the joint distribution of all elements of X at time . Given that elementary decomposition, it is a simple matter to express the observed distributional change from ft( ) to f t ( ) as a function of the change in the two distributions appearing in equa- tion 2.8--that is to say, the distribution of income conditional on characteristics X, g(y|X), and the distribution of these character- istics, (X) . To do so, define the following counterfactual experiment: (2.9) fgtt (y) = ··· gt (y |X) t (X) dX. C(X) This distribution would have been observed at time t if the distribu- tion of income conditional on characteristics X had been that observed in time t . This counterfactual distribution may be calcu- lated easily once the conditional distributions gt(y|X) and gt ( y|X), as well as the marginal distribution t(X), have been identified. Like- wise, one may define the counterfactual (2.10) ftt (y) = ··· gt (y |X) t (X) dX C(X) where, this time, it is the joint distribution of characteristics that has been modified. Note that this latter distribution could also have been obtained starting from the period t and replacing the condi- tional income distribution of that period by the one observed in DECOMPOSING CHANGES IN THE DISTRIBUTION 29 period t. In other words, it is identically the case that, with obvious notations, (2.11) fg tt (y) f t t(y) and f tt (y) fg t t(y). On the basis of the definition of these counterfactuals, the observed distributional change f t (y) - ft(y) may now be identically decom- posed into (2.12) f t (y) - f t(y) fgtt(y) - f t(y) + f t (y) - fgtt (y) . As in the Oaxaca-Blinder equation, the observed distributional change is expressed as the sum of a price-behavioral effect and an endowment effect. Indeed, the first term on the right-hand side of equation 2.12 describes the way in which the distribution of income has changed over time because of the change in the distri- bution conditional on characteristics X. In other words, it shows how the same distribution of characteristics--that of period t-- would have resulted in a different income distribution had the conditional distribution g(y|X) been that of period t . To see that the second term is indeed the effect of the change in the distribu- tion of endowments that took place between times t and t , one can use equation 2.11 and rewrite the preceding decomposition formula as follows: (2.13) f t (y) - f t(y) = fgtt (y) - f t(y) + f t (y) - ft t (y) . The main difference with respect to the Oaxaca-Blinder approach and the decomposition of scalar inequality measures reviewed ear- lier is that this decomposition--and the counterfactuals it relies on--refer to full distributions, rather than just to their means. Tak- ing means on equation 2.12 or 2.13 under the parametric assump- tion that the conditional mean of g(y|X) may be expressed as X would actually lead to the Oaxaca-Blinder equation (equation 2.2). More generally, the decomposition formula (equation 2.13) may be applied to any statistic defined on the distribution of income, f (y): mean, summary inequality measures (and not only those which are explicitly decomposable), poverty measures for various poverty lines, and so forth. The only restrictive property in the preceding decomposition is the path dependence already discussed in connection with the Oaxaca-Blinder equation. In the present framework, this property means that changing the conditional income distribution from the one observed in t to that observed in t does not have the same effect on the distribution when this is done with the distribution of 30 BOURGUIGNON AND FERREIRA characteristics X observed in t, as when X is observed in t . In the present general case, this means that fg tt (y) - f t(y) = f t (y) - fg t t (y) . However, the difference is likely to be small when the change in conditional income distributions g(y |X) is small.14 Extending the Scope of Counterfactuals In the preceding specification, all the characteristics X were consid- ered on the same footing. But it might be of interest in some instances to decompose further the change in the distribution of these characteristics. For example, one might want to single out the effect of the change in the distribution of schooling or of family size. Doing so simply requires extending the conditioning chain in equa- tion 2.8 and defining new counterfactuals as described below. For any partition (V, W) of the variables in X, the conditioning chain (equation 2.8) may be rewritten as f (y) = ··· g(y |V, W)h(V |W)(W) dV dW C(V,W) whereh(V |W) is the distribution of V conditional on W and (W) the marginal distribution of W. The set of counterfactuals may then be enlarged by modifying the conditional distribution of V. All com- binations of the three distributions--gg(y |V, W), hh(V |W), and (W) with g, h, = t or t --may be considered as generating a specific counterfactual. Two particular counterfactuals are the actual distributions themselves. They are obtained with the combinations g = h = = t or t . Comparing two counterfactuals that differ by only one distribu- tion gives an estimate of the contribution of the change in that partic- ular distribution to the overall distributional change. Of course, there are many paths for evaluating this contribution, with no guarantee that all these paths will generate the same estimate. For instance, the contribution of the change in the distribution of V conditional on W may be evaluated by comparing f t(y) and the following: fh tt (y) = ··· gt(y |V, W) ht (V |W) t(W) dV dW. C(V,W) But, with obvious notations, it could also be obtained by com- paring fg tt (y) and fgtt (y) and f t (y). ,h (y) or fh t t If necessary, a more detailed conditioning breakdown of vari- ables in V could be considered. For instance, it might be of interest DECOMPOSING CHANGES IN THE DISTRIBUTION 31 to analyze the effect of a change in the distribution of some compo- nents of V conditional on the others, thus breaking down ht(V |W) into h1(V1|V-1, W) h-1(V-1|W), where V-1 stands for the compo- nents of V different from V1. Following the same steps as above, this breakdown opens other counterfactuals and other decomposition paths.15 A Parametric Implementation of the Decomposition of Distributional Change This decomposition analysis may be directly implemented using non- parametric representations--such as kernel density estimates--of the appropriate distributions. With enough observations, it is indeed possible to obtain a nonparametric representation of all the condi- tional distributions involved in defining counterfactuals. In practical terms, however, this may require a discretization of the distribution of the conditioning variables (V, W) or, in other words, defining groups of individuals with specific combinations of variables V and W. An example of such a use of the general decomposition principle above is provided by DiNardo, Fortin, and Lemieux (1996).16 For reasons that have mostly to do with the interpretation of the results of this decomposition, the various studies in this book rely instead on a parametric representation of some of the distributions used for defining counterfactuals. Indeed, dealing with changes in parameters with direct economic meaning, such as the return to education or the age elasticity of labor force participation, makes the discussion of the decomposition results quite fruitful. This sec- tion discusses the general principles behind this parametric analysis. A general parametric representation of the conditional functions g(y |V, W) and h(V |W) relates y and (V, W), on the one hand, and V and W on the other hand, according to some predetermined functional form. These relationships may be denoted as follows: y = G[V, W, ; ] V = H[W, ; ] where and are sets of parameters and and are random variables-- is a vector if V is a vector. These random variables play a role similar to the residual term in standard regressions. They are meant to represent the dispersion of income y or individual charac- teristics V for given values of individual characteristics (V, W), and W, respectively. They are also assumed to be distributed indepen- dently of these characteristics, according to density functions ( ) and µ( ). Finally, the functions G and H have preimposed func- tional forms. 32 BOURGUIGNON AND FERREIRA With this parameterization, the marginal distribution of income in period may be written as follows: f (y) = () d G(V,W,; )=y (2.14) × µ() d (W) dV dW. H(W,, )=V Counterfactuals may be generated by modifying some or all of the parameters in sets and , the distributions ( ) and µ( ), or the joint distribution of exogenous characteristics, (W). These coun- terfactuals may thus be defined as follows: D[ , , µ; , ] = () d G(V,W,; )=y (2.15) × µ() d (W) dV dW H(W,, )=V where any of the three distributions ( ), µ( ), and µ( ), and the two sets of parameters, and can be those observed at time t or t . For instance, D[ t, t, µt; t, t] would correspond to the distrib- ution of income obtained by applying to the population observed at time t, the income model parameters of period t , while keeping con- stant the distribution of the random residual term, , and all that is concerned with the variables V and W. Thus, the contribution of the change in parameters from to may be measured by the differ- t t ence between D[ t, t, µt; t , t] and D[ t, t, µt; t, t], which is ft(y). But, of course, other decomposition paths may be used. For instance, the comparison may be performed using the population observed at time t as a reference, in which case the contribution of the change in the parameters would be given by D[ t, t , µt ; t, t] - D[ t, t , µt ; t, t ] (where the notation "-" stands for distributional differences). Note that the decompo- sition may also bear on some subset of the and parameters. In this parametric framework, the number of decomposition paths may become very large. Thus, the contribution of each indi- vidual change in the and parameters, in the distribution of the random or residual terms, ( ) and µ( ), and finally in the whole dis- tribution of exogenous characteristics, ( ), may be evaluated in many different ways. The choice depends on what value is given to DECOMPOSING CHANGES IN THE DISTRIBUTION 33 the other parameters or the functions used for the other distribu- tions. In general, a single decomposition path is used. But it is impor- tant to compare the results with those obtained on different paths to see whether they are very different and, if so, to understand the rea- sons for the differences.17 A Parametric Representation of the Income-Generation Process This section is devoted to particular applications of the preceding methodology--that is, to a specific set of variables X = (V, W) and some specification of the functions G( ) and H( ) above. The actual specifications used in the various chapters in this volume differ somewhat across economies, but they do share a common base, which is described below. The Simple Case of Individual Earnings If it were to be applied to the distribution of individual earnings, the preceding methodology would be rather simple. If we ignore for the moment the partition of X into exogenous characteristics (W) and nonexogenous individual characteristics (V), a simple and familiar parametric representation of individual earnings as a function of individual characteristics is given by the following: (2.16) Log y = X · + . In this particular case, the function G( ) thus writes as follows: G(X, ; ) = eX · + . To obtain estimates for the set of parameters and for the distri- bution of the random term , one may rely on standard econometric techniques. Running a regression on samples of observations i avail- able at time , Log yi = Xi · + i yields an estimate of the set of parameters , as well as of the distribution ( ) of the random term. Then, the counterfactuals D( ) defined earlier in (2.15) can be computed easily. Without the (V, W) distinction, a counterfactual is now defined as D(, ; ), where (W, ) is the joint distribution of the exogenous components of (V, W). Switching to a discrete representation {yi} = (y1, y2, ..., yN) of the distribution at time , where N is 34 BOURGUIGNON AND FERREIRA the number of observations in the sample available at time = t, t , it is identically the case that D(t, t, t) = {yi}t. The counterfactual, D(t, t, t) = {yi}tt , is obtained by computing Log (yi)t t = Xi · t ^ t + ^i t for i = 1, 2, ... , Nt where the notation ^ stands for ordinary least squares estimates. This counterfactual is thus obtained by simulating the preceding model on the sample of observations available at time t. This simu- lation shows what would have been the earnings of each individual in the sample if the returns to each observed characteristics had been those observed at time t rather than the actual returns observed at time t.18 The returns to the unobserved characteristics that may be behind the residual term ^i are supposed to be t unchanged, though. This is equivalent to the evaluation of the price effect for observed characteristics in the Oaxaca-Blinder calcula- tion. The difference is that the evaluation is carried out for every individual in the sample. The counterfactual on the distribution of the random term D(t, t , t) = {yi}tt is a little more difficult to construct. Import- ing the distribution of residuals from time t to time t requires an operation known as a rank-preserving transformation, whereby the residual in the nth percentile (of residuals) at time t is replaced by the residual in the nth percentile at time t , for all n. As this operation is not immediate when the number of observations is not the same in the two samples, an approximate solution is used. It consists of assuming that both distributions of residual terms are the same up to a proportional transformation. An example would be if residuals were normally distributed, with mean zero. The rank-preserving transformation is then equivalent to multiplying the residual observed at time t by the ratio of standard deviations at time t and t.19 D(t, t , t) = {yi}tt is thus defined by Log (yi)tt = Xi · t ^ t + ^i · (^ /^) t t t for i = 1, 2, ... , Nt. With those counterfactuals at hand, estimates of the contribution to the observed overall distributional change between t and t of the change in the parameters, in the distribution of residuals (), and possibly of these two changes taken together may easily be found. The effect of changing the distribution of individual endowments, X, is obtained as the complement of the two previous changes: {yi}t - D(t, t , t ). DECOMPOSING CHANGES IN THE DISTRIBUTION 35 This technique is intuitively simple, and a very similar methodology has been in use in the literature on earnings distribution ever since it was introduced by Juhn, Murphy, and Pierce (1993). Things are slightly more complicated when dealing with household incomes. The additional complication arises from the need to take into account behavior related to participation in the labor force or, equivalently, the presence of various potential earners within a household. A Household Income­Generation Model Moving from individual earnings to household income per capita requires adding the earnings of the various members of the house- hold and dividing by the total number of persons, or adult equiva- lents. This computation in turn requires considering not only the earnings of those people who are active but also the participation behavior of all the people of working age. Indeed, one reason the distribution of household income may change over time is that mem- bers may change occupation.20 In an imperfect labor market, more- over, it may also be necessary to take into account the segment of the labor market in which active people work. The model presented below incorporates these various aspects in the specification of the function G(V, W, ; ). The first component of the model is an identity that defines income per capita in a household h, with nh persons in it: 1 nh J (2.17) yh = j j se . nh Ihi yhi + yh + y0h i=1 j=1 In this expression, household income is defined as the aggregation of the earnings yhi across individual members i and activities j, of joint household self-employment income yh , and of unearned se income such as transfers or capital income, y0 . Individual earnings h may come from different activities, j = 1, 2, ... , J . The variables Ihi are indicator variables that take the value 1 if individual i par- j ticipates in earning activity j, and 0 otherwise. The set of activities may differ across studies. In studies in which self-employment income is reported at the individual level, this set essentially com- prises wage work or self-employment, both full- and part-time, and possibly a combination of part-time wage work and self-employ- ment. In studies in which self-employment income is reported at the household level, being employed in the family business may be taken as an additional activity, J + 1, whereas J would include full-time or part-time wage work, possibly combined with part-time work in the 36 BOURGUIGNON AND FERREIRA family business. Since some of these alternative occupations involve both wage work and self-employment, each occupation in the J + 1 set is exclusive of another occupation. It is thus the case that J +1 j j=1 Ihi = 0 or 1, with 0 corresponding to inactivity. The allocation of individuals across these J or J + 1 activities is represented through a multinomial logit model. It is well known (see McFadden 1974) that this model may be specified in the following way: Ihi = 1 if Zhi s Ls+ i > Max(0, Zhi Ls Lj + i ), Lj j = 1, ... , J + 1, j = s (2.18) Ihi = 0 for all s = 1, ... , J + 1 if Zhi s Ls + i 0 Ls for all s = 1, ... , J + 1 where Zhi is a vector of characteristics specific to individual i and household h, Ls are vectors of coefficients, and Ls are random variables identically and independently distributed across individu- als and occupations according to the law of extreme values. Within a discrete utility-maximizing framework, Zhi Ls+ i is to be inter- Ls preted as the utility associated with occupation s, with Ls standing for unobserved utility determinants of occupation s and the utility of inactivity being arbitrarily set to 0.21 Note, however, that this interpretation in terms of utility-maximizing behavior is not fully justified because occupational choices may actually be constrained by the demand side of the market, as in the case of selective rationing, rather than by individual preferences. Observed heterogeneity in earnings in each occupation j can be described by a log-linear model reminiscent of the well-known Mincer model: (2.19) log yhi = Xhi j wj + wjhi wj for i = 1, ... , nh where Xhi is a vector of individual characteristics, wj a vector of coefficients, and hi a random variable supposed to be distributed wj identically and independently across individuals and occupations, according to the standard normal law. Under those conditions, wj is to be interpreted as the unobserved heterogeneity of individual earnings in occupation j. To simplify, earnings functions are often assumed to differ across activities only through the intercepts, so that all components of wjbut one are identical across occupations and wj = w. Finally, self-employment income at the household level is assumed to be given by se (2.20) Log yh = Yh, se Ihi , se Ihi Xhi se · se+ seh . se i Ihi DECOMPOSING CHANGES IN THE DISTRIBUTION 37 The first component of the vector in brackets is a set of household characteristics, including available assets in the self-employment activity. The second component is the number of family members involved in that activity, and the third is a vector that corresponds to their average individual characteristics. As before, se is a vec- tor of coefficients, and h is a random variable distributed as a stan- se dard normal. Thus, se stands for the unobserved heterogeneity of household-level self-employment income. The model is now complete. Together, equations 2.17 to 2.20 give a full description of household income­generation behavior and correspond to the function G( ) discussed earlier. The (V, W) variables are now replaced by the X, Y, and Z characteristics of households and household members. Parameters are all the coeffi- cients included in ( Lj , wj, se), and random variables are the residual terms in the occupational-choice model, Lj ; the earning equations, wj; and the self-employment function, se. The only dif- ference with respect to the general parametric formulation discussed earlier is that the parameterization now extends to the distribution of the random variable terms. These terms are now assumed to be distributed according to some prespecified law, with parameters given by the standard errors (w, se) in the case of the normal dis- tributions for (wj, se). This parameterization of the distribution of random terms introduces some approximation in the decomposition methodology. However, because the normal distribution fits rather well with distributions of (log) earnings or self-employment income, the approximation error is likely to be small. Econometric estimates of all parameters ( ^Lj wj se , ^ , ^ ), of the standard errors (^ , ^ ), and of individual residual terms wj s se (^ , ^ , ^s ) may be obtained on the basis of samples of observa- Ly¨ wy¨ se tions available in t and t . Then the parametric decomposition technique described in the preceding section may be applied, after substituting the distributions ( ) and µ( ) by (^ , ^ ). Typically, wj se the model described in equations 2.17 through 2.20 is evaluated for each household in the sample of period t after substituting the para- meters ( ^Lj wj se , ^ , ^ s ), or a subset of them, by their counterpart in period t . This microsimulation exercise is less simple than the derivation of counterfactual distributions in the case of individual earnings but does not involve any particular difficulty. Some issues concerning the econometric estimation of the model are discussed in the next section. Yet an important point must be stressed at this stage. The estimates of the earnings functions (equa- tion 2.19) and self-employment functions (equation 2.20) are based on subsamples of individuals and households with nonzero earnings or income in the corresponding activity, which requires controlling 38 BOURGUIGNON AND FERREIRA for selection biases. The residual terms (^ , ^s ) are directly wj se observed only for those individuals or households with nonzero earnings or self-employment income. Simulating the complete household income model (equations 2.17 to 2.20) requires that an estimate be available for every random term ( , s ). For instance, wj se it is possible that individual i in household h who is observed as inactive in period t would become a wage worker when the coeffi- cients of year t , ^Lj, are used in the occupational model (equa- t tion 2.18). The earnings to be imputed to that individual in this counterfactual experiment are given by equation 2.19. The first part on the right-hand side of that equation is readily evaluated, but some value must be given to the corresponding random term in hi , wj because it is not observed. A simple solution consists of drawing that value randomly in a standard normal distribution. In effect, doing so involves drawing from conditional distributions rather than a standard normal distribution because of the obvious endoge- nous selection of people into the various types of occupations (see below for more detail). Note also that the same remark applies to the residual terms, , which are also unobserved. They must be Lj drawn from extreme value distributions in a way that is consistent with observed occupational choices. The preceding specification of the income-generation model may appear as unnecessarily general. The reason for such a general formu- lation is that it encompasses different specifications used in the case studies in this book. Each of these specifications is individually sim- pler than the preceding general model in some aspects and slightly more complicated in others. A simplification common to all case stud- ies is that both the occupational model (equation 2.18) and the indi- vidual earnings equation (equation 2.19) are logically defined on household members at working age. Another important simplifica- tion is that individual and household self-employment income are never observed simultaneously. Thus, equation 2.20 is irrelevant when self-employment income is observed at the individual level, and equa- tion 2.19 is estimated only for wage employment (rather than allow- ing for individual self-employment) when self-employment income is registered at the household level. Additional complexity arises from the facts that (a) some studies rely on earnings functions that differ across labor-market segments (defined by gender and by rural and urban areas) and (b) most studies rely on different occupational- choice models for household heads, spouses, and other household members of working age. Those variations do not modify the under- lying logic of the income-generation model (2.17­2.20). They were ignored in the preceding discussion for the sake of notational simplic- ity. At the same time, they show how rich the representation of the income-generation model summarized by the function G( ) can be. DECOMPOSING CHANGES IN THE DISTRIBUTION 39 Before turning to some econometric issues linked to the estima- tion of the model, we should say a word about the specification adopted for the second stage of decomposition--that is to say, the function H( ), which relates the set of variables V to those in W and . Two characteristics are treated as conditional at this second stage: individual education and the number of children in the household. The conditional distribution of the latter variable is represented through a multinomial logit, as in equation 2.18: nch = m if Yh D Nm + hNm> Max 0, Yh D Nj + h Nj , j = 1, ... , M, j = m (2.21) nch = 0 if Yh D Nj+ h 0 for all j = 1, ..., M Nj where Yh is a subset of household and individual characteristics-- D essentially the age, the education level, and the region of residence of the household head and of his or her spouse, if present. Here Nj is a vector of coefficients, h are independent random variables dis- Nj tributed according to the law of extreme values, and M is some upper limit on the number of children. Likewise the number of years of schooling, Xhi , of an individual i in household h is related to some E simple demographic variables Xhi such as age, gender, and region of D residence, through the same type of multinomial logit specification: Xhi = s if Xhi E D Es + hi > Max 0, Xhi Es D Ej+ hi , Ej j = 1, ... , S, j = s (2.22) Xhi = 0 if Xhi E D Ej+ h 0 for all j = 1, ..., S Ej where Ej is a matrix of coefficients, hi a set of independent ran- Ej dom variables distributed according to the law of extreme values, and S the maximum number of years of schooling.22 The preceding multinomial logit specification is not particularly restrictive. As before, applying the microsimulation methodology to this specification amounts to modifying the distribution of educa- tion or family size conditionally on demographic characteristics, by replacing the coefficients estimated for period t with those for period t in the preceding conditional system. Doing so requires drawing values for the residual variables, , in a way that is consistent with observed choices. But then, it may readily be seen that this is equiv- alent to changing the distribution of education or family sizes through simple rank-preserving transformations, conditionally on demographic characteristics. It is worth concluding the discussion of the income-generation model used in the rest of this book with an important warning on the epistemological nature of this decomposition exercise. It will have been noted that equations 2.21 and 2.22 are not proper 40 BOURGUIGNON AND FERREIRA economic models of fertility or schooling. They are purely statistical models, aimed at representing in a simple way the distribution of some variable conditionally on others, thus enabling us to perform the switches required by the methodology for decomposing distri- butional changes in a manner consistent with the covariance pat- terns observed in the data. To some extent, the same may be said of the income-generation model shown in equations 2.17 through 2.20. Earnings or income equations 2.19 and 2.20 might be inter- preted as the outcome of the labor market and self-employment production. In that sense, there is something of an economic model behind these equations. This injunction is not true, however, of sys- tem 2.18, which describes the allocation of individuals across occu- pations. If this discrete choice specification were to be taken as a structural model of labor supply, then it would be necessary to explicitly introduce the wage rate or productivity of self-employ- ment in that specification, as well as to introduce nonlabor income. Instead, equation 2.18 should be seen as a reduced-form specifica- tion. Comparing it at two points in time provides information on the identity of the individuals who modified their occupation over time, but not on the reasons they did so. It would thus be incorrect to rely on counterfactual distributions where only earnings equations are modified to identify the total dis- tributional effect of changes in wages. Only the direct effects can be captured in this way. Indirect effects that operate through the impact of these wage changes on labor supply cannot be identified sepa- rately from changes in the occupational structure of the labor force. Without a structural specification of occupational choices, instead of the reduced form (equation 2.18)23 and additional economywide modeling, there unfortunately is no solution to this identification problem. It is important to keep this "partial equilibrium" nature of the decomposition methodology in mind when analyzing the results obtained in the case studies in this book.24 Some General Econometric Issues Estimating the complete household income model (2.17­2.20) in its general form above would be a formidable undertaking, for several reasons. First, all the equations of the model clearly should be esti- mated simultaneously, with nonlinear estimation techniques, because of the discrete occupational-choice model and because of the likely correlation among the unobservable terms in the various equations. In particular, if the allocation of individuals across occu- pations is in some sense consistent with utility maximization, then the random term L cannot be considered independent from the DECOMPOSING CHANGES IN THE DISTRIBUTION 41 random terms in the earnings and self-employment equations, w and se. Indeed, if an individual finds a salaried job with higher earn- ings than individuals who have the same observable characteristics, he or she is likely to be observed in that job, too. Although extremely intricate, such simultaneous estimation might be manageable-- probably under some simplifying assumptions--if every household comprised a single individual. But the obvious correlation across the earnings equations and labor-supply equations of the working-age members of the same household, the number of which varies across households, makes things hopelessly complicated. An additional risk is that the estimates obtained with such a complex econometric specification might not be robust. They might, in particular, show artificially high time variability, thus jeopardizing the decomposition principle shown above. The microeconometric estimation work undertaken in the case studies reported in this volume relies on a simplified, but possibly more robust, specification, based on the following three principles: 1. Individual earnings functions and household self-employment functions, if applicable, are estimated separately and consistently through the instrumentation of endogenous right-hand-side vari- ables and the usual two-step Heckman correction for selection bias. This standard correction for selection bias allows us to draw the unobserved residual terms, w and se , of those individuals with no earnings (or households with no self-employment income) in the appropriate conditional distribution. In particular, it accounts for the fact that the latter should logically expect earnings and self- employment income that are smaller than those who are actually observed in a wage-earning job or a self-employment activity. Yet we do not attempt to link this selection bias correction procedure and the drawing of residuals in the earnings and self-employment income equations to the estimation of the occupational-choice model and to the drawing of residuals in that model.25 This is unlikely to be a problem if no significant bias is present in the earn- ings and self-employment equations, as occurs in most cases. It is less satisfactory, of course, when the bias is strongly significant. 2. The simultaneity between household members' labor-supply decisions is taken into account by considering the behavior of house- hold heads and that of the other members sequentially, as conven- tionally done in much of the labor-supply literature. Thus, the occupational decision of the household head is estimated first with the preceding multinomial logit model and using both the general exogenous characteristics of the household, as well as those of all household members, as explanatory variables. Second, the labor- supply and occupation decision of other members is estimated 42 BOURGUIGNON AND FERREIRA conditionally on the decision made by the head of household and possibly on his or her income. In addition, different models were sometimes estimated depending on the position of a person in the family. Indeed, it seems natural that, other things being equal, the spouse does not behave in the same way with respect to labor supply as the daughter of the head of household. The categories for which distinct labor-supply models were estimated include spouses, sons, daughters, and other household members. 3. The drawing of residual terms in the multinomial logit model raises some difficulties. First, none of the error terms is actually observed. What is observed is that the J + 1 random terms lie in some region of RJ +1 , such that all the inequality conditions are sat- isfied for the observed choice Ihi in system 2.18. Specifically, if indi- vidual i is observed in occupation 2, rather than in any of the other J occupations ( j = 2) that he or she might have chosen, then the vec- tor of i must be such that Zhi L L2 + iL2 > Zhi Lj+ i , j = 2 Lj and Zhi L2+ i L2> 0. Drawing consistent values for these residual terms essentially consists of independently drawing J + 1 values in the law of extreme values and checking whether they satisfy the above condition for the observed Ihi, that is, the occupation observed for individual hi. Drawings for which these conditions are not satis- fied are discarded, and the operation is repeated until a (single) set of values is drawn such that the conditions in system 2.18 hold.26 Finally, combined with the random drawing of residual terms for the potential earnings and self-employment incomes of individuals not observed in such an activity, this procedure for drawing multi- nomial logit residuals implies that any counterfactual distribution generated by the microsimulation of the model is, in effect, random. This is not too great a problem if the microsimulation relies on a suf- ficiently large number of observations. For this practical reason, the law of large numbers was supposed to hold in the case studies gath- ered in this book. If that were not the case, one should repeat each microsimulation a large number of times, so as to obtain a distribu- tion of counterfactual distributions. In the context of the large sam- ple sizes available to the case studies in the chapters that follow, the computation time necessary to generate these Monte Carlo experi- ments was generally judged excessive. How much the results of sin- gle-draw simulations differ from analogous Monte Carlo microsim- ulations remains an interesting question for further research. Another concern that is left for future research is perhaps even more basic. Estimates of distributions in this book--whether they are scalar measures or quantile interval means in some curve--are derived from samples and are thus subject to sampling error. Ideally, therefore, one would present confidence intervals for the various DECOMPOSING CHANGES IN THE DISTRIBUTION 43 statistics and seek to determine their implications for the estimated counterfactual distributions. Recent analytical and software devel- opments in the realm of inference for stochastic dominance may be a promising avenue for further investigation of this important issue (see, for instance, Davidson and Duclos 2000). As microeconomic simulation research evolves, a more rigorous treatment of its statis- tical inference properties is certain to become necessary. Notes 1. A powerful semiparametric method for constructing counterfactual distributions that is very similar in spirit to the parametric alternative we use here has been proposed by DiNardo, Fortin, and Lemieux (1996). We return to it later in this chapter. 2. In theory, using income or consumption per capita as a welfare mea- sure should not make any difference for a number of methods discussed in this chapter. Yet the parametric model discussed later is definitely better suited to an income view of welfare. Hence, this chapter generally refers to income distribution or income inequality, rather than to their consumption expenditure counterparts. 3. They were both interested by earning discrimination across individ- ual characteristics such as gender or race. Therefore, the populations they considered were defined by some given sociodemographic characteristic. Conceptually, this is no different than considering two populations at two different points in time, as in what follows. 4. To avoid this problem, some authors use the mean characteristics across periods t and t to evaluate the price effect and use the time average of prices to evaluate the endowment effect. It will be seen later that such efforts are an application of a more general method to deal with path dependence. 5. For an introduction to decomposable inequality measures, see Cowell (2002) and the references therein. 6. To see this, note that the mean income in group g is such that: yg = y · mg/ng. 7. The approximation in equation 2.4 tends to an equality as the changes become infinitesimally small. 8. Note that the change in population group weights is also present in the change in the overall mean, but this point is overlooked for the sake of simplicity. 9. But, of course, the Oaxaca-Blinder method could also be cast in terms of groups' means and group weights, rather than in terms of a linear income model. 10. Conversely, the inequality decomposition is path independent. 11. Two exceptions are Fields and O'Hara (1996) and Morduch and Sicular (2002). In both cases, however, the authors ignore the preceding 44 BOURGUIGNON AND FERREIRA point and the need to handle the joint distribution rather than the marginal distributions of income by sources. 12. The concept of subgroup decomposability was first introduced by Foster, Greer, and Thorbecke (1984). For a discussion of the normative implications of this property, see Sen (1997, appendix). 13. Poverty changes can, of course, be decomposed into a growth com- ponent (changes in means) and an inequality component (changes in Lorenz curves). See Datt and Ravallion (1992). But this decomposition is not anal- ogous to a decomposition into price effects, endowment effects, and behav- ioral changes, because both components are influenced by all three effects. 14. One way to investigate how small these differences are--and to address the problem of path dependence--would be to consider a large number of paths and to estimate the "average" contribution of a particular change over them. Shorrocks (1999) provides a formal definition of the appropriate "averaging" concept, on the basis of Shapley values. 15. A general formulation of these various decomposition paths is given in Bourguignon, Ferreira, and Leite (2002). 16. See also the semiparametric technique proposed by Donald, Green, and Paarsch (2000). 17. An alternative would be to use the Shapley-value approach referred to in note 14. 18. Because the simulation actually bears on micro data rather than aggregate data, this operation is often referred to as microsimulation. 19. For situations in which selection into the sample differs across t and t (say, because participation behavior has changed), an alternative approach exists for generating a counterfactual distribution of residuals. This approach, discussed in Cunha, Heckman, and Navarro (2004), relies on factor analysis (and a number of assumptions) to decompose the variance of residuals into a component due to predictable individual heterogeneity and another due to pure uncertainty (or "luck"). Such a decomposition would enable one to consider estimates of "unobserved" individual fixed effects separately from pure randomness. 20. This change in the population of earners because of changing labor- force participation behavior was only implicit in the preceding analysis of individual earnings. It was simply part of the endowment effect or, in other words, the change in the sociodemographic characteristics of the active population. 21. Ex ante, the probability that individual i of household h takes occu- pation s is given by the following: Ls Ps Zhi, L eZhi = 1 + eZhi Lj j whereas the probability of inactivity, P0(Zhi, L), is such that all prob- abilities sum to unity. DECOMPOSING CHANGES IN THE DISTRIBUTION 45 22. The multinomial logit specification is also compatible with school- ing being defined by achievement levels, rather than by number of years. 23. The occupational models are not always in pure reduced form. For instance, many case studies model the occupational choice of spouses or sec- ondary household members as a function of the income of the household head, as in much of the standard labor-supply literature. Such studies allow account- ing for the typically structural effect of a change in the occupation or earnings of the household head on the occupation of other household members. 24. The same caveat about the partial equilibrium nature of the exercise applies to the original Oaxaca-Blinder decomposition; to the semiparamet- ric approach of DiNardo, Fortin, and Lemieux (1996); and, indeed, to all other approaches previously reviewed. 25. An equivalent to the well-known Heckman two-stage procedure for the correction of selection bias in the case of a dichotomous choice repre- sented by a probit exists with polychotomous choice and the multinomial logit model (see Lee 1983). Yet this method has been shown to be problem- atic (see Bourguignon, Fournier, and Gurgand 2002; Schmertmann 1994). 26. Specific i terms can be obtained as i = - log[- log(x)], where x L L is a random draw in a uniform distribution in [0, 1]. An alternative method is proposed in Bourguignon, Fournier, and Gurgand (2001). References Blinder, Alan S. 1973. "Wage Discrimination: Reduced Form and Structural Estimates." Journal of Human Resources 8(Fall): 436­55. Bourguignon, François. 1979. "Decomposable Income Inequality Mea- sures." Econometrica 47: 901­20. Bourguignon, François, Francisco Ferreira, and Phillippe Leite. 2002. "Beyond Oaxaca-Blinder: Accounting for Differences in Household Income Distributions across Countries." Policy Research Working Paper 2828. World Bank, Washington, D.C. Bourguignon, François, Martin Fournier, and Marc Gurgand. 2001. "Fast Development with a Stable Income Distribution: Taiwan, 1979­1994." Review of Income and Wealth 47(2): 1­25. ------. 2002. "Selection Bias Correction Based on the Multinomial Logit Model." Working paper. DELTA, Paris. Cowell, Frank A. 1980. "On the Structure of Additive Inequality Mea- sures." Review of Economic Studies 47: 521­31. ------. 2002. "Measurement of Inequality." In Anthony Atkinson and François Bourguignon, eds., Handbook of Income Distribution, Vol. 1. Amsterdam: Elsevier. Cunha, Flávio, James Heckman, and Salvador Navarro. 2004. "Counter- factual Analysis of Inequality and Social Mobility." University of Chicago. Processed. 46 BOURGUIGNON AND FERREIRA Datt, Gaurav, and Martin Ravallion. 1992. "Growth and Redistribution Components of Changes in Poverty Measures." Journal of Development Economics 38: 275­95. Davidson, Russell, and Jean-Yves Duclos. 2000. "Statistical Inference for Stochastic Dominance and for the Measurement of Poverty and Inequal- ity." Econometrica 68(6): 1435­64. DiNardo, John, Nicole Fortin, and Thomas Lemieux. 1996. "Labor Market Institutions and the Distribution of Wages, 1973­1992: A Semi- Parametric Approach." Econometrica 64(5): 1001­44. Donald, Stephen, David Green, and Harry Paarsch. 2000. "Differences in Wage Distributions between Canada and the United States: An Applica- tion of a Flexible Estimator of Distribution Functions in the Presence of Covariates." Review of Economic Studies 67: 609­33. Fields, Gary, and Jennifer O'Hara. 1996. "Changing Income Inequality in Taiwan: A Decomposition Analysis." Cornell University. Ithaca, New York. Processed. Foster, James, Joel Greer, and Erik Thorbecke. 1984. "A Class of Decom- posable Poverty Measures." Econometrica 52: 761­65. Juhn, Chinhui, Kevin Murphy, and Brooks Pierce. 1993. "Wage Inequality and the Rise in Returns to Skill." Journal of Political Economy 101: 410­42. Lee, Lung-Fei. 1983. "Generalized Econometric Models with Selectivity." Econometrica 51: 507­12. McFadden, Daniel L. 1974. "Conditional Logit Analysis of Qualitative Choice Behavior." In Paul Zarembka, ed., Frontiers in Econometrics. New York: Academic Press. Mookherjee, Dilip, and Anthony Shorrocks. 1982. "A Decomposition Analysis of the Trend in U.K. Income Inequality." Economic Journal 92: 886­902. Morduch, Jonathan, and Terry Sicular. 2002. "Rethinking Inequality Decomposition, with Evidence from Rural China." Economic Journal 112: 93­106. Oaxaca, Ronald. 1973. "Male-Female Wage Differentials in Urban Labor Markets." International Economic Review 14: 673­709. Schmertmann, Carl P. 1994. "Selectivity Bias Correction Methods in Poly- chotomous Sample Selection Models." Journal of Econometrics 60: 101­32. Sen, Amartya. 1997. On Economic Inequality. Oxford, U.K.: Clarendon Press. Shorrocks, Anthony. 1980. "The Class of Additively Decomposable Inequality Measures." Econometrica 48: 613­25. ------. 1982. "Inequality Decomposition by Factor Components." Econo- metrica 50(1): 193­211. ------. 1999. "Decomposition Procedures for Distributional Analysis: A Unified Framework Based on the Shapley Value." University of Essex, Essex, U.K. Processed. 3 Characterization of Inequality Changes through Microeconometric Decompositions: The Case of Greater Buenos Aires Leonardo Gasparini, Mariana Marchionni, and Walter Sosa Escudero The main economic variables have oscillated widely in the past two decades in Argentina in association with deep macroeconomic and structural transformations. After reaching a peak of 172 percent monthly in 1989, the inflation rate decreased to less than 1 percent each year in a few years; gross domestic product drastically fell at the end of the 1980s and then grew at unprecedented rates in the first half of the 1990s; unemployment rose steadily from around 5 percent to 14 percent in a short period of time. Income inequality was not an exception in this turbulent period. The Gini coefficient increased from 41.9 to 46.7 between 1986 and 1989, fell to 40.0 toward 1991, and rose steadily in the following seven years, reach- ing a record level of 47.4 in 1998.1 In recent economic history, it is difficult to find periods with such marked changes in inequality, both in Argentina and in the rest of the world. The reasons for these changes in inequality are varied and com- plex. The main aim of this chapter is to assess the relevance of some forces believed to have affected income inequality in the Greater 47 48 GASPARINI, MARCHIONNI, AND ESCUDERO Buenos Aires area between 1986 and 1998. More specifically, the microeconometric decomposition methodology proposed in chap- ter 2 is used to measure the relevance of various factors that appear to have driven changes in inequality. In particular, this methodology is used to identify to what extent changes in (a) returns to education and experience, (b) endowments of unobservable factors and their returns, (c) the wage gap between men and women, (d) labor-market partici- pation and hours of work, and (e) the educational structure of the population contribute to the observed changes in income distribution. The results presented in this chapter suggest that the observed similarity between the inequality indices of 1986 and 1992 is in fact the consequence of mild forces that operated in different directions but compensated for each other in the aggregate. On the contrary, between 1992 and 1998, nearly all the determinants under study contributed to increased inequality. The dominating forces appear to be the increase in the returns to education; a higher dispersion in the endowments or in the returns to unobservable factors; and the dramatic fall in the hours of work of less skilled, low-income people. Perhaps surprisingly, neither the narrowing of the gender wage gap nor the increase in average education of the population were significant equalizing factors. In addition, the dramatic jump in unemployment in the 1990s does not appear to have had a very significant direct effect on household income inequality. The rest of this chapter is organized as follows. The basic facts and some issues that might have affected inequality in the past two decades aredescribedfirst.Nextthedecompositionmethodologyimplemented to assess the relevance of those factors is presented, and the estimation strategy is explained. The main results of the analysis are then pre- sented. The chapter concludes with some brief final comments. Income Inequality: Basic Facts and Sources of Changes Income inequality in Argentina has fluctuated considerably around an increasing trend initiated in the mid-1970s. Figure 3.1 shows the Gini coefficient of equivalent household income between 1985 and 1998 in the Greater Buenos Aires area.2 After a substantial increase in the late 1980s, inequality plunged in the first two years of the 1990s. A new stage of rising inequality started in 1992 and has not stopped yet. Until 1998, the Greater Buenos Aires area had never experienced the level of income inequality reached in that year, at least since reliable household data sets were available.3 For simplicity, this study focused on three years of relative macro- economic stability separated by equal intervals: 1986, 1992, and 1998. In addition, we restricted the analysis mainly to labor income CHARACTERIZATION OF INEQUALITY CHANGES 49 Figure 3.1 Gini Coefficient of Equivalent Household Income Distribution in Greater Buenos Aires, 1985­98 Gini coefficient 48 47 46 45 44 43 42 41 40 39 38 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 Source: Authors' calculations based on the EPH, Greater Buenos Aires, October. Table 3.1 Distributions of Income in Greater Buenos Aires, Selected Years (Gini coefficient) Type of distribution 1986 1992 1998 Earnings 39.4 37.7 44.9 Equivalent household labor income 40.3 41.0 49.5 Source: Authors' calculations based on the EPH, Greater Buenos Aires, October. (that is, wage earnings and self-employment earnings) for two reasons: (a) the Permanent Household Survey (Encuesta Permanente de Hogares, or EPH) has various deficiencies in capturing capital income, and (b) modeling capital income and retirement payments is not an easy task, especially considering the scarce information con- tained in the EPH. We also ignored those households whose heads or spouses were older than 65 or received retirement payments. Table 3.1 shows the basic facts characterized in this chapter. Inequality in individual labor income and in equivalent household labor income, as measured by the Gini, did not change very much between 1986 and 1992; on the contrary, both measures rose dra- matically in the next six years.4,5 50 GASPARINI, MARCHIONNI, AND ESCUDERO A countless number of factors may have caused the changes in inequality documented in table 3.1. We concentrate on seven: (a) returns to education, (b) the gender wage gap, (c) returns to expe- rience, (d) unobservable factors and their returns, (e) hours of work, (f) employment, and (g) the education of the working-able popula- tion. The objective of this chapter is to estimate the sign and the rel- ative magnitude of the effect of those factors on the distribution of earnings and the equivalent household labor income. Although microeconometric decompositions will be used toward that aim, this section begins with an analysis of the basic statistics and regressions to provide some intuitions about the results and to understand the need and usefulness of a microsimulation decomposition technique. Returns to Education An increase in the returns to education implies a widening of the wage gap between workers with high levels of education and those with low levels of education. This wider gap, in turn, would imply a more unequal distribution of earnings and probably a more unequal distribution of household income.6 Table 3.2 shows hourly earnings in real pesos (Arg$) for workers between 14 and 65 years old by educational level. The average wage fell 19 percent between 1986 and 1992 and increased 9.3 percent over the following six years. Changes were not uniform among educational groups. Although in the first period of the analysis the most dramatic drop in hourly earnings was for the college complete group, that group enjoyed the greatest increase in wages during the 1992­98 period. Table 3.2 is a first piece of evidence that changes in relative wages Table 3.2 Hourly Earnings by Educational Level in Greater Buenos Aires, Selected Years Means (Arg$ 1998) Changes (percent) Educational level 1986 1992 1998 1986­92 1992­98 1986­98 Primary incomplete 6.6 5.7 5.3 -13.6 -6.8 -19.5 Primary complete 7.7 6.3 5.9 -18.1 -6.0 -23.0 Secondary incomplete 9.2 6.8 6.6 -26.1 -2.8 -28.1 Secondary complete 11.6 9.1 9.1 -21.2 -0.4 -21.5 College incomplete 14.5 11.9 10.6 -17.5 -11.1 -26.7 College complete 24.1 16.3 19.4 -32.3 19.1 -19.4 Total 10.4 8.4 9.2 -19.0 9.3 -11.4 Note: Data cover workers between ages 14 and 65 with valid answers. Source: Authors' calculations based on the EPH, Greater Buenos Aires, October. CHARACTERIZATION OF INEQUALITY CHANGES 51 among schooling groups implied a decrease in earnings inequality between 1986 and 1992 and an increase thereafter. Table 3.3 shows the results of Mincerian log hourly earnings functions, estimated using the Heckman procedure to correct for sample selection. The first three columns refer to household heads (mostly men), the second three columns refer to spouses (nearly all women), and the last three columns refer to other members of the family (roughly half men and half women). Because the EPH does not record years of education, we included dummy variables that capture the maximum educational level achieved. The omitted cate- gory is primary incomplete. A gender dummy variable, age and age squared, and a dummy variable for youths younger than 18 years old (only relevant for other family members) also were included in the regression. In addition to those variables, the selection equation included marital status, number of children, and a dummy variable that takes the value 1 when the individual attends school. Following Bourguignon, Fournier, and Gurgand (2001), our analysis assumed that labor-market participation choices were made within the house- hold in a sequential fashion. Spouses consider the labor-market sta- tus of the head of household when deciding whether to enter the labor market themselves. Other members of the family consider the labor-market status of both the head of household and the spouse before deciding to enter the labor market. The coefficients of most educational levels are positive, signifi- cant, and increasing with the educational level; that is, the returns to education are always positive.7 For family heads in 1998, an indi- vidual who had completed primary school had an hourly wage 18 percent greater than an individual whose primary education was incomplete, if all other factors were constant. The same figures for individuals whose secondary education was incomplete, those who completed secondary school, those whose college education was incomplete, and those who completed college education are 36, 65, 94, and 146 percent, respectively, all with respect to individuals who had not completed primary school. In many cases, returns to education are increasing; that is, the hourly wage gap between edu- cational levels increases with education.8 For heads of household in 1998, the difference in wages between an individual who had completed primary school and one whose secondary education was incomplete is 18 percent, whereas the difference between an individual at the latter level and one who completed secondary school is 29 percent. The greatest jump is between individuals who did not complete and who completed college: 52 percent. Figure 3.2 shows predicted hourly earnings for all educational levels. The first panel refers to male household heads and the second 1998 0.0417 (0.287) 0.1366 (0.953) 0.3646 (2.447) 0.6699 (4.592) 0.9456 (5.830) 0.1678 (3.250) 0.0846 (4.138) 0.0011- 3.735)-( 0.3601- 2.811)-( 0.3190- 0.799)-( members 1992 0.3349 (2.884) 0.4361 (3.795) 0.5726 (4.546) 0.7100 (5.919) 0.8109 (5.432) 0.0701 (1.405) 0.0797 (4.267) family 0.0009- 3.545)-( 0.0338- 0.406)-( 0.2793- 0.749)-( Other earsY 1986 0.0407 (0.441) 0.2278 (2.400) 0.4053 (3.927) 0.5646 (5.289) 0.7439 (5.577) 0.0454 (0.827) 0.0766 (4.351) 0.0009- 3.646)-( 0.0218- 0.250)-( 0.1849 (0.577) Selected 1998 0.0575 (0.462) 0.2306 (1.848) 0.4841 (3.861) 0.6579 (4.347) 0.9607 (5.600) 0.2859 (1.706) 0.0454 (2.028) 0.0005- 1.813)-( 0.6169 (1.178) Aires, Spouse 1992 0.1731- 1.695)-( 0.0243- 0.211)-( 0.2652 (2.445) 0.5173 (3.666) 0.5764 (4.183) 0.2626 (1.280) 0.0343 (1.533) 0.0004- 1.393)-( 1.1095 (2.283) Buenos 1986 0.0393 (0.496) 0.2241 (2.342) 0.5595 (6.720) 0.6446 (5.210) 0.8607 (7.824) 0.1865- 0.774)-( 0.0413 (2.120) 0.0005- 2.057)-( 1.0778 (2.554) Greater to 1998 0.1828 (2.978) 0.3630 (5.620) 0.6534 (9.664) 0.9382 1.4634 0.1675 (3.474) 0.0452 (3.951) (12.714) (20.282) 0.0004- 3.155)-( 0.2051 (0.792) Applied household of 1992 0.2162 (4.011) 0.3367 (5.661) 0.6229 0.9516 1.2607 0.1834 (3.707) 0.0546 (4.882) (10.185) (12.713) (18.242) 0.0006- 4.661)-( 0.1959 (0.806) Equation Head 1986 0.2150 (5.496) 0.3994 (9.206) 0.6219 0.9121 1.3079 0.2915 (5.106) 0.0401 (3.969) (12.649) (15.469) (22.778) 0.0004- 3.295)-( 0.5599 (2.400) Earnings equation Hourly Log 18 earnings incomplete complete 3.3 complete incomplete complete than hourly squared ableT ariableV Log Primary Secondary Secondary College College Male Age Age oungerY Constant 52 ) page 0.2573 (1.126) 0.2308 (1.021) 0.4376 (1.829) 0.5657 (2.284) 0.8389 (2.752) 0.5007 (6.182) 0.2719 (9.111) 0.0036- 8.656)-( 0.4761- 3.360)-( 0.5995- 4.087)-( 0.9050- 7.556)-( following 0.5917 (2.874) 0.8538 (4.015) 0.7899 (3.296) 1.5396 (5.794) 1.3888 (3.766) 0.4630 (4.703) 0.1686 (4.836) 0.0022- 4.832)-( 0.4060- 2.373)-( 0.3063- 1.925)-( 1.7389- 11.237)-( the on 0.2137 (1.203) 0.2258 (1.215) 0.4315 (1.901) 0.8123 (3.441) 0.8274 (2.101) 0.8164 (8.451) 0.1960 (5.682) 0.0029- 6.385)-( 0.7813- 4.386)-( 0.5983- 3.823)-( 1.6477- 11.458)-( Continued( 0.1975- 1.346)-( 0.0398 (0.258) 0.2299 (1.489) 0.4153 (2.102) 1.3115 (7.588) 1.3967 (5.353) 0.1203 (4.279) 0.0015- 4.353)-( 0.1768- 5.477)-( 0.2020 (0.900) 0.0513- 0.381)-( 0.0129 (0.083) 0.1556 (1.066) 0.5239 (2.433) 1.0577 (5.620) 1.7185 (2.970) 0.1757 (5.577) 0.0023- 5.668)-( 0.1797- 5.460)-( 0.3501 (1.040) 0.3295- 3.289)-( 0.1980- 1.612)-( 0.0736- 0.639)-( 0.4776 (2.355) 0.7033 (4.467) 1.2982 (2.235) 0.1288 (4.907) 0.0017- 5.117)-( 0.1929- 6.496)-( 0.3036- 0.963)-( 0.3955 (3.052) 0.4556 (3.234) 0.5866 (3.736) 0.4125 (2.177) 0.8111 (4.537) 0.6528 (5.263) 0.1045 (3.748) 0.0014- 4.269)-( 0.0588 (0.477) 0.0464- 1.387)-( 0.5569- 2.509)-( 0) > 0.2212 (1.429) 0.5737 (2.987) 0.5575 (2.827) 1.0563 (3.318) 1.0181 (3.750) 0.7840 (4.001) 0.1141 (3.012) 0.0016- earnings 3.589)-( 0.1559 (0.841) 0.0178- 0.442)-( 1.0407- 3.280)-( hourly if 0.2931 (2.240) 0.3494 (2.238) 0.4875 (2.580) 0.4760 (1.827) 1.2176 (3.085) 0.8594 (5.175) 0.1099 (3.160) 1 0.0014- 3.541)-( 0.1986 (1.204) 0.0087- 0.202)-( 0.8669- 2.850)-( =. var (dep. 18 incomplete complete equation school complete incomplete complete than squared Selection Primary Secondary Secondary College College Male Age Age Married Children oungerY Attending 53 with 1998 0.2210- 1.951)-( 0.0488 (0.547) 4.0987- 8.127)-( 1,631 861.41 0.3600 0.5569 0.2005 65 1,191.27- and members 14 1992 family 0.1624- 1.112)-( 0.0005- 0.005)-( 2.6912- 4.358)-( 1,090 590.80 769.56- 0.3726 0.4770 0.1777 ages between Other 1986 0.0351- 0.212)-( 0.0763- 0.706)-( 2.6080- 4.233)-( 1,292 767.13 841.52- 0.1705 0.4848 0.0827 individuals all 1998 0.6148- 4.386)-( 1.8346- 3.390)-( 1,413 303.14 1,354.19- 0.1035- 0.6434 0.0666- cover Data Spouse 1992 0.6382- 3.314)-( 2.7184- 4.571)-( 1,116 154.13 998.04 0.0379 0.5492 0.0208 parentheses. . in 1986 0.7922- 3.982)-( 1.5356- 3.015)-( 1,575 164.62 1,311.55- 0.1691- 0.5603 0.0948- are October values z Aires, 1998 1.3239- 2.296)-( 1,967 148.96 0.1247 0.6361 0.0793 2,281.71- Buenos estimation; household 1,404 of 1992 1.3567- 1.682)-( 124.61 0.6786 0.5747 0.3900 Greater 1,368.31- likelihood EPH, Head the 1986 1.3555- 1.892)-( 1,961 153.77 0.2179 0.5562 0.1212 on 1,888.35- maximum based ) Heckman calculations' Continued( observations represent household of Authors 3.3 employed Data of likelihood answers. 2 ableT ariableV employed Note: Source: Head Spouse Constant Number Chi Log Rho Sigma Lambda valid 54 CHARACTERIZATION OF INEQUALITY CHANGES 55 Figure 3.2 Hourly Earnings­Education Profiles for Men (Heads of Household and Other Family Members), Age 40 A. Heads of household Hourly earnings (Arg $) 25 20 15 10 5 0 Prii Pric Seci Secc Coli Colc Educational level B. Other family members Hourly earnings (Arg $) 14 12 10 8 6 4 2 0 Prii Pric Seci Secc Coli Colc Educational level 1986 1992 1998 Note: Prii = primary incomplete, Pric = primary complete, Seci = secondary incomplete, Secc = secondary complete, Coli = college incomplete, Colc = college complete. Source: Predicted hourly earnings from models in table 3.3. 56 GASPARINI, MARCHIONNI, AND ESCUDERO to other male household members, both with age kept constant at 40. The wage-education profiles for family heads have a marked positive slope and are almost parallel everywhere, except for the substantial increase in the slope between 1992 and 1998 in the highest educational levels. This situation certainly contributes to increased earnings inequality among household heads. For other male family members, the wage-education profile became flatter between 1986 and 1992 and substantially steeper and more convex in the following six years. The latter movement could imply a dra- matic widening of the earnings gap by educational level. Figure 3.3 shows the profiles for 40-year-old females. As in the case of men, the wage-education profiles show a decreasing slope between 1986 and 1992 and an opposite movement between 1992 and 1998. In summary, the changes in the returns to education appear to have been mildly inequality reducing between 1986 and 1992 and strongly inequality increasing in the next six years. Those conclusions are the most detailed we can draw with basic statistics and regressions. To get a more complete assessment of the relative significance of these effects on the income distribution, we need to go beyond this simple Figure 3.3 Hourly Earnings­Education Profiles for Women (Spouses), Age 40 Hourly earnings (Arg $) 18 16 14 12 10 8 6 4 2 0 Prii Pric Seci Secc Coli Colc Educational level 1986 1992 1998 Note: Prii = primary incomplete, Pric = primary complete, Seci = secondary incomplete, Secc = secondary complete, Coli = college incomplete, Colc = college complete. Source: Predicted hourly earnings from models in table 3.3. CHARACTERIZATION OF INEQUALITY CHANGES 57 analysis. Later sections present a microsimulation methodology that builds from the results of this section and allows a richer analysis. Gender Wage Gap Table 3.4 presents mean hourly wages by gender. Wages were higher for males in every year. In 1986, males' hourly wages were on aver- age 16 percent higher than females' hourly wages. The gender gap narrowed to 3 percent in 1998. A conditional analysis also shows a shrinking wage gap for household heads. From table 3.3, the coeffi- cients of the male dummy variable in the regression for household heads are always positive and significant but clearly decrease over time.9 This narrowing gender wage gap has undoubtedly been an equalizing factor on the earnings distribution. The effect of the narrowing gender wage gap on the distribution of equivalent household labor income depends on the relative posi- tion of working women in that distribution. Two factors play in different directions. On the one hand, female workers are more con- centrated in the upper part of the distribution than male workers (partly because of their own labor decisions), and hence a relative wage change in favor of women implies an increase in household income inequality.10 On the other hand, a proportional wage increase for all women is more relevant in low-income families because women's earnings are a more significant part of total resources in those households than in rich families. An extreme example is the disproportionate number of poor households headed by working women. The total effect of a shrinking gender wage gap on the house- hold income distribution is then ambiguous. We need a more power- ful methodology to get a more precise assessment of that effect. Returns to Experience Age is used as a proxy for experience in the labor market. The coef- ficients of age and age squared in the log hourly earnings equation Table 3.4 Hourly Earnings by Gender in Greater Buenos Aires, Selected Years Means (Arg$ 1998) Changes (percent) Gender 1986 1992 1998 1986­92 1992­98 1986­98 Female 9.3 8.1 9.0 -12.6 10.2 -3.7 Male 10.8 8.5 9.3 -21.2 9.0 -14.1 Total 10.4 8.4 9.2 -18.9 9.3 -11.4 Note: Data cover workers between 14 and 65 with valid answers. Source: Authors' calculations based on the EPH, Greater Buenos Aires, October. 58 GASPARINI, MARCHIONNI, AND ESCUDERO of table 3.3 suggest an inverted U-shaped wage-age profile. The comparison between 1986 and 1998 reveals no major changes in the returns to experience. In contrast, the relevant coefficients did change in subperiods 1986­92 and 1992­98. For instance, between 1992 and 1998, the wage-age profile for heads of household and spouses changed in favor of workers older than 50. Because the mean hourly wage of this group is somewhat lower than the overall mean, in principle we expect a mild equalizing effect on the earnings distribution.11 Older workers are better located in the distribution of equivalent household income than in the earnings distribution, perhaps because of smaller families; thus, the effect of the change in the returns to experience on that distribution is not clear.12 The results presented in this chapter help assess the quantitative rele- vance of those arguments. Unobservable Factors Earnings equations allow the estimation of returns to observable factors such as education and experience. The error term usually is interpreted as capturing the joint effect of the endowment of unob- servable factors (such as individual ability) and their market value on earnings. In general, the variance of this error term captures the contribution of dispersion in unobservable factors to general inequality. Table 3.3 reports the standard deviation of the error terms of each log hourly earnings equation (labeled as "sigma"). For instance, for household heads, the standard deviation took a value of 0.56 in 1986, 0.57 in 1992, and 0.64 in 1998. The sub- stantial increase between 1992 and 1998 also is present in the spouses' and other members' equations. According to these results, the effect of changes in unobservable factors would have been mildly unequalizing between 1986 and 1992 and substantially unequaliz- ing in the next six-year period. Hours of Work During the period under analysis, there has been a slight fall in weekly hours of work: one hour between 1986 and 1992 and less than one-half hour in the next six years. This fall was not uniform across categories of workers. Table 3.5 classifies workers by educa- tional level and records the average hours of work of each group. Although there is not a clear pattern of changes between 1986 and 1992, the 1990s witnessed a dramatic fall in hours of work by work- ers with low levels of education. This change would have a nonneg- ligible unequalizing effect on the earnings and income distributions. CHARACTERIZATION OF INEQUALITY CHANGES 59 Table 3.5 Weekly Hours of Work by Educational Levels in Greater Buenos Aires, Selected Years Means (Arg$ 1998) Changes (percent) Educational level 1986 1992 1998 1986­92 1992­98 1986­98 Primary incomplete 45.7 45.6 40.2 -0.3 -11.7 -12.0 Primary complete 48.5 46.8 46.5 -3.3 -0.8 -4.1 Secondary incomplete 47.0 47.0 47.5 0.1 1.0 1.1 Secondary complete 46.9 45.1 46.7 -3.9 3.5 -0.5 College incomplete 42.7 41.9 41.8 -1.9 -0.1 -2.0 College complete 42.6 42.3 42.8 -0.5 1.1 0.5 Total 46.5 45.5 45.2 -2.1 -0.8 -2.9 Note: Data cover workers between 14 and 65 with valid answers. Source: Authors' calculations based on the EPH, Greater Buenos Aires, October. Figure 3.4 Weekly Hours of Work by Educational Level for Men (Heads of Household), Age 40 Weekly hours of work 55 50 45 40 35 30 25 20 Prii Pric Seci Secc Coli Colc Educational level 1986 1992 1998 Note: Prii = primary incomplete, Pric = primary complete, Seci = secondary incomplete, Secc = secondary complete, Coli = college incomplete, Colc = college complete. Source: Predicted weekly hours of work from models in table 3.6. A conditional analysis yields similar results. Figure 3.4 shows predicted weekly hours of work for male household heads from the Tobit censored data model presented in table 3.6. Although hours of 9.4156 1998 (1.467) 9.4008 (1.484) 12.4789 (1.890) 21.8096 (3.152) 13.3421 (1.777) 15.2718 (7.018) 7.5266 (9.468) 0.1020- 8.948)-( 12.1374- 3.241)-( 23.3702- 5.426)-( members 1992 16.5169 (3.094) 19.4012 (3.583) 17.9790 (3.009) 39.7456 (5.859) 23.1498 (3.108) 14.8407 (6.134) 3.4066 (4.049) family 0.0468- 4.130)-( 7.6537- 1.818)-( 14.6823- 3.618)-( Other 1986 9.5376 (2.186) 6.1322 (1.352) 8.5914 (1.686) 24.1386 (4.185) 8.2748 (1.156) 21.8135 (9.650) 4.7870 (6.111) 0.0714- 6.934)-( 15.8565- 3.813)-( 18.8104- 4.861)-( earsY 1998 3.7478- 0.694)-( 5.4827 (0.973) 12.0399 (2.135) 20.2824 (2.858) 36.5539 (6.181) 43.9907 (6.512) 4.4250 (4.335) 0.0562- 4.368)-( 7.3587- 6.414)-( Selected Spouse 1992 0.8331- 0.158)-( 1.6344 (0.270) 8.1426 (1.432) 18.4916 (2.277) 32.6159 (4.806) 44.9860 (3.380) 6.5939 (5.414) 0.0850- 5.455)-( 7.3819- 5.847)-( Aires, 1986 12.8134- 3.047)-( 8.1969- 1.583)-( 1.4443- 0.301)-( 16.6182 (2.095) 21.8548 (3.546) 45.2329 (2.677) 5.4816 (4.942) 0.0722- 5.098)-( 8.7386- 7.070)-( Buenos 1998 9.1412 (4.416) 13.2170 (6.057) 13.1584 (5.770) 10.8928 (3.979) 13.2734 (5.535) 15.1987 (8.093) 1.3565 (3.351) Greater 0.0186- 3.895)-( 4.4988 (2.608) 0.4745- 1.036)-( for household of 1992 2.9998 (1.690) 7.4547 (3.780) 3.7789 (1.853) 5.7436 (2.149) 5.2378 (2.255) 11.0772 (4.845) 0.9534 (2.468) 0.0150- 3.248)-( 3.3768 (1.652) 0.0064 (0.015) Equation Head ork 1986 3.6994 (3.059) 3.6777 (2.722) 4.6707 (3.075) 3.1701 (1.552) 1.7271 (0.985) 1.5803 2.7919 0.2807 W 13.0310 (7.291) (4.980) 0.0212- 5.620)-( (1.826) (0.835) of Hours 18 incomplete complete 3.6 complete incomplete complete than squared ableT ariableV Primary Secondary Secondary College College Male Age Age Married Children oungerY 60 982 valid 1,631 33.9044- 10.203)-( 3.6361- 1.224)-( 0.0257 (0.011) 108.5699- 7.913)-( 0.1163 33.2833 941.1300 with 3,576.4700- 65 and 609 14 54.3772- 13.539)-( 5.6771- 1.619)-( 1.2688 (0.478) 43.2795- 2.895)-( 1,090 658.90 0.1124 2,602.48- 31.4037 ages 780 between 51.2882- 14.003)-( 4.0485- 1.095)-( 3.5146- 1.327)-( 51.8461- 3.641)-( 1,292 877.47 0.1368 2,769.37- 30.5604 individuals 848 all 2.5146 (0.330) 19.6188- 4.138)-( 70.4622- 3.570)-( 1,413 252.91 0.0363 3,352.46- 40.6468 cover Data 705 9.0871 (0.772) 25.8924- 3.669)-( 99.8388- 4.321)-( 1,116 129.49 0.0252 2,502.00- 42.6309 . 81 parentheses. in October 13.9652- 1.077)-( 28.5008- 3.686)-( 70.2406- 3.282)-( 1,575 143.34 0.0225 3,111.35- 45.6327 are Aires, ratiost 201 Buenos 13.1575- 3.902)-( 3.6110- 0.435)-( 1,967 292.40 0.0172 8,369.46- 24.0450 estimation; Greater 97 16.2041- 4.315)-( 1,404 EPH, 17.5783 (2.193) 174.00 0.0146 5,880.39- 19.6320 likelihood the on 112 1,961 based 14.2282- 4.665)-( 3.5987 (0.559) 279.96 0.0169 maximum 8,148.67- 18.2014 obitT calculations' represent school household of 2 Authors employed R Data of likelihood 2 employed observations Note: Source: Attending Head Spouse Constant Number Censored Chi Log Pseudo Sigma answers. 61 62 GASPARINI, MARCHIONNI, AND ESCUDERO work clearly decreased between 1986 and 1998 for the less edu- cated male household heads, changes in hours for the rest of the educational groups were only marginal. Employment Household income inequality can change not only because of changes in hours of work but also because of changes on the exten- sive margin of the labor market. This aspect is particularly interest- ing in the case of Argentina, because many analysts consider the dramatic jump in the unemployment rate in the 1990s to be the main reason for the increase in inequality. In table 3.7, adults are grouped according to whether they are employed, unemployed, or out of the labor force (inactive). The per- centage of unemployed individuals rose from 2.3 percent in 1986 to 6.5 percent in 1998.13 The major increase took place between 1992 and 1998. However, the increase in unemployment between 1986 and 1998 was accompanied by a decrease in inactivity of roughly the same magnitude. Despite the jump in the unemployment rate, Table 3.7 Labor Status by Role in the Household in Greater Buenos Aires, Selected Years Proportions by group (percent) Labor status 1986 1992 1998 All Employed 59.4 60.9 59.5 Unemployed 2.3 3.5 6.5 Inactive 38.3 35.6 34.0 Head Employed 94.6 93.1 89.8 Unemployed 2.0 3.1 5.2 Inactive 3.4 3.8 5.0 Spouse Employed 31.7 36.8 40.1 Unemployed 1.4 1.7 5.6 Inactive 66.9 61.5 54.3 Other Employed 39.6 44.1 39.8 Unemployed 4.0 5.9 8.8 Inactive 56.3 50.0 51.4 Note: Data cover individuals between ages 14 and 65 with valid answers. Source: Authors' calculations based on the EPH, Greater Buenos Aires, October. CHARACTERIZATION OF INEQUALITY CHANGES 63 the proportion of working-able people with zero income remained roughly unchanged between 1986 and 1998. Notice that for inequality measures, it is irrelevant whether the individual has zero income because he or she is unemployed or because he or she is not looking for a job. Hence, it is not likely that aggregate changes in labor-market participation played a significant role on inequality changes.14 Table 3.7 suggests three different stories in the labor market--for household heads, spouses, and other family members. Some house- hold heads lost or quit their jobs, especially in the period between 1992 and 1998, becoming either unemployed or out of the labor force. By contrast, many of the spouses tried to enter the labor force between 1986 and 1992; most of them found a job, but some of them did not. Other family members were less fortunate; nearly all members of this group who started to look for a job became unem- ployed (or caused another employed individual to move into the unemployed category). Education In Argentina, as in many developing countries, substantial changes in the educational composition of the population have been taking place in recent decades. Table 3.8 presents the proportion of indi- viduals between 14 and 65 years old by level of education. Between 1986 and 1998, there was a strong contraction in the proportion of youths and adults with primary education (both those who com- pleted primary schooling and those who did not). Simultaneously, the share of individuals in all other educational groups increased, particularly in the secondary complete group between 1986 and 1992 and in the college group between 1992 and 1998. Table 3.8 Composition of Sample by Educational Level in Greater Buenos Aires, Selected Years Educational level 1986 1992 1998 Primary incomplete 15.4 11.0 7.3 Primary complete 32.0 31.1 25.2 Secondary incomplete 26.0 26.8 30.6 Secondary complete 13.5 15.8 15.2 College incomplete 7.1 8.1 11.7 College complete 6.0 7.3 10.0 Note: Data cover individuals between ages 14 and 65 with valid answers. Source: Authors' calculations based on the EPH, Greater Buenos Aires, October. 64 GASPARINI, MARCHIONNI, AND ESCUDERO To understand the effects of these changes, one can think of over- all inequality as a function of inequality between educational groups and a weighted average of inequality within educational groups. An increase in the share of a given educational group in the population can increase inequality (a) if the mean income of that group is far from the overall mean (or median) so that inequality between that group and the others grows, and (b) if inequality within that group is high so that the weighted average of inequalities within the group increases. In Argentina, the educational structure has changed in the 1990s in favor of a group with an earnings distribution with a rela- tively high mean and dispersion--the college group. This change feeds the presumption of an unequalizing education effect on the earnings and income distribution, operating through both of the previously mentioned channels.15 The first channel is linked to Kuznets's (1955) observation: if the highly educated rich are a minority and only some poor children manage to achieve the high- est educational (and income) levels, it is likely that inequality grows as the average education of the population increases, at least until the highly educated group is relatively large. The second channel lies on the convexity of the returns to education, which implies higher wage dispersion for the group of highly educated people. So far we have analyzed several factors that might have affected inequality. Although we have offered some evidence to argue for each effect, we still do not have a consistent framework to use to confirm the sign of each effect and to assess its quantitative rele- vance. Were changes in the returns to education really an unequal- izing force? Were they really a significant force compared with other factors? The next section presents a framework to tackle these questions. Methodology To assess the relevance of the various factors discussed in the previ- ous section on income inequality changes, we adapted the micro- econometric decomposition methodology proposed in chapter 2 to our case.16 Let Yit be individual i's labor income at time t, which can be writ- ten as a function F of the vector Xit of individual observable characteristics that affect wages and employment, the vector it of unobservable characteristics, the vector t of parameters that deter- mine market hourly wages, and the vector t of parameters that affect employment outcomes (participation and hours of work). (3.1) Yit = F(Xit, it, t, t) i = 1, . . . , N CHARACTERIZATION OF INEQUALITY CHANGES 65 where N is total population. The distribution of individual labor income can be represented as follows:17 (3.2) Dt = {Y1 , . . . , YNt}. t We can simulate individual labor incomes by changing one or some arguments in equation 3.1. For instance, the following expression represents labor income that individuals i would have earned in time t if the parameters determining wages had been those of time t, keeping all other things constant: (3.3) Yit(t ) = F(Xit, it, t , t) i = 1, . . . , N. More generally, we can define Yit(kt ), where k is any set of argu- ments in equation 3.1. Hence, the simulated distribution will be (3.4) Dt(kt ) = {Y1 (kt ), . . . , YNt(kt )}. t The contribution to the overall change in the distribution of a change in k between t and t , holding all else constant, can be obtained by comparing equations 3.2 and 3.4. Although we can make the comparisons in terms of the whole distributions, in this chapter, we compared inequality indices I(D). Therefore, the effect of a change in argument k on the earnings distribution is given by (3.5) I[Dt(kt )] - I(Dt). As discussed in the previous section, this chapter is devoted to discussing the following effects: · Returns to education (k = ed) measures the effect of changes in the parameters that relate education to hourly wages (ed) on inequality. · Gender wage gap (k = g) measures the effect of changes in the parameters that relate gender to hourly wages (g) on inequality. · Returns to experience (k = ex) measures the effect of changes in the parameters that relate experience (or age) to hourly wages (ex) on inequality. · Endowment and returns to unobservable factors (k = w) mea- sures the effect of changes in the unobservable factors and their remunerations affecting hourly wages (w) on inequality. · Hours of work and employment (k = ) measures the effect of changes in the parameters that determine hours of work and labor- market participation () on inequality. · Education (k = Xed) measures the effect of changes in the educational levels of the population (Xed) on inequality. The previous discussion refers to the distribution of earnings. However, from a social point of view, it is more relevant to study the distribution of household income because a person's utility 66 GASPARINI, MARCHIONNI, AND ESCUDERO usually depends not on his or her own earnings but on the house- hold income and the demographic composition of the family. Equiv- alent household income for each individual in household h in time t is defined as (3.6) Yht = q Yjt + Yjt0 ajt h = 1, . . . , H jht jht where Yq stands for equivalent household income, h indexes house- holds, Y0 is income from other sources, a stands for the equivalent adult of each individual, and is a parameter that captures house- hold economies of scale.18 The distribution of equivalent household income for the population of N individuals can be expressed as follows: (3.7) Dt = Y1 , . . . , YNt . q q q t Changing argument k to its value in t yields the following simulated equivalent household income in year t: (3.8) Yht(kt ) = q Yjt(kt ) + Yjt 0 ajt h = 1, . . . , H. jht jht Hence, the simulated distribution is (3.9) Dt (kt ) = Y1 (kt ), . . . , YNt(kt ) . q q q t The effect of a change in argument k, holding all else constant, on equivalent household income inequality is given by (3.10) I Dt (kt ) - I Dt . q q Estimation Strategy To compute expressions 3.5 and 3.10, we need to estimate parame- ters and and the residual terms . Also, because we do not have panels, we need a mechanism to replicate the structure of observ- able and unobservable individual characteristics of one year into the population of another year. This section is devoted to explaining the strategies to address these problems. Estimation of and Let Li denote the number of hours worked by person i and wi be the hourly wage received. Total labor income is given by Yi = Liwi . The number of hours of work Li comes from a utility maximization process that determines optimal participation in the labor market, whereas wages are determined by market forces. The estimation CHARACTERIZATION OF INEQUALITY CHANGES 67 stage specifies models for wages and hours of work, which are used in the simulation stage described earlier. The econometric specification of the model is similar to the one used by Bourguignon, Fournier, and Gurgand (2001), which corre- sponds to the reduced form of the labor decisions model originally proposed by Heckman (1974). Heckman shows how it is possible to derive an estimable reduced form starting from a structural system obtained from a utility maximization problem of labor-consumption decisions. Leaving technical details aside, the scheme proposed by Heckman has the following structure. Individuals allocate hours to work and domestic activities (or leisure) to maximize their utility subject to time, wealth, wages, and other constraints. As usual, the solution to this optimization problem can be characterized as demand relations for goods and leisure as functions of the relevant prices. Under general conditions, it is possible to invert these func- tions to obtain prices and wages as functions of quantities of goods and leisure consumed (or their counterpart, hours of work). In par- ticular, the wages obtained in this fashion (denoted as w) are inter- preted as marginal valuations of labor, which will be a function of hours of work and other personal characteristics, and represent the minimum wage for which the individual would accept work for a determined number of hours. In equilibrium, if the individual decides to work, the number of hours devoted to labor should equate their marginal value w with the wage effectively received. Conversely, a decision not to work is made if the marginal value is greater than the wage offered, given the individual's personal characteristics. This discussion suggests a way to determine wages demanded by individuals. In parallel it is possible to model market determinants of wages offered (w) as a function of characteristics such as years of education, experience, and age as a standard Mincer equation (Mincer 1974). In equilibrium, it is assumed that the number of hours of work adjusts to make w = w. The demand-supply relations discussed so far are structural forms in the sense that they reflect relevant economic behavior in which wages offered and demanded depend on the number of hours of work. Under general conditions, it is possible to derive a reduced form for the equilibrium relations in which wages and hours of work are expressed as functions of the variables taken as exoge- nous. In this way, the model has two equations--one for wages (w) and one for the number of hours of work (L)--and both are a function of factors taken as given that affect wages (X1) and hours (X2), which may or may not have elements in common. The error terms 1 and 2 represent unobservable factors that affect the deter- mination of endogenous variables. 68 GASPARINI, MARCHIONNI, AND ESCUDERO According to the characteristics of the problem, we observe pos- itive values of w and L for a particular individual if and only if the individual actually works. If the person does not work, we only know that the offered wage is smaller than the wage demanded. Consequently, the reduced form model for wages and hours of work is specified as follows: (3.11) wi = X1 + 1 i i i = 1, . . . , N (3.12) Li = X2 + 2 i i with wi = wi if Li > 0 wi = 0 if Li 0 Li = Li if Li > 0 Li = 0 if Li 0 where wi and Li correspond to observed wages and hours of work, respectively. This notation emphasizes that, consistent with the data used for the estimation, observed wages for a nonworking individ- ual are zero. Following Heckman (1979), for estimation purposes we assume that 1 and 2 have a bivariate normal distribution with E(1 ) = i i i E(2 ) = 0, variances 12 and 22, and correlation coefficient . This i particular specification corresponds to the Tobit type III model in Amemiya's (1985) classification. Even though it is possible to estimate all the parameters using a full information maximum likelihood method, we adopted a limited information approach that has notable computational advantages. If instead of hours of work, we had information only about whether or not the individual works, the model would correspond to the type II model in Amemiya's classification, whose parameters can be esti- mated on the basis of a simple selectivity model. More specifically, the regression equation would be the wage equation, and the selection equation would be a censored version of the labor supply equation, simply indicating whether or not the individual works. Table 3.3 shows the estimation results of these equations for our case. Conversely, the hours of work equation corresponds to the Tobit type I model in Amemiya's classification in which the variable is observed only if it is positive. In this case, the parameters of interest could be estimated using a standard censored regression Tobit model (see table 3.6). This strategy is consistent but not fully efficient. In any case, the efficiency loss is not necessarily significant for a small sample. CHARACTERIZATION OF INEQUALITY CHANGES 69 Unobservable Factors Unobservable factors that affect wages are modeled as regression error terms of the wage equation 3.11. Their mean is trivially nor- malized to zero, and their variance is estimated as an extra parame- ter in the Heckman procedure. To simulate the effect of changes in those unobservable factors between t and t on inequality, we have rescaled the estimated residuals of the wage equation of year t by t /t, where is the estimated standard deviation of the wage equation.19 To study employment effects, the decomposition methodology requires simulating earnings for people who do not work. Because we do not observe wages, we cannot apply equations 3.11 and 3.12 to estimate the unobservables. For each individual in that situation, we assigned as an "error term" a random draw from the bivariate normal distribution implicit in the wage-labor supply model (equa- tions 3.11 and 3.12), whose parameters are consistently estimated by the Heckman procedure. Error terms were drawn from the bivari- ate normal distribution and a prediction (based on observable characteristics, estimated parameters, and sampled errors) was com- puted for wages and hours worked. If the resulting prediction yields positive hours of work (and the prediction is inconsistent with observed behavior in this group), the error term is sampled again until nonpositive hours of work are predicted. Individual Characteristics For the estimation of the education effect, it is necessary to simulate the educational structure of year t on year t population. Instead of following Bourguignon, Fournier, and Gurgand (2001) and estimat- ing a parametric equation that relates individual educational level to other individual characteristics (age and gender), we apply a rough nonparametric mechanism. We divide the adult population in homogeneous groups by gender and age and then replicate the edu- cational structure of a given cell in year t into the corresponding cell in year t. Results This section reports the results of performing the decompositions described in the methodology using the estimation strategy outlined in the previous section. The objective is to shed light on the quanti- tative relevance of the various phenomena discussed earlier in this chapter on inequality changes during 1986­98. 70 GASPARINI, MARCHIONNI, AND ESCUDERO Before we show the results, two explanations are in order. First, the decompositions are path dependent. Hence, we report the results using alternatively t and t as the base year. Second, the simulations are carried out for the whole distribution. To save space, we show only the results for the Gini coefficient. There were not significant variations when other indices were used.20 Tables 3.9 to 3.11 show the results both with t and t as base years. Table 3.12 reports the average of these results.21 A positive number indicates an unequalizing effect. A large number compared with the other figures in the column suggests a significant effect. For instance, the price effect of education on the earnings distribution in the 1992­98 period (column ii) is 2.9. This finding roughly means that the Gini would have increased 2.9 points if only the returns to education (that is, the coefficients of the educational dummy vari- ables in the wage equation) had changed between those years. The number 2.9 tells us two things: (a) because it is a positive number, it implies that the returns to the education effect increased inequality, and (b) because it is large compared with the other numbers in the column, it indicates that the change in the returns to education was a very significant factor affecting inequality in the distribution of earnings. The rest of this section is devoted to studying the effects on the earnings and equivalent household labor income distributions of the seven factors that were discussed earlier, with the help of tables 3.9 to 3.12. Returns to Education Table 3.12 confirms the presumptions of the earlier section on basic facts and sources for change. Changes in the returns to education had an equalizing effect on the individual labor income distribution between 1986 and 1992 and a strong unequalizing effect over the next six years. The effects on the equivalent income distribution were similar. Over the whole period from 1986 to 1998, changes in the returns to education (in terms of hourly wages) represented an important inequality increasing factor. Gender Wage Gap As expected, changes in the gender parameter of the wage equation implied an equalizing effect on the earnings distribution. During the past decade, the gender wage gap has shrunk substantially. Given that women earn less than men, that movement had an unambigu- ous inequality-decreasing effect on the earnings distribution. CHARACTERIZATION OF INEQUALITY CHANGES 71 Table 3.9 Decompositions of the Change in the Gini Coefficient: Earnings and Equivalent Household Labor Income in Greater Buenos Aires, 1986­92 Using 1992 coefficients Earnings Equivalent income Indicator Level Change Level Change 1986 observed 39.4 40.3 1992 observed 37.7 -1.7 41.0 0.7 Effect 1. Returns to education 38.9 -0.5 39.7 -0.6 2. Gender wage gap 38.4 -1.0 40.4 0.1 3. Returns to experience 41.5 2.1 40.0 -0.3 4. Unobservable factors 39.9 0.5 40.7 0.4 5. Hours of work 39.8 0.4 41.7 1.4 6. Employment 39.4 0.0 40.1 -0.3 7. Education 39.2 -0.2 40.5 0.1 8. Other factors -3.1 -0.1 Using 1986 coefficients Earnings Equivalent income Indicator Level Change Level Change 1986 observed 39.4 -1.7 40.3 0.7 1992 observed 37.7 41.0 Effect 1. Returns to education 39.2 -1.5 42.2 -1.2 2. Gender wage gap 38.8 -1.1 40.9 0.1 3. Returns to experience 36.4 1.3 41.7 -0.7 4. Unobservable factors 37.2 0.5 40.7 0.3 5. Hours of work 38.8 -1.2 40.4 0.6 6. Employment 37.6 0.1 41.0 0.0 7. Education 38.6 -1.0 40.8 0.2 8. Other factors 1.2 1.2 Average changes Indicator Earnings Equivalent income 1986­92 observed -1.7 0.7 Effect 1. Returns to education -1.0 -0.9 2. Gender wage gap -1.0 0.1 3. Returns to experience 1.7 -0.5 4. Unobservable factors 0.5 0.4 5. Hours of work -0.4 1.0 6. Employment 0.0 -0.1 7. Education -0.6 0.2 8. Other factors -0.9 0.5 Note: The earnings distribution includes those individuals with Yit > 0 and Yit(kt ) > 0. The equivalent household labor income distribution includes those indi- viduals with Yit 0 and Yit(kt ) 0. Nonlabor income is not considered. q q Source: Authors' calculations based on the EPH, Greater Buenos Aires, October. 72 GASPARINI, MARCHIONNI, AND ESCUDERO Table 3.10 Decompositions of the Change in the Gini Coefficient: Earnings and Equivalent Household Labor Income in Greater Buenos Aires, 1992­98 Using 1998 coefficients Earnings Equivalent income Indicator Level Change Level Change 1992 observed 37.7 41.0 1998 observed 44.9 7.2 49.5 8.5 Effect 1. Returns to education 40.8 3.2 43.8 2.7 2. Gender wage gap 37.3 -0.4 41.0 0.0 3. Returns to experience 36.8 -0.9 41.9 0.8 4. Unobservable factors 39.9 2.2 42.8 1.8 5. Hours of work 40.7 3.0 42.9 1.9 6. Employment 37.5 -0.2 41.0 0.0 7. Education 38.2 0.5 41.3 0.2 8. Other factors -0.2 1.0 Using 1992 coefficients Earnings Equivalent income Indicator Level Change Level Change 1992 observed 37.7 7.2 41.0 8.5 1998 observed 44.9 49.5 Effect 1. Returns to education 42.2 2.7 46.5 3.0 2. Gender wage gap 45.3 -0.4 49.6 -0.1 3. Returns to experience 45.9 -1.0 48.8 0.7 4. Unobservable factors 43.1 1.8 48.0 1.5 5. Hours of work 43.0 1.9 47.8 1.7 6. Employment 44.8 0.1 49.2 0.3 7. Education 44.8 0.1 48.7 0.8 8. Other factors 2.0 0.6 Average changes Indicator Earnings Equivalent income 1992­98 observed 7.2 8.5 Effect 1. Returns to education 2.9 2.8 2. Gender wage gap -0.4 -0.1 3. Returns to experience -0.9 0.7 4. Unobservable factors 2.0 1.7 5. Hours of work 2.5 1.8 6. Employment -0.1 0.1 7. Education 0.3 0.5 8. Other factors 0.9 0.8 Note: The earnings distribution includes those individuals with Yit > 0 and Yit(kt ) > 0. The equivalent household labor income distribution includes those indi- viduals with Yit 0 and Yit(kt ) 0. Nonlabor income is not considered. q q Source: Authors' calculations based on the EPH, Greater Buenos Aires, October. CHARACTERIZATION OF INEQUALITY CHANGES 73 Table 3.11 Decompositions of the Change in the Gini Coefficient: Earnings and Equivalent Household Labor Income in Greater Buenos Aires, 1986­98 Using 1998 coefficients Earnings Equivalent income Indicator Level Change Level Change 1986 observed 39.4 40.3 1998 observed 44.9 5.5 49.5 9.2 Effect 1. Returns to education 41.1 1.7 42.0 1.7 2. Gender wage gap 38.1 -1.3 40.5 0.1 3. Returns to experience 39.8 0.4 40.6 0.2 4. Unobservable factors 42.2 2.8 42.7 2.4 5. Hours of work 42.3 3.0 43.5 3.2 6. Employment 39.2 -0.2 40.1 -0.2 7. Education 39.8 0.4 41.2 0.9 8. Other factors -1.3 0.9 Using 1986 coefficients Earnings Equivalent income Indicator Level Change Level Change 1986 observed 39.4 5.5 40.3 9.2 1998 observed 44.9 49.5 Effect 1. Returns to education 43.0 1.9 47.6 1.9 2. Gender wage gap 46.4 -1.5 49.7 -0.2 3. Returns to experience 44.5 0.4 49.2 0.3 4. Unobservable factors 42.7 2.2 47.7 1.8 5. Hours of work 43.5 1.4 46.7 2.8 6. Employment 44.7 0.2 49.4 0.1 7. Education 45.7 -0.8 48.5 1.0 8. Other factors 1.7 1.6 Average changes Indicator Earnings Equivalent income 1986­98 observed 5.5 9.2 Effect 1. Returns to education 1.8 1.8 2. Gender wage gap -1.4 0.0 3. Returns to experience 0.4 0.3 4. Unobservable factors 2.5 2.1 5. Hours of work 2.2 3.0 6. Employment 0.0 -0.1 7. Education -0.2 0.9 8. Other factors 0.2 1.2 Note: The earnings distribution includes those individuals with Yit > 0 and Yit(kt ) > 0. The equivalent household labor income distribution includes those indi- viduals with Yit 0 and Yit(kt ) 0. Nonlabor income is not considered. q q Source: Authors' calculations based on the EPH, Greater Buenos Aires, October. in (vi) 9.2 1.8 0.0 0.3 2.1 3.0 earY 0.1- 0.9 1.2 includes 1986­98 Base income distribution the (v) 8.5 2.8 household 0.1- 0.7 1.7 1.8 0.1 0.5 0.8 income 1992­98 labor Changing Equivalent household Results (iv) 0.7 0.9- 0.1 0.5- 0.4 1.0 0.1- 0.2 0.5 1986­92 equivalent verageA The 0. >) cient:fi t (iii) 5.5 1.8 1.4- 0.4 2.5 2.2 0.0 0.2- 0.2 k( . it 1986­98 Y Coef and October 0 > considered. Gini it Y not Aires, is the (ii) 7.2 2.9 in Earnings 1992­98 0.4- 0.9- 2.0 2.5 0.1- 0.3 0.9 with Buenos income Greater individuals Change Nonlabor EPH, the Periods those 0. (i) the of 1986­92 1.7- 1.0- 1.0- 1.7 0.5 0.4- 0.0 0.6- 0.9- ) t k( on q includes it Y Selected based and 0 q it Aires, distribution Y gap factors Decomposition with calculations' education experience Buenos to wage to work earnings of factors Authors 3.12 The individuals ableT Returns Gender Returns Unobservable Hours Employment Education Other Note: Source: Greater Indicator Observed Effect 1. 2. 3. 4. 5. 6. 7. 8. those 74 CHARACTERIZATION OF INEQUALITY CHANGES 75 However, the gender effect becomes negligible in the equivalent household labor income distribution. Earlier, we argued that, on the one hand, the shrinking gender wage gap could increase inequality in the household income distribution because of the concentration of female workers in the upper part of that distribution. On the other hand, however, it could decrease inequality because women's earnings are a more significant part of total resources in low-income households. It appears that these two factors cancel each other out. Returns to Experience (Age) The age coefficients in the wage equations of 1986 and 1998 are not substantially different. This fact is translated into a small value for the effect of returns to experience seen in columns iii and vi of table 3.12. Changes were greater in the two subperiods. For instance, the relative increase in earnings for people older than 50 between 1992 and 1998 implies a sizable equalizing effect on the earnings distribution. Instead, the sign of the returns to the experi- ence effect in column v is positive, perhaps because of the different location of the age groups in the earnings and household income distributions, as argued in the section on basic facts and sources for changes. Unobservables Changes in endowments and returns to unobservable factors have implied unequalizing changes in wages, which have translated into unequalizing changes in the earnings and equivalent household labor income distributions. These effects were particularly strong in the 1992­98 period. The results of the decompositions suggest that the increase in the dispersion of unobservables was one of the main factors affecting earnings and household inequality over the period under analysis. Hours of Work To assess the relevance of changes in hours of work and employ- ment status on inequality, we simulate the distribution in a base year using the parameters of the Tobit employment equations of table 3.6 for a different year. To single out the effect of changes in hours worked, we ignore observations for people who changed labor status between the base year and the simulation (that is, we keep their actual earnings) and change hours of work only for indi- viduals who worked both in the base year and in the simulation. As 76 GASPARINI, MARCHIONNI, AND ESCUDERO discussed earlier, the 1990s witnessed a substantial fall in hours of work by low-income workers and an increase in hours of work for the rest. From columns ii and v of table 3.12, it appears that this fact has had a very significant effect on the earnings and household income distributions. Employment To assess the effect of changes in individual employment status, we assign zero earnings to people with nonpositive simulated hours of work, whereas people who worked in the simulation are assigned the actual base year earnings.22 Unemployment rates skyrocketed in the mid-1990s and have remained very high since then. There is a widespread belief that the increase in unemployment is the main cause of the strong increase in household inequality. Results in column v of table 3.12 suggest that we scale down those conclusions because the employment effect is positive but negligible.23 Two reasons contribute to reduce the effect of the great increase in unemployment on household inequality. First, during 1992­98, the unemployment rate jumped, but the employment rate did not change much, implying a minor change in the number of individuals without earnings. As stressed earlier, this number, rather than the number of unemployed people, is the relevant number for household inequality. The second point is that the newly unemployed (those who did not work in 1998 but who would have worked given the 1992 parameters) had extremely low individual labor incomes in 1992 (just 10 percent of the rest), but their equivalent household incomes were not far from the median (75 percent of the median). This finding implies that in the simulation using the 1992 parameters, the change in labor status (from unemployed to employed) of some individuals would not have a very strong effect on household inequality because (a) those individuals had very low incomes anyway, and (b) they were not concentrated in the lower tail of the household income distribution. Education Argentina has witnessed a dramatic change in the educational com- position of its population in the past two decades. According to the results shown in table 3.12, that change had a mild inequality- increasing effect on the earnings and equivalent household income distributions in the 1990s. This result is not surprising given our earlier discussion on sources for change. CHARACTERIZATION OF INEQUALITY CHANGES 77 Other Factors and Interactions The last row in table 3.12 is calculated as a residual. It encompasses the effects of interaction terms and of many factors not considered in the analysis. According to table 3.12, these terms are not too large, implying either that the factors not considered in the analysis are not extremely important or that they tend to compensate for each other. Concluding Remarks This chapter contributes to a highly discussed topic in Argentina-- the increase in income inequality--by using microeconometric decompositions methodology. This technique allows us to assess the relevance of various factors that affected inequality between 1986 and 1998. The results of the chapter suggest that the small change in inequality between 1986 and 1992 is the result of mild forces that compensated for each other. In contrast, between 1992 and 1998, nearly all effects played in the same direction. Changes in the returns to education and experience, in the endowments of unobservable factors and their remunerations, and in hours of work and employ- ment status, as well as the transformation of the educational structure of the population, have all had some role in increasing inequality in Argentina to unprecedented levels. Even the decrease in the wage gap between men and women, which is a potential force for reducing inequality, has not induced a significant decrease in household income inequality. The increase in the returns to education and unobservable factors and the relative fall in hours of work for unskilled workers are particularly important to characterize the growth in inequality. Per- haps surprisingly, although Argentina witnessed dramatic changes in the gender wage gap, the unemployment rate, and the educa- tional structure, these factors appear to have had only a mild effect on the household income distribution. Notes This article is part of a project on income distribution financed by the Convenio Ministerio de Economía de la Provincia de Buenos Aires and the Facultad de Ciencias Económicas de la Universidad Nacional de La Plata. We appreciate the financial support of these institutions. We are grateful to the editors of this volume and seminar participants at the Universidad Nacional de La Plata, Universidad Torcuato Di Tella, Latin American and 78 GASPARINI, MARCHIONNI, AND ESCUDERO Caribbean Economic Association meetings in Rio de Janeiro, meetings of the Asociación Argentina de Economía Política in Córdoba, and Hewlett Foundation Conference at the University of California at Los Angeles for helpful comments and suggestions. We also thank Verónica Fossati and Alvaro Mezza for efficient research assistance. All opinions and remaining errors are the responsibility of the authors. 1. These values correspond to the distribution of the equivalent house- hold income in Greater Buenos Aires. All figures in this chapter were calcu- lated from the Permanent Household Survey (Encuesta Permanente de Hogares, or EPH) for the Greater Buenos Aires area, because data for the rest of urban Argentina are available only from the beginning of the 1990s. Following Buhmann and others (1988), the equivalent household income was obtained by dividing household income by the number of equivalent adults--taken from the National Institute of Statistics and Census (INDEC)--raised to 0.8, a parameter that implies mild household economies of scale. 2. The use of other indices does not change the main conclusions derived from the graph. See Gasparini and Sosa Escudero (2001). 3. These broad trends are also reported by other authors. See Altimir, Beccaria, and González Rozada (2001); Gasparini, Marchionni, and Sosa Escudero (2001); Lee (2000); and Llach and Montoya (1999). 4. All households with valid incomes (including those with no income) were considered in the equivalent household labor income statistics. Ignoring those with zero income did not alter the main results; see our companion paper, Gasparini, Marchionni, and Sosa Escudero (2000). Only workers with positive earnings were included in the individual labor income statistics. Results in table 3.1 are robust to changes in inequality indices (see our companion paper). 5. Gasparini and Sosa Escudero (2001) used bootstrap methods to show that it is possible to reject the null hypothesis that the Gini coefficients of 1986 and 1998 are equal. Although the same is true for the Gini coeffi- cients of 1992 and 1998, one cannot reject the null hypothesis that the Gini coefficients of 1986 and 1992 are equal. 6. Throughout the paper, wage refers to hourly labor income earned by wage workers and self-employed workers. 7. We refer to returns to education as the change in hourly wages owing to a change in the educational level (and not in years of education). It takes approximately seven years to complete primary school, five or six additional years to complete high school, and approximately five years to complete college. 8. The increasing returns to education could be caused by a selectivity bias in the schooling decision. High-ability people have lower costs of acquiring knowledge and hence are more prone to make a higher human capital investment. CHARACTERIZATION OF INEQUALITY CHANGES 79 9. Surprisingly, the time pattern is the opposite for other members of the household. However, because the number of working individuals in this group is much smaller than in the head of household group, the global con- clusion of a narrowing gender wage gap holds. 10. Although 44 percent of working women are in the highest income quintile of the equivalent household labor income distribution, only 25 per- cent of men are in that quintile (the Greater Buenos Aires area, 1998). At the other extreme, 6 percent of working women are in the lowest income quintile, whereas 9 percent of men are in that quintile. 11. In 1998, the mean wage for workers between 50 and 60 years old was 86 percent of the overall mean. 12. For instance, although 22 percent of working household heads in their 50s are in the richest quintile of the earnings distribution, 28 percent are in the top quintile of the equivalent household labor income distribu- tion (the Greater Buenos Aires area, 1998). Instead, for working household heads in their 30s, the figures are 36 percent and 28 percent. 13. This implies an unemployment rate of 3.8 percent in 1986 and 9.9 per- cent in 1998. These figures refer to our restricted sample. The unemployment rates reported by INDEC for the whole country are somewhat higher. 14. Furthermore, there are no signs that the strong increase in unem- ployment has translated into a disproportionate increase in adults with no income in any of the educational groups. The results of the selection equa- tions in table 3.3 are in line with this conclusion. See Gasparini, Marchionni, and Sosa Escudero (2000) for more information. 15. Between 1986 and 1992, the greatest increase in share was for adults who had completed or not completed secondary school, a group with wages close to the mean and with relative low dispersion; therefore, we expect an equalizing education effect on the earnings distribution. 16. See also Altimir, Beccaria, and González Rozada (2001) and the other chapters in this book. 17. It is typical to restrict this distribution to those individual with Yit > 0. We followed that practice in the empirical implementation. 18. In the empirical implementation, we ignore Yjt. 0 19. Under bivariate normal assumptions implicit in the Heckman model, once the correlation between unobservable factors affecting wages and hours worked is kept constant, all remaining effects of unobservable factors on wages come through the variance. Machado and Mata (1998) allowed for heterogeneous behavior of the error term using quintile regres- sion methods. 20. See Gasparini, Marchionni, and Sosa Escudero (2000). 21. According to table 3.12, the observed Gini coefficient of the indi- vidual earnings distribution grew 7.2 points between 1992 and 1998. The return to education in column ii is 2.9. This figure is the average of two numbers: (a) the difference between the Gini that results from applying 80 GASPARINI, MARCHIONNI, AND ESCUDERO 1998 vector ed of educational dummy variables to the 1992 distribution and the actual Gini in 1992, and (b) the difference between the actual Gini in 1998 and the Gini that results from applying 1992 vector ed to the 1998 distribution. 22. Some people did not work in the base year but did work in the sim- ulation. For those individuals, we simulated the base year hours of work and wages using the base year parameters of equations 3.11 and 3.12 and adding error terms obtained by following the procedure described the section on estimation strategy. 23. Naturally, the role of unemployment as the main source of the increase in inequality can be stressed again if it is argued that the fall in the relative wages of the poorest workers was generated by a relative increase in the unemployment rate of that group. However, the evidence on this point is far from conclusive. References Altimir, Oscar, Luis Beccaria, and Martín González Rozada. 2001. "La Evolución de la Distribución del Ingreso Familiar en la Argentina: Un Análisis de Determinantes." Serie de Estudios en Finanzas Públicas 7. Maestría en Finanzas Públicas Provinciales y Municipales, Universidad Nacional de La Plata, La Plata, Argentina. Amemiya, Takeshi. 1985. Advanced Econometrics. Cambridge, Mass.: Harvard University Press. Bourguignon, François, Martin Fournier, and Marc Gurgand. 2001. "Fast Development with a Stable Income Distribution: Taiwan, 1979­94." Review of Income and Wealth 47(2): 139­63. Buhmann, Brigitte, Lee Rainwater, Guenther Schmaus, and Timothy Smeeding. 1988. "Equivalence Scales, Well-Being, Inequality and Poverty: Sensitivity Estimates across Ten Countries Using the Luxembourg Income Study Database." Review of Income and Wealth 34: 115­42. Gasparini, Leonardo, and Walter Sosa Escudero. 2001. "Assessing Aggre- gate Welfare: Growth and Inequality in Argentina." Cuadernos de Economía (Latin American Journal of Economics) 38(113): 49­71. Gasparini, Leonardo, Mariana Marchionni, and Walter Sosa Escudero. 2000. "La Distribución del Ingreso en la Argentina y en la Provincia de Buenos Aires." Cuadernos de Economía 49: 1­50. ------. 2001. La Distribución del Ingreso en la Argentina: Perspectivas y Efectos Sobre el Bienestar. Buenos Aires: Fundación Arcor. Heckman, James. 1974. "Shadow Prices, Market Wages, and Labor Supply." Econometrica 42: 679­94. ------. 1979. "Sample Selection Bias as a Specification Error." Economet- rica 47: 153­61. CHARACTERIZATION OF INEQUALITY CHANGES 81 Kuznets, Simon. 1955. "Economic Growth and Income Inequality." American Economic Review 45(1): 1­28. Lee, Haeduck. 2000. "Poverty and Income Distribution in Argentina. Patterns and Changes." In Argentina: Poor People in a Rich Country. Washington, D.C.: World Bank. Llach, Juan, and Silvia Montoya. 1999. En Pos de la Equidad. La Pobreza y la Distribución del Ingreso en el Area Metropolitana del Buenos Aires: Diagnóstico y Alternativas de Política. Buenos Aires: Instituto de Estu- dios sobre la Realidad Argentina y Latinoamericana. Machado, José, and José Mata. 1998. "Sources of Increased Inequality." Universidade Nova de Lisboa, Lisbon. Processed. Mincer, Jacob. 1974. Schooling, Experience, and Earnings. New York: Columbia University Press for National Bureau for Economic Research. 4 The Slippery Slope: Explaining the Increase in Extreme Poverty in Urban Brazil, 1976­96 Francisco H. G. Ferreira and Ricardo Paes de Barros By both the standards of its own previous growth record during the "Brazilian miracle" years of 1968­73 and those of other leading developing countries thereafter (notably in Asia), the two decades between 1974 and 1994--between the first oil shock and the return of stability with the Real plan--were dismal for Brazil. Primarily, these years were characterized by persistent macroeconomic dis- equilibrium, the main symptoms of which were stubbornly high and accelerating inflation and a gross domestic product (GDP) time series marked by unusual volatility and a very low positive trend. Figures 4.1 and 4.2 plot annual inflation and GDP per capita growth rates for the 1976­96 period. The macroeconomic upheaval involved three price and wage freezes (during the Cruzado Plan of 1986, the Bresser Plan of 1987, and the Verão Plan of 1989), all of which were followed by higher inflation rates. Then there was one temporary financial asset freeze (with the Collor Plan of 1990), and finally a successful currency reform followed by the adoption of a nominal anchor in 1994 (the Real Plan). In less than a decade, the national currency changed names four times.1 Throughout the period, macroeconomic policy was almost without exception characterized by relative fiscal laxity and growing monetary stringency. 83 Figure 4.1 Macroeconomic Instability in Brazil: Inflation Inflation rate (percent) 1,800 1,706 1,600 1,509 1,400 1,328 1,200 1,107 1,000 900 800 703 600 400 319 350 200 158 174 190 42 36 38 62 80 84 86 57 20 9 0 1976 1977 1978 1979 19801981 1982 1983 1984 19851986 1987 1988 1989 1990 19911992 1993 1994 19951996 Source: Fundação Getulio Vargas 1999 and Instituto Brasileiro de Geografia e Estatística 1999. Figure 4.2 Macroeconomic Instability in Brazil: Per Capita GDP Per capita GDP growth rate (percent) 10 8 8 7 6 6 5 4 4 4 3 3 2 2 3 2 2 1 1 0 1 2 1 2 2 4 5 6 6 6 8 1976 1977 1978 1979 19801981 1982 1983 1984 19851986 1987 1988 1989 1990 19911992 1993 1994 19951996 Source: Instituto Brasileiro de Geografia e Estatística 1999. 84 THE SLIPPERY SLOPE: EXPLAINING THE INCREASE 85 In addition, substantial structural changes were taking place. Brazil's population grew by 46.6 percent between 1976 and 19962 and became more urban (the urbanization rate rose from 68 percent to 77 percent). The average education of those 10 years or older rose from 3.2 to 5.3 effective years of schooling.3 Open unemploy- ment grew steadily more prevalent. The sectoral composition of the labor force moved away from agriculture and manufacturing and toward the service industries. The degree of formalization of the labor force declined substantially: the proportion of formal workers (wage workers with formal documentation) dropped by nearly half, from just less than 60 percent to just more than 30 percent of all workers (see table 4.1). However, despite the macroeconomic tur- moil and continuing structural changes, a casual glance at the head- line inequality indicators and poverty incidence measures reported at the bottom of table 4.1 suggests that little changed in the Brazil- ian urban income distribution between 1976 and 1996. Nevertheless, as is often the case, casual glances at the data can be misleading. This apparent distributional stability belies a number of powerful, and often countervailing, changes in four realms: the returns to education in the labor markets, the distribution of educa- tional endowments over the population, the pattern of occupational choices, and the demographic structure resulting from household fertility choices. In this chapter, we discuss two puzzles about the evolution of Brazil's urban income distribution in the 1976­96 period and suggest explanations for them. The first puzzle is posed by the combination of growth in mean incomes and stable or slightly declining inequality on the one hand and rising extreme poverty on the other hand. We argue that this enigma can be explained only by the growth in the size of a group of very poor households, who appear to be effectively excluded both from the labor markets and the system of formal safety nets. This group is trapped in indigence at the very bottom of the urban Brazilian income distribution and contributes to rises in poverty measures, particularly to bottom-sensitive measures like the depth [P(1)] and severity [P(2)] of poverty.4 This is especially the case when poverty is defined with respect to a low poverty line. E(0) fails to respond to this group because of a rise in the share of families report- ing (valid) zero incomes.5 Other inequality measures, which also fell slightly between 1976 and 1996, compensated for these increases in poverty by declining dispersion further along the distribution. How- ever, the reality of the loss in income to the poorest group of urban households is starkly captured by figure 4.3, which plots the observed (truncated) Pen parades for the four years being studied.6 The main endogenous channel through which the marginalization 86 FERREIRA AND PAES DE BARROS Table 4.1 General Economic Indicators for Brazil, Selected Years Economic indicator 1976 1981 1985 1996 Gross national product per capita (in constant 1996 reais)a 4,040 4,442 4,540 4,945 Annual inflation rate (percent)a,b 42 84 190 9 Open unemployment (percent)c 1.82 4.26 3.38 6.95 Average years of schoolingd,e 3.23 4.01 4.36 5.32 Rate of urbanizatione 67.8 77.3 77.3 77.0 Self-employed workers (as a percentage of the labor force)e 27.03 26.20 26.19 27.21 Percentage of formal employmente,f 57.76 37.97 36.41 31.51 Mean (urban) household per capita income (in constant 1996 reais)e,g 265.10 239.08 243.15 276.46 Inequality (Gini)e 0.595 0.561 0.576 0.591 Inequality (Theil T)e 0.760 0.610 0.657 0.694 Poverty incidence (R$30 per month)e 0.0681 0.0727 0.0758 0.0922 Poverty incidence (R$60 per month)e 0.2209 0.2149 0.2274 0.2176 a. Annual figure is given. b. Rate shown is for January to December. The 1976 figure is based on the Índice Geral de Preços­Disponibilidade Interna (General Price Index). All other years are based on the Índice Nacional de Preços Consumidor­Real (National Consumer Price Index). c. Rate is based on the Instituto Brasileiro de Geografia e Estatística (Brazilian Geographical and Statistical Institute) Metropolitan Unemployment Index. d. Rate is for all individuals 10 years of age or older in urban areas. e. Rate is calculated from the urban Pesquisa Nacional por Amostra de Domicilios (National Sample Survey) samples by the authors. See appendix 4A. f. Defined as the number of formal sector (com carteira) employees as a fraction of the sum of all wage employees and self-employment workers. g. Urban only, monthly and spatially deflated. Source: Authors' calculations. of this group is captured in our model is a shift in their occupational "decisions" away from either wage or self-employment, toward unemployment or out of the labor force.7 Second, the evidence we examine reveals general downward shifts in the earnings-education profile, controlling for age and gender, in both the wage and self-employment sectors over the 20-year data THE SLIPPERY SLOPE: EXPLAINING THE INCREASE 87 Figure 4.3 Truncated Pen Parades, 1976­96 Income (R$) 200 180 160 140 120 100 80 60 40 20 0 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 Percentile 1976 1981 1985 1996 Source: Authors' calculations. period, 1976­96 (figure 4.4). Despite a slight convexification of the profile, the magnitude of the shift implies a decline in the (average) rate of return to education for all relevant education levels. Simi- larly, average returns to experience also fell unambiguously for 0 to 50 years of experience (figure 4.5). The combined effect of changes in these returns--the price effects--was an increase in simulated poverty for all measures and for both lines. Simulated inequality also rose, albeit much more mildly. Both effects were exacerbated when the changes (to 1996) of the determinants of labor-force par- ticipation decisions also were taken into account. The second puz- zle, then, is what forces counterbalance these price and occupa- tional choice effects to explain the observed stability in inequality and "headline poverty."8 We found that these forces were funda- mentally the combination of increased educational endowments, which move workers up along the flattening earnings-education slopes, with an increase in the correlation between family income and family size, caused by a more-than-proportional reduction in dependency ratios and family sizes for the poor. This demographic 88 FERREIRA AND PAES DE BARROS Figure 4.4 Plotted Quadratic Returns to Education (Wage Earners) Returns 25 20 15 10 5 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Years of schooling 1976 1981 1985 1996 Source: Authors' calculations. factor had direct effects on per capita income--through a reduction in the denominator--but also had indirect effects--through partici- pation decisions leading to higher incomes. Naturally, the coexistence of these two phenomena or puzzles implies that these last combined educational and demographic effects did not extend to all of Brazil's poor. At the very bottom, some of the poor are being cut off from the benefits of greater edu- cation and economic growth and remain trapped in indigence. We address these issues by means of a microsimulation-based decomposition of distributional changes, which builds on the work of Almeida dos Reis and Paes de Barros (1991) and of Juhn, Murphy, and Pierce (1993). The approach, which was described in chapters 1 and 2 of the book, has two distinguishing features. First, unlike other dynamic inequality decompositions, such as the approach proposed by Mookherjee and Shorrocks (1982), it decom- poses the effects of changes on an entire distribution rather than on a scalar summary statistic (such as the mean log deviation). This approach allows for much greater versatility: within the same THE SLIPPERY SLOPE: EXPLAINING THE INCREASE 89 Figure 4.5 Plotted Quadratic Returns to Experience (Wage Earners) Returns 6 5 4 3 2 1 0 0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 Years of experience 1976 1981 1985 1996 Source: Authors' calculations. framework, a wide range of simulations can be performed to inves- tigate the effects of changes in specific parameters on any number of inequality or poverty measures (and then for any number of poverty lines or assumptions about equivalence scales). Second, the evolving distribution, which it decomposes, is a distribution of household incomes per capita (with the recipient unit generally being the indi- vidual). Therefore, moving beyond pure labor-market studies, the approach explicitly takes into account the effect of household com- position on living standards and participation decisions. As it turns out, these factors are of great importance for a fuller understanding of the dynamics at hand. The remainder of the chapter is organized as follows. The next section briefly reviews the main findings of the literature on income distribution in Brazil over the period of study and presents summary statistics and dominance comparisons for the four observed distri- butions analyzed: 1976, 1981, 1985, and 1996. The methodology section outlines how the basic model in chapter 2 of this book was adapted to the case of Brazil. The section on estimating the model 90 FERREIRA AND PAES DE BARROS presents the results of the estimation stage and discusses some of its implications. It is followed by a section presenting the main results of the simulation stage and decomposing the observed changes in poverty and inequality. The chapter then concludes and draws some policy implications. Income Distribution in Brazil from 1976 to 1996: A Brief Review of the Literature and the Data Set There is little disagreement in the existing literature about the broad trends in Brazilian inequality since reasonable data first became available in the 1960s. The Gini coefficient rose substantially during the 1960s, from around 0.500 in 1960 to 0.565 in 1970 (see Bonelli and Sedlacek 1989).9 There was a debate over the causes of this increase, spearheaded by Albert Fishlow (1972) on the one hand and Carlos Langoni (1973) on the other. However, there was gen- eral agreement that the 1960s saw substantially increased disper- sion in the Brazilian income distribution.10 The 1970s displayed a more complex evolution. Income inequal- ity rose between 1970 and 1976, reached a peak in that year, and then fell--both for the distribution of total individual incomes in the economically active population and for the complete distribu- tion of household per capita incomes--from 1977 to 1981. This decline was almost monotonic, except for an upward blip in 1980 (Bonelli and Sedlacek 1989; Hoffman 1989; Ramos 1993). The recession year of 1981 was a local minimum in the inequality series, whether measured by the Gini coefficient or the Theil T index. From 1981, income inequality rose during the recession years of 1982 and 1983. Some authors report small declines in some indices in 1984, but the increase resumed in 1985. In 1986, the year of the Cruzado Plan, a break in the series was caused both by a sudden (if short- lived) decline in inflation and by a large increase in reported house- hold incomes. Stability and economic growth led to a decline in mea- sured inequality, according to all of the authors cited in table 4B.1 in appendix 4B. Thereafter, with the failure of the Cruzado stabi- lization attempt and the return to stagflation, inequality resumed its upward trend, with the Gini coefficient finishing the decade at 0.606. Table 4B.1 summarizes the findings of this literature, both for per capita household incomes and for the distribution of total individ- ual incomes in the economically active population. The general trends identified in the existing literature are mir- rored in the statistics for the years covered in this chapter: 1976, 1981, 1985, and 1996. The distributions for each of these years were taken from the Pesquisa Nacional por Amostra de Domicílios THE SLIPPERY SLOPE: EXPLAINING THE INCREASE 91 (National Sample Survey, or PNAD), run by the Instituto Brasileiro de Geografia e Estatística (Brazilian Geographical and Statistical Institute, or IBGE). Except where otherwise explicitly specified, we deal with distributions for urban areas only, where the welfare con- cept is total household income per capita (in constant 1996 reais, spatially deflated to adjust for regional differences in the average cost of living), and the unit of analysis is the individual. Details of the PNAD sampling coverage and methodology, sample sizes, defi- nitions of key income variables, spatial and temporal deflation issues, and adjustments with respect to the national accounts base- line are discussed in appendix 4A. Table 4.2 presents a number of summary statistics for these dis- tributions in addition to the mean, which was provided in table 4.1. The four inequality indices used throughout this chapter are the Gini coefficient and three members of the generalized entropy class Table 4.2 Basic Distributional Statistics for Different Degrees of Household Economies of Scale Statistic 1976 1981 1985 1996 Median (1996 R$)a 127.98 124.04 120.83 132.94 Inequality Gini, = 1.0 0.595 0.561 0.576 0.591 Gini, = 0.5 0.566 0.529 0.548 0.567 E(0), = 1.0 0.648 0.542 0.588 0.586 E(0), = 0.5 0.569 0.472 0.524 0.534 E(1), = 1.0 0.760 0.610 0.657 0.694 E(1), = 0.5 0.687 0.527 0.580 0.622 E(2), = 1.0 2.657 1.191 1.435 1.523 E(2), = 0.5 2.254 0.918 1.134 1.242 Poverty, R$30 per month P(0), = 1.0 0.0681 0.0727 0.0758 0.0922 P(0), = 0.5 0.0713 0.0707 0.0721 0.0847 P(1), = 1.0 0.0211 0.0337 0.0326 0.0520 P(1), = 0.5 0.0235 0.0315 0.0303 0.0442 P(2), = 1.0 0.0105 0.0246 0.0224 0.0434 P(2), = 0.5 0.0132 0.0226 0.0204 0.0357 Poverty, R$60 per month P(0), = 1.0 0.2209 0.2149 0.2274 0.2176 P(0), = 0.5 0.2407 0.2229 0.2382 0.2179 P(1), = 1.0 0.0830 0.0879 0.0920 0.1029 P(1), = 0.5 0.0901 0.0875 0.0927 0.0960 P(2), = 1.0 0.0428 0.0525 0.0534 0.0703 P(2), = 0.5 0.0471 0.0508 0.0521 0.0625 a. For urban areas only, and spatially deflated. See appendix 4A. Source: Authors' calculations. 92 FERREIRA AND PAES DE BARROS of inequality indexes, E(). Specifically, we chose E(0), also known as the mean log deviation or the Theil L index; E(1), better known as the Theil T index, and E(2), which is one-half of the square of the coefficient of variation. These indices provide a useful range of sen- sitivities to different parts of the distribution. E(0) is more sensitive to the bottom of the distribution, whereas E(2) is more sensitive to higher incomes. E(1) is somewhere in between, whereas the Gini places greater weight around the mean. We also present three poverty indices from the Foster, Greer, and Thorbecke (1984) additively decomposable class P(). P(0), also known as the headcount index, measures poverty incidence. P(1) is the normalized poverty deficit, and P(2) is an average of squared normalized deficits, thus placing greater weight on incomes furthest from the poverty line. We calculated each of these indices with respect to two poverty lines, representing R$1 and R$2 per day, at 1996 prices.11 Each of these poverty and inequality indices is presented both for the (individual) distribution of total household incomes per capita and for an equivalized distribution using the Buhmann and others (1988) parametric class of equivalence scales (with = 0.5). This method provides a rough test that the trends described are robust to different assumptions about the degree of economies of scale in con- sumption within households. Although a per capita distribution does not allow for any such economies of scale, taking the square root of family size allows for economies of scale to a rather gener- ous degree. As usual, per capita incomes generate an upper bound for inequality measures, whereas allowing for some extent of local public goods within households raises the income of (predominantly poor) large households and lowers inequality. In the case of the poverty measures, the poverty lines were adjusted as follows: z = z[µ(n)]1 - , where µ(n) is the mean household size in the dis- tribution (see Deaton and Paxson 1997). Table 4.2 also confirms that the evolution of inequality over the period was marked by a decline from 1976 to 1981 and by a subse- quent deterioration over the remaining two subperiods. Further- more, this trend is robust to the choice of equivalence scale, proxied here by two different values for , although the inequality levels are always lower when we allow for economies of scale within house- holds. It is also robust to the choice of inequality measure, at least with regard to the inequality increases from 1981 to 1996 and from 1985 to 1996, as the Lorenz dominance results identified in table 4.3 indicate. The results for poverty are more ambiguous. With respect to the higher poverty line, incidence is effectively unchanged throughout THE SLIPPERY SLOPE: EXPLAINING THE INCREASE 93 Table 4.3 Stochastic Dominance Results 1976 1981 1985 1996 1976 F 1981 L 1985 L 1996 Source: Authors' calculations. the period (and even displays a slight decline for the equivalized distribution). P(1) and P(2), however, showed increases over the period, which become both more pronounced and more robust with respect to as the concavity of the poverty measure increases. This trend suggests that depth and severity of poverty, affected mostly by falling incomes at the very bottom of the distribution, were rising. These results are reflected in table 4.3, in which a letter L (F) in cell (i, j) indicates that the distribution for year i Lorenz dominates (first order stochastically dominates) that for year j. Both 1981 and 1985 display Lorenz dominance over 1996, as suggested earlier. There is only one case of first-order welfare dominance throughout the period, and symptomatically, it is not a case of a later year over an earlier one. Instead, money-metric social welfare was unambiguously higher in 1976 than in 1985. Indeed, all poverty measures reported for both of our lines (and for = 1.0) are higher in 1985 than in 1976.12 This finding is conspicuously not the case for a comparison between 1976 and 1996. Although poverty mea- sures very sensitive to the poorest are higher for 1996, poverty incidence for "higher" lines fall from 1976 to 1996, suggesting a crossing of the distribution functions. Figure 4.3 shows this cross- ing by plotting the Pen parades [F-1(y)], truncated at the 60th percentile for all four years analyzed. Note that although 1976 lies everywhere above 1985, all other pairs cross. In particular, 1976 and 1996 cross somewhere near the 17th percentile. Before we turn to the model used to decompose changes in the distribution of household incomes, which will shed some light on all of these changes, it is helpful to gather some evidence on the evolu- tion of educational attainment (as measured by average effective years of schooling) and on labor-force participation, for different groups in the Brazilian population, partitioned by gender and eth- nicity. Table 4.4 presents these statistics. As seen, there was some progress in average educational attainment in urban Brazil over this period. Average effective years of schooling for all individuals 10 years or older, as reported in table 4.1, rose from 3.2 to 5.3 years. 94 FERREIRA AND PAES DE BARROS Table 4.4 Educational and Labor-Force Participation Statistics, by Gender and Race Statistic 1976 1981 1985 1996 Average years of schooling Males 3.32 4.04 4.36 5.20 Females 3.14 3.99 4.37 5.43 Blacks and mixed-race individuals -- -- -- 4.20 Whites -- -- -- 6.16 Asians -- -- -- 8.13 Labor-force participation (percent) Males 73.36 74.63 76.04 71.31 Females 28.62 32.87 36.87 42.00 Blacks and mixed-race individuals -- -- -- 55.92 Whites -- -- -- 56.41 Asians -- -- -- 54.88 -- Not available. Notes: Table shows the average effective years of schooling for persons age 10 or older in urban areas. Labor-force participation rates are for urban areas only. Source: Authors' calculations. In fact, this piece of good news was vital in preventing a more pronounced increase in poverty. Table 4.4 now reveals that the male-female educational gap has been eliminated, with females 10 years or older being on average slightly more educated than males of the same age. Clearly, this finding must imply a large dis- parity in favor of girls in recent cohorts. Although a cohort analysis of educational trends is beyond the scope of this chapter,13 such a rapid reversal may in fact warrant a shift in public policy toward programs aimed at keeping boys in school, without in any way dis- couraging the growth in schooling of girls. Finally, note the remark- able disparity in educational attainment across ethnic groups, with Asians substantially above average and blacks and those of mixed race below average. As for labor-force participation, the persistent and substantial increase in female participation from 29 percent to 42 percent over the two decades was partly mitigated by a decline in male partici- pation rates. Those trends notwithstanding, the male-female partic- ipation gap remains high, at around 30 percentage points. There is little evidence of differential labor-force participation across ethnic groups. THE SLIPPERY SLOPE: EXPLAINING THE INCREASE 95 The Model and the Decomposition Methodology Let us now turn to the Brazilian version of the general semireduced- form model for household income and labor supply in chapter 2. It is used here to investigate the evolution of the distribution of house- hold incomes per capita over the two decades from the mid-1970s to the mid-1990s. Specifically, we analyzed the distributions of 1976, 1981, 1985, and 1996 and simulated changes between them. As stated earlier, this chapter covers only Brazil's urban areas (which account for some three-quarters of its population). The general model, therefore, collapses to two occupational sectors: wage earn- ers and self-employed workers in urban areas.14 Total household income (Yh) is given by n n (4.1) Yh = wi Li + w i Li + Y0 se h i=1 i=1 where wi is the total wage earnings of individual i; Lw is a dummy variable that takes the value 1 if individual i is a wage earner (and 0 otherwise); i is the self-employment profit of individual i; Lse is a dummy variable that takes the value 1 if individual i is self-employed (and 0 otherwise); and Y0 is income from any other source, such as transfer income or capital income. Equation 4.1 is not estimated econometrically. It aggregates information on right-hand-side terms 1 (from equations 4.2 and 4.4), 2 (from equations 4.3 and 4.4), and 3 directly from the household data set. The wage-earnings equation is given as follows: (4.2) Log wi = Xi w + i P w where Xi = (ed, ed2, exp, exp2, Dg) and ed denotes completed effec- P tive years of schooling. Experience (exp) is defined simply as (age - education - 6), because a more desirable definition would require the age when a person first entered employment, a variable that is not available for 1976.15 Dg is a gender dummy variable, which takes the value of 1 for females and 0 for males; wi is the monthly earnings of individual i; and i is a residual term that captures any other determinant of earnings, including any unobserved individual characteristics, such as innate talent. This extremely simple specifi- cation was chosen to make the simulation stage of the decomposi- tion feasible, as described below. Analogously, the self-employed earnings equation is given as follows: (4.3) Log i = Xi se + i . P se Equations 4.2 and 4.3 are estimated using ordinary least squares (OLS). Equation 4.2 is estimated for all employees, whether or not 96 FERREIRA AND PAES DE BARROS they are heads of household and whether or not they have formal sector documentation (com or sem carteira). Equation 4.3 is esti- mated for all self-employed individuals (whether or not they are heads of households). Because the errors are unlikely to be inde- pendent from the exogenous variables, a sample selection bias cor- rection procedure might be used. However, the standard Heckman procedure for sample selection bias correction requires equally strong assumptions about the orthogonality between the error terms and (from the occupational-choice multinomial logit below). The assumptions required to validate OLS estimation of equations 4.2 and 4.3 are not more demanding than those required to validate the results of the Heckman procedure. We assume, therefore, that all errors are independently distributed and do not correct for sam- ple selection bias in the earnings regressions. We now turn to the labor-force participation model. Because we had a two-sector labor market (segmented into the wage employ- ment and self-employment sectors), labor-force participation and the choice of sector (occupational choice), could be treated in two dif- ferent ways. One could assume that the choices were sequential, with a participation decision independent from the occupational choice and the latter conditional on the former. That approach, which would be compatible with a sequential probit estimation, was deemed less satisfactory than an approach in which individuals face a single three- way choice, between staying out of the labor force, working as employees, or being self-employed. Such a choice can be estimated by a multinomial logit model. According to that specification, the probability of being in state s = (0, w, se) is given by equation 4.4: eZis (4.4) Pi = s eZis + eZij where s, j (0, w, se) j=s where the explanatory variables differ for household heads and other household members, by assumption, as follows. For house- hold heads, 1 P X1 ; n0-13, n14-65, n>65, D14 -65ed, n14 -65 -1 Z1 = h 1 2 1 D14-65 ed , D14 -65age, . n14-65 -1 n14 -65 -1 1 2 1 D14 age , D14 Dg, D n14 -65 -65 -65 -1 n14 -65 -1 Notice that this is essentially a reduced-form model of labor supply, in which own earnings are replaced by the variables that THE SLIPPERY SLOPE: EXPLAINING THE INCREASE 97 determine them, according to equation 4.2 or 4.3. For other mem- bers of the household, 1 Xi P; n0 -13, n14-65, n>65, D14 -65ed, n14 -65 -i Zi = h 1 2 1 D14 -65ed , D14-65age, n14 -65 -i n14-65 -i 1 2 1 D14 age , D14 se n14 -65 -65Dg, D1 , Lww1,D 1 -65 -i n14-65 -i where nk is the number of persons in the household whose age ­m falls between k and m, D14 is a dummy variable that takes the ­65 value of 1 for individuals whose age is between 14 and 65, Dse is a dummy variable for a self-employed head of household, and the penultimate term is the earnings of a wage-earning head. These last two variables establish a direct conduit for the effect of the house- hold head's occupational choice (and possibly income) on the par- ticipation decisions of other members. D is a dummy variable that takes the value 1 if there are no individuals age 14 to 65 years in the household. The sums defined over {-j} are sums over {i h| j}. The multinomial logit model in equation 4.4 corresponds to the following discrete choice process: (4.5) s = Argj max Uj = Zi j + y, j = (0, w, se) h ¨ where Z is given above, separately for household heads and other members; the y are random variables with a double exponential ¨ density function; and Uj may be interpreted as the utility of alterna- tive j. Once the vector j is estimated by equation 4.4, and a random term is drawn, each individual chooses an occupation j so as to maximize the above utility function. Once equations 4.2, 4.3, and 4.4 have been estimated, we have two vectors of parameters for each of the four years in our sample (t {1976, 1981, 1985, 1996}): t from the earnings equations for both wage earners and the self-employed (including constant terms t) and t from the participation equation. In addition, from equa- tion 4.1, we have Y0 and Yht. Let Xht = {Xi , Zi |i h} and P h ht ht= {wi, sei, ji|i h}. We can then write the total income of household h at time t as follows: (4.6) Yht = H (Xht, Y0 , ht ht; t, t) h 1, . . . , m. On the basis of this representation, changes in the distribution of incomes can be decomposed into price effects ( ), occupational- choice effects (), endowment effects (X, Y0), and residual 98 FERREIRA AND PAES DE BARROS effects ( ), as outlined in chapter 2. Calculating the price and occupational-choice effects is reasonably straightforward once the relevant exogenous parameters have been estimated. Estimating individual endowment effects requires a further step because elements of the X and Y vectors are jointly distributed and a change in the value of any one variable must be understood conditionally on all other observable characteristics. Specifically, if we are interested in the effect of a change in the distribution of a single specific variable Xk on the distribution of household incomes between times t and t , it is first necessary to identify the distribution of Xk conditional on other relevant charac- teristics X-k (and possibly other incomes Y0). This can be done by regressing Xk on X-k at dates t and t , as follows: (4.7) Xkit = X-kitµt + ukit where k is the variable, i is the individual, and t is the date. The vec- tor of residuals ukit represents the effects of unobservable character- istics (assumed to be orthogonal to X-k) on Xk. The vector µt is a vector of coefficients capturing the dependency of Xk on the true exogenous variables X-k, at time t. For the sake of simplicity, let us assume that the error terms u are normally distributed with a mean of zero and a common standard deviation t. The same equation can, of course, be estimated at date t , gener- ating a corresponding vector of coefficients µt, and a standard error of the residuals given by t . We are then ready to simulate the effect of a change in the conditional distribution of Xk from t to t by replacing the observed values of Xkit in the sample observed at time t, with t (4.8) Xkit = X-kitµt + ukit . t The contribution of the change in the distribution of the variable Xk to the change in the distribution of incomes between t and t may now be written as follows: Rtt =D[{Xkit , X-kit, Y0 , x ht ht}, t, t] (4.9) - D[{Xkit, X-kit, Y0 , ht ht }, t, t]. In this study, we perform four regression estimations such as equation 4.7, and hence four simulations such as equation 4.8. The four variables estimated are Xk = {n0 -13, n14-65 , n>65, ed). In the case of the education regression, the vector of explanatory variables X-kit was (1, age, age2, Dg, regional dummy variables). In the case of the regressions with the numbers of household members in certain age THE SLIPPERY SLOPE: EXPLAINING THE INCREASE 99 intervals as dependent variables, the vector X-kit was (1, age, age2, ed, ed2, regional dummy variables), where age and education are those of the household head. The simulations permitted by these estimations allow us to investigate the effects of the evolution of the distribution of educational attainment and of the demographic structure on the distribution of income. We now turn to the results of the estimation stage of the model. Estimating the Model The results of the OLS estimation of equation 4.2 for wage earners (formal and informal) are shown in table 4.5. The static results are not surprising. All variables are significant and have the expected signs. The coefficients on education and its square are positive and significant. The effect of experience (defined as age - education - 6) is positive but concave. The gender dummy variable (female = 1) is negative, significant, and large. The dynamics are more interesting. Between 1976 and 1996, the earnings-education profile changed shape. After rising in the late 1970s, the linear component fell substantially between 1981 and 1996. Meanwhile, the coefficient of squared years of schooling fell to 1981 but then more than doubled to 1996, ending the period substantially above its initial 1976 value. Overall, the relationship Table 4.5 Equation 4.2: Wage Earnings Regression for Wage Employees Variable 1976 1981 1985 1996 Intercept 4.350 4.104 3.877 4.256 (0.0001) (0.0001) (0.0001) (0.0001) Education 0.123 0.136 0.129 0.080 (0.0001) (0.0001) (0.0001) (0.0001) Education squared (× 100) 0.225 0.181 0.283 0.438 (0.0001) (0.0001) (0.0001) (0.0001) Experience 0.075 0.085 0.087 0.062 (0.0001) (0.0001) (0.0001) (0.0001) Experience squared (× 100) -0.105 -0.119 -0.121 -0.080 (0.0001) (0.0001) (0.0001) (0.0001) Gender (1 = female) -0.638 -0.590 -0.635 -0.493 (0.0001) (0.0001) (0.0001) (0.0001) R2 0.525 0.538 0.547 0.474 Note: P-values are in parentheses. Source: Authors' calculations based on the PNAD. 100 FERREIRA AND PAES DE BARROS became more convex, suggesting a steepening of marginal returns to education at high levels. However, plotting the parabola that mod- els the partial earnings­education relationship from equation 4.2, the lowering of the linear term dominates. The profile shifts up from 1976 to 1981 and again to 1985, before falling precipitously (although convexifying) to 1996 (see figure 4.4). The net effect across the entire period was a fall in the cumulative returns to edu- cation (from zero to t years) for the entire range. This effect coex- isted with increasing marginal returns at high levels of education. The implications for poverty and inequality are clear, with the edu- cation price effect leading to an increase in the former and a decline in the latter, all other things being equal. Returns to experience also increased from 1976 to 1981 and from 1981 to 1985 with a concave pattern and a maximum at around 35 years of experience (see figure 4.5). However, from 1985 to 1996, there was a substantial decline in cumulative returns to experience, even with respect to 1976, until 50 years of experience. The relationship became less concave, and the maximum returns moved up to around 40 years. Over the entire period, the experience price effect was mildly unequalizing (although it contributed to increases in inequality until 1985, which were later reversed) and seriously poverty increasing. The one piece of good news comes from a reduction in the male- female earnings disparity. Although, when we controlled for both education and experience, female earnings remained substantially lower in all four years (suggesting that some labor-market discrimi- nation may be at work), there was nevertheless a decline in this effect between 1976 and 1996. As we will see from the simulation results, this effect was both mildly equalizing and poverty reducing. Let us now turn to equation 4.3, which seeks to explain the earn- ings of the self-employed with the same set of independent variables as equation 4.2. The results are reported in table 4.6. This table reveals that education is also an important determinant of incomes in the self-employment sector. The coefficient on the linear term has a higher value in all years than for wage earners, but the quadratic term is lower. This result implies that, all other things equal, the return to low levels of education might be higher in self-employment than in wage work, but these returns eventually become lower as years of schooling increases. This result will have an effect on occupational choice, estimated through equation 4.4. Dynamically, the same trend was observed as for wage earners: the coefficient on the linear term fell over time, but the relationship became more con- vex.16 The coefficients on experience and experience squared follow a similar pattern to that observed for wage earners, as shown in THE SLIPPERY SLOPE: EXPLAINING THE INCREASE 101 Table 4.6 Equation 4.3: Total Earnings Regression for the Self-Employed Variable 1976 1981 1985 1996 Intercept 4.319 4.192 3.853 4.250 (0.0001) (0.0001) (0.0001) (0.0001) Education 0.196 0.148 0.165 0.114 (0.0001) (0.0001) (0.0001) (0.0001) Education squared (× 100) -0.206 0.021 0.012 0.219 (0.0001) (0.4892) (0.6545) (0.0001) Experience 0.074 0.079 0.084 0.063 (0.0001) (0.0001) (0.0001) (0.0001) Experience squared (× 100) -0.101 -0.108 -0.111 -0.082 (0.0001) (0.0001) (0.0001) (0.0001) Gender -1.092 -1.148 -1.131 -0.714 (0.0001) (0.0001) (0.0001) (0.0001) R2 0.431 0.434 0.438 0.336 Note: P-values are in parentheses. Source: Authors' calculations based on the PNAD. figure 4.5. Once again, the cumulative return to experience fell over the bulk of the range from 1976 to 1996, contributing to the observed increase in poverty. The effect of being female, all other things equal, is even more markedly negative in this sector than in the wage sector. It also fell from 1976 to 1996, despite a temporary increase in disparity in the 1980s. A cautionary word is in order before proceeding. All of the esti- mation results reported in table 4.6 refer to equations with total earnings as dependent variables. The changes in coefficients will, therefore, reflect changes not only in the hourly returns to a given characteristic but also in any supply responses that may have taken place. The analysis is to be understood in this light. Let us now turn to the estimation of the multinomial logit in equation 4.4. This estimation was made separately for household heads and for others because the set of explanatory variables was slightly different in each case (see the description of vectors Z1 and Zi in the previous section).17 For household heads, education was not significantly related to the likelihood of choosing to work in the wage sector compared with staying out of the labor force, at any time. In addition, the pos- itive effect of education decreased from 1976 to 1996 to the point where it was no longer statistically significant. The dominant effect on the occupational choices of urban household heads over this period, however, was a substantial decline in the constant term 102 FERREIRA AND PAES DE BARROS affecting the probability of participating in either productive sector, as opposed to remaining outside the labor force or in unemploy- ment. Because it is captured by the constant, this effect is not related to the educational or experience characteristics of the head of house- hold or to the endowments of his or her household. We interpret it, instead, as the effect of labor-market demand-side conditions, lead- ing to reduced participation in paid work.18 In the occupational- choice simulations reported in the next section, this effect will be shown to be both unequalizing and immiserizing. For other members of the household, education did appear to raise the probability of choosing wage work compared with staying out of the labor force, with the relationship changing from concave to convex over the period. It also enhanced the probability of being in self-employment compared with being outside the labor force in both periods, although this relationship remained concave. The number of children in the household significantly discouraged par- ticipation in both sectors, although more so in the wage-earning sector. The change in the constant term was much smaller than for household heads, suggesting that negative labor-market conditions hurt primary earners to a greater extent. Consequently, we observed the effect of the occupational choices of other household members on poverty and inequality to be much milder than that of the occu- pational choices of the heads of households. This finding is in con- trast to those in other economies where similar methodologies have been applied. For example, in Taiwan, China, changes in labor- force participation rates of spouses (particularly female spouses) had important consequences for the distribution of incomes (see chapter 9). The results of the estimation of equation 4.7, with education of individuals 10 years old or older as the dependent variable regressed against the vector (1, age, age2, Dg, regional dummy variables), are also given in Ferreira and Paes de Barros (1999). Over time, there is a considerable increase in the value of the intercept, which will yield higher predicted values for educational attainment, controlling for age, gender, and regional location. In addition, the gender dummy variable went from large and negative to positive and significant, suggesting that women have more than caught up with men in edu- cational attainment in Brazil over the past 20 years. The effect of individual age is stable, and regional disparities persist, with the South and Southeast ahead of the three central and northern regions. Regressing the number of household members in the age intervals 0­13, 14­65, and older than 65 years, respectively, on the vector (1, ed, ed2, age, age2, regional dummy variables) yields the finding that THE SLIPPERY SLOPE: EXPLAINING THE INCREASE 103 the schooling of the head of household has a large, negative, and significant effect on the demand for children; hence, as education levels rise, family sizes tend to fall, all other things equal. In addi- tion, some degree of convergence across regions in family size can be inferred, with the positive 1976 regional dummy coefficients for all regions (with respect to the Southeast) declining over time and more than halving in value to 1996. Simulation Results After estimating earnings equations for both sectors of the model-- wage earners (equation 4.2) and the self-employed (equation 4.3); participation equations for both household heads and other house- hold members (equation 4.4); and endowment equations for the exogenous determination of education and family composition (equation 4.7), we are now in the position to carry out the decom- positions described in chapter 1. These simulations, as discussed earlier, are carried out for the entire distribution. The results are summarized in table 4.7, through the evolution of (a) the mean household per capita income µ(y); (b) four inequality indices--the Gini coefficient, the Theil L index [E(0)], the Theil T index [E(1)], and E(2); and (c) the standard three members of the Foster, Greer, and Thorbecke (1984) class of poverty measures--P(), = 0, 1, 2--computed with respect to two monthly poverty lines: an indi- gence line of R$30 and a poverty line of R$60 (both expressed in 1996 São Paulo metropolitan area prices).19 Table 4.7 contains a great wealth of information about a large number of simulated economic changes, always by bringing combi- nations of 1996 coefficients to the 1976 population. To address the two puzzles posed in the introduction to this chapter--namely, the increase in extreme urban poverty between 1976 and 1996 despite (sluggish) growth and (mildly) reducing inequality and the coexis- tence of a deteriorating labor market with stable headline poverty-- we now plot differences in the logarithms of incomes between the simulated distribution of household incomes per capita and that observed for 1976 for a number of the simulations in table 4.7.20 Figure 4.6 plots the combined price effects ( and ) separately for wage earners and the self-employed. As can be seen, these effects were negative (that is, they would have implied lower income in 1976) for all percentiles. The losses were greater for wage earners than for the self-employed and, for the latter, were regressive. Those losses are exactly what one would have expected from the downward P(2) 0.0428 0.0703 0.0596 0.0490 0.0673 0.0545 0.0590 0.0488 0.0525 0.0404 R$60 =z month P(1) 0.0830 0.1029 0.1129 0.0932 0.1249 0.1040 0.1114 0.0953 0.1000 0.0797 per Poverty: P(0) 0.2209 0.2176 0.2876 0.2399 0.3084 0.2688 0.2837 0.2531 0.2592 0.2160 P(2) 0.0105 0.0434 0.0141 0.0121 0.0169 0.0129 0.0143 0.0110 0.0125 0.0090 R$30 =z month P(1) cientsfi 0.0211 0.0530 0.0304 0.0250 0.0357 0.0275 0.0303 0.0234 0.0265 0.0191 per Poverty: Coef P(0) 0.0681 0.0922 0.0984 0.0788 0.1114 0.0897 0.0972 0.0779 0.0851 0.0650 1996 E(2) 2.657 1.523 2.161 2.787 2.190 2.691 2.055 2.691 2.694 2.590 Using E(1) 0.760 0.694 0.752 0.770 0.754 0.774 0.736 0.759 0.771 0.751 1976, for Inequality E(0) 0.648 0.586 0.656 0.658 0.655 0.664 0.644 0.639 0.664 0.649 Gini 0.595 0.591 0.598 0.597 0.598 0.601 0.593 0.593 0.600 0.595 Inequality and capita Mean income per 265.101 276.460 218.786 250.446 204.071 233.837 216.876 232.830 240.618 270.259 Poverty both Simulated earners for) both both both no for for both 4.7 wage self-employed both observed observed effects for, for ableT Indicator 1976 1996 Price for, for, for , (but only All Education Experience Gender 104 0.0671 0.0454 0.0902 0.0264 0.0677 0.0287 0.0173 0.0561 0.1082 0.0867 0.1466 0.0554 0.1129 0.0567 0.0359 0.0913 0.2471 0.2274 0.3248 0.1711 0.2724 0.1593 0.1131 0.2204 0.0331 0.0119 0.0402 0.0063 0.0321 0.0073 0.0049 0.0296 0.0451 0.0231 0.0597 0.0113 0.0433 0.0136 0.0078 0.0374 0.0944 0.0721 0.1352 0.0365 0.0931 0.0424 0.0225 0.0735 2.633 2.482 2.401 2.432 2.177 2.485 2.320 1.896 0.788 0.757 0.788 0.704 0.727 0.740 0.688 0.727 0.650 0.657 0.649 0.585 0.577 0.650 0.584 0.600 0.609 0.598 0.610 0.574 0.587 0.594 0.571 0.594 PNAD. the 260.323 265.643 202.325 277.028 210.995 339.753 353.248 263.676 on based effects effects + sectors all calculations' patterns heads other all sectors sectors both endowment all all all both for for for, for, Authors for both both d only Occupational-choice for (and others) for (only members) ,, for,,,, ,, e only µ, µ, for,, d d e d e Source: Demographic µ µ Education µ µ µ 105 106 FERREIRA AND PAES DE BARROS Figure 4.6 Combined Price Effects by Sector Difference of log incomes 0.3 0.2 0.1 0.0 0.1 0.2 0.3 0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 Percentile Wage earners Self-employment Source: Authors' calculations. shifts of the partial earnings-education and earnings-experience profiles, shown in figures 4.4 and 4.5. In figure 4.7, we adopt a different tack to the price effects by plotting the income differences for each price-effect simulation (for both sectors combined) and then aggregating them. As we would expect from figures 4.4 and 4.5, the returns to education and expe- rience are both immiserizing. The change in partial returns to edu- cation alone is mildly equalizing (as can be seen from table 4.7). The change in the partial returns to experience is unequalizing as well as immiserizing. The change in the intercept, calculated at the mean values of the independent variables, was also negative throughout. This change proxies for a "pure growth" effect, capturing the effects on earnings from processes unrelated to education, experience, gen- der, or the unobserved characteristics of individual workers. It is intended to capture the effects of capital accumulation, managerial and technical innovation, macroeconomic policy conditions, and other factors likely to determine economic growth that are not included explicitly in the Mincer equation. Its negative effect in this THE SLIPPERY SLOPE: EXPLAINING THE INCREASE 107 Figure 4.7 Price Effects Separately and for Both Sectors Combined Difference of log incomes 0.3 0.2 0.1 0.0 0.1 0.2 0.3 0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 Percentile Combined price effect Economic growth Returns to education Returns to experience Returns to being female Source: Authors' calculations based on the 1976 and 1996 PNAD. simulation suggests that these factors were immiserizing in urban Brazil over the period. The one piece of good news, once again, comes from the gender simulation, which reports a poverty-reducing effect as a result of the decline in male-female earnings differentials captured in tables 4.5 and 4.6. However, this effect was far from being sufficient to offset the combined negative effects of the other price effects. As the thick line at the bottom of figure 4.7 indicates, the combined effect of imposing the 1996 parameters of the two Mincerian equations on the 1976 population was substantially immiserizing. Figure 4.8 plots the logarithm of the income differences between the distribution that arises from imposing the 1996 occupational- choice parameters (the vector from the multinomial logit in equation 4.4) on the 1976 population and the observed 1976 distri- bution. It does so both for all individuals (the lower line) and for 108 FERREIRA AND PAES DE BARROS Figure 4.8 Occupational-Choice Effects Difference of log incomes 0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 Percentile All s s (non­head of household) Source: Authors' calculations based on the 1976 and 1996 PNAD. non­household heads (the upper line). The effect of this simulated change in occupational-choice and labor-force participation behav- ior is both highly immiserizing and unequalizing, as an inspection of the relevant indices in table 4.7 confirms. It suggests the existence of a group of people who--by voluntarily or involuntarily leaving the labor force, entering unemployment, or being consigned to very ill- remunerated occupations (likely) in the informal sector--are becom- ing increasingly impoverished. Combining the negative price and occupational-choice effects provides a sense of the overall effect of Brazil's urban labor-market conditions over this period. This finding is shown graphically in figure 4.9, where the lowest curve plots (a) the differences between the household per capita incomes from a distribution in which all s, s, and s change, and (b) the observed 1976 distribution. It shows the substantially poverty-augmenting (and unequalizing) combined effect of changes in labor-market prices and occupational- choice parameters on the 1976 distribution. THE SLIPPERY SLOPE: EXPLAINING THE INCREASE 109 Figure 4.9 The Labor Market: Combining Price and Occupational-Choice Effects Difference of log incomes 0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 Percentile s and s All s s, s, and s Source: Authors' calculations based on the 1976 and 1996 PNAD. At this point, the second puzzle can be stated clearly: given these labor-market circumstances, what factors can account for the facts that mean incomes rose, headline poverty did not rise, and inequal- ity appears to have fallen slightly? The first part of the answer is shown graphically in figure 4.10, where the upper line plots the differences between the log incomes from a distribution arising from imposing on the 1976 population the transformation (equation 4.8) for the demographic structure of the population. The changes in the parameters µd (and in the variance of the residuals in the corre- sponding regression) have a positive effect on incomes for all per- centiles and in an equalizing manner. However, when combined with a simulation in which the values of all s, s, and s also change, it can be seen that the positive demographic effect is still overwhelmed. Nevertheless, it is clear that the reduction in dependency ratios-- and subsequently in family sizes--in urban Brazil over this period had an important mitigating effect on the distribution of incomes. 110 FERREIRA AND PAES DE BARROS Figure 4.10 Demographic Effects Difference of log incomes 0.4 0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 Percentile (d) (d), s, s, and s Source: Authors' calculations based on the 1976 and 1996 PNAD. One final piece of the puzzle is needed to explain why the deteri- oration in labor-market conditions did not have a worse effect on poverty. That, as should be evident from the increase in mean years of effective schooling registered in table 4.1, is the rightward shift in the distribution function of education. This effect is shown in figure 4.11, which reveals that gains in educational attainment were particularly pronounced at lower levels of education and thus, pre- sumably, among the poor. A gain in educational endowments across the income distribu- tion, but particularly among the poor, has both direct and indirect effects on incomes. The direct effects are through equations 4.2 and 4.3, where earnings are positive functions of schooling. The indirect effects are both through the occupational choices that individuals make and through the additional effect that education has on reduc- ing the demand for children and, hence, family size. A simulation of the effect of education is thus quite complex.21 After it is completed, THE SLIPPERY SLOPE: EXPLAINING THE INCREASE 111 Figure 4.11 Shift in the Distribution of Education, 1976­96 100 90 80 70 60 50 40 30 20 10 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Years of schooling 1976 1996 Source: Authors' calculations based on the 1976 and 1996 PNAD. one observes, in figure 4.12, a rather flat improvement in log incomes across the distribution (that is, a scaling effect). However, when this effect is again combined with changes in the parameters of the demographic equations, it gains strength and becomes not only more poverty reducing but also mildly equalizing. The bottom line in figure 4.12, in keeping with the pattern, combines both of these effects with the changing s, s, and s. The result is striking: this complex combined simulation suggests that all of these effects, during 20 turbulent years, cancel out almost exactly from the 15th percentile up, hence the small changes in headline poverty. How- ever, from around the 12th percentile down, the simulation suggests a prevalence of the negative occupational-choice (and, to a lesser extent, price) effects, with substantial income losses. These findings account for the rise in indigence captured by the R$30 per month poverty line. The bottom line in figure 4.12 is, in a sense, the final attempt by this methodology to simulate the various changes that led from the 1976 to the 1996 distribution. Figure 4.13 is a graphical test of the approach. Here the line labeled "1996­76" plots the differences in 112 FERREIRA AND PAES DE BARROS Figure 4.12 Education Endowment and Demographic Effects Difference of log incomes 0.6 0.4 0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 Percentile (e) (d) and (e) (d), (e), s, s, and s Source: Authors' calculations based on the 1976 and 1996 PNAD. actual (log) incomes between the observed 1996 and the observed 1976 distributions. Along with it, we also plotted every (cumula- tive) stage of our simulations: first, the immiserizing (but roughly equal) price effects; then these effects combined with the highly immiserizing occupational-choice effects; then the slightly less bleak picture arising from a combination of the latter with the parameters of the family size equations; and, finally, the curve plotting the differences between the incomes from the simulation with all param- eters changing, and observed 1976. As can be seen in figure 4.13, the last line does not seem to replicate the actual differences badly. Of course, the point of the exercise is not to replicate the actual changes perfectly but rather to learn the different effects of different parameters and possibly to infer any policy implications from them. However, the success of the last simulation in approximately match- ing the actual changes does provide some extra confidence in the methodology and in any lessons we may derive from it. THE SLIPPERY SLOPE: EXPLAINING THE INCREASE 113 Figure 4.13 A Complete Decomposition Difference of log incomes 0.5 0.0 0.5 1.0 1.5 2.0 0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 Percentile s and s s, s, and s (d), s, s, and s (d), (e), s, s, and s 1996­76 Source: Authors' calculations based on the 1976 and 1996 PNAD. Conclusions In the end, does this exercise help improve our understanding of the evolution of Brazil's urban income distribution over this turbulent 20-year period? Although many traditional analysts of income dis- tribution dynamics might have inferred from the small changes in mean income, in various inequality indices, and in poverty incidence that there was little--if anything--to investigate, digging a little deeper has unearthed a wealth of economic factors interacting to determine substantial changes in the environment faced by individ- uals and families and in their responses. In particular, we have found that, despite a small fall in measured inequality (although the Lorenz curves cross, as expected) and a small increase in mean income, extreme poverty has increased for sufficiently low poverty lines, or sufficiently high poverty-aversion parameters. This result appears to have been caused by outcomes 114 FERREIRA AND PAES DE BARROS related to participation decisions and occupational choices, in com- bination with declines in the labor-market returns to education and experience. These changes were associated with greater unemploy- ment and informality, as one would expect, but more research appears necessary. Although we appear to have identified the exis- tence of a group excluded both from the productive labor markets and from any substantive form of safety net, we have not been able to interpret fully the determinants of their occupational choices. Issues of mobility--exacerbated by the current monthly income nature of the welfare indicator--will also require further under- standing in this context. Policy implications appear to lie in the area of self-targeted labor programs or other safety nets, but it would be foolhardy to go into greater detail before the profile of the group that appears to have fallen into extreme poverty in 1996 is better understood. Second, we have found that, even above the 15th percentile, where urban Brazilians have essentially stayed put, this lack of change was the result of some hard climbing up a slippery slope. These urban Brazilians had to gain an average of two extra years of schooling (still leaving them undereducated for the country's per capita income level) and to substantially reduce fertility in order to counteract falling returns in both the formal labor market and in self-employment. It may well be, as many now claim, that an investigation of non- monetary indicators--such as access to services or life expectancy at birth--would lead us to consider the epithet of "a lost decade" too harsh for the 1980s. Unfortunately, we find that if one is sufficiently narrow minded to consider only money-metric welfare, urban Brazil has in fact experienced two, rather than one, lost decades. Appendix 4A: Data and Methodology Macroeconomic Data All macroeconomic indicators reported in this chapter were based on original data from the archives of the IBGE. GDP and GDP per capita figures reported in the introduction came from the series shown in table 4A.1. This series was constructed from the current GDP series (A), which was revised in 1995 and backdated to 1990 and from the old series (B), from 1976 to its final year, 1995. The series reported in table 4A.1 comprises the values of series A from 1990 to 1996 and the values of series B scaled down by a factor of 0.977414 from 1976 to 1989. This factor is the simple average of the ratios A/B over the years 1990­95. The series is expressed in 1996 reais, using the IBGE GDP deflator. THE SLIPPERY SLOPE: EXPLAINING THE INCREASE 115 Table 4A.1 Real GDP and GDP Per Capita in Brazil, 1976­1996 (constant 1996 prices) Year GDP (reais) Population GDP per capita (reais) 1976 434,059,220 107,452,000 4,040 1977 455,477,123 110,117,000 4,136 1978 478,113,823 112,849,000 4,237 1979 510,432,394 115,649,000 4,414 1980 562,395,141 118,563,000 4,743 1981 538,474,976 121,213,000 4,442 1982 542,971,306 123,885,000 4,383 1983 527,054,370 126,573,000 4,164 1984 555,515,747 129,273,000 4,297 1985 599,129,793 131,978,000 4,540 1986 644,002,821 134,653,000 4,783 1987 666,708,887 137,268,000 4,857 1988 666,304,312 139,819,000 4,765 1989 687,391,828 142,307,000 4,830 1990 651,627,236 144,091,000 4,522 1991 658,339,124 146,408,000 4,497 1992 654,759,303 148,684,000 4,404 1993 687,004,026 150,933,000 4,552 1994 727,213,139 153,143,000 4,749 1995 757,918,030 155,319,000 4,880 1996 778,820,353 157,482,000 4,945 Source: Instituto Brasileiro de Geografia e Estatística 1999. The GDP per capita growth rates plotted in figure 4.1 were derived from this series. Annual inflation and unemployment rates also came from the relevant IBGE series. The PNAD Data Sets All of the distributional analyses performed in this chapter were based on four data sets (1976, 1981, 1985, 1996) of Brazil's National Household Survey (Pesquisa Nacional por Amostra de Domicilios, or PNAD), which is fielded annually by the IBGE. For the latter three years, the survey was nationally and regionally rep- resentative, except for the rural areas of the North region (except the state of Tocantins). For 1976, rural areas were not surveyed in the North or in the Center-West regions. In this chapter, we were concerned only with urban areas, which are defined by state-level legislative decrees. The urban proportions of the population in each year are given in table 4.1. The PNAD sample sizes, as well as the proportion of missing income values, are given in table 4A.2. Each PNAD questionnaire contains a range of questions pertain- ing to both the household and the individuals within the household. 116 FERREIRA AND PAES DE BARROS Table 4A.2 PNAD Sample Sizes and Missing or Zero Income Proportions Proportion of Proportion of individuals Number of Number of individuals with whose income Year households individuals missing income is zero 1976 84,660 385,282 0.0052 0.0063 1981 110,151 477,607 0.0073 0.0141 1985 127,128 520,069 0.0073 0.0108 1996 91,621 329,434 0.0291 0.0313 Note: Income is total household income per capita. Source: Authors' calculations based on PNAD. The household-related questions included regional location, demo- graphic composition, quality of the dwelling, ownership of durables, and so forth. The individual questions included age, gender, race, educational attainment, labor-force status, sector of occupation, and incomes (in both cash and kind) from various sources. The main variables used in our analysis were those related to incomes, educa- tion, demographic structure of the household, and labor-force par- ticipation. Tables A.6 to A.9 in Ferreira and Paes de Barros (1999) summarize the main items in the questionnaire for these variables and the changes from 1976 to 1996. Most importantly, the distributions analyzed in this chapter (except where explicitly otherwise indicated) have, as welfare con- cept, total household income per capita (regionally deflated). It is constructed by summing all income sources for each individual within the household and across all such individuals, except for lodgers or resident domestic servants. The latter two categories con- stitute separate households. Total nominal incomes were deflated spatially to compensate for differences in average cost of living across various areas in the country, according to the spatial price index given in table 4A.3. We assumed, largely because of the lack of earlier comparable regional price information, that the structure of average regional cost of living described earlier remained constant over the period. Tem- poral deflation was undertaken on the basis of the Brazilian consumer price indices--the Índice Geral de Preços--Disponibilidade Interna (General Price Index, or IGP-DI) for 1976 and the Índice Nacional de Preços Consumidor­Real (National Consumer Price Index, or INPC-R) for the three subsequent years. For 1996, the INPC-R was upwardly adjusted by 1.2199 to compensate for the actual price increases that took place in the second half of June 1994 and that were not computed into the July index, because the latter was already computed in terms of the unidade real de valor (real value unit). This THE SLIPPERY SLOPE: EXPLAINING THE INCREASE 117 Table 4A.3 A Brazilian Spatial Price Index (São Paulo metropolitan area = 1.0) PNAD region Spatial price deflator Fortaleza metropolitan area 1.014087 Recife metropolitan area 1.072469 Salvador metropolitan area 1.179934 Northeast (other urban areas) 1.032056 Northeast rural 0.953879 Belo Horizonte metropolitan area 0.958839 Rio de Janeiro metropolitan area 1.002163 São Paulo metropolitan area 1.000000 Southeast (other urban areas) 0.904720 Southeast rural 0.889700 Porto Alegre metropolitan area 0.987001 Curitiba metropolitan area 0.987001 South (other urban areas) 0.904720 South rural 0.889700 Belem metropolitan area 1.088830 North (other urban areas) 1.032056 Brasília metropolitan area 1.037915 Center-West (other urban areas) 0.968388 Note: This regional price index is based on the consumption patterns and implicit prices from the 1996 Pesquisa de Padrões de Vida (Living Standard Measurement Survey) for the Northeast and Southeast regions and was extrapolated to the rest of country according to a procedure specified in Ferreira, Lanjouw, and Neri (2003), where the exact derivation of the index is also discussed in detail. Source: Ferreira, Lanjouw, and Neri 2003. adjustment is becoming the standard deflation procedure at the Insti- tuto de Pesquisa Econômica Aplicada when comparing incomes across June­July 1994 (see Macrométrica 1994). To center the indices on the first day of the month, which is the reference date for PNAD incomes, the geometric average of the index for a month and for the preceding month were used as that month's deflator. Once again, this procedure is now best practice for price deflation in hyper- inflationary periods. Once the deflators were constructed in this way, the values to convert current incomes into 1996 reais were devel- oped, as shown in table 4A.4. A final possible adjustment to the PNAD data concerns devia- tions between survey-based welfare indicators (such as mean household income per capita) and national accounts­based prosper- ity indicators (such as GDP per capita). The international norm is that household survey means are lower than per capita GNP, both because the latter includes the value of public and publicly provided goods and services, which are generally not imputed into the survey indicators, and because of possible underreporting by respondents. Given that the levels of the two series are not expected to match 118 FERREIRA AND PAES DE BARROS Table 4A.4 Brazilian Temporal Price Deflators, Selected Years Year Value 1976 4.115 1981 49.512 1985 2257.294 1996 1.000 Source: Authors' calculations based on IBGE: IGP-DI and INPC-R. Table 4A.5 Ratios of GDP Per Capita to PNAD Mean Household Incomes, 1976­96 Year GDP per capita (A) Mean PNAD income (B) (A)/(B) 1976 336.6 190.2 1.770 1981 370.2 187.3 1.976 1985 378.3 188.6 2.005 1996 412.1 233.0 1.769 Source: Authors' calculations based on PNAD and National Accounts data. exactly, analysts are usually concerned by deviant trends, which may indicate a problem with the survey instrument. Conversely, it may be argued that national accounts data have errors of their own and that many of the "correction" procedures applied to household data rely on reasonably strong assumptions, such as equipropor- tional underreporting by source. In deciding whether to adjust the PNAD data with reference to the Brazilian national accounts over this period, we examined the evolution of the ratios of GDP per capita to mean household incomes from the PNAD (for the entire country and without regional price deflation, for comparability). As table 4A.5 shows, these ratios were remarkably stable. In particular, the ratios for the starting and ending points of the period covered, which are of par- ticular importance for our analysis, are almost identical. In this light and because even the disparity with respect to 1981 and 1985 is rea- sonably small, we judged that the costs of making rough adjust- ments to the PNAD household incomes on the basis of the national accounts outweighed the benefits. Appendix 4B: Summary of the Literature Table 4B.1 shows the evolution of mean income and inequality in Brazil during the period studied and provides a summary of the literature. ) 1990 164 0.606 0.745 page 1989 196 0.618 0.796 following the 1988 166 0.609 0.750 on 1987 166 0.582 0.710 Continued( 1986 5.6 0.586 0.519 213 0.581 0.694 0.577 3112.8 Literature 1985 0.592 4.5 0.592 0.529 150 0.589 0.697 2222.1 0.588 the 1984 125 of 4.0 0.588 0.526 0.577 0.653 1983 0.549 0.589 3.8 0.589 0.523 126 0.584 0.676 1835.6 0.582 Summary A 1982 4.7 0.587 0.520 1981 0.542 0.584 4.6 0.584 0.519 143 0.574 0.647 2040.6 0.562 Inequality: 1980 4.8 0.597 0.536 2264.0 0.590 and 1979 0.550 0.588 4.7 0.588 0.523 2081.2 0.574 Income 1978 Mean 1977 of 1976 0.561 0.583 2241.8 0.589 ) d Evolution a b income (1989 (1996) US$) (1989) income cientfi cientfi cientfi cientfi population) cientfi 4B.1 capita (1989) eldfi (1990 and coef coef c coef T and coef T and e coef per Sedlacek Litch individual (active Sedlacek ableT Gini Gini Mean Gini Theil Mean Gini Theil Mean Gini Indicator Household Bonelli Hoffman Ferreira otalT Bonelli 119 Sul, 1990 the (Cost do and D and 1989 a Estudos Grosso Estatístico e PNA Mato Ferreir 1988 sticaí Anuário 1986 For Estat and and 1987 s.á Grosso, de set/1986. PNAD. Goi Mato and 1985, Census; 1985 1986 426.1 0.589 and of 1983, and Sul, Intersindical areas set/1985 do income. 1981, 1985 335.7 0.599 94.6 0.545 0.521 0.584 rural 1984, Demographic total Grosso and between 1979, 1983, 1984 293.6 0.587 89.2 0.536 0.498 0.558 -DIEESE). 1980 Mato Departamento­ week; ICV years the 1976, idaV or all -DIEESE per 1982, the 1983 297.5 0.591 86.8 0.534 0.496 0.565 for ICV de PNAD; Grosso, hours 1981, PNAD. Studies, (1989), region 1985; 20 1982 91.9 0.520 0.465 0.527 Mato Custo 1986 1979, 1990 of do least and North August at and Sedlacek 1978, 1981 331.2 0.572 93.4 0.514 0.457 0.513 states ndiceÍ the 1985, in until and 1989, the the Socioeconomic 1977, working 1980 by and and 1994, areas 1988, values. and Bonelli atedfl income. 1976, rural INPC-IBGE 1983, For the areas 1987, 1979 340.2 0.585 93.6 region de 0.530 0.486 0.560 Statistics income of positive 1983. North 1980, ators:fl 1982, urban (1993), 1986, Excludes and 1978 89.7 with values. 0.531 0.488 0.571 De in the missing 1981, 1985, of income. August or 1986. force Ramos 1989. lowest force 1982, of Department 1979, 1977 87.5 0.543 0.511 0.607 zero areas ) and 1984, income labor labor the 1981, Lauro with rural wages Union September the the For 1983, 1976 85.4 zero 0.564 0.556 0.709 in September in the radeT of highest 1979. of of (1989), 1980, with and 65,­ 1981, families Continued( minimum people 18 1979, Census. h a,f excludes in Inter­ 100. the = i Cz$1,000 average cientfi cientfi those people 1976 cruzeiros ages Hoffman years in 4B.1 (1989) 1979, value, Index- the (1996), g coef (1993) j coef L T forsá includes 1980 For 1,000 men, eighted Includes For Real Excludes Prices Goi Only In W for Demographic For Base: eldfi ableT Mean Gini Mean Gini Theil Theil a. b. c. Living d. e. f. g. h. i. j. Source: Indicator Hoffman Ramos of and 1985 1980 Litch 120 THE SLIPPERY SLOPE: EXPLAINING THE INCREASE 121 Notes 1. The changes were from the cruzeiro to the cruzado in 1986, to the novo cruzado in 1989, back to the cruzeiro in 1990, and to the real in 1994. 2. See table 4A.1 in appendix 4A for a complete population series. 3. Effective years of schooling are based on the last grade completed and are thus net of repetition. 4. All poverty measures reported in this chapter are the P() class of decomposable measures from Foster, Greer, and Thorbecke (1984). An increase in implies an increase in the weight placed on the distance between households' income and the poverty line. 5. E() denote members of the decomposable generalized entropy class of inequality measures. A lower means an increased weight placed on distances between poorer people and the mean. E(0), the Theil L index, is particularly sensitive to the poorest people but ignores zero incomes by con- struction. See Cowell (1995). For the zero incomes in our sample, see appen- dix 4A, table 4A.2. 6. Pen parades--or quantile functions--are the mathematical inverse of distribution functions; that is, they plot the incomes earned by each person (or group of persons) when these people are ranked by income. 7. The use of terms such as occupational choice or decision should not be taken to imply an allocation of responsibility. It will become clear when the model is presented that, as usual, these are choices under constraints. 8. By headline poverty, we mean poverty incidence computed with respect to the R$60 per month poverty line. 9. Throughout this chapter, this comparison and other comparisons between sample-based statistics are subject to sampling error, and one would ideally like to estimate their level of statistical significance. As discussed in chapter 2, the application of inference procedures to microsimulation-based decompositions remains an item in the agenda for future research. 10. The Fishlow-Langoni debate concerned the importance of educa- tion vis-à-vis repressive labor-market policies in determining the high level of Brazilian inequality. See, for example, Fishlow (1972), Langoni (1973), and Bacha and Taylor (1980). 11. At 1996 market exchange rates, this amount was roughly equal to US$1 and US$2. In real terms, this amount would be slightly lower than the conventional poverty lines of purchasing power parities US$1 and US$2 valued at 1985 prices, which the World Bank often uses for international comparisons because of U.S. inflation in the intervening decade. 12. Note that this first-order welfare dominance is not robust to a change in to 0.5. 13. See Duryea and Székely (1998) for such an educational cohort analysis of Brazil and other Latin American countries. 122 FERREIRA AND PAES DE BARROS 14. In Brazil, wage earners include employees with or without formal documentation (com or sem carteira). The self-employed are own-account workers (conta própria). 15. Because education is given by the last grade completed and is thus net of repetition, this definition overestimates the experience of those who repeated grades at school and, hence, biases the experience coefficient downward. The numbers involved are not substantial enough to alter any conclusions on trends. 16. In this case, the relationship actually switched from concave to convex. 17. Space constraints prevent the presentation of the tables reporting these estimations. They are available in Appendix 3 of the working paper version (Ferreira and Paes de Barros 1999). 18. In terms of the occupational-choice framework, these are changes in the constraints with respect to which those choices are made. 19. Table 4.7 and the remaining figures in this chapter refer to the sim- ulation of bringing the coefficients estimated for 1996 on 1976. Similar exercises were conducted for 1981 and 1985 and are reported in Ferreira and Paes de Barros (1999). Likewise, the return simulation of applying the 1976 coefficients on 1996 was conducted, and the directions and broad magnitudes of the changes confirm the results presented here. 20. In computing these differences, we compared the percentiles of the two different distributions described earlier. A different, but equally inter- esting, exercise is to compare the percentiles of the simulated distribution ranked as in the observed 1976 distribution with that 1976 distribution. These exercises were performed but are not reported because of space con- straints. In any case, the plots presented are those that correspond to the summary statistics presented in table 4.7. 21. Note that the different effects are not simply being summed. The effect of greater educational endowments is simulated through every equa- tion in which it appears in the model, thereby affecting fertility choices and occupational statuses, as well as earnings. References Almeida dos Reis, José G., and Ricardo Paes de Barros. 1991. "Wage Inequality and the Distribution of Education: A Study of the Evolution of Regional Differences in Inequality in Metropolitan Brazil." Journal of Development Economics 36: 117­43. Bacha, Edmar L., and Lance Taylor. 1980. "Brazilian Income Distribution in the 1960s: Acts, Model Results, and the Controversy." In Lance Taylor, Edmar L. Bacha, Eliana Cardoso, and J. Frank, eds., Models of THE SLIPPERY SLOPE: EXPLAINING THE INCREASE 123 Growth and Distribution for Brazil. New York: Oxford University Press: 296­342. Bonelli, Regis, and Guilherme L. Sedlacek. 1989. "Distribuição de Renda: Evolução no Ultimo Quarto de Seculo." In Guilherme L. Sedlacek and Ricardo Paes de Barros, eds., Mercado de Trabalho e Distribuição de Renda: Uma Coletanea. Serie Monografica 35. Rio de Janeiro: Instituto de Pequisa Econômica Aplicada. Buhmann, Brigitte, Lee Rainwater, Guenther Schmaus, and Timothy Smeed- ing. 1988. "Equivalence Scales, Well-Being, Inequality, and Poverty: Sensitivity Estimates across Ten Countries Using the Luxembourg Income Study Database." Review of Income and Wealth 34: 115­42. Cowell, Frank A. 1995. Measuring Inequality, 2nd ed. New York: Harvester Wheatsheaf. Deaton, Angus, and Christina Paxson. 1997. "Poverty among Children and the Elderly in Developing Countries." Working Paper 179. Princeton University Research Program in Development Studies. Princeton, N.J. Duryea, Suzanne, and Miguel Székely. 1998. "Labor Markets in Latin America: A Supply-Side Story." Paper prepared for the Inter-American Development Bank and Inter-American Investment Corporation Annual Meeting, Cartagena de Indias, Colombia, March 16­18. Ferreira, Francisco H. G., and Julie A. Litchfield. 1996. "Growing Apart: Inequality and Poverty Trends in Brazil in the 1980s." Distributional Analysis Research Programme Discussion Paper 23. Suntory and Toyota International Centres for Economics and Related Disciplines, London School of Economics and Political Science. Ferreira, Francisco H. G., and Ricardo Paes de Barros. 1999. "The Slippery Slope: Explaining the Increase in Extreme Poverty in Urban Brazil, 1976­1996." Revista de Econometria 19(2): 211­96. Ferreira, Francisco H. G., Peter Lanjouw, and Marcelo Neri. 2003. "A Robust Poverty Profile for Brazil Using Multiple Data Sources." Revista Brasileira de Economia 57(1): 59­92. Fishlow, Albert. 1972. "Brazilian Size Distribution of Income." American Economic Association: Papers and Proceedings 1972: 391­402. Foster, James, Joel Greer, and Erik Thorbecke. 1984. "A Class of Decom- posable Poverty Measures." Econometrica 52: 761­5. Hoffman, Rodolfo. 1989. "Evolução da Distribuição da Renda no Brasil: Entre Pessoas e Entre Familias, 1979/86." In Guilherme L. Sedlacek and Ricardo Paes de Barros, eds., Mercado de Trabalho e Distribuição de Renda: Uma Coletanea. Serie Monografica 35. Rio de Janeiro: Instituto de Pequisa Econômica Aplicada. Juhn, Chinhui, Kevin Murphy, and Brooks Pierce. 1993. "Wage Inequality and the Rise in Returns to Skill." Journal of Political Economy 101(3): 410­42. 124 FERREIRA AND PAES DE BARROS Langoni, Carlos Geraldo. 1973. Distribuição da Renda e Desenvolvimento Econômico do Brasil. Rio de Janeiro: Expressão e Cultura. Macrométrica. 1994. "Inflação: Primeiros Meses do Real." Boletim Mensal Macrométrica 111 (July­August). Mookherjee, Dilip, and Anthony F. Shorrocks. 1982. "A Decomposition Analysis of the Trend in U.K. Income Inequality." Economic Journal 92: 886­902. Ramos, Lauro. 1993. A Distribuição de Rendimentos no Brasil: 1976/85. Rio de Janeiro: Instituto de Pequisa Econômica Aplicada. 5 The Reversal of Inequality Trends in Colombia, 1978­95: A Combination of Persistent and Fluctuating Forces Carlos Eduardo Vélez, José Leibovich, Adriana Kugler, César Bouillón, and Jairo Núñez By the late 1970s, the Colombian economy had completed two decades of consistent reduction in income inequality. For some time, income inequality in Colombia was exemplary of Kuznets's well- known inverted U-shaped curve: after the growing inequality of the first half of the 20th century, substantial reductions in inequality were observed during the 1960s and 1970s as the economy grew. The improvements became marginal during the late 1970s and the 1980s, and income inequality took a U-turn in the late 1980s, com- pletely reversing the equity gains of the two preceding decades. The rise in national inequality during the 1988­95 period in Colombia was driven by a large increase in inequality in the urban sector, as well as by the simultaneous increase in inequality between urban and rural areas. At the same time, Colombia experienced sig- nificant changes in the sociodemographic characteristics of the pop- ulation. Between 1978 and 1995, the most significant changes in those respects were the following: (a) higher educational attainment of the labor force--particularly among women--and greater work experience; (b) a drop in fertility, leading to smaller family size; (c) a decrease in the gender earnings gap; (d) pronounced fluctuations in 125 126 VÉLEZ, LEIBOVICH, KUGLER, BOUILLÓN, AND NÚÑEZ the structure of wages by educational level; and (e) increased female participation in the labor market. At the same time, the Colombian economy was subjected to major structural reforms and macroeco- nomic changes that modified key labor-market parameters and affected labor-market performance through different channels. The structural reforms of the early 1990s covered several areas: trade liberalization and trade integration agreements with neighboring countries, liberalization of the capital account, and major changes in labor and social security legislation. The latter increased the rela- tive cost of labor with respect to capital and became a source of dif- ficulty for job creation. In addition, the economy suffered supply shocks linked to major discoveries of oil reserves. Rural economic activities experienced a marginal shift from agri- culture, strictly speaking, and industry to mining and services. In addition, during the late 1970s and early 1980s, agriculture was subjected to a faster process of concentration of land and rural credit. Finally, that sector was hit by a set of negative shocks in the early 1990s: lower tariff protection, real exchange appreciation, lower international prices, drought, and violence. The purpose of this chapter is to decompose the dynamics of income inequality--urban and rural--so as to measure the specific contribution of some of the preceding factors to changes in income inequality. Within a microsimulation framework based on a reduced-form model of individual earnings and participation in the labor market, we evaluate the following factors:1 (a) the returns to observable human assets (such as education or experience) and individual characteristics (such as gender, location, or occupational status); (b) the changes in the distribution of these assets and indi- vidual characteristics in the population; (c) the changes in labor- force participation and occupational choice behavior; and finally (d) the changes in the overall effect of unobservable earning determinants. This approach is used to decompose the changes in inequality and measure the contribution of each of the preceding factors for the periods 1978­88 and 1988­95 for both individual earnings and household income. Our findings show that periods of moderate changes in inequal- ity conceal strong counterbalancing effects of equalizing and unequalizing forces. The strongest determinants of individual income distribution dynamics are returns to education, education endowments (that is, how many years of education an individual has), and effects of unobservable factors on earning inequality, in addition to family size and nonlabor income for household income. Some of these factors are persistent, while others are less stable and are strongly dependent on economic conditions. The analysis also THE REVERSAL OF INEQUALITY TRENDS 127 shows that the forces that determine changes in the distribution of individual earnings differ in intensity from those that determine changes in the distribution of household income. A combination of persistent and fluctuating forces characterizes the dynamics of income inequality in the urban sector in Colombia between 1978 and 1995 and explains the reversal that took place in 1988. The persistent forces are linked to demographics and labor supply: the evolution of family behavior--smaller family size and increased labor participation by women--and the growth of educa- tional endowments. The unstable or fluctuating factors tend to respond to changes in the labor demand function--namely, to its labor skills profile. Although the aggregate effect of persistent fac- tors is moderate relative to the effect of fluctuating factors, it is per- haps the best indicator of long-run trends in inequality. Some of these effects are also present, but of much less importance, in the rural sector. Two of our main findings are contrary to our expectations. First, and intuitively, a greater and more egalitarian education endowment in both urban and rural areas is expected to reduce income inequal- ity. However, according to our decomposition exercise, this intuition held true only in rural areas. Paradoxically, equalization of educa- tion endowment led to a deterioration in the income distribution in urban areas in both periods, 1978­88 and 1988­95. This apparent contradiction is explained by the strong convexity of the earnings functions and by the larger interquintile differences in returns to edu- cation prevalent in urban areas, with respect to rural areas. Second, increasing female participation in the labor market generated asym- metric effects on per capita income distribution vis-à-vis changes in the per capita labor earnings distribution. The effects were regressive for income distribution and progressive for labor earnings distribu- tion. This surprising discrepancy is easily explained with a simple statistical line of reasoning, which is laid out later in this chapter. This chapter is divided into four sections. In the first section, we examine the evolution of inequality and poverty indicators for three years: 1978, 1988, and 1995. We examine the changes in some labor-market indicators and in the distribution of sociodemographic characteristics. We also briefly review the main structural reforms and macroeconomic developments that affected labor-market per- formance. In the second section, we model the income-generating process and provide estimates of parameters that describe the evo- lution of the structure of earnings and participation behavior. The third section discusses the outcome of the decomposition exercises, which measure the contribution of different factors to the total change in inequality. Finally, we offer some conclusions. 128 VÉLEZ, LEIBOVICH, KUGLER, BOUILLÓN, AND NÚÑEZ Colombian Income Distribution between 1978 and 1995 The Recent U-Turn in Inequality Several authors have identified the mid-1960s as the break point in the regressive trend of income distribution during the first half of the 20th century.2 However, the evolution of the income distrib- ution over the past two decades suggests instead that the regressive trend of the 1960s only presaged a high-water mark. The reduction in inequality was steady from the mid-1960s until the late 1970s. Inequality plateaued from 1978 to 1988 then increased significantly from 1988 to 1995, practically erasing the equity gains of previous decades.3 As may be seen in table 5.1, indexes of household income inequal- ity for urban and rural areas are relatively stable from 1978 to 1988 but exhibit opposite tendencies during the 1988­95 period. In urban areas, the Gini coefficient is flat and the Theil index fell a little in the first period. Some reduction of inequality in the upper tail and some increase in the lower tail of the urban distribution are revealed by the simultaneous drop in the transformed coefficient of variation and the increase in the mean log deviation index. After 1988, urban inequality deteriorated significantly, as indi- cated by all summary inequality measures reported in table 5.1.4 In rural areas, the evolution is almost identical between 1978 and 1988: the Gini coefficient and the Theil index deteriorate a little, and the lower and upper tail inequalities show the same rise and decline as in urban areas. From 1988 to 1995, however, rural inequality follows a different path. A clear improvement is notice- able in all inequality indices shown in table 5.1. This improvement in the rural income distribution was not suffi- cient to prevent national inequality from rising under the pressure of the increase in the inequality of urban incomes, which represent approximately 80 percent of national household income. It is true that the urban-rural income gap increased after 1988, as urban income per capita nearly doubled between 1978 and 1995 while rural income increased by only 50 percent. However, this evolution is of little importance in explaining the overall worsening of the national distribution of household income. Most of the increase in national inequality after 1988 is explained by changes within urban areas, whereas the limited changes in the national distribution of income during the preceding decade reflect parallel distributional changes within both urban and rural areas. ithin 42 63 321 W 39.1 57.4 Female Urban 13 11 10 Decomposition Between Male 45.0 59.4 1995 otalT 56.1 55.8 74.7 331.5 0.4 Rural 36.6 Rural 40.7 30.0 29.4 45.8 39.3 17.4 earsY 1.4 54.4 50.5 70.6 60.7 82.6 Urban 282.7 Urban 50.3 ithin 04 74 115 W 34.3 59.0 Selected Female Urban 9 8 7 Areas, Decomposition Between Male 39.5 53.5 1988 otalT 54.1 49.6 55.2 122.2 Urban 0.5 Rural 39.0 and Rural 44.4 37.3 35.0 50.5 39.8 21.0 Hogares. 1.3 de 50.2 42.5 50.3 60.2 79.0 Rural Urban 105.1 Urban 44.7 ithin 63 84 Nacional 163 W 32.7 54.0 Female between Urban 8 8 7 Encuesta Decomposition Between Male 42.1 60.8 DANE, 1978 Inequality otalT 53.9 44.7 56.0 170.4 from otalT 0.6 38.5 Rural 43.5 33.8 34.6 60.3 42.6 23.9 Rural data on of 1.3 50.2 38.0 52.6 57.4 76.1 Urban 153.6 Urban 47.8 based calculations E(2) Decomposition E(0) coefficient inequality share 5.1 share income mean) Authors' log E(1) coefficient variation, the coefficients earner ableT deviation, of (to individuals Indicator Household Gini Mean Theil, ransformedT age Source: Population Income Relative Indicator Gini All W Self-employed 130 VÉLEZ, LEIBOVICH, KUGLER, BOUILLÓN, AND NÚÑEZ In view of that relative autonomy of the evolution of urban inequality and rural inequality and their clear contribution to over- all inequality, the two sectors are analyzed separately in the rest of this chapter. In urban and rural areas, the inequality of earnings among all employed persons follows a pattern somewhat similar to household inequality. Data from 1978 to 1988 reveal a pronounced decrease in income inequality for all individual urban workers (see the bottom of table 5.1) and stability for rural workers. From 1988 to 1995, earnings inequality for individual rural workers decreases slightly, whereas inequality for urban workers increases quite significantly. To conclude this short review of the distributional trend in Colombia since 1978, we should mention that, despite fluctuations in income inequality, social welfare in urban Colombia improved substantially and unambiguously both from 1978 to 1988 and from 1988 to 1995. The doubling of income per capita compensated for all changes in income distribution. In rural areas, welfare improve- ments are unambiguous between 1978 and 1988 but somewhat ambiguous between 1988 and 1995. Vélez and others (2001) find first-order stochastic dominance in both periods in urban areas and during the first period in rural areas as well. However, from 1988 to 1995 in rural areas, second-order stochastic dominance is only sat- isfied up to the 90th percentile. Main Forces Driving the Dynamics of Income Distribution The purpose of this chapter is to identify the forces that shaped the changes of income inequality within urban and rural areas during the 1980s and early 1990s. Before turning to a detailed analysis, we first review the social and demographic developments that may have affected the distribution of income either directly or through the supply of labor. We also assess the simultaneous structural reforms and macroeconomic events that had major impacts on the demand side of the labor market. EVOLUTION OF THE SOCIODEMOGRAPHIC STRUCTURE OF THE WORKING POPULATION Greater and More Egalitarian School Attainment. Urban education levels became higher and more equally distributed throughout the period. The proportion of urban workers who had only completed or had not completed primary education fell by nearly 20 percentage points (see table 5.3), whereas the average number of years of school- ing went up from 6.4 to 8.9 years. A more detailed analysis also THE REVERSAL OF INEQUALITY TRENDS 131 shows that the increase in educational attainment was greater among women--specifically among younger women, who either caught up with or surpassed men. This general increase in education came with some equalizing of schooling attainment. For instance, the coefficient of variation of the number of years of schooling in the cohort born in 1975 was half what it was four decades earlier. Progress in educa- tional attainment was also observed in the rural population: the aver- age number of years of schooling went up from 2.1 to 3.9 years. Overall, however, the rural sector remained considerably behind the urban sector. As for trends within the urban population, the inequal- ity of educational achievements fell substantially. Higher Participation in the Labor Force, Particularly by Women. Changes in labor-force participation have been substantial over the period, especially among women. Table 5.2 shows that the average employment rate for women increased from 37.0 to 51.0 percent in urban areas and from 18.6 to 27.5 percent in rural areas. Interest- ingly, most of this gain in labor-force participation was among female household heads or spouses. Overall, the share of wage earners in the urban labor force remained relatively constant at about 44 percent. However, the pro- portion of men employed as wage earners decreased noticeably, sug- gesting that a higher proportion of women were employed as wage workers. This tendency was still clearer in rural areas, where women entering the labor force tended to concentrate in wage work in com- merce and services (López 1998). Decreasing Fertility Rates. Table 5.3 shows that family size fell in urban areas from 5.1 persons in 1978 to 4.3 in 1988 and 4.1 in 1995. For the average household, this change in size produced, other things being equal, an increase in per capita income of 24 percent, which represents a fourth of the total gain in real earnings per capita for the average Colombian household over the period. This evolu- tion was even more pronounced in rural areas. Overall, the reduc- tion in family size affected all income groups, although in different proportions. Figure 5.1 shows that in urban areas family size fell proportionally more for lower-middle-income households. MACRO EVENTS AND CHANGES IN DEMAND FOR LABOR The growth performance of the Colombian economy was satisfac- tory between 1978 and 1995. Gross domestic product (GDP) per capita grew at an average annual rate of 1.8 percent. But the growth rate was higher by 1 percentage point between 1988 and 1995.5 otalT 69.2 100 8.8 43.7 24.2 32.1 100 261 1995 51.0 43.4 11.4 32.8 16.3 50.9 100 206 1.2 Female 6.8 Male 90.4 56.6 56.3 33.4 10.3 100 296 1.5 otalT 64.4 100 10.3 43.3 19.9 36.8 100 228 earsY Urban 1988 0.9 43.3 41.3 13.9 28.9 12.6 58.5 100 182 Female 7.8 Selected Male 88.6 58.7 59.7 28.3 12.0 100 253 1.2 Areas, otalT 62.4 100 8.2 43.5 16.5 40.1 100 211 Rural and 1978 37.0 38.6 10.3 25.6 10.0 64.4 100 150 0.8 Female Urban in 6.9 Male 88.9 61.4 64.2 24.0 11.9 100 239 1.3 Indicators groups rate by month -Market earner hour per rate statistics per Labor gender wage self-employed thousand) thousand) by of of 5.2 employment population earnings $Col wages $Col -market ableT Indicator Labor verageA orking Employed Unemployment W Percentage Percentage Inactive otalT verageA (1995 verageA (1995 132 otalT 4.7 53.1 31.7 19.3 49.1 100 107 100.0 86 9.70 100 1995 29.6 27.50 16.5 12.5 71.1 Female 2.6 Male 76.1 72.5 46.9 26.0 27.1 100 115 otalT 4.0 53.0 30.6 18.5 50.9 100 111 100.0 86 Rural 1988 8.9 26.5 24.4 13.7 10.1 76.2 100 Female 2.3 Male 79.0 75.6 47.9 27.2 24.9 100 118 Hogares. de otalT 2.1 99 49.1 26.7 17.1 56.2 100 Nacional 100.0 Encuesta 5.4 7.6 8.2 68 1978 19.6 18.6 84.2 100 Female DANE, from 1.3 Male 76.8 81.4 46.5 26.4 27.1 100 106 data on based groups rate by month earner per calculations rate statistics gender wage self-employed thousand) by of of employment population earnings $Col Authors' -market Indicator Labor verageA orking Employed Unemployment W Percentage Percentage Inactive otalT verageA (1995 Source: 133 5.3 0.8 0.6 3.9 4.7 1995 40.5 22.1 16.8 20.6 19.8 57.8 15.8 100 3.6 0.5 0.3 3.4 5.1 earsY Rural 1988 44.6 20.7 15.8 18.9 22.1 60.5 13.0 100 Selected 6.6 1.0 0.1 0.1 1978 47.2 18.4 15.0 19.4 37.9 54.3 100 2.1 5.9 Areas, Rural 8 2.1 1995 23.7 32.7 24.3 19.2 26.8 27.4 24.9 10.8 100 8.9 4.1 and Urban Hogares. in 2.1 7.1 9.5 100 7.9 4.3 de Urban 28.4 32.7 20.9 18 32.8 28.8 19.8 1988 Nacional 4.2 6.3 5.8 6.4 5.1 1978 34.9 27.4 18.5 19.1 43.6 28.9 11.2 100 Encuesta Characteristics DANE, from data working on in force Sociodemographic education based in of labor population in years the of calculations Changes of structure incomplete complete size 5.3 complete Authors' incomplete complete number (percentage) structure ableT age (percentage) Indicator Age 12­24 25­34 35­44 45­65 Education Illiterate Primary Secondary Secondary rtiaryeT rtiaryeT otalT verageA Source: Household 134 THE REVERSAL OF INEQUALITY TRENDS 135 Figure 5.1 Average Household Size by Income Decile in Urban Colombia, Selected Years Average household size (persons) 8 7.5 7 6.5 6 5.5 5 4.5 4 3.5 1 2 3 4 5 6 7 8 9 10 Income decile 1978 1988 1995 Source: Authors' calculations based on data from DANE, Encuesta Nacional de Hogares. Labor demand was less dynamic, a change that is likely to have affected the evolution of income distribution. Employment growth fell quite significantly after 1990. Several macroeconomic events and structural reforms during the early 1990s explain the lack of dynamism of labor demand for less skilled workers: (a) exchange rate appreciation and labor legislation reforms in the early 1990s that increased the relative cost of labor relative to capital; (b) a tendency of domestic industry to invest in more capital-intensive technology, as exposure to international com- petition rose because of tariff reductions and regional trade integra- tion; and (c) a gradual shift of productive activities toward more capital-intensive activities, as production shifted from agriculture and industry to mining and services. The substantial rise in payroll taxation in the 1990s also slowed down the demand for unskilled labor and the generation of wage-earning jobs,6 despite the labor reform of 1990 (Ley 50), which reduced labor costs by diminishing the expected value of the cost of dismissals (cesantías). Only one 136 VÉLEZ, LEIBOVICH, KUGLER, BOUILLÓN, AND NÚÑEZ factor helped reinforce the demand for low-skilled labor: the five- fold increase in construction activity in the early 1990s, closely related to exchange rate appreciation, which derived from unprece- dented capital inflows.7 On the agricultural side, the first half of the 1990s was charac- terized by a set of negative circumstances and policy measures that produced a major reduction in output. The removal of import con- trols, the lowering of tariffs, the appreciation of the exchange rate, low international prices, scarce credit, frequent drought, and increasing violence all contributed to a substantial agricultural decline (Jaramillo 1998). Changes in rural credit and land owner- ship should have had more direct effects on the distribution of income. The 1974­84 decade witnessed an increase in the concen- tration of land ownership (Lorente, Salazar, and Gallo 1994). How- ever, this trend reversed in the subsequent decade, when the Gini coefficient of land ownership went down from 0.61 to 0.59. The same egalitarian evolution occurred in the credit market. Until 1984, credit and interest rate subsidies were concentrated among large- scale producers. But a shift occurred between 1984 and 1993. The controls over interest rates gradually crumbled, and credit tended to deconcentrate (Gutiérrez 1995). Determinants of Household Income: 1978, 1988, and 1995 The explanation of the dynamics of income distribution relies on some representation of household income­generating behavior in the various periods under analysis. Household income is modeled as the outcome of two interrelated process: (a) the determination of labor earnings as a function of observed and unobserved individual characteristics and (b) the individual decision to participate to the labor force as a wage worker or a self-employed worker and the probability of being employed.8 This section presents the main results of the estimation of earning and occupational choice equa- tions. It also highlights the most prominent changes in underlying individual or market behavior that are likely to have led to changes in the distribution of income during the 1978­95 period. Urban and rural earnings are modeled independently. In each case, four separate Mincer earning equations are estimated for the loga- rithm of self-employed workers' and wage workers' earnings and for each gender. Explanatory variables are the number of years of school- ing, potential labor experience, location. Both schooling and experi- ence include quadratic terms that control for heterogeneity in results THE REVERSAL OF INEQUALITY TRENDS 137 by levels of schooling or experience. For urban areas, equations for men are estimated by ordinary least squares (OLS), and a two-stage Heckman selection-bias correction is used for women. For rural areas, the Heckman correction is applied to wage earners of both genders; OLS is used for self-employed workers because the selection bias failed to be significant. Occupational choice behavior is estimated as a multinomial logit model with three possible situations: (a) self-employed, (b) wage earner, and (c) inactive. This model is estimated separately for house- hold heads, spouses, and other members of the household--with gender dummy variables included in each case. The same occupa- tional model is used for all individuals of working age in rural areas. Explanatory variables include the variables likely to affect potential individual earnings--schooling, experience, region, and gender. These variables describe the earning and domestic production capac- ity of all other household members--that is, household composition summarized by number of household members by gender and age group, average schooling, and average experience. Changes in the Earnings Equations The eight panels of tables 5.4 and 5.5 show the individual regressions for log earnings of male and female wage earners and self-employed workers in urban and rural areas for the three years considered in this analysis. For all years and for all occupational situations, the coefficients have the expected sign and are generally highly significant. The pos- itive estimate of the quadratic term for education reveals that the marginal rate of return to schooling increases with schooling within all groups--except for male, rural, self-employed workers in 1995-- and the reverse is true of experience, as predicted by the Mincerian model. Figures 5.2 and 5.3 show how the changes in parameter estimates for schooling affected wage differentials across schooling levels for urban male and female wage and self-employed workers. Changes in returns to schooling clearly contributed to flattening the earnings- schooling profile of men between 1978 and 1988 and, therefore, to equalizing the earnings distribution. Indeed, the relative income of low-educated workers increased much more than that of those with more education. No change took place for self-employed women, whereas middle-educated wage-earning women seemed to lose in comparison with those women of other educational levels. The evo- lution of income distribution was radically different between 1988 and 1995. For men, relative incomes increased at both the lower and 844* 1995 earsY 9.3611* 0.0321* 0.0051* 0.0536* 0.0007*- 0.8156 5,059 0.3029 1995 8.8958* 0.0254 0.0061* 0.0448* 0.0006*- 0.9127 11,837 workers workers cancefi Selected squares) signi correction) 792* least 1988 8.9284* 0.0901* 0.0024* 0.0561* 0.0007*- 0.7913 4,635 0.3216 1988 8.3962* 0.0457** 0.0063* 0.0461* 0.0006*- 0.9159 18,676 orkers, indicates self-employed self-employed W ** , (ordinary (Heckman Male Female better Urban or 834 201* 1978 8.4609* 0.1232* 0.0007 0.0867* 0.0013*- 0.885 0.2818 1978 4,046 8.2978* 0.0361 0.0068** 0.0342** 0.0004- 0.8905 level Female percent 1 and the at Male 1995 8,534 9.8234* 0.0379*- 0.0075* 0.0476* 0.0007*- 0.5211 0.3983 1995 3,082* 9.4141* 0.0015- 0.0062* 0.0337* 0.0006*- 0.4934 17,621 cancefi signi squares) earners earners level. correction) Indicates * Self-Employed least 3229* wage 1988 9,762 9.5537* 0.0027 0.0055* 0.0541* 0.0007*- 0.457 0.4659 wage 1988 9.3672* 0.0383* 0.0034* 0.0416* 0.0006*- 0.458 18,676 table. percent and 10 Male the Female (ordinary (Heckman the age at from W of 774* 1978 2,234 cancefi 9.0234* 0.0474* 0.0046* 0.0727* 0.0011*- 0.5142 0.4774 1978 4,046 9.2313* 0.0267 0.0049* 0.0399* 0.0007*- 0.4587 omitted are signi Equations variables indicates and dummy calculations.' Earnings level, squared squared observations observations variance squared squared of variance of Authors 5.4 Regional percent 5 2 ableT ariableV 2 Constant Schooling Schooling Experience Experience Residual Number R ariableV Note: the Source: Constant Schooling Schooling Experience Experience Residual Number Chi at 138 THE REVERSAL OF INEQUALITY TRENDS 139 Table 5.5 Earnings Equations of Wage and Self-Employed Male and Female Rural Workers, Selected Years Male wage earners Male self-employed workers (Heckman correction) (ordinary least squares) Variable 1988 1995 1988 1995 Constant 10.4208* 10.7522* 9.2593* 9.2058* School 0.0221* -0.0050 0.0749* 0.0738* School squared 0.0021* 0.0042* 0.0005 -0.0005 Age 0.0668* 0.0474* 0.0656* 0.0730* Age squared -0.0008* -0.0005* -0.0006* -0.0007* Atlantic -0.3041* -0.2729* -0.0335 -0.0317 Oriental -0.2324* -0.0454* -0.2765* -0.2297* Central -0.2345* -0.2016* 0.0583 -0.2490* Model chi2 1,237.2 1,970.0 n.a. n.a. Adjusted R2 n.a. n.a. 0.1243 0.1180 Number of observations 4,438 4,691 2,515 2,604 Female wage earners Female self-employed (Heckman correction) (ordinary least squares) Variable 1988 1995 1988 1995 Constant 9.8676* 10.0758* 10.5254* 10.0828* School 0.0800* 0.0527* 0.0636* 0.0647* School squared 0.0015 0.0021* 0.0014 0.0035* Age 0.0576* 0.0508* 0.0040 0.0186* Age squared -0.0005* -0.0005* 0.0000 -0.0001 Atlantic -0.2306* -0.1884* -0.1274 0.1923* Oriental -0.1947* -0.0025 -0.5907* -0.0297 Central -0.1825* -0.1305* -0.1065 -0.0722 Model chi2 n.a. n.a. 1,028.6 1,081.3 Adjusted R2 0.4211 0.3877 n.a. n.a. Number of observations 1,300 1,645 965 1,246 *Significant at the 5 percent level. n.a. Not applicable. Source: Authors' calculation based on DANE, Encuesta Nacional de Hogares. the upper end of the distribution of schooling, with a priori ambigu- ous effects on inequality. The same was observed for female wage workers, as in the previous period, whereas the evolution was unam- biguously equalizing for female self-employed workers. This evolution of earning differential with respect to education is broadly consistent with the macroeconomic factors that affected the labor market through the early 1990s: capital deepening and a complementary demand for skilled workers at the top of the distri- bution, and construction boom and a demand for unskilled workers at the bottom. 140 VÉLEZ, LEIBOVICH, KUGLER, BOUILLÓN, AND NÚÑEZ Figure 5.2 Change in Income from Changes of Returns to Education, Relative to Workers Who Have Completed Secondary Education: Male and Female Wage Earners in Urban Colombia, Selected Periods Percent change in relative income 40 30 20 10 0 10 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Years of schooling Percent change in relative income 40 30 20 10 0 10 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Years of schooling 1978­88 1988­95 Source: Authors' calculations. THE REVERSAL OF INEQUALITY TRENDS 141 Figure 5.3 Change in Income from Changes of Returns to Education, Relative to Workers Who Have Completed Secondary Education: Male and Female Self-Employed Workers in Urban Colombia, Selected Periods Percent change in relative income 40 30 20 10 0 10 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Years of schooling Percent change in relative income 40 30 20 10 0 10 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Years of schooling 1978­88 1988­95 Source: Authors' calculations. 142 VÉLEZ, LEIBOVICH, KUGLER, BOUILLÓN, AND NÚÑEZ Compared with returns to education in the urban labor market, returns in rural areas behaved similarly but showed more hetero- geneity over time and across labor groups (table 5.5). Returns to education increased with years of schooling attained, except for self-employed male workers in 1988 and 1995. As in the urban case, the convexity of the earnings equation with respect to years of schooling decreased from 1978 to 1988 and increased again after. The variance in the residuals of the earnings equations represents the joint dispersion across earners of the rewards for unobserved skills, as well as measurement error and transitory components of earnings.9 Table 5.4 shows a reduction in that variance between 1978 and 1988 and an increase between 1988 and 1995 for all male urban earners, whereas changes are somewhat limited for female urban earners. Observed changes in that variance seem large enough to affect the inequality of individual earnings and that of household incomes.10 It is clear from tables 5.4 and 5.5 that shifts in earning differen- tials across gender and occupational groups depend on the charac- teristics of earners. For example, for otherwise equal men and women who have 8 years of schooling and 10 years of experience in urban areas, we would expect to find a small increase in the male- female wage differential but a large drop in the differential between men (wage or self-employed workers) and self-employed women. Most of the resulting substantial drop in the male-female earnings gap actually took place between 1988 and 1995. In the rural sector, equal men and women who have three years of schooling will likely exhibit a continuous substantial drop in the earning differential between male self-employed workers and male wage workers but an increasing gender wage differential in favor of men. Changes related to experience are of limited amplitude. Regional differences declined for all groups between 1978 and 1988 but did the opposite during the 1990s.11 Changes in Participation and Occupational Choice Behavior Occupational choices are modeled as a multinomial logit. Three choices are considered: inactivity, wage work, and self-employment. Dependent variables include all characteristics of individuals as well as summary characteristics for the household they belong to. The estimation is made independently for household heads, spouses, and other male and other female adult members. The main features of occupational choice behavior within those groups of individuals and their evolution over time are summarized in the following paragraphs. THE REVERSAL OF INEQUALITY TRENDS 143 URBAN Labor-force participation displays the usual features (see table 5.6). Higher levels of education increase the probability of being employed, in particular for spouses.12 Participation decreases with experience or age for household heads and spouses, but it tends to increase for other household members. Spouse participation is par- ticularly sensitive to demographics and household potential income. It falls with the number of children in the household and with the average human capital endowment (education and experience) of other household members. The latter effect is quite strong.13 From 1978 to 1988, changes in the average participation rate are insignificant among male household heads. Changes are substan- tially positive for spouses and female household heads and negative for other household members. All these findings are in full agreement with the aggregate evolution shown in table 5.3. More interestingly, this evolution was not neutral with respect to education, but the bias depends on the group being considered. Married women's participa- tion increased more among the least educated (see figure 5.4), whereas participation declined relatively more for the least-educated, secondary, male household members. From 1988 to 1995, participa- tion kept increasing for all women, with the same bias toward the least educated. Other male household members also saw a tilt in par- ticipation in favor of the least skilled. As in the preceding period, changes in participation among household heads were negligible. The negative impact of family size on female participation in the labor force shifted over time too. It ended up concentrating among spouses in households with very young children, but most of that evolution took place between 1978 and 1988 (see figure 5.4). With respect to the effect of the characteristics of other household mem- bers on spouse participation, figure 5.4 shows an interesting evolu- tion. It would seem that the increase in spouse participation tended to concentrate first in households that had a relatively higher poten- tial income, as summarized by the average educational level of non- spouse members. But between 1988 and 1995, that increase con- centrated more among less educated households. This feature will prove important. Concerning the choice between wage work and self-employment, estimates conform to what is observed elsewhere. Wage work tends to be more common for younger and more educated individuals. The effect of education tends to be more pronounced among spouses and other household members than among heads of household.14 The education gradient for wage employment became positive and significant for household heads in 1995 also. Over time, two 1995 20.4*- 0.2*- 0.2** 0.0 0.0 11.3* 0.9- 1995 26.8* 3.2*- 0.2*- 4.2* 10.3* 0.8- 9,233 12,104 0.1364 0.0898 Members, Self-Employed 1988 2.1* 0.8** Inactive Earners, Household 19.7*- 0.1**- 0.1 12.8* 1.0- 0.6- 1.1*- 1988 Inactive 28.2* 2.6*- 0.3*- 7.9* 9,586 12,657 0.1418 0.0907 age W Other and 1978 0.2 0.0 4.8 2.4* 1.3* among heads 16.9*- 15.6* 3.7- 0.1 0.9- 1978 23.6* 2,587 1.5- 0.0* 1,931 0.1812 0.0909 spouses Spouses, Choice household Urban Urban 1995 2.2*- 0.9*- 0.7* 14.7*- 1.0- 1.1 2.6* 1995 0.7* 0.3 12,104 0.1364 22.4**- 2.8**- 0.5**- 1.8* 9,233 0.0898 Household, Occupational of on Heads 1988 23.5*- 0.2- 0.9* 0.8 0.9 0.8 13.0*- 1988 0.7* 0.3* 19.6*- 2.0*- 0.4* 0.2* 9,586 Self-employed 12,657 0.1418 Self-employed 0.0907 ariablesV Urban for earsY Selected 1978 4.1 1.7 0.9 0.3 0.2 of 33.8*- 0.1- 1.2* 13.8- 1978 2,587 0.1812 13.1- 3.2- 0.7- 0.3- 1,931 0.0909 Selected Individuals Effect orkers, Inactive W (percent) (percent) (percent) 2 (percent) 2 (percent) Marginal (percent) (percent) and observations observations Rural (percent) (percent) (percent) under 5­2 13­6 (percent) (percent) (female) of 2 (percent) under 5­2 13­6 of 2 5.6 R R All ableT orkers, W and ariableV Constant Schooling Experience Gender Children Children Children Number Pseudo ariableV Constant Schooling Experience Children Children Children Number Pseudo 144 1995 7.3*- 2.8*- 0.0* 5.5* 1.8 23.2* 12.8* 1.5- 1.8*- years years 11,437 0.1219 19,992 0.3277 table. other 2 of 66 the than Logit in ber and variables, num less 18 een 1988 3.4* Inactive 3.0*- 0.1*- 5.2* 1.6** old, dummy 28.5* 15.1* Inactive 0.9**- 1.9*- included betw al 12,787 0.1185 18,781 0.3419 Multinomial not years Population of are 65 region female and Rural: model old, three 14 65. members 1978 3.2*- 3.2- 0.1*- 34.5* 17.3 0.6*- 4.2* 3,009 1.6- 2.0*- indicators the 0.394 years 0.1432 13,084 in than 65 workers cancefi between household, used older family and of Signi males 18 rural age other females 0.0* 0.6 1995 10.0**- 7.9- 1.4**- 0.5** 0.4* All 0.7* 0.8* rural. variables other 11,437 0.1219 19,992 0.3277 for of between average Some other Urban of males persons level. number old, number household, of 0.2* 0.5* 1988 15.1*- 11.9*- 1.1*- 2.0*- 0.7* earner 0.4* 1.0* percent 65, years Hogares. age 10 9 12,787 0.1185 18,781 0.3419 experience, level de Self-employed W than self-employed the of and at 6 and older years Nacional and educational 0.0 0.4 1978 10.0- 9.8- 2.8- 6.1 0.2- 0.9 1.4* urban males between 3,009 0.394 level, 0.1432 13,084 for average other Encuesta average of