Capital and Labor: The Factor Income Composition of Top Incomes in the United States, 1962-2006

This paper finds that capital and labor incomes in the United States have become more closely associated since the 1980s. This contributed to the well-known increase in the top 1 percent's share of total income, exacerbating rising inequality in capital incomes and earnings. The paper shows that the trend in the association is U-shaped, as the recent increase contrasts with a tendency toward a weakening association until the 1980s. The paper uses data derived from tax records, studies the asymmetries in the association, tests for robustness to alternative income definitions, and discusses the potential role of declining top marginal tax rates.


Introduction
Recent literature has documented the increase in income inequality at the very top of the distribution (e.g. Piketty, 2007, 2010;Alvaredo et al., 2017). Between the mid-1980s and the mid-2000s, the income share of the top 1% in the US approximately doubled, while around the same time their share of income from capital declined, and salaries and selfemployment incomes became more important (Figure 1, also see Piketty and Saez, 2007a).
These changes in the income composition have been even more pronounced over the long-run, with the share of income from capital among the top 1% dropping from close to 50% in the 1920s to less than 20% in the 2000s (Piketty andSaez, 2003, 2007a). Such an increase in the labor share at the top could arise from a change in the association between labor and capital, or because the share of tax units in the top who receive only capital income declines while there remain separate classes of laborers and capitalists. As part of the hypermeritocratic society (Piketty, 2014), the inequality in the wage distribution has increased substantially (Piketty and Saez, 2007a), providing support for the second channel. At the same time, tax units increasingly have income from both capital and labor (Wol and Zacharias, 2009), and pure rentiers have virtually disappeared (Atkinson, 2009).
While the literature (e.g. Piketty and Saez, 2007a) has focused on the distribution of total income, as well as the distributions of capital and labor income separately, the association between the two income sources has received little attention. To extend the existing literature, this paper examines the association between capital and labor incomes at the top of the US income distribution. Using data based on US tax returns, we can directly measure the association. In a classical society, capitalists are at the top of the capital distribution and bottom in the wage distribution, i.e. the correlation between capital and labor incomes is negative. On the other hand, this correlation is generally positive (but less than one) in modern economies (Piketty, 2014). Our paper also links the study of the functional (capital v. labor) and the personal (rich v. poor) income distributions. For capitalists being rich and workers poor (Milanovic, 2017). With many people having income from more than one source (Atkinson, 2009), and the increasing inequality within capital and labor incomes (Lydall, 1968), this mapping is more complicated in modern economies.
We nd that the association between capital and labor incomes increased between 1985 and 2006. Tax units at the top of the distributions of capital and labor incomes are increasingly the same people. There is some evidence of a U-shaped pattern -the association became weaker from 1966 to 1985 and then started increasing. The turning point comes after a fall in top marginal tax rates, which suggests that the rising association may be explained by high earners accumulating savings (i.e. future capital incomes) during a period of low top marginal tax rates. The association is found to be asymmetric in some parts of the distribution, with four-fths of the top 1% of earners being among the top quintile of capital incomes, compared with only two-thirds of the top 1% capitalists being in the top quintile of earnings. That is, top wage earners are very likely to also receive high capital incomes, while capital incomes and rentiers have not disappeared from the top. This asymmetry is not found at the top of the distribution (top 5% and upwards). Our results are robust to treating negative incomes and capital gains dierently, and to how we allocate income from self-employment and closely-held corporations to capital and labor.
We examine the association in two ways. We begin by decomposing the top 1% income share by factor incomes, which is frequently done with standard inequality measures, but has not been applied to top income shares. The inequality in total income is decomposed into the labor share in total income, the top 1% share within the distributions of capital and labor incomes, and the alignment coecient, which captures the association. The alignment coecient, like the Pearson correlation coecient, is aected by monotone transformations in the marginal distributions. In the second part of the paper, we thus use a rank-based measure of association that is more general and invariant to such transformations. Specifically, we analyze the association matrices between labor and capital, which are a discrete approximation to the copula density and equivalent to transition matrices in the study of economic mobility. The literature on mobility also provides us a test of increasing association based on cumulative association matrices. This remainder of the paper is structured as follows. Section 1 describes the data. In Section 2, we develop and estimate a decomposition of top income shares by factor incomes.
Section 3 reports the results from the association matrices. Section 4 concludes and discusses how the changes in the association may be related to top marginal tax rates. income tax returns.
2 The data are based on a random sample of tax records led during a particular calendar year. Importantly for our analysis, high-income returns are over-sampled, and we use the appropriate sampling weights to adjust for this. Following Piketty and Saez (2007a) and the literature using tax records more generally, the unit of analysis is a tax unit as dened under US tax law. We thus include singles and married couples without adjusting for dierences in tax unit size. 3 Not every tax unit les a tax return; historically there were high exemption levels and income taxes only applied to the most auent taxpayers.
We ignore the non-ling issue in this paper because we would have to make an arbitrary assumption about the income composition of the non-lers. Furthermore, this is unlikely to aect our results, since the fraction of lers is quite high and stable over the period of analysis (94% on average, compared with 9% before World War II). The PUFs are subject to some adjustments, especially at the top, that try to minimize the risk that individual taxpayers can be identied (Winglee et al., 2002). As a result, an observation in the PUF never contains all the information on a tax return, and may include information from other returns. Further information is provided in Appendix A.3, where we also show some robustness checks.
Income is dened as (taxable) gross market income, as reported on federal income tax returns.
4 We follow Piketty and Saez (2007b), Piketty et al. (2018) and Saez and Zucman 2 Like other papers using these data (e.g. Zucman, 2016 or Piketty et al., 2018), we exclude the micro data for 1960, since it contains fewer tax return variables. There exist no PUFs for 1963 and 1965.
3 Our results are robust to including only tax units that have two adults. Between the 1960s and 2000s, the average tax unit size declined from 2.6 to almost 2 persons. The proportion of married tax units declined by 10pp in the whole population but slower at the top of the distribution (Saez, 2004). Lakner (2014) shows that the trend in top shares is robust to accounting for tax unit size. Piketty et al. (2018) also nd very similar trends for tax units and adults. Tax units tend to be smaller than households; Hungerford (2010) estimates 75% of households to have one tax unit, and another 17% to have two tax units.
4 We exclude any income that is not taxable (e.g. non-taxable fringe benets such as health insurance), since it is not reported on the tax return. Imputing such non-taxable incomes from the US Current Population Survey does not have a large eect on top income shares (Bivens and Mishel, 2013;CBO, 2012). We also exclude non-market or transfer income such as Social Security and unemployment insurance benets. Since the PUFs do not capture all tax liabilities (e.g. exclusion of state and local taxes), we focus on gross incomes, like Piketty and Saez (2007a) and most of the literature on top incomes. It is also unclear how to split total federal income taxes between capital and labor without additional imputations. Finally, our analysis uses income that is observed from tax records (like Piketty and Saez, 2007a), which does not match national income (like Piketty et al., 2018) or macro totals in the nancial accounts (like Saez and Zucman, 2016) (also see Figure A.5).
(2016) to construct income components from the raw data that are comparable over time. We dene labor income as the sum of wages and (taxable) pensions. 5 Self-employment income is the sum of sole-proprietorship (Schedule-C) and partnership income. In the baseline, capital income is dened as the sum of dividends, (taxable) interest, rents, estate income, royalties, and prots from S-corporations. 6 Because negative incomes can result in top shares for the distribution of capital or labor income that are greater than one, we drop observations which are negative in labor, self-employment or capital income.
7 Self-employment income reects returns to both human and physical capital, so it needs to be split between labor and capital.
In the baseline, we allocate two-thirds of self-employment income to labor and one-third to capital. While these weights are arbitrary, they are similar to earlier literature, close to factor shares found in national accounts (Gollin, 2002;Feldstein, 2008;Elsby et al., 2013;Karabarbounis and Neiman, 2014), and intermediate compared with more extreme weights considered in the robustness checks (Appendix A.5).
8 Table 1 presents summary statistics for the baseline income denition. The number of observations is 87,000 per year on average.
In Appendix A.5, we show that our results are robust to several alternative income denitions. First, instead of dropping negative observations, we set them to zero (similar to Saez and Stantcheva, 2017). Second, we include capital gains, which are an important income source at the top. The tax data only report realized capital gains, which are lumpy because realizations respond to changes in the tax code or asset prices. An accruals-based approach changes the timing of capital gains in the short run, but the long-run trend remains 5 Stock options are taxed as wage income when they are exercised. 6 S-corporations are businesses with few shareholders that are taxed at the personal instead of the corporate level. Including S-corporation prots with capital is similar to CBO (2012), and tries to address the shift of corporate income to the personal sector following the Tax Reform Act of 1986 (TRA86) (see below).
Since we only include income that is taxable at the personal level, our denition of capital income excludes undistributed corporate prots.
7 This aects on average 7.5% of the (weighted) sample, due to negatives in self-employment (4.9% of self-employment incomes are negative) and/or capital income (3.0% of capital incomes are negative). The median total income of the excluded observations is similar to the 70th percentile. Using the same dataset, Auerbach and Hassett (2002) also drop observations with negative adjusted gross income.
8 Johnson (1954)   9 As Figure 1 shows, this pass-through income has become increasingly important at the top (also see Cooper et al., 2016). 10 Using evidence from rm owner deaths, Smith et al. (2017) estimate that 54% of S-corporation prots at the top represent labor income. (b) Our conclusions are also robust to moving in the opposite direction and allocating a greater share of self-employment income to capital. Following Saez and Zucman (2016), we include all of self-employment income (as well as S-corporation prots) with capital, instead of only one-third in the baseline.

Decomposition by factor incomes
We begin the analysis with a decomposition of top income shares by factor incomes. This is a formal derivation and the rst empirical application of the decomposition by Atkinson (2007), who builds on Meade (1964). It is closely related to factor income decompositions of 9 Wages that the owner-manager of the S-corporation pays herself would have already been included in wage income.
10 S-corporation ling status became more attractive following TRA86, which reduced the top personal tax rate below the corporate tax rate (Slemrod, 1996;Auerbach and Slemrod, 1997). Since then, the incentives to le as an S-corporation or a C-corporation have remained similar (Smith et al., 2017).
Like these decompositions, inequality in total income is decomposed into three elements: The share of each factor in total income, the inequality in the distribution of income from each of the factors, and a term capturing the association between the incomes from dierent factors and total income.
The income share of top quantile i can be written as in the data and Y i denotes total income of tax units with income greater or equal to y i , the threshold income (e.g. the 99th percentile in the case of the top 1% income share). For , where X m = N j=1 x j,m is the total income from factor m, and X i, Xm is the share of total income from factor m received by the top quantile i of total income recipients. 11 The alignment coecient lies between 0 and 1 since shares are non-negative and S i,m ≥ S i,m . 12 If top income recipients 11 Following Shorrocks (1982), S i,m may be called the pseudo share. It is dierent from S i,m because observations are ranked according to total, not factor, income.
12 After canceling out the incomes of tax units who are in the top quantile i of both income from factor m and total income, we can write (according to total income) receive no labor income, X i,l = 0 and A i,l = 0. On the other hand, if everybody in the top quantile i of the total income distribution is also found in the top quantile i of the distribution of labor income, then S i,l = S i,l and A i,l = 1. Figure 2 shows the results of the factor income decomposition for the top 1%. Panel A shows S 1 , the top 1% share of total income, which roughly doubled over this period (as was already shown in Figure 1). While we estimate the top share at a somewhat lower level than Piketty and Saez (2007a), the two series track each other very closely, as discussed in 13 Tax reforms may lead to re-timing or accounting responses (Slemrod, 1992), which can explain short-run uctuations but they are unlikely to account for the long-term trends that we analyze in this paper (Alvaredo et al., 2013). Furthermore, it is reassuring that our results are robust to alternative income denitions (see Appendix A.5), such as the inclusion of capital gains, an important channel of tax avoidance . This is weakly positive because in the rst term all x j,m are at least as big as the cut-o level x i,m , while in the second term they are strictly less than x i,m .

Results
13 As we already discussed, TRA86 brought the personal top marginal income tax rate below the corporate rate, thus providing incentives to move income from the corporate to the personal income tax base (Auerbach and Slemrod, 1997). TRA86 also raised tax rates on capital gains by including all realized capital gains in taxable income (Slemrod, 1996). The inequality in labor incomes, as measured by the top 1% share, increased very similarly to the inequality in total income, although the top labor share remains at a slightly lower level. The share of labor income going to the top 1% of earners approximately doubled, from 6% in the 1960s to 12% in the 2000s (panel C, left axis). These results mimic the estimates by Piketty and Saez (2007a), who impute for non-lers and also present independent evidence on executive compensation. 14 Capital incomes are distributed much more unequally than either labor or total income, as one would expect (Piketty, 2014). The top 1% share of capital incomes fell until the 1980s, then increased similarly to labor and total income, but continued to rise in the 2000s. The top 1% of capitalists now account for more than half of capital incomes, compared with around 30% in the 1980s (panel C, right axis). These results 14 Saez and Veall (2007) nd that in Canada top wages increased similarly to the US without the same changes in scal policy, suggesting that the US increase was real, not simply an accounting response due to changes in the tax code. follow a very similar trend to the taxable capital income shares reported by Saez and Zucman (2016), who allocate all self-employment income to capital (also see Appendix A.5). 15 The alignment coecient for labor income declined slightly from 91% to 88% in the late-1970s, before rising to 95% in the 2000s (panel D, left axis).
16 For capital income, the alignment coecient is lower and follows a U-shaped pattern; it declined from almost 80% in the 1960s to 65% in the 1980s, before rising to 83% by the end of the period. 17 A value of 83% for the capital alignment coecient means that 83% of total capital income of the top 1% of capitalists goes to tax units who are also in the top 1% of total income. Given that the top 1% capitalists receive 55% of capital income (panel C), this implies that 46% of all capital income goes to tax units who are in the richest 1% (this is S 1,c above). The same statistic was 20% of all capital income in the mid-1980s. These estimates suggest that over the last 20 years capitalists are increasingly also at the top of the income distribution. Labor income has an even stronger association with total income: Around 95% of labor income of the top 1% of earners is received by tax units that are also in the richest 1%, compared with 83% for capital.
18 We will examine this asymmetry in the association in more detail below.
tion is invariant to all monotone transformations in the marginals (Dardanoni and Lambert, 2001). In the remainder of the paper, we use an analytical framework based on the copula function, which oers a clean separation of the joint distribution of labor and capital into the marginal distributions and a rank-based measure of the association.
19 Our rank-based association measure is also more general because it considers the entire distribution, while the alignment coecient for say the top 1% is determined only by whether observations cross the 99th percentile.
Total income is a two-dimensional vector X = (L, K), where L denotes labor income and K refers to capital income. By Sklar's Theorem (Nelsen, 2006), there exists a copula function C X such that H X (l, k), the joint distribution function of X, can be written as where F (l) and G(k) are the marginal distribution functions of labor and capital income.
The density of the joint distribution is obtained by dierentiating with respect to l and k where C F G {F (l), G(k)} is the copula density. The joint density can thus be expressed as the product of the marginals, and the copula density, which is a rank-based measure of the association. The association matrix between labor and capital, shown in Table A.1 for 2006, is a discrete approximation to the copula density (Bonhomme and Robin, 2009). 20 The bins are dened in terms of ranks, splitting the distributions of labor and capital income into eight quantile groups: The bottom 50% (≤ P50), the next 10% (P50-P60), the next 20% 19 Copula functions have been widely used in actuarial science to describe multidimensional risks. In economics, they have been used to study the joint distribution of income and wealth (Kennickell, 2009;Jäntti et al., 2015), the horizontal equity of the tax system (Dardanoni and Lambert, 2001), income mobility by considering the dependence over time (Bonhomme and Robin, 2009), and multi-dimensional inequality and poverty (Atkinson, 2011;Ferreira and Lugo, 2013;Decancq, 2014). 20 Parametric copulas tend to impose symmetry. Since the association between capital and labor is asymmetric, we will adopt a non-parametric approach and use association matrices directly.
(P60-P80), the next 10% (P80-P90), the next 5% (P90-P95), the next 4% (P95-P99), the next 0.5% (P99-P99.5) and the top 0.5% (>P99.5). 21 The association matrix is equivalent to transition matrices used to study economic mobility. Following Atkinson (1981), who examines transition matrices, we can test whether the degree of association between labor and capital has increased. Consider the following two association matrices A and A * where i and j are particular quantile groups (of labor and capital), p i,j is the frequency in the association matrix, and γ > 0. A * is obtained from A by a correlation-increasing (or diagonalizing) switch, which adds γ to the diagonal elements and subtracts it from the o-diagonal elements. This switch increases the weight on the diagonal, such that A * exhibits a stronger association between labor and capital, but it leaves the marginal distributions unchanged. Let α and α * be the survival association matrices of A and A * , which are obtained by cumulating the association matrices from above. of the capital income distribution. This is greater than 0.0025%, which would be the frequency if the two variables were independent, but less than 0.5%, the frequency with perfect association.
22 Given our interest in the top tail of the distribution, it makes sense to consider the survival copula.
Similar to the expression above, the joint survival function can be written as H X (l, k) = C X { F (l), G(k)}, where C X is the survival copula, and F (l) = 1 − F (l) and G(k) = 1 − G(k) are the survival distributions (or complementary cumulative distribution functions) (Nelsen, 2006).
Therefore, if the dierence between the survival association matrices in years t+1 and t is everywhere positive, labor and capital incomes have become more closely associated between those years, thus moving away from a class model, where one class is at the top of the labor distribution and the other at the top of the capital distribution. 23 This is a test of rst-order dominance, which will be sucient for this paper. To go beyond rst-order dominance, one would need to place additional restrictions on the social welfare functions, eectively giving a dierent weight to the association in dierent parts of the distribution (Atkinson, 1981;Aaberge, 2009;Aaberge et al., 2017).

Results
We begin by examining the long-run evolution of some statistics from the association matrix before testing for rst-order dominance for selected years. Figure 3 shows several conditional probabilities that are obtained from the survival association matrix. 80% were also among the top quintile of capitalists, which is shown in Figure 3. However, this asymmetry is not present at the very top, since the results for the top 5% (as opposed to the top quintile) are more similar across labor and capital. These results are conrmed when we condition on the top 5% or the top 0.5% (Figures A.3 and A.4), and when we use alternative income denitions ( Figure A.7). In any case, these estimates suggest a high degree of association: If the top 1% of earners had randomly been assigned capital incomes, 20% of them would be among the richest quintile of capitalists, compared to the observed 80%.
We test for rst-order dominance between 1966, 1985 and 2006, capturing the two 20- year periods that we have just described. 25 The dierences between the survival association matrices is shown in Table 2

Conclusion
This paper has studied the association between capital and labor incomes at the top of the distribution using tax return data. This helps to understand the driving forces behind the rise in the top 1% income share, that has been documented before (Piketty and Saez, 2003).
We nd that capital and labor incomes have become more closely associated between 1985 and 2006, such that top capitalists and top earners are increasingly the same people. This rising association has contributed to the well-known increase in the top 1% income share, exacerbating the eects of rising inequality within capital incomes and earnings. In contrast, the 20 years leading up to 1985 saw a tendency towards a declining association, thus resulting in a U-shaped pattern. The association is asymmetric in some parts of the distribution, as a top earner is almost guaranteed to also be among the richest fth of capitalists, while a sizable share of top capitalists fall into the bottom four-fths of earnings. The association is more symmetric for richer quantiles, such as the top 5%. Our conclusions are robust to alternative treatments of negative incomes and capital gains, and how prots from self-employment and closely-held businesses are split between capital and labor.
The reversal from declining to increasing association coincided with a strong fall in the top marginal income tax rate in the US (dotted line in Figure 3). The top marginal rate declined from 91% in the early 1960s to 28% in 1986, and remained below 40% for the rest of the period. Lower taxes at the top raise the reward to bargaining more aggressively for higher pay, and therefore may explain the rapid rise in (gross) salaries at the top, which account for a large share of the increase in top income shares (Bakija et al., 2012;Alvaredo et al., 2013;Piketty et al., 2014). 26 Lower taxes may also account for the increasing association: When top marginal tax rates are low, tax units can save a greater share of their wages, thus accumulating more capital income over time.
27 This explanation assumes high saving 26 The decline in top marginal tax rates is not the only possible explanation for these patterns (also see Alvaredo et al., 2013). Other explanations may include a superstar theory together with a globalized economy (Atkinson, 2008), the spread of performance-based pay (Lemieux et al., 2009)  27 Saez and Zucman (2016) discuss the eect of increasing top incomes and high savings rates for wealth rates and only limited mobility at the top of the wage distribution, which is conrmed by the empirical evidence. 28 Our nding of an asymmetric association in some parts of the distribution also ts a model in which high earners accumulate capital incomes out of labor income.
Our paper shed light on the evolution of the association between capital and labor incomes during the last 40 years, when the top marginal tax rate declined strongly. It it unclear how the association will evolve in the future as there are two opposing forces. On the one hand, the high concentration of labor incomes coupled with low top marginal tax rates and high saving rates at the top, is unlikely to go away. On the other hand, we may see a reemergence of rentiers, as the high earners retire, which would reduce the association.

29
inequality, which is closely related to the distribution of capital incomes. Kaymak and Poschke (2016) present a formal model where a decline in income tax progressivity leads to an increase in wealth inequality.
28 Saez and Zucman (2016) nd high and increasing saving rates for the top 1%. Kopczuk et al. (2010) show that around two-thirds of the top 1% of earners are still there after three years, with little change since the late 1970s.
29 The average age at the top of the income distribution has increased since the 2000s, after having fallen for 20 years (Piketty et al., 2018).

A.1 Derivation of rst-order dominance test
Association matrices A and A * are dened as where i and j are particular quantile groups (of labor and capital), p i,j is the frequency in the association matrix, and γ > 0. A * is obtained from A by a correlation-increasing switch, which raises the weight on the diagonal without changing the marginal distributions. Hence A * exhibits stronger association between labor and capital. The survival association matrix of A is dened as where α i,j = Pr(l > l i ∩ k > k j ). We have used the fact that α i,j = α i,j+1 + α i+1,j − α i+1,j+1 + p i,j ; the diagonal element α i+1,j+1 needs to be subtracted because adding the adjacent elements α i,j+1 and α i+1,j double-counts these cells. All other cells follow from the same formula. Similarly, the survival association matrix of A * is denoted by where we used the following results: α * i,j =α i,j + γ; α * i,j−1 = α i,j−1 and α * i−1,j = α i−1,j because γ cancels out. After canceling out γ, it is clear that the only dierence between α and α * is α * i,j , such that α * i,j = α i,j + γ. Therefore, taking the dierence between α * and α yields the following result (also see equation 4 in the main text)

A.3 Adjustments made to public use les
The public use les (PUF) are subject to some adjustments that try to avoid individual taxpayers being identied. Because public data on executive compensation may be available, salaries reported on tax returns have been blurred (or micro-aggregated) since 1983 by replacing adjacent records by their average. Therefore, an observation in the PUF never contains all the information on a tax return, and may include information from other returns.
Before 1996, this only aected salaries at the top (the top 1% or less), so it is unlikely to aect our results substantially. 30 We present two robustness checks which conrm that the association has increased between 1982 and 1995. None of the income components we use were blurred in 1982. 1995 is the last year before blurring was applied to salaries throughout the distribution, and before prots from sole-proprietorships were also blurred.
Since the blurring aects income components, but not total income, one can attempt to recreate the raw salaries from the correctly recorded total income, as we do here for 1995.
It will be impossible to reproduce the raw salaries with certainty, since multiple income components have been removed or blurred at the same time, and components have been rounded, but it is nevertheless a useful robustness check. For example, in 1995 alimony paid and received, which is part of total taxable income, was removed for high-income tax units and blurred for low-income tax units. For the low-income observations, for whom salaries were not blurred, our recreated salaries are 5% greater on average than the raw salaries in 1995. The recreated salaries also contain a substantial number of negatives (almost 8% of the high-income observations), which we set to zero. Our recreated salary variable combines the raw salaries for the low-income observations (approximately the bottom 99%) with the 30 Other variables such as alimony payments, real estate deductions or the state of residence were also blurred or removed, but we do not use them in our analysis. Because the PUFs also exclude some records at the very top (between 13 and 191 records during 1996 to 2008, as reported by Piketty et al.), Saez and Zucman (2016) and Piketty et al. (2018) augment the PUF with a synthetic observation to match the totals above $10m. The cut-o for the top 0.5%, which is the smallest group we consider in our analysis, is far lower than that, so this is unlikely to aect our results. Sailer et al. (2001) nd that the original and blurred data match well for the top 1%, but they nd larger dierences for the top 400 taxpayers. For a full description of the PUF construction, see Winglee et al. (2002) and http://users.nber.org/~taxsim/gdb/. recreated salaries for the high-income observations. 31 Salaries were blurred also for the low-income observations. Sole-proprietorship prots were now blurred.  Figure A.5 compares our top 1% income share with the estimates by Piketty and Saez (2007a) (taken from Saez, 2013). Our methodology diers from Piketty and Saez in several aspects: (a) Piketty and Saez adjust for non-ling and dene top quantiles relative to the entire US population of potential taxpayers, while we only consider the tax units that le a return. (b) Piketty and Saez use total gross market income reported on the tax return, while we report the sum of income components. We thus exclude some small income sources, such as alimony.
The disclosure avoidance procedures, which aect income components but not total income, could also lead to dierences (see

A.5 Robustness checks for alternative income denitions
We replicate the main results for ve alternative income denitions, which are summarized in Table A.4 and in the main text (Section 1). Capital gains have been adjusted to account for changes in legislation aecting the taxable portion of capital gains. 33 Figure    Allocating self-employment income entirely to capital income has a large eect (also see panel E in Figure A.6), but this is a very extreme allocation rule that seems unrealistic, since self-33 This income denition is used when the US results of this paper are discussed in comparison with Norway, see unpublished note by Aaberge et al. (2017).
34 In addition to the sample selection adopted in the baseline, observations with negative S-corporation prots are excluded in this robustness check.