Policy Research Working Paper                    9804




                 Fair Inheritance Taxation
                               Benoit Decerf
                             François Maniquet




Development Economics
Development Research Group
October 2021
Policy Research Working Paper 9804


  Abstract
 This paper studies the optimal taxation of bequests in a                           two principles, together with Pareto efficiency and a separa-
 model in which agents have heterogeneous preferences over                          bility principle. Second, the paper studies the shape of the
 their consumption and the net-of-tax bequest received by                           inheritance tax scheme that maximizes this social welfare
 their heir. The bequest left by an individual depends on                           function. It shows that in the aggregate, the inheritance
 both her degree of altruism and the bequest received from                          tax must collect money (redistributed through a non-nega-
 her parents. First, the paper studies two principles that are                      tive demogrant). Moreover, small bequests cannot be taxed
 at the heart of the debates on taxing inheritances: (1) chil-                      (they can potentially be subsidized), while bequests that are
 dren should not be penalized by the lack of altruism of                            larger than those of the most altruistic individuals who did
 their parents, and (2) parents should be free to choose their                      not receive bequests from their parents should be taxed as
 bequests. Only one social welfare function satisfies these                         much as efficiency permits.




 This paper is a product of the Development Research Group, Development Economics. It is part of a larger effort by the
 World Bank to provide open access to its research and make a contribution to development policy discussions around the
 world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may
 be contacted at bdecerf@worldbank.org.




         The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development
         issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the
         names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those
         of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and
         its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.


                                                       Produced by the Research Support Team
                       Fair Inheritance Taxation∗
                   Benoit Decerf†               Francois Maniquet‡




          Keywords: fairness, inheritance taxation, responsibility, compensa-
      tion, tax exemption

          JEL: D63, D64, H21

   ∗ Acknowledgments : We are grateful to Dirk Neumann and Kevin Spiritus for helpful dis-

cussions. We thank all the participants to the DEFIPP workshop of March 2017 (University
of Namur), the “What is socialism today” conference (Yale University), for their comments.
Funding from the Banque Nationale de Belgique (Exercice 2017) is gratefully acknowledged.
The ﬁndings, interpretations, and conclusions expressed in this paper are entirely those of the
authors and should not be attributed in any manner to the World Bank, to its aﬃliated orga-
nizations, or to members of its Board of Executive Directors or the countries they represent.
The World Bank does not guarantee the accuracy of the data included in this publication and
accepts no responsibility for any consequence of their use. The usual disclaimer applies.
   † World Bank. bdecerf@worldbank.org
   ‡ University of Louvain-la-Neuve. francois.maniquet@uclouvain.be
1     Introduction
Piketty and Saez (2013) deeply questioned the social desirability of recent re-
forms in the inheritance tax systems. Indeed, the outcome of these reforms
is a decrease in inheritance tax, whereas Piketty and Saez (2013) showed that
the historical high top inheritance tax rates near 70% observed in the United
States and the United Kingdom over the 1950s, 1960s and 1970s were within the
brackets of the optimal ones (see Figure 3 in Piketty and Saez (2013)). Opti-
mality here is measured with respect to the maximization of some social welfare
function.
    In this paper, we solve two problems that are left unsolved in Piketty and
Saez (2013). First, we show how fairness principles can be used to endogenize
social welfare weights. In Piketty and Saez (2013), indeed, the formula of the
optimal tax is a function of how society values the utility of diﬀerent individuals,
but in a world in which individuals diﬀer in many dimensions, all of which likely
to create inequality, it is not clear who should be given priority.1 We solve this
problem by resorting to two fairness principles, which are at the heart of the
debates on taxing inheritance. The ﬁrst principle is that parents should be free
to choose their bequest (McCaﬀery, 1994). This is consistent with the principle
of responsibility for one’s preferences, which has inspired recent developments
in optimal taxation theory (like, among others, Fleurbaey and Maniquet (2018),
Lockwood and Weinzierl (2016), Piketty and Saez (2013)). The second fairness
principle is that children should not be penalized by the lack of altruism of their
parents.
    These two fairness principles may sound in conﬂict, but we show that they
are not. This comes from the fact that each individual is both a child, who may
wish to receive a part of the bequest allocated to children of other families, and
a parent, who wishes to be free to bequeath the amount she prefers. As a result,
there exists one social welfare function (SWF) that reconciles both principles.
This is the social welfare function that we maximize to identify the shape of the
optimal tax function.
    Second, and most importantly, we identify the shape of the optimal inheri-
tance tax on low bequests, that is bequests left by parents who did not receive
anything from their own parents. In Piketty and Saez (2013), indeed, the tax
is assumed to be either linear, or linear after an interval of exemption. Starting
with a much broader set of admissible tax functions, we prove that the optimal
function has one of the following two shapes. Either it exempts low bequests, or
it ﬁrst subsidizes them and then tax them, in which case the largest tax amount
on low bequests cannot be larger than the supremum of the bequests that are
subsidized.
    An intermediary result in our analysis of the optimal tax function is that
bequests should, on average, be taxed rather than subsidized. This coincides
with a result of Piketty and Saez (2013), except that they reach this conclusion
only for some distribution of normative weights. This further illustrates how
important it is to be able to endogenize these weights as a function of fairness
principles.
   1 Piketty and Saez (2013) take advantage of the ﬂexibility allowed by the theory of general

social marginal welfare weights (Saez and Stantcheva, 2016) in order to evaluate diﬀerent
social welfare choice criteria, all of which are diﬀerent from the one we derive.




                                              2
    Three policy recommendations are consistent with our results. First, tax-
ing bequests should be viewed as a way to redistribute from individuals who
inherited from their parents to those who were less lucky. That means that the
tax/transfer scheme of bequests should bring some strictly positive surplus to
the government, money that should be allocated to all individuals independently
of how much they receive from their parents and how much they leave to their
children.
    Second, the trade-oﬀ between increasing the transfers to all and decreasing
inheritance taxes should be solved by looking at how they aﬀect the well-being
of those who did not receive inheritance from their parents. Indeed, the most
altruistic among them will prefer a decrease in inheritance tax (or an increase
in inheritance subsidy) whereas the self-centered among them will prefer an
increase in transfer.
    Third, in spite of a positive average tax on bequests, there can be two reasons
to subsidize low bequests. The ﬁrst reason, reminiscent of results from the
literature that we review below, is eﬃciency, because subsidizing may be a way
to obtain a dominating distribution of bequests. The second reason is fairness,
because it is a way to redistribute among the poorest individuals (the ones who
did not get any bequests from their parents) from self-centered to altruistic
individuals. This, however, goes with a precise condition: fairness can only
justify subsidies in an interval that goes up to the bequests left by the poorest
and most altruistic individuals.
    The literature on inheritance taxation has raised a number of diﬀerent is-
sues, among which the issue of the shape of the tax function maximizing an
inequality-sensitive social welfare function is a central one. This literature has
not reached any consensus. This comes from diﬀerences in, ﬁrst, the modeling
of the interactions between parents and children, and second, the objective that
the egalitarian planner is supposed to follow. In many papers, individuals are
assumed to have the same preferences but diﬀerent abilities to earn income, with
the consequence that the planner tries to redistribute from high to low-wage in-
dividuals. Because inheritance inequality does not reveal new information about
individuals’ wages, Farhi and Werning (2010) and Kaplow (2001), following a
Atkinson and Stiglitz (1976) type of argument, proves that taxing labor income
is more eﬃcient than taxing inheritance. In a similar model, Kopczuk (2013a)
makes the point that a countervailing force pushing in favor of bequests taxa-
tion is that receiving a large inheritance disincentivizes labor supply. Kaplow
(1995), Farhi and Werning (2010) and Kopczuk (2013a) all make the point that
subsidizing bequests is a way to incentivize parents to internalize the positive
externality of giving.
    In Piketty and Saez (2013), on the contrary, diﬀerences in bequests do not
necessarily come from diﬀerences in parents’ ability to earn income. They can
come from parents’ altruism, which diﬀer across parents. As a result, taxing
inheritance can be the most eﬃcient way to redistribute from lucky to unlucky
children. Moreover, Piketty and Saez (2013) do not divide individuals into
parents and children. They rather consider the entire lifetime, so that all in-
dividuals are children and parents in turn. Fiscal policies then aﬀect both the
resources that individuals receive early in life and the tax they pay at the end
of their life. The positive externality is now reﬂected in the level of sustainable
tax and transfer policies. As a consequence, Piketty and Saez (2013) show that
taxing bequests may end up being optimal.


                                        3
    As explained above, we keep the same intergenerational setting as Piketty
and Saez (2013), but we characterize a speciﬁc social welfare function and we
study tax functions in a larger domain. We prove that tax functions that ﬁrst
subsidize low bequests and then tax larger bequests can be optimal. These
functions are not studied by Piketty and Saez (2013), but Kopczuk (2013a)
conjectures that they might be optimal.
    As discussed in reviews by Cremer and Pestieau (2006) and Kopczuk (2013b),
the eﬃciency and fairness implications of inheritance taxation may depend on
the bequest motive. For instance, accidental bequests, which exist in the ab-
sence of a perfect annuity market when parents die before consuming all their
savings, can be taxed without any eﬃciency costs. We study inheritance taxa-
tion under a joy-of-giving motive, which explicitly acknowledges the desire that
parents may hold to leave a bequest. The legitimacy of such desire is central in
discussions surrounding the taxation of bequests (McCaﬀery, 1994). An alter-
native bequest motive consistent with this desire is altruism, whereby parents
care for the utility of their child (whereas parents care only about the net-of-tax
inheritance received by their child under a joy-of-giving motive). The altru-
istic motive is at the center of the Barro-Becker dynastic model, which has
been widely studied in the literature on optimal capital/inheritance taxation.
Most centrally, Chamley (1986) and Judd (1985) conclude that the tax on in-
heritances should be zero in the long run. More recently, Straub and Werning
(2020) overturn this early result by showing that it only holds for high values
of intertemporal elasticity of substitution, but otherwise such tax is positive
and signiﬁcant. We do not consider altruism and it remains an open question
whether our results also hold in this alternative setting. One result that is very
likely to carry through is that the inheritance tax should globally collect a non-
negative amount. The reason is that any tax violating this would be dominated
by Laissez-Faire, because such tax would hurt self-centered individuals whose
parents are self-centered. We also note that the altruistic model of bequests has
been empirically tested and rejected by Wilhelm (1996).
    Some authors study the eﬀect on the optimal tax of the fact that the number
of children may diﬀer across families and parents may decide not to give equal
bequests to all their children, like Cremer et al. (2001). We could take that into
account at least to some extent: the worst-oﬀ would remain the same under
more general assumptions on the number of children.
    Some other aspects of inheritance taxation are completely ignored in our
analysis. Cremer et al. (2003) study capital income taxation as subsidiary to
inheritance taxation in a world in which bequests may not be observable. Nord-
blom and Ohlsson (2006) study the possibilities to escape bequest taxation
through inter-vivos gifts. Stantcheva (2015) studies inheritance taxation in its
relationship to investments in human capital. Mirrlees et al. (2010) discuss the
administrative cost implied by the collection of an inheritance tax. Golosov
et al. (2003) and Kocherlakota (2005) study tax instruments that are allowed to
vary over time, leaving more room for improving welfare. Fleurbaey et al. (2018)
study the implications of ex-post egalitarianism for the taxation of accidental
bequests resulting from premature mortality.
    In Section 2, we describe the model, by insisting on the similarities and
diﬀerences with the pioneer model of Piketty and Saez (2013). In Section 3, we
discuss our social welfare function and the axioms that justify it. In Section 4,
we study the optimal tax function and we state our main results. In Section 5,


                                        4
we provide some concluding comments. In Section 6, we develop the proofs of
the results.


2     The model
We consider an economy with a discrete set of successive generations, 0, 1, . . ..
Each generation contains a set [0, 1] of individuals of measure 1. We use λ to
denote the probability measure on [0, 1], that is the mass of individuals whose
names are between i and j (i, j ∈ [0, 1], i < j ) is equal to j − i. We let M [0, 1]
denote the set of Lebesgue-measurable subsets of [0, 1] and µ(J ) denote the
measure of J ∈ M [0, 1].
    Remember that, contrary to Piketty and Saez (2013), we are not interested
into the trade-oﬀ between labor income taxation and bequest taxation. We only
raise the subquestions of whether total lifetime labor income should be taxed
(resp., subsidized) so as to subsidize (resp., tax) (at least some) bequest leavers.
As a result, we assume that all individuals earn an identical lifetime income
of w. Therefore, diﬀerences in lifetime budgets only come from the bequests
individuals get (or not) at the beginning of their life. The assumption of equal
w among individuals, however, is far from necessary for our results. We come
back to this issue in the conclusion.
    Preferences are deﬁned over lifetime consumption, c, and the inheritance
received by their heir, h. Preferences are heterogeneous in the population. We
make three assumptions on the joy-of-giving utility functions representing these
preferences.

    1. We assume preferences to be normal on both goods, consumption and
       inheritance, at all prices.
    2. We assume that at each period there exist some selﬁsh individuals, that
       is their utility function is
                                    us (c, h) = c.
       A consequence of this assumption is that at each period there exist some
       individuals who do not receive any inheritance.
    3. Finally, we assume that at each period there exist individuals exhibiting
       the largest level of altruism. That is, some individuals have utility function
       ua and for all compact opportunity set B ⊂ R2                     a   a
                                                         + , if bundle (c , h ) is the
                                          a
       best bundle in B according to u and (ci , hi ) the best bundle according
       to any utility function ui of another individual then ha ≥ hi . Observe
       that this assumption would be a consequence of imposing the classical
       single-crossing property (which amounts to assume that individuals can
       be ranked according to their level of altruism) and a kind of compactness
       of the domain of preferences. We don’t need these assumptions and limit
       ourselves at imposing the existence of most altruistic individuals.

Individuals live one period, after which they are replaced by the individual of
the same dynasty and the next generation. In addition to goods c and h, it
is convenient to consider the quantity of money that an individual receives at
the beginning of her life, g , and the bequest left by an individual, b, which is
the quantity of money that she does not consume at the beneﬁt of her heir.


                                          5
Quantities g , c, b and h will be related to each other when we model taxation
below, but we don’t need to introduce these relations in the ﬁrst step of our
analysis, when we discuss the social welfare function.
    Indeed, we begin by focusing on what happens to one generation. Anticipat-
ing that we will later restrict our attention to long-run equilibrium allocations,
we restrict our attention to allocations in which the distribution of money re-
ceived by this generation from the previous one, g , is identical to the distribution
of money received by the following one, h. Formally, individual i ∈ [0, 1] con-
sumes a bundle

                              zi := (gi , ci , hi ) ∈ X = R3
                                                           +.


An allocation z ∈ Z := X [0,1] is a function z : [0, 1] → X . An allocation
z = (gi , ci , hi ) ∈ Z is a steady-state allocation if the distribution of the gi ’s
is equal to the distribution of hi ’s, that is, if (ˆ                 ˆ i )i∈[0,1] , where
                                                       gi )i∈[0,1] = (h
 xi )i∈[0,1] denotes the permutation of (xi )i∈[0,1] in which elements are ranked in
(ˆ
increasing order. We let S denote the set of steady-state allocations. A (one
generation) economy is a proﬁle of utility functions u = (ui )i∈[0,1] ∈ U = U [0,1] ,
where U is the set of acceptable utility functions. In particular, us , ua ∈ U .


3     Social welfare
In this section, we deﬁne the social welfare function (SWF) that we use in the
next section to study the optimal tax and we discuss its axiomatic foundation.
    The SWF works by applying the lexicographic aggregator to individual well-
being indices that capture the fairness principles of the planner. Each index
represents the preferences of an agent. Let us deﬁne this index ﬁrst. It is
illustrated in Fig. 1. Agent i is consuming (ci , hi ). The indiﬀerence curve
through (ci , hi ) shows that agent i is indiﬀerent between (ci , hi ) and maximizing
her utility over a budget of slope −R starting at (c, 0), where R is the exogenous
rate of return on savings per generation. Budgets of slope −R are ﬁrst-best, or
laisser-faire, budgets, that is, in the absence of taxation. To say it diﬀerently,
this agent is indiﬀerent between her actual consumption, (ci , hi ), and being
free to allocate a wealth of c between own consumption today and children’s
inheritance tomorrow in the absence of taxation. We state that this agent has a
current well-being of c. The objective of the planner is to maximize the lowest
well-being, and in case of a tie, to maximize the second lowest well-being, etc.
    To deﬁne this SWF formally, we need the following notation. A social or-
dering is a complete ordering on steady-state allocations. A Social Welfare
Function (SWF) is a function R associating each economy u ∈ U with a social
ordering R(u).
    We deﬁne the intertemporal budget set of agent i with bundle zi = (gi , ci , hi )
when laissez-faire prevails as2
                                                           h         hi
              B LF (zi ) :=   zi = (gi , ci , hi ) ∈ X ci + i ≤ ci +        .
                                                       
                                                            R         R
  2 As mentioned above, g , c and h will be related to each other when we model taxation.
                           i  i      i
Thus, even if gi does not appear in the inequality deﬁning B LF , ci and hi will depend on gi
when we model taxation.



                                             6
                       hi
                                 ui
                                    zi




                               B LF ((0, c, 0))
                                                                      ci
                                                         c


                            Figure 1: The c-equivalent utility



    The well-being index we are interested in, which we denote as uc and we call
c-equivalent utility, can be deﬁned as follows.
Deﬁnition 1 (c-equivalent utility).
For all i ∈ [0, 1], zi = (gi , ci , hi ) ∈ X and ui ∈ U ,

            uc (zi , ui ) = c ⇔ ui (ci , hi ) = ui arg max B LF (gi , c, 0)           .
                                                             ui

    The SWF Rc−lex compares two allocations by applying the leximin aggrega-
tor to lists of c-equivalent utilities associated to the allocations.
SOF 1 (Rc−lex ). For all u ∈ U and any two allocations z = (gi , ci , hi )i∈[0,1] , z =
(gi , ci , hi )i∈[0,1] ∈ S

          z Rc−lex (u) z        ⇔     (uc (zi , ui ))i∈[0,1] ≥lex (uc (zi , ui ))i∈[0,1] .
    This way of measuring well-being has two key properties. First, it does not
depend on gi , that is the quantity of money one agent received as inheritance
does not matter per se. The only thing that matters is the quantity of money
that this agent and her child consume. Second, how precisely an agent allocates
their wealth between own consumption and bequest does not matter, provided
this agent allocates it freely. As a result, more altruistic or more selﬁsh agents
have the same c-equivalent utility when they allocate the same quantity of money
in the absence of taxation.
    The combination of these two properties is the basis on which the axiomatiza-
tion of this SWF is grounded. Indeed, it satisﬁes the following three important
axioms. The ﬁrst one is the classical Pareto axiom. It requires weak social
preference when all individuals weakly prefer one allocation over another. In
addition, it requires strict social preference when one set (of positive measure)
of individuals strictly prefer the former allocation.


                                               7
Axiom 1 (Pareto).
For all economy u ∈ U and steady-state allocations z = (gi , ci , hi )i∈[0,1] , z =
(gi , ci , hi )i∈[0,1] ∈ S , if for all i ∈ [0, 1]

                               ui (ci , hi ) ≥ ui (ci , hi )

then z R(u) z , and if, in addition, there exists a subset of individuals J ∈
M [0, 1] such that µ(J ) > 0 and for all j ∈ J

                               uj (ci , hi ) > uj (cj , hj )

then z P(u) z .
    The fact that Rc−lex satisﬁes Pareto comes from c-equivalent utility being
identical at all points of the indiﬀerence curve. Actually, c-equivalent utility is
a recalibration of the utility function.
    The second axiom, compensation for children’s lack of luck, in short Com-
pensation, encapsulates the idea that individuals should not be held responsible
for the lack of altruism of their parents, that is they should be compensated
for receiving low inheritance. Formally, it requires that if two individuals with
identical preferences consume bundles that dominate one another (that is, one
individual has both a larger consumption and a larger inheritance received by
her child) then a transfer from the richer to the poorer of these individuals is a
strict social improvement. We add the restriction that individuals have identical
preferences in order to avoid a classical impossibility with Pareto.
Axiom 2 (Compensation).
For all economy u ∈ U , steady-state allocations z = (gi , ci , hi )i∈[0,1] , z =
(gi , ci , hi )i∈[0,1] ∈ S , subsets of individuals J, K ∈ M [0, 1] such that µ(J ) =
µ(K ) > 0, and δ ∈ (0, 1      2 ], if for all j, q ∈ J and k, ∈ K ,

   • uj = uq = uk = u , cj = cq , ck = c , hj = hq , hk = h ,

   • cj + δ (ck − cj ) = cj = cq ≤ c = ck = ck − δ (ck − cj ),
   • hj + δ (hk − hj ) = hj = hq ≤ h = hk = hk − δ (hk − hj ),
                      / J ∪ K then z P(u) z .
and zi = zi for all i ∈

   The fact that Rc−lex satisﬁes Compensation comes from c-equivalent utility
being independent of the inheritance received g . As a consequence, equalizing
consumption and bequest among agents with the same preferences is a way to
make individual well-being independent of how much one inherited from their
parents.
   The third axiom, responsibility for parents’ choices, in short Responsibility,
encapsulates the idea that individuals should be considered responsible for their
preferences, that is they should be free to allocate their wealth the way they
wish. It requires that two individuals with the same inheritance from their
parents should ideally be free to choose their preferred bundle in the same
budget set of slope −R, the laisser-faire slope. Rather than requiring that they
should choose in the same budget, the axiom requires that budget inequality
between two such agents should be reduced.



                                            8
Axiom 3 (Responsibility).
For all economy u ∈ U , steady-state allocations z = (gi , ci , hi )i∈[0,1] , z =
(gi , ci , hi )i∈[0,1] ∈ S , subsets of individuals J, K ∈ M [0, 1] such that µ(J ) =
µ(K ) > 0, if there exists δ > 0 such that for all j, q ∈ J and k, ∈ K ,
   • zi ∈ max|ui B LF (zi ), ∀ i ∈ {j, q, k, },
     zi ∈ max|ui B LF (zi ), ∀ i ∈ {j, q, k, }
   • yj + δ = yq + δ = yj = yq < yk = y = yk − δ = y − δ,
where
                             hi            h
                    yi = ci +   , yi = ci + i , ∀ i ∈ {j, q, k, },
                             R             R
                      / J ∪ K then z P(u) z .
and zi = zi for all i ∈
    The fact that Rc−lex satisﬁes Responsibility comes from c-equivalent utility
assigning the same well-being to two agents as soon as they both freely allocate
their wealth absent any taxation.
    In Appendix 6.3, we formally prove that Rc−lex is the only SWF that satisﬁes
these three axioms together with a consistency requirement of the SWF across
economies. A number of important remarks have to be made at this stage.
   1. Both Compensation and Responsibility are transfer axioms, that is, they
      are satisﬁed if a transfer of goods or of budget is implemented from the
      richer to the poorer agent, that is to say that they display a positive degree
      of inequality aversion. The SWF we come at, though, exhibits an inﬁnite
      inequality aversion. The fact that the combination of Pareto, transfer and
      consistency axioms leads to a SWF exhibiting inﬁnite inequality aversion is
      common in the literature and the underlying logics is now well understood.
      This is the main reason why we relegate the proof to the Appendix.
   2. There is something less common, though, in the axiomatic foundation
      of Rc−lex . Compensation and responsibility axioms, indeed, are typically
      incompatible with each other. It is therefore a surprise that they turn
      out to be compatible in this model. This comes from the fact that agents
      in this model are both parents and children (building on the novelty of
      Piketty and Saez (2013)) and each agent’s utility is inﬂuenced both when
      she receives inheritance, so that her wealth increases, and when she is
      prevented from bequeathing/incentivized to bequeath money to her child.
      It can be illustrated through the following simple example. Assume an
      altruistic parent plans to bequeath some amount of money to her child,
      whereas a selﬁsh parent with the same wealth does not wish to bequeath
      anything to her own child. As a result of the bequest, the child of the
      altruistic parent would turn out better-oﬀ than that of the selﬁsh parent.
      On the other hand, if the altruistic parent is prevented from bequeathing,
      she will end up worse-oﬀ than the selﬁsh parent. The solution to this
      paradox, assuming non-distortionary transfers can take place, would be
      to withdraw some wealth from both parents and to allocate it to the child
      of the selﬁsh parent in compensation for the lack of inheritance, taking
      into account that the two children themselves can be required to give a
      part of their wealth in case other agents are worse-oﬀ. That is, the ideal
      non-distortionary allocation would be to equalize wealth of all agents of all

                                          9
      generations, independently of whether this wealth comes from bequest or
      redistribution. Useless to say, such an allocation is impossible to achieve
      through bequest taxation.


4     Optimal tax
In this section, we study the allocations that maximize our SWF among those
that can be implemented by a bequest tax function and a demogrant. Contrary
to what we did in the previous section, we now take account of the inﬂuence
of the tax scheme on the transmission of bequests, as the behavior of members
of one generation inﬂuences the well-being of members of the next generation.
To take account of these long-run eﬀects of the tax, we restrict our attention to
tax functions and demogrants that do not depend on time and we look at the
corresponding long-run allocations, that is the allocations obtained when the
distribution of bequests is stabilized across generations.
    More precisely, we assume that at time t, each individual it (from dynasty
i living in generation t) receives inheritance git ≥ 0 (which is a function of
the bequest left from individual it − 1) and demogrant D, so that their total
resources are w + D + git . Individual it chooses consumption cit ≥ 0 and bequest
bit+1 ≥ 0 under the budget constraint

                               cit + bit+1 = w + D + git .

Bequests are taxed according to tax function τ , so that amount bit+1 − τ (bit+1 )
is transferred to individual it + 1, who receives

                             hit+1 = R(bit+1 − τ (bit+1 )),

where R is the interest rate. We assume that τ (0) = 0, because any other value
would amount to transferring the same (negative or positive) amount to all,
which is exactly what D achieves.
    The same process takes place at t + 1, with inheritance git+1 = hit+1 ≥ 0.
Starting conditions at time t + 1 may, therefore, diﬀer from those at time t.
Note that, in this model, it is the money collected at time t through τ that is
used to fund demogrant D. This captures the fact that, had we considered a
model in which individuals live for many periods, with a fraction of them born
and dead at each period, what an agent gets out of the redistribution system at
each period of her life, D, is funded by the taxes on bequests of the individuals
that lived (and died) during this individual’s own life. This modeling is par-
ticularly appropriate to our objective to study the trade-oﬀ between modifying
the budgets of individuals through a demogrant, funded by taxes on bequests,
or subsidies to bequests, at the price of a lower or even negative demogrant.
    For a given tax-demogrant scheme (τ, D), an equilibrium allocation at time t
is an allocation zt = (zit )i∈[0,1] = (git , cit , hit+1 )i∈[0,1] ∈ Z for which all i ∈ [0, 1]
choose in the budget set deﬁned by this scheme and their inheritance git , i.e.
                                             
B τ (w + D + git , 0) := (cit , hit+1 ) ∈ R2
                                           +hit+1 ≤ R ((w + D + git − cit − τ (  w + D + git − cit )) ,
                                             

implying for all i ∈ [0, 1] that

              hit+1 = R (w + D + git − cit − τ (w + D + git − cit )) .                  (1)


                                            10
    A long-run equilibrium allocation for a tax scheme (τ, D) is a steady-state
equilibrium allocation z = (gi , ci , hi )i∈[0,1] ∈ S to which this sequence of equi-
librium allocations at time t may converge. That is, at a long-run equilibrium
allocation, Eq. (1) holds and the proﬁle of inheritances received, (ˆ    gi )i∈[0,1] , is
                                               ˆ
equal to the proﬁle of inheritances left, (hi )i∈[0,1] .
    We need some further assumptions to guarantee that long-run equilibrium
allocations exist and that we are able to apply our SWF to them.
    As Piketty and Saez (2013), we assume that the stochastic transmission
of preferences across generations is such that the distribution of preferences
remains constant through time. In the terms of the previous section, that means
that economy u ∈ U is constant through time, up to some (measure preserving)
permutation of i ∈ [0, 1].
    We also assume that the economy converges over time to a unique long-run
equilibrium independent on the initial distribution of inheritances (gi0 )i∈[0,1] .
Piketty and Saez (2012) show that this assumption is met in their framework
under reasonable conditions. In particular, the average taste for bequest cannot
be too strong and the stochastic transmission of preferences across generations
must satisfy an ergodicity property.3 Importantly, this property does NOT im-
ply that all members of one dynasty have the same preferences. Some altruistic
parents have selﬁsh children and the converse is true as well.
    Observe that (τ, D) may yield a long-run equilibrium allocation in which
not enough money is collected through tax τ to fund demogrant D. We need to
further restrict our attention to tax schemes that meet the government budget
constraint
                           D ≤ τ (gi + w + D − ci ) di.                              (2)
                                        i
    A demogrant D is sustainable for the tax τ if the long-run equilibrium allo-
cation associated to (τ, D) satisﬁes Eq. (2).4 When this is the case, we say that
the tax-demogrant scheme (τ, D) is sustainable. Observe that a given tax may
admit several sustainable demogrants. For instance, both a zero demogrant and
a negative demogrant are sustainable under a linear tax with rate zero.
    A sustainable tax-demogrant scheme (τ, D) is optimal if there is no other
sustainable tax scheme whose associated long-run equilibrium allocation is pre-
ferred by SWF Rc−lex to that associated to (τ, D). A tax τ is optimal in some
domain if there is no alternative tax τ in that domain for which the long-run
equilibrium allocation associated to a sustainable scheme (τ , D ) is preferred by
SWF Rc−lex to the long-run equilibrium allocation associated to all sustainable
schemes (τ, D).
    An individual is among the worst-oﬀs if all other individuals (in the long-
run equilibrium generation) have a c-equivalent utility at least as large as this
individual.
    Laissez-Faire is a tax-demogrant scheme deﬁned by a zero tax and a zero
demogrant: τ LF (b) = 0 ∀b ≥ 0 and DLF = 0.
   3 See Piketty and Saez (2013), page 1854.
   4 This  sustainability constraint allows us to link the government and individuals bud-
get constraints in the following way.       A sustainable (τ, D) is optimal only if D ≤
 i τ (gi + w + D − ci ) di and hi = R (w + D + gi − ci − τ (w + D + gi − ci )). These two
equations give us the following sustainability constraint: D ≤ i gi + w + D − ci − h
                                                                                   R
                                                                                     i
                                                                                       di ⇔
            hi
 i ci   +   R
               di   ≤   i gi   + wdi.



                                            11
4.1     A positive average tax on bequests
Our ﬁrst result answers the following question: at the optimal tax, should in-
dividuals’ incomes be taxed so that bequests can be, on average, subsidized, or
should bequests be taxed, on average, so as to subsidize individuals’ incomes
(through a demogrant)? The answer is that bequests should be taxed: the
amount globally collected by an optimal inheritance tax cannot be negative. If
subsidies are provided for some bequest levels, they must be paid for by taxes
collected at other bequest levels. This answer is the opposite to that of Atkinson
and Stiglitz (1976), Kaplow (2001) and Farhi and Werning (2010), conﬁrming
the crucial importance of taking account of the inﬂuence of taxing bequests on
the inheritance distribution, and, therefore, the wealth of the parents. The op-
timal formula of Piketty and Saez (2013), on the other hand, is consistent with
taxing bequests on average, depending on the distribution of normative weights.
Our result shows that the distribution of normative weights that follows from
imposing the axioms we propose unambiguously leads to a positive average tax
rate on bequests.
    The proof goes by comparing the optimal tax scheme with Laissez-Faire.
Under Laissez-Faire, any individual i freely allocates her lifetime resources be-
tween consumption and bequest, implying that her c-equivalent utility is equal
to her lifetime resources (w + gi ). Thus, the worst-oﬀ individuals are those
with gi = 0, and they all have a well-being level equal to w. Now, under any
tax-demogrant scheme (τ, D), the c-equivalent utility of any self-centered indi-
vidual is also equal to her consumption (w + D + gi ). Provided that at least one
individual who inherits nothing is self-centered,5 an assumption that we impose
(see assumption A1 below), her c-equivalent utility is equal to w + D. If the
demogrant is negative, then her well-being is smaller than the well-being of the
worst-oﬀ under Laissez-Faire. The result follows from the fact that our SWF
ranks tax-demogrant schemes by comparing the long-run equilibrium well-being
of the worst-oﬀ. To sum up, redistribution cannot take place from the general
population towards those who leave some bequests, because such a redistribu-
tion hurts those who did not receive anything from their parents and do not plan
to leave anything to their children either, and these are among the worst-oﬀs.
    The property of Laissez-Faire that all individuals for whom gi = 0 are among
the worst-oﬀs and they all have the same c-equivalent utility is shared by all tax
schemes (τ ∗ , D∗ ) in which τ ∗ exempts the bequests left by those who inherit
nothing. This suggests that if τ ∗ maximizes the sustainable demogrant D∗
under the constraint that τ ∗ exempts the bequests left by those who inherit
nothing, (τ ∗ , D∗ ) is a strong candidate to be optimal.
    Indeed, our second result identiﬁes a necessary condition on the optimal
(τ , D ) to be diﬀerent from (τ ∗ , D∗ ): it needs to be the case that D ≥ D∗ . It is
an easy consequence of what we already said. If (τ , D ) has D < D∗ , then the
c-equivalent utility of a self-centered individual who did not inherit anything is
lower at (τ , D ), where it is equal to w + D , than at (τ ∗ , D∗ ), where it is equal
to w + D∗ and where this individual is among the worst-oﬀs.
    To deﬁne the largest bequest left by a zero-inheritor precisely, we impose the
assumption that there are zero-inheritors with the most altruistic preferences.
   5 Piketty and Saez (2013) document that about half the population in France and the US

receives negligible bequests in 2010, which suggests that some individuals in those countries
do not enjoy leaving a bequest.



                                             12
The needed assumption for the following proposition is, therefore:
    Assumption A1: In any long-run equilibrium allocation, there are two dis-
joint subsets I s , I a ⊂ [0, 1] with µ(I s ) > 0 and µ(I a ) > 0 such that for all s ∈ I s
and all a ∈ I a we have ga = gs = 0, ua = ua and us = us .
    Our ﬁrst proposition summarizes the discussion above.
    Proposition 1 makes use of the following deﬁnition. Let bLF       a (w + D ) denote
the optimal bequest left by individual a ∈ I a under a scheme (τ, D) that provides
an exemption strictly larger than bLF    a (w + D ), i.e.

                   bLF                               ˜     ˜
                    a (w + D ) = arg max ua (w + D − ba , Rba ).
                                         ˜
                                         ba ≥0

Proposition 1. (i) Under A1, a tax-demogrant scheme (τ, D) is optimal only
if D ≥ 0. (ii) Under A1, a tax-demogrant scheme (τ ∗ , D∗ ) that provides an
                            ∗
exemption up to bLF
                 a (w + D ) is optimal if there is no other sustainable tax-
demogrant scheme (τ , D ) such that D ≥ D∗ .
Proof. The proof is relegated in Appendix 6.1.
    The condition D ≥ D∗ is necessary for (τ , D ) to be optimal, it is of course
not suﬃcient. In the long-run equilibrium allocation associated to (τ ∗ , D∗ ), all
zero-inheritors are equally well-oﬀ, and the optimal tax scheme needs to make
all of them at least as well-oﬀ as at (τ ∗ , D∗ ).
    Note that the long-run equilibrium allocation associated with the optimal
tax scheme depends on the preference proﬁle considered, preferences are het-
erogeneous and the long-run equilibrium proﬁle of inheritances received is en-
dogenous to the shape of the tax. Even without being able to characterize this
allocation exactly, we derive in the next section some constraints on the shape
of an optimal tax.

4.2     Tax exemption on low bequests, or limited subsidies
        and taxes
In the previous section, we proved that D, the demogrant, cannot be too low,
and, in particular, cannot be negative. The demogrant should be thought of
as the average amount transferred from those who leave a bequest for their
children to the general population. In this section, we prove that D cannot
be too large, either. The intuition for this result is the following one. Worst-
oﬀ individuals are to be found among the zero-inheritors. Among them, the
well-being of the self-centered individuals is entirely determined by D. It is not
the case for the other zero inheritors. In particular, the most altruistic among
them, individuals a, receive D but they pay τ (ba ), in which ba stands for their
bequest. Proposition 1 implies that D ≥ τ (ba ): individuals a cannot be strict
contributors to the tax system. It suggests that even if a end up with a lower
well-being than self-centered individuals, the diﬀerence in well-being should be
limited.
    This has the following implication for the optimal tax scheme. Let ba denote
the bequest of these individuals a at (τ, D). Let β be a positive bequest level
smaller than ba and let ∆ be an amount of money smaller than β . Consider the




                                            13
alternative tax scheme (τ , D ) satisfying

                       τ (b)    =    τ (b + ∆) − ∆, ∀b ≥ β − ∆
                          D     =    D − ∆,

which is illustrated in Figure 2. Facing (τ , D ), individuals a do not see any
diﬀerence between (τ , D ) and (τ, D): the decrease in their lifetime income is
perfectly compensated by the decrease in the tax they pay on their bequest.
Zero-inheritor-self-centered individuals, on the contrary, are aﬀected by the
change, as their well-being decreases by ∆. If, moreover, (τ , D ) leaves money
on the table, which is quite likely because D decreases, this money can be re-
distributed to the entire population, thereby making individuals a better-oﬀ.
That illustrates that decreasing D and decreasing the tax below some thresh-
old is a policy tool to increase the well-being of a at the expense of s (the
zero-inheritor-self-centered individuals).


      B LF ((0, w + D, 0))

    hi
                          Rτ (ba )

                                     za
                                                      ua


                                                                     B τ (w + D, 0)
                                                                        ′
                                              Rβ                    B τ (w + D ′ , 0)


                                                           ∆       zs
                                                                              ci
                                                               w+D
                                                           β
                                              ba



    Figure 2: Tax-demogrant schemes (τ, D) and (τ , D ) provide equal c-
    equivalent utility to the worst-oﬀ individuals a even if the latter has a
    smaller demogrant.


    This illustration is too simple, however, as (τ , D ) is typically non-sustainable.
To prove Proposition 2, we do identify a truncation of (τ, D) that is sustainable
and that increases social welfare as soon as D, or, equivalently, τ (ba ), is too
large.6 As a consequence, we show in the next proposition that
   1. tax functions with positive taxes on small bequest amounts are not opti-
      mal,
  6 When   deﬁning this truncation below, we give a precise value to β and to ∆.



                                             14
  2. monotonically increasing tax functions are not optimal unless they provide
     an exemption up to the amount of bequest that would be chosen under
     Laissez-Faire by individuals a (who inherit nothing and have the most-
     altruistic preference),
  3. a positive tax on the amount of bequest left by a is not excluded though,
     at least when subsidies are provided on smaller bequest amounts, and
  4. even when the optimal tax function subsidies smaller bequest amounts,
     the tax on the bequest amount left by a must be limited (see Eq. (3) in
     the proposition below).
    To proves these claims, we restrict our attention to tax functions τ for which
−τ is single-peaked. This domain contains all tax functions that are policy rel-
evant. In particular, this domain is the union of the two most relevant subdo-
mains. The ﬁrst subdomain contains all (weakly) monotonically increasing tax
functions τ (bequests are taxed at an increasing non-negative rate), in which 0 is
a peak for −τ (there may be an entire interval of peaks, in which case τ exempts
bequests on this interval). The second subdomain contains all tax functions τ
that are ﬁrst monotonically (weakly) decreasing (small bequests are subsidized)
and then monotonically (weakly) increasing. In this subdomain the peaks are all
positive. We refer to the latter subdomain as that of positive peak tax functions,
and to the former as that of (weakly) monotonically increasing tax functions.
Observe that, by Proposition 1, any monotonically decreasing tax function is
dominated because such tax cannot sustain a non-negative demogrant. We do
not comment on these functions anymore.
    In this domain, the worst-oﬀ individuals are either individuals a or s. As
illustrated in Figure 3, the reason is that the consumption of any zero-inheritor
is at least as large as the consumption of a, but not larger than the consumption
of s. As a result, their c-equivalent utility cannot be smaller than both the c-
equivalent utilities of s and a. Since either s or a are among the worst-oﬀs,
we can restrict the normative analysis to these two individuals. When a tax
function τ implies a tax on the amount of bequest left by a, this individual is
among the worst-oﬀs whereas s is not. Welfare would be improved if it were
possible to increase the well-being of a while keeping the well-being of s above
the well-being of a. The diﬃculty here is to make sure that such improvement
materializes in the new long-run equilibrium allocation, which depends on the
maximal demogrant that can be sustained.
    Two additional mild assumptions are required for these results. First, when
they consume at least as much as their labor income, the preferences of individu-
als who are not self-centered and consume at least w must be strictly increasing
in the inheritance received by their child. This assumption will guarantee that
there exists a suﬃciently large subsidy rate that induces these individuals to
leave at least a threshold amount to their children.
    Assumption A2: For all i ∈ [0, 1] with ui = us we have that ui (ci , hi ) is
strictly monotonic in hi when ci ≥ w.
    Second, the long-run equilibrium amount collected by a tax-demogrant scheme
is continuous in the demogrant.
    Assumption A3: For all tax τ , the amount of tax collected, i.e.

                                        τ (bi ) di,
                                    i



                                          15
                    (a)                                                (b)
    hi                                              hi     ua
                     ua
                                   us                                              us

         Bτ

                              za
                                                          Bτ              za



              B LF ((0, uc
                         s , 0))                         B LF ((0, uc
                                                                    a , 0))
                                    zs   ci                                         zs   ci
                                   uc
                                    s                                         uc
                                                                               a   w+D
                                =
                               w+D



    Figure 3: (a) Individuals s with gs = 0 and us = us are among the
    worst-oﬀs, where uc      c
                       s = u (zs , us ). (b) Individuals a with ga = 0 and
           a
    ua = u are among the worst-oﬀs, where uc         c
                                                a = u (za , ua ).




where bi is the long-run equilibrium bequest left under (τ, D), is continuous in
D.
   Assumption A3 implies that, if (τ, D) is sustainable and leaves money on
the table, then for some D > D the tax-demogrant scheme (τ, D ) is also
sustainable. Observe that assumption A3 is not necessary in the proposition
below for the constraint derived on monotonically increasing tax functions.
   We introduce the following notation. Let b ≥ 0 be the minimal bequest
amount above which the tax τ provides zero subsidy, i.e.

                          ˜ ∈ R+ such that τ (x) ≥ 0 for all x ≥ x
                  b = min x                                      ˜.

Obviously, amount b is endogenous to the tax considered. For any positive peak
tax, we have b > 0 and we construct the alternative tax scheme (τ , D ) using
β = b. For any monotonically increasing tax, we have b = 0 and we construct
the alternative tax scheme (τ , D ) using some β > b.
   Proposition 2, our main result, provides us with the formal statements from
which the four claims above are deduced.
Proposition 2. Consider any tax τ for which −τ is single-peaked. Let b be
the minimal bequest amount above which no subsidies are provided under τ . Let
Dmax be the maximal sustainable demogrant under τ . Let a be an individual
who inherits nothing and holds the most altruistic preference. Let ba be the
equilibrium bequest left by a under (τ, Dmax ). Let bLF
                                                     a (w ) be the equilibrium
bequest left by a under Laissez-Faire.
    (i) Under A1 and A2, if τ is monotonically increasing, then τ is optimal
only if τ provides an exemption up to bLF
                                        a (w ).
    (ii) Under A1, A2 and A3, τ is optimal only if

                                          τ (ba ) ≤ b.                                        (3)

Proof. The proof is relegated in Appendix 6.2.


                                              16
    An important feature of Proposition 2 is that the shape of monotonically in-
creasing taxes is completely characterized up to the minimal amount exempted.
Importantly, this amount is exogenous to the optimal tax. This amount only
depends on the preference ua , the interest rate R and the wage rate w. This
implies this shape is valid regardless of the exact preferences proﬁle deﬁning the
economy. This contrasts with characteristics of optimal tax as derived in the
literature (Piketty and Saez, 2013), which typically depend on statistics endoge-
nous to the optimal tax. On the contrary, the bequest amount ba in claim (ii)
of Proposition 2 is also endogenous to the tax.
    To conclude this section, we note that our two propositions have an inter-
esting corollary, namely that linear taxes are never optimal.
Corollary 1. Under A1 et A2, no linear tax diﬀerent from Laissez-Faire is
optimal.

Proof. Linear tax with negative rates cannot sustain non-negative demogrants.
By Proposition 1, any tax scheme based on a negative demogrant is not optimal.
Therefore, linear tax with negative rates are not optimal. Linear tax with
positive rates are monotonically increasing and do not exempt bequests up to
bLF
 a (w ). By Proposition 2, these tax functions are not optimal.



5    Conclusion
The model that we study in this paper is designed to focus on the trade-oﬀ
between subsidizing bequests or transferring a demogrant to all. A number of
simplifying assumptions have been needed, to which we now come back.
    We assumed away the issue of taxing labor incomes. We assume that all
individuals have the same labor time and the same lifetime income, so that
fairness does not require to redistribute labor income. Our result that bequests
should, on average, be taxed so as to transfer a demogrant to all does not,
therefore, come from the need to alleviate income inequalities, but only from
the need to compensate children of selﬁsh parents while preserving the parents’
freedom to allocate their lifetime income the way they wish. As a consequence,
our conclusions are compatible with heterogeneity of wages and labor times and
the existence of a labor income tax system maximizing social welfare. This
claim, however, calls for two qualiﬁcations.
    First, our formal analysis can be replicated in a more general model only
under the assumption that individuals’ lifetime incomes are not inﬂuenced by
the design of our bequest taxation system. In case this assumption is not valid,
we should study the income redistribution and bequest taxation systems simul-
taneously. This task does not look feasible. Intuitively, though, the result of
such an exercise is likely to remain that bequests are, on average, taxed, so as to
compensate the children of self-centered parents while maintaining a suﬃciently
high utility level to self-centered individuals who did not receive anything from
their own parents. Identifying who are the worst-oﬀ individuals and dealing
with sustainability issues, however, would become much harder.
    The second qualiﬁcation has to do with the identiﬁcation of the worst-oﬀ in-
dividuals in the case in which lifetime incomes are not inﬂuenced by the bequest
taxation system. Given that the labor income taxation system aims at redis-
tributing from higher wage individuals to lower wage individuals, it is extremely


                                        17
likely that the worst-oﬀ have to be found among the minimal-wage individuals.
Consequently, our assumption A1 has to be strengthened into the existence, in
any long-run equilibrium allocation, of individuals who did not inherit anything
from their parents, who have the minimum wage and who have either the most
altruistic or self-centered preferences. As a result, Proposition 2, part (i), for
instance, would become that bequests should be exempted from taxes up to the
amount left by the most altruistic individuals who did not receive any bequest
from their parents and worked all their life at the minimum wage.
    Our main results do not give us a formula that can be calibrated, but yet
they can be used to qualitatively assess current tax systems. According to
a recent report (see OCDE (2021)) 12 of the 36 OECD countries do not tax
bequests. Our Proposition 1, part (i) implies that this can not be optimal given
our social welfare function. All the 24 countries that do tax bequests to children
have a system consistent with the optimal tax system of Proposition 1, Part (ii)
and Proposition 2, part (i): exemption for small bequests and a positive tax
on larger ones. The interval of exemptions considerably varies across countries,
from $17,133 in Belgium to $11,580,000 in the United States (numbers in 2020
USD). While the former amount is likely to be smaller than the bequest left by
the most altruistic parents having not received anything from their own parents
and having worked at the minimum wage, the latter amount is clearly above this
threshold. The money collected through bequest taxes is below 2% of the total
ﬁscal revenues in all countries, and even below 1% in most countries, suggesting
that the corresponding demogrant is not maximized. More research is needed,
however, to compute the optimal tax systems in these countries.
    In the model, we also assumed that the number of children is identical across
households. Allowing heterogeneity among the number of children would not
change the fact that the worst-oﬀ individuals have to be found among those
who did not inherit anything from their parents. A new question would emerge,
however, regarding the amount of exempted bequest. It would still be deﬁned
with reference to the amount left by the most altruistic individuals who did not
receive any bequest from their parents, but the choice is between considering
the bequests of parents of the largest number of children or with only one child.
The former choice is appropriate if parents are modeled as caring about the per
capita bequest received by their children. The latter choice is appropriate if
parents are modeled as caring about the total bequest left.
    The interest rate, R, is exogenous in our model. This is typical of a small
open economy. If it is endogenous, but further assumptions make it depend
only on the distribution of preferences in the economy, then our analysis carries
over with R being replaced with the endogenous rate. If, on the contrary, the
interest rate may vary across time, then our results change. The intuition is
that our optimal tax scheme needs to be amended so as to redistribute further
from the lucky ones who face higher interest rates towards those who face lower
interested rates. Moreover the leximin nature of our SWF implies that the worst-
oﬀs belong to those who did not inherit anything from their parents. Therefore,
whether an individual is lucky or not only depends on the future interest rates.
So, an optimal tax system should redistribute from those who can save for their
children at a high rate towards those who save at a low rate. How precisely this
should be done requires additional research.
    Other assumptions would be much more diﬃcult to relax. They would re-
quire to redeﬁne the social welfare function or the policy tools. It would be


                                       18
the case, for instance, if individuals are interested in the entire lifetime of their
children and not only how much they bequeath, in which case an increase of
the demogrant beneﬁts the altruistic parents more than the self-centered ones,
if they have unequal life expectancy, in which case the social planner may wish
to subsidize the bequest of short-lived individuals, if fertility choices are con-
strained, in which case the social planner may wish to favor those who wanted to
have children but could not, and, therefore, do not leave any bequest, if children
can inherit from diﬀerent adults, raising the question of whether the tax should
be donor-based or recipient based, if bequests are only partially observable, etc.


6     Appendix
6.1     Proof of Proposition 1
First, we prove claim (i). We show that any (τ, D) with D < 0 is dominated by
Laissez-Faire.
    We start by showing for the long-run equilibrium allocation z LF ∈ S associ-
                                LF
ated to Laissez-Faire that uc (zi   , ui ) for all i ∈ [0, 1]. Under Laissez-Faire, any
i ∈ [0, 1] choses in the budget set
                                                 LF
                                  B LF ((0, w + gi  , 0)),

implying that uc (zi LF               LF
                        , ui ) = w + gi  .
    We then show for the long-run equilibrium allocation z ∈ S associated to
(τ, D) that some subset J ⊂ [0, 1] with µ(J ) > 0 is such that uc (zj , uj ) = w + D
for all j ∈ J . By assumption A1, there is a subset J ⊂ [0, 1] with µ(J ) > 0 such
that gj = 0 and uj = us for all j ∈ J . Since these individuals are self-centered,
we have that zj = (0, w + D, 0) and uc (zj , uj ) = w + D.
    As D < 0, this shows that uc (zj , uj ) < w for all j ∈ J , showing that Laissez-
Faire is preferred to (τ, D) by Rc−lex .

    Second, we prove claim (ii). We show that any (τ , D ) with D < D∗ is
dominated by (τ ∗ , D∗ ).
    We start by showing for the long-run equilibrium allocation z ∗ ∈ S associated
to (τ ∗ , D∗ ) that uc (zi ∗
                             , ui ) ≥ w + D∗ for all i ∈ [0, 1]. In equilibrium, any
                                           ∗
i ∈ [0, 1] choses in the budget set B τ (w + D∗ + gi      ∗
                                                            , 0). For all i, j ∈ [0, 1] with
                                                                           ∗
                 ∗                            ∗          ∗
ui = uj and gj = 0 we must have ui (zi ) ≥ uj (zj ) because B τ (w + D∗ , 0) ⊆
   ∗
B τ (w + D∗ + gi  ∗
                    , 0). As ui = uj , this implies that uc (zi∗               ∗
                                                                 , ui ) ≥ uc (zj , uj ). There
                              c ∗               ∗      ∗
remains to show that u (zj , uj ) = w + D . As τ provides an exemption up to
             ∗                          ∗                                   ∗
bLF
 a (w + D ), any j ∈ [0, 1] with gj = 0 choses the same bundle zj in her budget
        ∗
set B τ (w + D∗ , 0), as she would chose in the Laissez-Faire budget set

                                  B LF ((0, w + D∗ , 0)),
                   ∗
implying that uc (zj , uj ) = w + D∗ .
   We then show for the long-run equilibrium allocation z ∈ S associated to
(τ , D ) that some subset J ⊂ [0, 1] with µ(J ) > 0 is such that uc (zj , uj ) =
w + D for all j ∈ J . By assumption A1, there is a subset J ⊂ [0, 1] with
µ(J ) > 0 such that gj = 0 and uj = us for all j ∈ J . Since these individuals



                                             19
are self-centered, we have under the long-run equilibrium allocation z ∈ S
associated to (τ , D ) that zj = (0, w + D , 0) and so uc (zj , uj ) = w + D .
   As D < D∗ , this shows that uc (zj , uj ) < w + D∗ for all j ∈ J , showing
that (τ ∗ , D∗ ) is preferred to (τ , D ) by Rc−lex .

6.2    Proof of Proposition 2
Proposition 2 provides a limit on the tax paid on the amount of bequest left by
a. The proof constructs another sustainable tax-demogrant scheme that domi-
nates (τ, D). The construction is based on a sustainable tax-demogrant scheme
(τ ∆ , D − ∆) illustrated in Figure 4. As shown in the ﬁgure, this second scheme
linearly truncates the budget set faced by individuals under (τ, D). This trunca-
tion reduces the demogrant by an amount ∆, but the new tax τ ∆ provides large
subsidies on small bequests. Provided the inheritance they receive is unchanged,
individuals chose the same bundle under both schemes, at least if this bundle
does not lie in the truncated part of their budget set. This implies that the well-
being of a is the same under both schemes. The smaller demogrant reduces the
well-being of s, but s still has a larger well-being than a. In the proof, we show
that when (τ, D) is sustainable, scheme (τ ∆ , D − ∆) leaves money on the table.
Money is left on the table because (i) the new scheme saves on self-centered
individuals who receive a smaller demogrant and (ii) the new scheme does not
spend more on altruistic individuals because its subsidies on small bequests are
ﬁnanced by the reduction in the demogrant. Then, the money saved by the new
scheme can ﬁnance a small increase in the demogrant, such that (τ ∆ , D − ∆ + )
is sustainable for some > 0. The larger demogrant increases the well-being of
all individuals, including that of the worst-oﬀ.
    Here is the intuition why scheme (τ ∆ , D − ∆) is sustainable. The trunca-
tion creates a kink in the budet set faced by individuals under (τ, D). Impor-
tantly, this kink is located at an amount of bequest b above which the tax τ is
non-negative and monotonically increasing. The larger the rate of subsidies on
small bequests associated to τ ∆ , the more numerous the altruistic individuals
who “bunch” at the kink, at least for those who used to leave a bequest smaller
than b under (τ, D). If all of these individuals “bunch” at the kink, then the
long-run equilibrium proﬁle of inheritances under (τ, D) would be ﬁrst-order
stochastically dominated by the long-run equilibrium proﬁle of inheritances un-
der (τ ∆ , D − ∆). We can then show that (τ ∆ , D − ∆) is sustainable if (τ, D) is
sustainable because, above the kink, the tax paid is increasing in the bequest
left.

Both claims (i) and (ii) in Proposition 2 rely on the following two lemmas.
Recall that we denote by a an individual for whom the long-run equilibrium
ga = 0 and for whom ua = ua , and by s an individual for whom the long-run
equilibrium gs = 0 and for whom us = us .
Lemma 1. Under A1, for any tax-demogrant scheme (τ, D) for which −τ is
single-peaked, either a or s are among the worst-oﬀs.
Proof. Let z ∈ S denote the long-run equilibrium allocation associated to (τ, D).
By assumption A1, there exist two individuals a and s with ga = 0, ua = ua ,
gs = 0 and us = us . We derive a contradiction when assuming that for some
k ∈ [0, 1] we have uc (zk , uk ) < uc (za , ua ) and uc (zk , uk ) < uc (zs , us ).


                                        20
           B LF ((0, w + D, 0))


             hi
                           Rτ (ba )




                                 za

                                                              (w + D − b, Rb)
                         B LF ((0, uc
                                    a , 0))
                                                                             Bτ
                                                                                 ∆
                                                                            Bτ
                                                        Rb

                                                                      ∆     zs
                                                                                      ci
                                                      uc
                                                       a          b       w+D
                                                 ba



           Figure 4: The tax-demogrant scheme (τ, D) is dominated be-
           cause the tax function τ taxes too much the bequest ba left by
           a. Individual a is the worst-oﬀ because uc (za , ua ) ≤ uc
                                                                    a . The
           sustainable scheme (τ ∆ , D − ∆) has a smaller demogrant, does
           not aﬀect uc (za , ua ) and leaves money on the table.



   Under z , any i ∈ [0, 1] choses in her budget set B τ (w + D + gi , 0). This
implies for all j ∈ [0, 1] with uj = uk and gj = 0 that uj (zj ) ≤ uk (zk ), and
hence uc (zj , uj ) ≤ uc (zk , uk ). Thus, we can assume without loss of generality
that gk = 0.
   By deﬁnition of a and s, the fact that ga = gk = gs = 0 implies that ca ≤
ck ≤ cs = w + D. Letting uc            c
                                                                    / B LF ((0, uc
                                  k = u (zk , uk ), we show that zk ∈            k , 0)).
There are two cases, which are illustrated in Figure 5.
    • Case 1: zs ∈ B LF ((0, uc
                              k , 0)).
       Let zs = arg maxus B LF ((0, uc     k , 0)). This case is such that us (zs ) ≤
       us (zs ). As by deﬁnition uc (zs , us ) = uc                    c              c
                                                     k , we have that u (zs , us ) ≤ uk
                  c             c
       and thus u (zs , us ) ≤ u (zk , uk ), a contradiction.
    • Case 2: zs ∈
                 / B LF ((0, uc
                              k , 0)).
       As zs = (0, w + D, 0) ∈/ B LF ((0, uc
                                           k , 0)) and gk = gs = 0, we have zk ∈
                                                                               /
        LF      c
       B ((0, uk , 0)) unless τ (w + D − ck ) ≥ 0.7 As −τ is single-peaked and
   7 By   deﬁnition, B LF ((0, uc                                     c
                                k , 0)) = {(ci , hi ) | ci + hi /R ≤ uk }. When zs = (0, w + D, 0) ∈
                                                                                                   /
B LF ((0, uc
           k , 0)), we
                             c
                      have uk < w + D. Since in equilibrium, hk = R(w + D − ck − τ (w + D − ck )),
we have ck + hk /R    < w + D only if τ (w + D − ck ) > 0.


                                                 21
        ca ≤ ck , this implies in turn that τ (w + D − ca ) ≥ τ (w + D − ck ). Since
        gk = ga = 0, the fact that τ (w + D − ca ) ≥ τ (w + D − ck ) ≥ 0 and
        zk ∈ B LF ((0, uc  k , 0)) together imply that za ∈ B
                                                                          LF
                                                                             ((0, uc
                                                                                   k , 0)). Then,
        uc (za , ua ) ≤ uc
                         k and   thus uc
                                         (za , ua ) ≤ u c
                                                          (z k , uk ), a contradiction.


                                (Case 1)                                                          (Case 2)
       B LF ((0, uc
                  k
                    , 0))                                                                uk

hi                             z
                               ˆk                                       hi
                                                                   ˆk
                                                                   h                             ˆk
                                                                                                 z
     B τ (w + D, 0)                    zk

                                                                                                       Rτ (bk )
                                                         uk
                                                                                B τ (w + D, 0)
             B LF ((0, w + D, 0))                                                                        za
                                                                                                              zk

                                                                                         B LF ((0, uc
                                                                                                    k
                                                                                                      , 0))

                                               zs     ′
                                                     zs                                                                 zs
                                                              ci                                                             ci
                                                    uc
                                                     k
                                                                                                                  uc
                                                                                                                   k   w+D
                                            w+D




                                Figure 5: Constructions used in the proof of Lemma 1.


            / B LF ((0, uc
   Since zk ∈                              c             c
                             k , 0)) but u (zk , uk ) = uk , this implies that for some
       ˆk = (0, c
bundle z             ˆ k ) ∈ B ((0, u , 0)) we have uk (ˆ
                ˆk , h          LF       c
                                                             zk ) = uk (zk ) (see Figure 5
                                         k
Case 2). This implies that

                            ˆk ∈ arg
                            z                        max                         zk ).
                                                                             uk (˜
                                       ˜k ∈B τ (w+D,0)∪B LF ((0,uc
                                       z                         k ,0))


But by deﬁnition of the most-altruistic preferences ua , it means that a would
chose in B τ (w + D, 0) ∪ B LF ((0, uc                                ˆ a ) such that
                                                        ˆa = (0, c
                                       k , 0)) a bundle z        ˆa , h
     ˆ k . By construction, we have z
ˆa ≥ h
h                                             LF     c
                                    ˆa ∈ B ((0, uk , 0)) and zˆa ∈/ B τ (w + D, 0).
                   c            c                c          c
                                  za , ua ) = uk and thus u (za , ua ) ≤ uc (zk , uk ),
This shows that u (za , ua ) ≤ u (ˆ
a contradiction. This concludes the proof of Lemma 1.
   Let µc      c
          a = u (za , ua ) denote the long-run equilibrium c-equivalent utility of
individual a with ga = 0 and ua = ua under scheme (τ, D). Let ba denote the
equilibrium bequest left by individual a under scheme (τ, D).
Lemma 2. Consider any sustainable tax-demogrant scheme (τ, D) such that
D ≥ 0 and −τ is single-peaked. Under A1 and A2, if τ is monotonically in-
creasing, then (τ, D) is dominated if µc
                                       a < w + D and ba > 0. Under A1, A2
and A3, (τ, D) is dominated if µc
                                a <  w + D − b and ba > b.

Proof. Let z = (gi , ci , hi )i∈[0,1] ∈ S denote the long-run equilibrium allocation
associated to (τ, D) and let (bi )i∈[0,1] be the long-run equilibrium proﬁle of
bequests left, i.e. bi = w + D + gi − ci for all i ∈ [0, 1]. Let A ⊂ [0, 1] be the
subset of altruistic individuals, i.e. ui = us for all i ∈ A.


                                                         22
    There are two cases to consider. For each case, the proof proceeds in three
steps. In Step 1, we construct a particular tax-demogrant scheme (τ , D ). In
Step 2, we show that (τ , D ) is sustainable if (τ, D) is sustainable. In Step 3,
we show that the long-run equilibrium allocation z ∈ S associated to (τ , D )
is preferred by Rc−lex over z .

CASE 1: for all bequest amount b > 0 there is a subset J ⊆ A with µ(J ) > 0
such that bj < b for all j ∈ J .
    Step 1. We construct a particular tax-demogrant scheme (τ , D ). The con-
struction of τ is based on a particular bequest amount β > 0, whose deﬁnition
depends on the type of τ .
   • If τ is monotonically increasing, then b = 0 and we take any β such that
     0 < β < min(ba , w + D − µc a ).

   • If −τ is positive peak, then b > 0 and we take β = b.
For both types, we have 0 < β < min(ba , w + D − µc    a ).
    Given β , we construct (τ , D ) from a speciﬁc member of a parametric family
of “truncated” tax-demogrant schemes (τ ∆ , D − ∆) with parameter ∆ ∈ (0, β ).
The construction of (τ ∆ , D − ∆) is illustrated in Figure 6 for the case b = 0
and in Figure 4 for the case b > 0 (where β = b). All members of this family
linearly truncate the budget set B τ (w + D + gi , 0) for bequests smaller than β
and diﬀer by their associated demogrant D − ∆.8 Formally, we deﬁne τ ∆ as

                            τ (x + ∆) − ∆       for all x ≥ β − ∆
                        
                        
             τ ∆ (x) :=                                                       (4)
                            τ (β )−∆
                                     x         for all x  ∈ [0 , β − ∆] .
                        
                              β −∆

  The particularity of scheme (τ ∆ , D − ∆) is that any individual i ∈ [0, 1] for
whom bi ≥ β choses the same bundle under both (τ ∆ , D − ∆) and (τ, D), i.e.

           arg          max                 zi ) = arg
                                        ui (˜                     max                   zi ).
                                                                                    ui (˜
                 ˜i ∈B τ (w+D +gi ,0)
                 z                                       ˜i ∈B τ ∆ (w+D −∆+gi ,0)
                                                         z

                ∆ ∆
    Let z ∆ = (gi  , ci , h∆i )i∈[0,1] ∈ S denote the long-run equilibrium allocation
                 ∆
associated to (τ , D − ∆) and let (b∆      i )i∈[0,1] be the long-run equilibrium proﬁle
of bequests left, i.e. b∆ i   =  w +   D −  ∆ + gi ∆
                                                      − c∆
                                                         i for all i ∈ [0, 1]. Recall that
Rβ − Rτ (β ) is the amount inherited by the child of any i ∈ [0, 1] for whom
bi = β . Consider the subset J ∆ ⊆ A for whom h∆                                        ∆
                                                         j < Rβ − Rτ (β ) for all j ∈ J .
                          ∆
    We show that µ(J ) → 0 when ∆ → β . The intuition for this statement is
that the “truncated” budget set
                                           ∆
                                        B τ (w + D − ∆, 0)

has a kink at bundle (0, w + D − β, Rβ − Rτ (β )). Therefore, the larger is ∆,
the steeper is the slope of this truncated budget sets for small bequests (this
slope tends to −∞ when ∆ → β ), and the greater is the incentive to “bunch” at
the kink for the “moderately” altruistic individuals who inherit nothing. More
formally, for any altruistic preference u ∈ U \{us }, there is a ∆ < β such that
  8 Demogrant    D − ∆ need not be the maximal sustainable demogrant under tax τ ∆ .



                                                  23
         B LF ((0, w + D, 0))

    hi                  ua


                                    Rτ (ba )

                                               za
                                                                  (w + D − β, Rβ − Rτ (β ))
                   B LF ((0, uc
                              a , 0))

                                                                      Bτ
                                                                            ∆
                                                                       Bτ


                                                              ∆       zs
                                                                                ci
                                                     uc
                                                      a           w+D
                                                              β
                                                    ba



       Figure 6: Construction of scheme (τ ∆ , D − ∆) for the case b = 0.



for any i ∈ [0, 1] with ui = u and gi = 0 we have h∆
                                                   i ≥ Rβ − Rτ (β ). This follows
from A2, which requires that the altruistic preference ui is strictly monotonic in
hi when ci ≥ w. The latter is guaranteed because the kink is located at bundle
(0, w + D − β, Rβ − Rτ (β )) where w + D − β ≥ w.9 By the binormality of
preferences, we also have h∆ i ≥ Rβ − Rτ (β ) for any i ∈ [0, 1] with an altruistic
preference ui and gi > 0, which yields the result.
    We are now equipped for the deﬁnition of scheme (τ , D ). This deﬁnition is
based on the per-capita money amount βλs , which is saved by the government
on the mass λs of self-centered individuals when reducing the demogrant by
an amount β . For some βλ    2 > 0, consider the subset of altruistic individuals
                               s

  βλs                       βλ             βλs
J 2 ⊆ A for whom hj < 2 for all j ∈ J 2 . In words, all altruistic individuals
                               s

      βλs
in J 2 leave an inheritance smaller than half the per-capita amount saved on
                                                      βλs
self-centered individuals. Under Case 1, we have µ(J 2 ) > 0. Since µ(J ∆ ) → 0
when ∆ → β , there is a value ∆∗ with ∆∗ > β   2 such that
                                                           10

               ∗             βλs
   • µ(J ∆ ) < µ(J            2    ), and
           ∗
   • τ ∆ subsidizes bequests smaller than β − ∆∗ .11
In fact any ∆ > ∆∗ also satisﬁes these two properties. We deﬁne scheme (τ , D )
    9 Indeed, we assume that β ≤ w + D − uc and we have uc ≥ w (otherwise by A1 τ is
                                              a              a
dominated by Laissez-Faire).
  10 Recall that by deﬁnition of τ ∆ we have ∆ < β .
  11 If the tax τ is positive peak, then τ ∆∗ subsidizes bequests smaller than β − ∆∗ for
                                                                         ∗
all ∆∗ > 0. In contrast, when the tax τ is monotonically increasing, τ ∆ subsidizes small
bequests when τ (β ) < ∆∗ .



                                                         24
as
                                                                    βλs
                               (τ , D ) =   τ ∆∗ , D − ∆ ∗ +                 ,
                                                                     2

whose construction is illustrated in Figure 7.

                    hi           βλs
                                  2                                    (w + D − β, Rβ − Rτ (β ))


                                                                                      ′
     Rβ − Rτ (β )                                                                  B τ (w + D ′ , 0)

                                                                          B τ (w + D, 0)
                                                                             ∆∗
                                                                        Bτ        (w + D − ∆∗ , 0)




             βλs                                                β
              2
                                                          βλs
                                                           2                          zs
                                                                                                ci
                                                                w + D′            w+D
                                                                ∆∗



            Figure 7: Construction of scheme (τ , D ) for the case b = 0.


     Step 2. We show that (τ , D ) is sustainable if (τ, D) is sustainable. Let
z = (gi , ci , hi )i∈[0,1] ∈ S denote the long-run equilibrium allocation associated
to (τ , D ) and let (bi )i∈[0,1] be the long-run equilibrium proﬁle of bequests left,
i.e. bi = w + D + gi − ci for all i ∈ [0, 1]. Let (hˆ )i∈[0,1] be the proﬁle obtained
                                                     i
from (hi )i∈[0,1] when sorting dynasties by increasing order of hi , i.e. h  ˆ ≤h  ˆ
                                                                              j      k
for any j, k ∈ [0, 1] with j < k . Similarly, we use symbol “ˆ” to denote other
bequest or inheritance proﬁles sorted in increasing order. Step 2 relies on the
following Technical Claim.

                                                                 βλs
      Technical Claim: For any i ∈ (λs + µ(J                      2                      ˆ i and
                                                                                     ˆ ≥ h
                                                                       ), 1] we have hi
      ˆ ≥ Rβ − Rτ (β ).12
      hi


      For t ∈ {0, 1, . . . }, we consider successive equilibrium allocations zt =
      (git , cit , hit+1 )i∈[0,1] for which all i ∈ [0, 1] chose in B τ (w + D + git , 0)
      and (ˆ  git+1 )i∈[0,1] = (h  ˆ
                                     it+1 )i∈[0,1] , under an initial proﬁle of inheritances
      (gi0 )i∈[0,1] = (hi )i∈[0,1] , which corresponds to the long-run equilibrium
      proﬁle associated to (τ, D). We show for all t ∈ {0, 1, . . . } that
           ˆ
       (1) h             βλs    ˆ i for all i ∈ [0, 1],
                               ≥h
             it+1 +       2
                                                                      βλs
           ˆ        ˆ      ˆ
       (2) h it+1 ≥ hi and hit+1 ≥ Rβ − Rτ (β ) for any i ∈ (λs + µ(J
                                                                       2 ), 1].


 12 Index   i need not refer to the same dynasty in the two sorted distributions.


                                                25
Observe that, at all t, (1) and (2) compare the sorted proﬁle of inheri-
tances left in t to the sorted proﬁle of inheritances left under the long-run
equilibrium allocation z .
If (1) and (2) hold for all t ≥ 0, then (2) holds as well for the proﬁle
(hi )i∈[0,1] associated to the long-run equilibrium allocation z , because we
assume that for any given tax-demogrant scheme, the economy converges
over time to a unique long-run equilibrium allocation independent on the
initial distribution of inheritances.
Consider ﬁrst t = 0. In order to show that claims (1) and (2) hold for
t = 0 when the tax-demogrant scheme is (τ , D ), it is suﬃcient to show
that claims (1) and (2) hold for t = 0 when the tax-demogrant scheme is
(τ ∆∗ , D − ∆∗ ) instead of (τ , D ). Indeed, let (h∆    ∗
                                                       i1 )i∈[0,1] be the proﬁle
of inheritances obtained in period t = 0 if the tax-demogrant scheme is
(τ ∆∗ , D − ∆∗ ). Since τ = τ ∆∗ and D > D − ∆∗ , the binormality of
                                    ∗
preferences implies that hi1 ≥ h∆ i1 for all i ∈ [0, 1], which shows that it is
indeed suﬃcient to prove these claims when the tax-demogrant scheme is
(τ ∆∗ , D − ∆∗ ).
Consider ﬁrst claim (1) in t = 0, i.e.

                   ˆ ∆∗ + βλs ≥ h
                   h            ˆi         for all       i ∈ [0, 1].                (5)
                     i1
                           2
Consider the proﬁle (hi1 )i∈[0,1] of inheritances obtained in period t = 0 if
the tax-demogrant scheme is (τ, D) instead of (τ ∆∗ , D − ∆∗ ). As the initial
proﬁle (gi0 )i∈[0,1] = (hi )i∈[0,1] , we have (hˆ i1 )i∈[0,1] = (h ˆ i )i∈[0,1] because
z is the long-run equilibrium allocation associated to (τ, D). It is thus
suﬃcient to show that
                   ˆ ∆∗ + βλs ≥ h
                   h               ˆ i1    for all        i ∈ [0, 1]                (6)
                     i1
                           2
for Eq. (5) to hold.
                  ∗                                               s
Since we have h∆i1 = hi1 = 0 for all i ∈ [0, 1] for whom ui = u , we can
                                          s
focus on the subset A = {i ∈ [0, 1]|ui = u } of altruistic individuals. We
partition A into three subgroups A1 , A2 and A3 , respectively deﬁned as
   – A1 = {i ∈ A|hi1 ≥ Rβ − Rτ (β )}.
     By construction of (τ ∆∗ , D − ∆∗ ), any i ∈ A1 choses the same bundle
                                        ∆∗
     in B τ (w + D + gi0 , 0) and in B τ (w + D − ∆∗ + gi0 , 0). This implies
             ∆∗                     1
     that hi1 = hi1 for all i ∈ A .
                                               ∗
   – A2 = {i ∈ A|hi1 < Rβ − Rτ (β ) and h∆   i1 ≥ Rβ − Rτ (β )}.
                                ∆∗
     By deﬁnition, we have hi1 > hi1 for all i ∈ A2 .
                                               ∗
   – A3 = {i ∈ A|hi1 < Rβ − Rτ (β ) and h∆   i1 < Rβ − Rτ (β )}.
                                           3
     By deﬁnition, any altruistic j ∈ A leaves a smaller inheritance than
     Rβ − Rτ (β ), and thus a smaller inheritance than any i ∈ A1 ∪ A2 ,
              ∗   ∆∗
     i.e. h∆
           j 1 ≤ hi1 .
                                                                              βλs
     We can assume without loss of generality that µ(A3 ) ≤ µ(J 2 ). If
     it is not the case, consider for the construction of (τ , D ) a larger
                                                   βλs
     ∆ ∈ (∆∗ , β ) for which we have µ(A3 ) ≤ µ(J 2 ). Such larger value
     exists by assumption A2.


                                      26
       By the deﬁnition of the above partition of [0, 1], we have
                                                ∗
                                             h∆
                                              i1 ≥ hi1                                       (7)
       for all i ∈ [0, 1]\A3 ,13 but Eq. (7) may not hold for some i ∈ A3 . However,
                                  βλs
       we have µ(A3 ) ≤ µ(J 2 ) and there is a subset of individuals of mass
            βλs
       µ(J 2 ) for whom hi1 < βλ                               ∆∗                 3
                                      2 . Therefore, even if hi1 = 0 for all i ∈ A ,
                                       s

                   ∆∗     βλs     βλs              3 14
       we have hi1 + 2 ≥ 2 for all i ∈ A , which shows that (6) holds,
       and thus (5) holds.
       We now turn to claim (2) for t = 0 if the tax-demogrant scheme is
       (τ ∆∗ , D − ∆∗ ) instead of (τ , D ), i.e.
          ˆ ∆∗ ≥ h
          h               ˆ ∆∗ ≥ Rβ − Rτ (β ) for all i ∈ (λs + µ(J βλ
                 ˆ i1 and h                                            s
                                                                     2 ), 1].                (8)
            i1              i1

                                                               ∗   ∆∗
                                          / A1 ∪ A2 we have h∆
       By deﬁnition, if i ∈ A1 ∪ A2 and j ∈                  i1 ≥ hj 1 . Also,
                    βλs
       as A2 ∩ J 2 may be non-empty, the mass of individuals in A1 ∪ A2 is
                                               βλs
       such that µ(A1 ∪ A2 ) ≥ 1 − λs − µ(J 2 ). Hence, it is suﬃcient that
       Eq. (8) holds for individuals in A1 ∪ A2 . As shown when deﬁning these
                                   ∗
       two subgroups, we have h∆                                       1
                                 i1 = hi1 ≥ Rβ − Rτ (β ) for all i ∈ A and
        ∆∗                                   2
       hi1 = Rβ − Rτ (β ) > hi1 for all i ∈ A , which proves Eq. (8).
       Together, we have shown claims (1) and (2) for t = 0. We next prove
       these claims for t = 1.
       In t = 1, the proﬁle of inheritances received under (τ , D ) is such that
       (ˆ               ˆ )i∈[0,1] . By construction, since τ = τ ∆∗ and D = D −
        gi1 )i∈[0,1] = (hi1
         ∗     βλs
       ∆ + 2 , we have for all i ∈ [0, 1] that
                   ∆∗                             βλs
              Bτ          w + D − ∆∗ + gi1 +          ,0   = B τ (w + D + gi1 , 0) ,
                                                   2
       as can be seen in Figure 7 (for the case gi1 = 0). Therefore, the proﬁle
        ˆ )i∈[0,1] obtained under (τ , D ) for (ˆ
       (h                                                       ˆ )i∈[0,1] is the same
                                                gi1 )i∈[0,1] = (h
         i2                                                      i1
                       ˆ ∆∗
       as the proﬁle (hi2 )i∈[0,1] that would be obtained under (τ ∆∗ , D − ∆∗ ) if
       the proﬁle of inheritances received in t = 1 was instead (h    ˆ + βλs )i∈[0,1] .
                                                                       i1     2
       From claim (1) for t = 0, we have that g   ˆi1 + βλ 2
                                                             s
                                                               ≥  ˆ
                                                                  gi0 for all i ∈ [0, 1].
       Therefore, by the binormality of preferences, the same reasoning implies
       again (1) and (2) for t = 1.
       The same reasoning extends (1) and (2) to any t ≥ 2, which concludes the
       proof of the Technical Claim.


   Using the Technical Claim, we show that (τ , D ) is sustainable if (τ, D) is
sustainable. If (τ, D) is sustainable, then we have from the government’s budget
constraint that
                             0≤          (τ (bi ) − D) di,
                                       i∈[0,1]
  13 We have shown that h∆∗ = h                              ∆∗                    1     ∆∗
                            i1     i1 for all i ∈ [0, 1]\A, hi1 = hi1 for all i ∈ A and hi1 > hi1
for all i ∈ A2 .
  14 In words, even if all individuals in A3 leave no inheritances, there are enough altruistic

individuals who, in the long-run equilibrium allocation z , leave inheritances smaller than the
additional amount βλ 2
                       s
                         considered in claim (1).


                                                 27
where (bi )i∈[0,1] is the long-run equilibrium proﬁle of bequests left under (τ, D).
In order to show that (τ , D ) is sustainable, it is suﬃcient that15

                                  (τ (bi ) − D) di ≤                   (τ (bi ) − D ) di
                        i∈[0,1]                              i∈[0,1]

where (bi )i∈[0,1] is long-run equilibrium proﬁle of bequests left under (τ , D ).
    Recalling that the mass of self-centered individuals is λs , we have for all
                  bi = ˆ
i ∈ [0, λs ] that ˆ                        bi ) = τ (ˆ
                        bi = 0 and thus τ (ˆ         bi ) = 0 . Last inequality holds if
the money saved by reducing the demogrant from D to D is suﬃcient to cover
the reduction in tax collected on altruistic individuals, i.e.

                                     τ (ˆ
                                        bi )di −               τ (ˆ
                                                                  bi )di ≤ D − D .                          (9)
                         i∈(λs ,1]                 i∈(λs ,1]

    In the remainder of Step 2, we ﬁrst show that Eq. (9) holds in a special
case. Then, we build on this special case in order to show that Eq. (9) holds in
general. To do that, using the Technical Claim, we show that the special case
is in fact the worst-case scenario.
    We now show that Eq. (9) holds for the special case for which h  ˆ =h  ˆi ≥
                                                                       i
                                         ˆ
Rβ − Rτ (β ) for all i ∈ (λs , 1]. When hi ≥ Rβ − Rτ (β ), because τ = τ ∆∗
we have by the deﬁnition of τ ∆ in Eq. (4) that τ (ˆ b)=τ ˆ  b + ∆∗ − ∆∗ . Ifi            i
ˆ =h
h                                 bi − τ (ˆ
     ˆ i , which is equivalent to ˆ              bi − τ (ˆ
                                          bi ) = ˆ       bi ), the deﬁnition of τ implies
 i
     ˆ      ˆ
that bi = bi − ∆ for all i ∈ (λs , 1]. In turn, this implies that τ (ˆ
                  ∗
                                                                        bi ) = τ (ˆ
                                                                                  bi ) − ∆ ∗ .
Replacing this expression in Eq. (9) yields

                                          (1 − λs )∆∗ ≤ D − D .

Since D − D = ∆∗ − βλ                                            ∗   β
                              2 , last inequality becomes ∆ ≥ 2 , which holds as the
                               s

                     ∗                   β     ∗
construction of ∆ is such that 2 < ∆ < β .
    There remains to show that Eq. (9) holds in general if Eq. (9) holds for the
special case for which h  ˆ =h    ˆ i ≥ Rβ − Rτ (β ) for all i ∈ (λs , 1]. By the Technical
                             i
                    ˆ
Claim, we have hi ≥ hi and hˆ         ˆ ≥ Rβ − Rτ (β ) for all i ∈ (λs + µ(J βλ       s
                                                                                     2 ), 1].
                                       i
From proﬁles (h ˆ i )i∈(λ ,1] and (h  ˆ )i∈(λ ,1] , we show it is possible to construct two
                         s             i     s
                         ˜
                         ˆ i )i∈(λ ,1] and (h˜
                                             ˆ )i∈(λ ,1] that correspond to the special
alternative proﬁles (h            s            i      s
           ˜
           ˆ    ˜
                ˆ
case, i.e. hi = hi ≥ Rβ − Rτ (β ) for all i ∈ (λs , 1], and for which

                                                                          ˜                         ˜
                τ (ˆ
                   bi )di −               τ (ˆ
                                             bi )di ≤                  τ (ˆ
                                                                          bi )di −               τ (ˆ
                                                                                                    bi )di, (10)
    i∈(λs ,1]                 i∈(λs ,1]                  i∈(λs ,1]                   i∈(λs ,1]


       ˆ i = R˜
       ˜       ˆ        ˜          ˆ = R˜
                                   ˜                      ˜
where h        bi − Rτ (ˆ
                        bi ) and h   i      bi − Rτ (ˆ
                                            ˆ             bi ) for all i ∈ (λs , 1]. If Eq. (10)
holds, then Eq. (9) holds for (ˆ   bi )i∈(λs ,1] and (ˆ bi )i∈(λs ,1] because the inheritance
                         ˜                    ˜
proﬁles associated to (ˆ bi )i∈(λs ,1] and (ˆ  bi )i∈(λs ,1] correspond to the special case
   ˜
   ˆ =h  ˜
         ˆ i ≥ Rβ − Rτ (β ) for all i ∈ (λs , 1].
as h i
   15 Recall that the sustainability of a scheme (τ, D ) relates only to the government budget

constraint under its associated long-run equilibrium allocation. Indeed, a long-run equilibrium
allocation is by deﬁnition a steady-state allocation, implying that the proﬁle of inheritances
left corresponds to the proﬁle of inheritances received.


                                                        28
                    ˜                   ˆ i )i∈(λ ,1] as follows (see Figure 8.a for an
                    ˆ i )i∈(λ ,1] from (h
    We construct (h          s                   s
illustration for the case of a positive peak tax τ , for which τ (β ) = 0):
                     ˜
                     ˆ j = Rβ − Rτ (β )                                                             βλs
                     h                                     for all j ∈ (λs , λs + µ(J                2       )],
                     ˜
                     ˆ     ˆ                                                                βλs
                     h i = hi                           for all i ∈ (λs + µ(J                2    ), 1],

                 ˜                 ˆ i )i∈(λ ,1] and (h
                 ˆ )i∈(λ ,1] from (h                  ˆ )i∈(λ ,1] as follows (see Figure
and construct (h   i    s                   s          i     s
8.b for an illustration for the case of a positive peak tax τ , for which τ (β ) = 0):
                          ˜
                          ˆ = Rβ − Rτ (β )
                          h j                                 for all j ∈ (λs , λs + µ ],
                          ˜
                          ˆ =h˜
                              ˆi
                          h i                                for all i ∈ (λs + µ , 1].

                    ˜                  ˜
                                       ˆ )i∈(λ ,1] are constructed by replacing the
                    ˆ i )i∈(λ ,1] and (h
In words, proﬁles (h         s           i    s
                     ˆ that are smaller than Rβ − Rτ (β ) by Rβ − Rτ (β ), and
             ˆ j and h
inheritances h          j
then replacing inheritances h  ˆ that are larger than h        ˆ i.
                                                        ˆ i by h
                                 i

                                                  (a)                                                                                  (b)

B LF ((0, w + D, 0))                                                     B LF ((0, w + D′ , 0))
                                                    ˜
                                         bi ) = Rτ (ˆ
                                     Rτ (ˆ          bi )                                                hi
                     hi
                                             case −τ is single-peaked:                ˆ′
                                                                                      h
          ˜                ˜
                           ˆi = z
                           z    ˆi                                                      i                                 ′
          ˆ
          h     ˆ
            i = hi                                          ˜                                                           ˆi
                                                                                                                        z
                                                τ (β ) = τ (ˆ
                                                            bj ) = 0                                                             ˜
                                                                                                                                 z
                                                                                                                                 ˆi
                                                                                                                                   ′
                                                                                     ˜′ ˆ
                                                                                     ˆ     ˜
                                                 ˜                                   h i = hi                                                ˜ ′
˜
ˆ                                                ˆj
                                                 z                                                                                           ˆj
                                                                                                                                             z
h j = Rβ − Rτ (β )                                                               Rβ − Rτ (β )
                                                                                                                   ˜
                                                                                                                   z
                                                                                                                   ˆi                                    −Rτ ′ (β − ∆∗ )
               ˆj
               h                                                   ˆj
                                                                   z

                                                                              −Rτ (ˆ
                                                                                   bj )
                                                                                                                                                     ′
                                                                                                  ˆ′
                                                                                                  h                                                ˆj
                                                                                                                                                   z
                                                                                                    j

                                                                                                                                                                   Bτ
                                                                                 Bτ
                                                                                                                                                                       ′
                                                                                                                                                                 Bτ
                                                                    ˆ
                                                                    bj
                                                                                 ci                                                                               ci
                                                           ˜             w+D                                                                       w + D′
                                                        β =ˆ
                                                           bj
                                                                                                                                             β − ∆∗



                                         ˜
                                         ˆ i )i∈(λ ,1] , where j ∈ (λs , λs + µ(J βλ s                  βλs
                                                                                   2 )] and i ∈ (λ + µ(J 2 ), 1].
Figure 8: (a) Construction of proﬁle (h           s                                               s
                               ˜
                               ˆ )i∈(λ ,1] , where j ∈ (λs , λs + µ ] and i ∈ (λs + µ , 1]. Case of a positive
(b) Construction of proﬁle (h    i    s
peak tax τ , for which τ (β ) = 0.


    There remains to show that Eq. (10) holds. First, we show that

                                                                                 ˜
                                                 τ (ˆ
                                                    bi )di ≤                  τ (ˆ
                                                                                 bi )di.                                      (11)
                                     i∈(λs ,1]                    i∈(λs ,1]

                          βλs
                                             ˆi = h˜
                                                   ˆ i ≥ Rβ − Rτ (β ) implying that
For all i ∈ (λs + µ(J 2 ), 1], we have h
             ˜                                βλ
   bi ) = τ (ˆ
τ (ˆ                                                        ˆ j < Rβ − Rτ (β ) and
                                                 s
             bi ). For all j ∈ (λs , λs + µ(J 2 )], we have h


                                                             29
˜
ˆ j = Rβ − Rτ (β ), and we show that τ (ˆ              ˜
h                                            bj ) ≤ τ (ˆ
                                                       bj ). This is obvious if τ is
                                     ˆ     ˜
                                           ˆ
monotonically increasing because bj < bj = β . If τ is positive peak (the case
illustrated in Figure 8.a), then our construction is such that β = b. As −τ is
single-peaked, this implies that τ (ˆ
                                    bj ) ≤ 0 whereas τ (β ) = 0. Therefore Eq. (11)
holds.
    Second, we show that
                                                                        ˜
                                         τ (ˆ
                                            bi )di ≥                 τ (ˆ
                                                                        bi )di.                   (12)
                             i∈(λs ,1]                   i∈(λs ,1]

                                         ˆ ≥ h  ˜
                                                ˆ ≥ Rβ − Rτ (β ) and we show that
For all i ∈ (λs + µ , 1], we have h       i       i
             ˜                                                                   ˜
τ (bi ) ≥ τ (bi ) (see illustration in Figure 8.b, where −τ (ˆ
    ˆ        ˆ                                                      bi ) ≤ −τ (ˆ bi )). The
construction of τ from τ is such that −τ is single-peaked when −τ is single-
                      ˆ ≥ h    ˜
                               ˆ ≥ Rβ − Rτ (β ) we have that ˆ               ˜
peaked. Also, as h     i         i                                   bi ≥ ˆ  bi ≥ β − ∆∗ .
                                                  ∗
For bequest amounts larger than β − ∆ , τ is monotonically increasing in
                                                ˜
bequest, which implies that τ (ˆ     bi ) ≥ τ (ˆbi ). For all j ∈ (λs , λs + µ ], we have
                           ˜                                                        ˜
ˆ                          ˆ
hj < Rβ − Rτ (β ) and hj = Rβ − Rτ (β ), and we show that τ (ˆ           bj ) ≥ τ (ˆbj ). For
                         ˆ       ˜
                                 ˆ          ∗             ∆ ∗
                                                                  ˆ             ∗
such j , we have thus bj < bj = β − ∆ . As τ = τ              and bj < β − ∆ , we have
    ˆ
τ (bj ) < 0. What is more, the construction of τ is such that τ (x ) < τ (x)
for all 0 ≤ x < x ≤ β − ∆∗ , i.e. the subsidy received under τ is increasing in
the bequest left, when the bequest is smaller than β − ∆∗ . Therefore Eq. (12)
holds.
     Eq. (11) and Eq. (12) together imply that (10) holds, which concludes Step
2.
     Step 3. We show that allocation z is preferred by Rc−lex to allocation z .
     By assumption A1, there is a positive mass of individuals a and a positive
mass of individuals s. The construction of τ from τ is such that −τ is single-
peaked when −τ is single-peaked. By Lemma 1, under both z and z , either
a or s are among the worst-oﬀs. Individual a is among the worst-oﬀs under z
because µc        c                         c
           a = u (za , ua ) < w + D = u (zs , us ), as illustrated in Figure 6.
                                        c
     There remains to show that µa < uc (za , ua ) and µc              c
                                                                a < u (zs , us ). We have
µa < u (zs , us ) because u (zs , us ) = w + D > w + D − β and β ≤ w + D − uc
  c     c                      c
                                                                                          a.
                                                                                            16
                             c     c
Finally, we show that µa < u (za , ua ). We have selected β such that β < ba .17
                                              ∗
                                                                 ∆∗                     ∆∗
When β < ba , the construction of τ ∆ implies that za                = za , where za         is
                                                       ∗                                ∗
the equilibrium bundle of a under scheme (τ ∆ , D − ∆∗ ). Since τ = τ ∆ but
D > D − ∆∗ , bundle za lies in the interior of18
                                          B τ (w + D , 0) ,
which is a’s budget set under (τ , D ), as illustrated in Figure 7. This implies
that ua (za ) < ua (za ), hence µc    c
                                 a < u (za , ua ).


CASE 2: there exists a bequest amount b∗ with 0 < b∗ < min(ba , w + D − µc
                                                                         a)
such that for all J ⊆ A with µ(J ) > 0 we have bj ≥ b∗ for all j ∈ J .
  16 We have w + D > w + D − β because D = D − ∆∗ + βλs and ∆∗ ∈ (β/2, β ) and
                                                                  2
λs ∈ (0, 1).
  17 If τ is monotonically increasing, then b = 0 and this case is such that β < b . If τ is
                                                                                  a
positive peak, then b > 0 and this case is such that β = b and ba > b.
                                           ∆∗
  18 As   D > D −∆∗ , all bundles in B τ        (w + D − ∆∗ , 0) lie in the interior of B τ (w + D , 0).


                                                    30
    Step 1. We construct (τ , D ) from a particular tax-demogrant scheme
(τ , D ). Take D = D − b∗ /2. The tax τ is constructed from τ by lin-
early truncating the budget set B τ (w + D + gi , 0) for bequests smaller than b∗
(as illustrated in Figure 9). Formally, we deﬁne τ as
                                      b∗    b∗
                      
                       τ x+ 2 − 2
                                               for all x ≥ b∗ /2
             τ (x) :=
                            τ (b∗ )−b∗ /2
                                               for all x ∈ [0, b∗ /2].
                      
                                          x
                      
                                b∗ /2




         B LF ((0, w + D, 0))



                                                              b=0
                 ua
    hi                 Rτ (ba )
                                                        (τ has no subsidy)



                                  za
                                                                    (w + D − b∗ , Rb∗ − Rτ (b∗ ))

                                                                   Bτ
                                                                         ′′
                                                                    Bτ
               B LF ((0, uc
                          a , 0))


                                                            b∗     zs
                                                                              ci
                                       uc
                                        a                        w+D
                                                        w + D ′′



    Figure 9: The tax-demogrant scheme (τ, D) is dominated because the
    tax function τ taxes small bequests. Individual a is the worst-oﬀ be-
    cause uc (za , ua ) = uc
                           a . The sustainable scheme (τ , D ) has a smaller
    demogrant, does not aﬀect uc (za , ua ) and leaves money on the table.


  The particularity of scheme (τ , D ) is that any individual i ∈ [0, 1] for
whom bi ≥ b∗ choses the same bundle under both (τ , D ) and (τ, D), i.e.

              arg          max                 zi ) = arg
                                           ui (˜                    max                zi ).
                                                                                   ui (˜
                    ˜i ∈B τ (w+D +gi ,0)
                    z                                       ˜i ∈B τ (w+D +gi ,0)
                                                            z

Let z = (gi , ci , hi )i∈[0,1] ∈ S denote the long-run equilibrium allocation as-
sociated to (τ , D ) and let (bi )i∈[0,1] be the long-run equilibrium proﬁle of
bequests left, i.e. bi = w + D + gi − ci for all i ∈ [0, 1]. Case 2 is such that
the proﬁle (hˆ )i∈[0,1] = (hˆ i )i∈[0,1] because for all J ⊆ A with µ(J ) > 0 we have
               i
                                                               bi = ˆ
bj ≥ b∗ for all j ∈ J . This implies for all i ∈ (λs , 1] that ˆ    bi − b∗ /2 and thus
   ˆ         ˆ       ∗
τ (bi ) = τ (bi ) − b /2. As (τ, D) is sustainable, we have that (τ , D ) leaves on
                                        ∗
the table an amount at least λs b2 , i.e.

                                λ s b∗
                                       ≤               (τ (bi ) − D ) di.
                                  2          i∈[0,1]


                                                       31
If τ is positive peak, then by assumption A3 there exists a sustainable scheme
(τ , D ) with τ = τ and D > D . If τ is monotonically increasing, then A3 is
not assumed and we deﬁne scheme (τ , D ) as

                                                   b∗ λs
                           (τ , D ) =   τ ,D +             .
                                                     4

    Step 2. We show that (τ , D ) is sustainable if (τ, D) is sustainable. We have
already shown it using A3 in Step 1 in the case for which τ is positive peak.
In the case for which τ is monotonically increasing, then a simpliﬁed version of
the argument used in Step 2 of Case 1 shows that (τ , D ) is sustainable. The
argument can be simpliﬁed because all altruistic individuals leave a bequest
larger than b∗ , implying that the partition of A as A = A1 ∪ A2 ∪ A3 is such
that A2 = A3 = ∅. We do not repeat this argument.
    Step 3. The long-run equilibrium allocation z associated to (τ , D ) is pre-
ferred by Rc−lex to allocation z . The argument is the same as the argument
used in Step 3 of Case 1. We do not repeat this argument. This concludes the
proof of Lemma 2.


First, we prove claim (i) of Proposition 2. Assume to the contrary that τ is
monotonically increasing but τ does not provide an exemption up to bLF          a (w ),
i.e. τ (bLF
         a  (w )) >  0. Under  this contradiction assumption,   we  show  that  scheme
(τ, D) is not optimal whatever the value of D. As any scheme (τ, D) with D < 0
is not optimal (Proposition 1), we consider any (τ, D) with D ≥ 0.
     As τ is monotonically increasing, we have b = 0 and either τ (ba ) > 0 or
τ (ba ) = 0. (Recall that ba denotes the equilibrium bequest left by individual
a with ga = 0 and ua = ua .) If τ (ba ) > 0, then Eq. (3) is violated and
(τ, D) is not optimal by Proposition 2 (ii), whose proof is given below. So as-
sume that τ (ba ) = 0, which implies that τ (x) = 0 for all x ∈ [0, ba ] because
τ is monotonically increasing. If ba ≥ bLF     a (w ), then we have τ (ba ) > 0 be-
cause τ (bLF
           a  ( w )) >  0 and τ is monotonically  increasing, a contradiction to our
assumption that τ (ba ) = 0.
     There remains the case for which ba < bLF  a (w ) and τ (ba ) = 0. First, we show
that any optimal (τ, D) has ba > 0. If ba = 0 under (τ, D) with D ≥ 0, then
we can show that the economy converges to a long-run equilibrium allocation
for which all inheritances are zero, i.e. D = 0. Indeed, if ba = 0 under a
scheme (τ, D) with D ≥ 0, the binormality of preferences implies that a leaves
no bequest under scheme (τ, 0). Therefore, any dynasty i with a member it
such that uit = us leaves no bequest for all it with t ≥ t . As there is in each
generation a mass λs of individuals i ∈ [0, 1] with ui = us , and preferences are
drawn at random in each generation, all dynasties have a member it such that
uit = us for some t ≤ t when t is suﬃciently large. Therefore all inheritances
are zero in the long-run equilibrium allocation, which implies that the largest
sustainable demogrant is D = 0 when ba = 0. We show that (τ, D = 0) is
dominated by Laissez-Faire when ba = 0. Under Laissez-Faire, the sustainable
demogrant is also zero. The equilibrium bundle of a under (τ, D = 0) is za =
                                              LF
(0, w, 0) because ba = 0, whereas it is za        = (0, w − bLF          LF
                                                              a (w ), Rba (w )) with
  LF
ba (w) > 0 under Laissez-Faire. As illustrated in Figure 10.a, we have ua (za ) <



                                          32
     LF
ua (za  ) because za ∈ B LF ((0, w, 0)) but
                                          LF
                                    za = za  = arg                    max               za ).
                                                                                    ua (˜
                                                               ˜a ∈B LF ((0,w,0))
                                                               z

Under Laissez-Faire, any individual i ∈ [0, 1] allocates her lifetime resources
freely in the laissez-faire budget B LF ((0, w + gi , 0)), thus we have uc (zi     LF
                                                                                       , ui ) =
w + gi . This shows that uc (zi     LF
                                       , ui ) ≥ uc (za LF
                                                           , ua ) for all i ∈ [0, 1] because
     LF
uc (za    , ua ) = w as ga = 0. Now, since ua (za LF
                                                      ) > ua (za ), we have uc (za LF
                                                                                      , ua ) >
                                  LF
uc (za , ua ), implying that uc (zi  , ui ) > uc (za , ua ) for all i ∈ [0, 1]. By A1, there
is mass of individuals a with ga = 0 and ua = ua , showing that (τ, D = 0) is
dominated by Laissez-Faire, i.e. not optimal.

                                      (a)                                                                      (b)
             B LF ((0, w, 0))
hi                                                                             hi       B LF ((0, w + D, 0))
                                     ua
                           ua
                                             LF
                                            za                                                                  ˆa
                                                                                                                z
                                                                                B LF ((0, uc (za , ua ), 0))


     B LF ((0, uc (za , ua ), 0))
                                                                                            Bτ                                  ua
                                                                                                                           za
               Bτ                                                                                                                    ua


                                                            za = zs                                                                  zs
                                                                       ci                                                                 ci
                                          uc (za , ua )    w                                                   uc (za , ua )    w+D




Figure 10: (a) (τ, D = 0) is dominated by Laissez-Faire when ba = 0. (b) uc (za , ua ) < w + D
when 0 < ba < bLF
                a (w ).



    Now, for the case 0 < ba < bLF a (w ), the binormality of preferences implies
that bLF
      a  ( w ) ≤ b LF
                   a  ( w + D ) and thus ba < bLFa (w + D ). In words, a would
increase her bequest if the exemption proposed by τ was larger. As illustrated
in Figure 10.b, because za ∈ B LF ((0, w + D, 0)) but
                                         ˆa = arg
                                    za = z                            max               za ),
                                                                                    ua (˜
                                                          ˜a ∈B LF ((0,w+D,0))
                                                          z

                       za ).
we have ua (za ) < ua (ˆ
          za , ua ) = w + D, this implies that uc (za , ua ) < w + D. As b = 0 and
   As uc (ˆ
ba > 0, we have ba > b, and Lemma 2 implies that (τ, D) is not optimal.

Second, we prove claim (ii) of Proposition 2. We show that, if τ (ba ) > b, then
scheme (τ, D) is not optimal whatever the value of D. As any scheme (τ, D)
with D < 0 is not optimal (Proposition 1), we consider any (τ, D) with D ≥ 0.
As any sustainable (τ, D) for which D < Dmax is dominated by (τ, Dmax ), and
thus not optimal, we can focus on D = Dmax . We show that the preconditions
for Lemma 2 are all met, which implies that (τ, D) is not optimal.
    By deﬁnition of b we have τ (b) = 0. Since τ (ba ) > 0, we have ba > b because
−τ is single-peaked, implying that τ is monotonically increasing in x for all
x ≥ b.


                                                                 33
    Any individual a with ga = 0 and ua = ua choses the equilibrium bundle
za = (0, w + D − ba , Rba − Rτ (ba )) in the budget set B τ (w + D, 0). Bundle
za is on the frontier of the Laissez-Faire budget set B LF ((0, w + D − τ (ba ), 0)),
which implies that uc (za , ua ) ≤ w + D − τ (ba ). As τ (ba ) > b, this implies that
uc (za , ua ) < w + D − b.
    Together, we have D ≥ 0 and we have shown µc       a < w + D − b and ba > b.
Therefore, Lemma 2 implies that (τ, D) is not optimal.

6.3     Proof of Proposition 3
The following axiom is the well-known Separability axiom, according to which
agents who are assigned identical bundles in two allocations should not matter
for the social ranking between these two allocations. The idea that they should
not matter is captured by the requirement that the social ranking remain the
same if the preferences and bundles assigned to these agents change in such
a way that the bundles assigned to these agents remain identical in the two
allocations.
Axiom 4 (Separability).
For all economy u ∈ U , steady-state allocations z = (gi , ci , hi )i∈[0,1] , z =
(gi , ci , hi )i∈[0,1] , z = (gi , ci , hi )i∈[0,1] , z = (gi , ci , hi )i∈[0,1] ∈ S , subset
of individuals J ∈ M [0, 1], if
    • for all j ∈ J : zj = zj and zj = zj ,
    • for all j ∈ [0, 1] \ J : uj = uj , zj = zj and zj = zj ,
then z R(u) z if and only if z R(u ) z .
    The bite of this axiom is that it allows us to modify the economy in such a
way that sets of agents of positive measure have the same preferences, which is
unlikely in a generic economy, whereas it is crucial to allow us to use Compen-
sation (see Step 1 in the proof below). We now state and prove the following
result, which justiﬁes using SWF Rc−lex .
Proposition 3. If a SWF (R) satisﬁes axioms Pareto, Compensation, Re-
sponsibility and Separability, then for all u ∈ U , z = (gi , ci , hi )i∈[0,1] , z =
(gi , ci , hi )i∈[0,1] ∈ S , if there exists J ∈ M [0, 1] such that µ(J ) > 0 and

                            sup uc (zj , uj ) < inf uc (zi , ui )
                            j ∈J               i∈[0,1]

then

                                        z P (u) z.

Proof. This proof is reminiscent of similar proofs developed in models of labor
income taxation in Fleurbaey and Maniquet (2006) and Fleurbaey and Maniquet
(2007). The main diﬀerences are, ﬁrst, that we deal here with economies with
a continuum of agents, which makes some arguments longer, whereas all agents
face the same prices (that is, the price of h is equal to R), which allows us to
simplify the proof.
    The proof is divided in three steps. In the ﬁrst step, we show that the
combination of the four axioms implies a strengthening of the Responsibility


                                             34
axiom in which inequality aversion is inﬁnite. In the second step, we show that
the inﬁnite inequality aversion is extended to uc (zi , ui ). In the ﬁnal step, we
show that this allows us to derive the desired property.
   Step 1. We begin by deﬁning the following strengthening of Responsibility.
Axiom 5 (Responsibility*).
For all economy u ∈ U , steady-state allocations z = (gi , ci , hi )i∈[0,1] , z =
(gi , ci , hi )i∈[0,1] ∈ S , subsets of individuals J, K ∈ M [0, 1] such that µ(J ) =
µ(K ) > 0, if there exists δ, ∆ > 0 such that for all j, q ∈ J and k, ∈ K ,
    • zi ∈ max|ui B LF (zi ), ∀ i ∈ {j, q, k, },

    • zi ∈ max|ui B LF (zi ), ∀ i ∈ {j, q, k, },
    • yj + δ = yq + δ = yj = yq ≤ yk = y = yk − ∆ = y − ∆,
where
                             hi           h
                                , y = ci + i , ∀ i ∈ {j, q, k, }
                         yi = ci +
                             R i          R
                      / J ∪ K then z P(u) z .
and zi = zi for all i ∈
    With Responsibility*, we require strict social preference as soon as all agents
in J gain, even if their budget gain is arbitrarily small and the budget loss of
members of K is arbitrarily large. This is why Responsibility*, contrary to
Responsibility, conveys an inﬁnite inequality aversion.
    We prove the following claim: If a SWF (R) satisﬁes Pareto, Compensation,
Responsibility and Separability, then it satisﬁes Responsibility*.
    Let u ∈ U , z = (gi , ci , hi )i∈[0,1] , z = (gi , ci , hi )i∈[0,1] ∈ S , and J, K ∈ M [0, 1]
satisfy the conditions of Responsibility*. Let us assume, contrary to the claim,
that z R(u) z . We can assume, without loss of generality, that µ(J ) = µ(K ) ≤
1
4 . If it is not the case, then the claim is proven by repeating this proof twice.
Let yj , yj , yk , yk ∈ R+ be deﬁned by

                                               hi
                                  y i = ci +      ∀ i ∈ {j, k },
                                               R
so that yj < yj ≤ yk < yk . By Pareto, we can assume, w.l.o.g., that yj < yk .
Indeed, if it is not the case, then we can create z = (gi , ci , hi )i∈[0,1] ∈ S by
replacing zk = (gk , ck , hk ) with zk = (gk , ck , hk ) such that

                                                      hk
                                  yk < yk = ck +         < yk
                                                      R
for all k ∈ K , so that z P(u) z and, by transitivity, z P(u) z and continue
the proof. So, we assume yj < yk . We can even assume, w.l.o.g., that

                                                           yk − yj
                             (yj − yj ) + (yk − yk ) <             .
                                                              2
Indeed, if it is not the case, then the claim is proven by repeating this proof the
required number of times.19
  19 The   fact that yk > yj always allows us to construct sets A and B with y    ¯a = y
                                                                             ¯a < y         ¯b
                                                                                       ¯b < y
                                                                                         yj −yj
and the proof below has to be replicated a ﬁnite number of times at least as large as    ¯a −y
                                                                                         y   ¯a
                                                                                                .



                                                35
    Let u∗ ∈ U and z        ¯a = (¯   ga , c     ¯ a ), z
                                            ¯a , h      ¯a = (¯   ga , c    ¯ ), z
                                                                       ¯a , h                            ¯ ), z
                                                                             a ¯a = (¯              ¯a , h
                                                                                               ga , c     a ¯a =
(¯
 ga , c    ¯ ), z
      ¯a , h       ¯   = (¯
                          g , ¯
                              c , ¯
                                  h   ) , ¯
                                          z   =  (¯g  , ¯
                                                        c , ¯
                                                            h   ), z
                                                                   ¯   =    g
                                                                           (¯   , c
                                                                                  ¯   , ¯
                                                                                        h   ), z
                                                                                               ¯    =    g
                                                                                                        (¯         ¯ )∈
              a      b     b b      b       b        b b      b     b         b     b     b     b          b ,c
                                                                                                              ¯b , hb
X, y¯a , y
         ¯a , y    ¯b ∈ R be such that
              ¯b , y
                                           ¯i
                                           h            ¯
                                                        h
                             y
                             ¯i = c
                                  ¯i +        ,y    ¯i + i , ∀ i ∈ {a, b}
                                               ¯i = c
                                           R            R
                                                ¯a
                                                y        ¯b
                                                       = y
                                         ¯a − y
                                         y    ¯a       = yj − yj
                                          ¯b − y
                                          y    ¯b      = yk − yk
                                                yj     ≤ y
                                                         ¯a
                                                y
                                                ¯b     ≤ yk

                         ¯a ∈ max|u∗ B LF (¯
                         z                       ¯a ∈ max|u∗ B LF (¯
                                           za ), z                 za ),
                          ¯b ∈ max|u∗ B LF (¯
                          z                       ¯b ∈ max|u∗ B LF (¯
                                            zb ), z                 zb )
                                       ¯a < c
                                       c          ¯ <h
                                             ¯b , h     ¯ ,
                                                   a      b
                                                          ¯        ¯ )
                          (¯    ¯ ) = (¯
                           ca , h           ¯ ) = b hb ) + (¯
                                       cb , h
                                                      c
                                                     (¯ ,     ca , ha
                                 a           b
                                                            2
                                          u∗ (za )    =     u∗ (za )
                                          u∗ (za )    =     u∗ (za )
                                          u∗ (zb )    =     u∗ (zb ).
The construction of u∗ , z    ¯a , z
                         ¯a , z         ¯a , z
                                   ¯a , z    ¯b , z
                                                  ¯b , z
                                                       ¯b , z    ¯a , y
                                                            ¯b , y         ¯b , y
                                                                      ¯a , y    ¯b is illustrated in
Fig. 11.
    The intuition of the proof and the role of the axioms can be illustrated with
the ﬁgure. We need to prove that the budget increase of an amount δ for agents
j and q at the expense of a budget decrease of an amount ∆ for agents k and
 , with ∆ possibly much larger than δ , is a social improvement. Separability
allows us to modify the preferences and bundles of a suﬃciently large number
of agents and insert agents of type a and b in the economy. The design of
their preferences is key: they are indiﬀerent between a transfer of resources,
represented by bundles z ¯i and z    ¯i , i ∈ {a, b}, in which the beneﬁciary gets an
amount equal to that left by the contributor, and a transfer of resources in which
the beneﬁciary gets a diﬀerent amount of resources, possible much smaller, than
the one lost by the contributor, represented by bundles z                  ¯i and z  ¯i , i ∈ {a, b}.
Pareto forces us to be indiﬀerent between these two sets of transfers. Transfers
are calibrated in such a way that a sequence of transfers between agents j and
a (using Responsibility), agents a and b (using Compensation) and then agents
b and k (using Responsibility) allows us to reach the desired conclusion.
    The following axiom, known in the literature as Pareto indiﬀerence, is a
well-known consequence of Pareto.
Axiom 6 (Pareto Indiﬀerence).
For all economy u ∈ U and steady-state allocations z = (gi , ci , hi )i∈[0,1] , z =
(gi , ci , hi )i∈[0,1] ∈ S , if for all i ∈ [0, 1]
                                         ui (ci , hi ) = ui (ci , hi )
then z I(u) z .

                                                      36
                   z ′′′  ′′′
                     a = zb
                                  z ′′
                                    b

    hi              z ′′
                      a




                                                                                zk
               ′
              zq
                                                                      ′
         zq                                                          zk
                                                            z′
                                                             b


                                            za = zb
                                   z′
                                    a                                                         zℓ
                            ′
                           zj                                             u∗       ′
                                                                                  zℓ
                    zj

                                                  u∗         u∗
                                      δ                 δ             ∆                            ∆
                                            ′
                                                                                                                   ci
                                 yj        yj      y′
                                                    a                           y′
                                                                                 b
                                                                                          ′
                                                                                         yk               yk
                                                        ya = y b



    Figure 11: Construction of u∗ , z    ¯a , z
                                    ¯a , z         ¯a , z
                                              ¯a , z    ¯b , z
                                                             ¯b , z
                                                                  ¯b , z    ¯a , y
                                                                       ¯b , y         ¯b , y
                                                                                 ¯a , y    ¯b .



    Let A, B ∈ M [0, 1] be such that µ(A) = µ(B ) = µ(J ) = µ(K ), A, B, J and
K are all disjoint. Since they are all disjoint, we have (ci , hi ) = (ci , hi ) for all
i ∈ A ∪ B . Let u ∈ U be deﬁned by

                                 ua        = u∗ , ∀ a ∈ A
                                  ub       = u∗ , ∀ b ∈ B
                                  ui       = ui , ∀ i ∈ [0, 1] \ (A ∪ B ).
                                                                                                       3[0,1]
                        1 1
Let allocations z 1 = (gi , ci , h1             2    2 2       2
                                  i )i∈[0,1] , z = (gi , ci , hi )i∈[0,1] ∈ R+                                  be deﬁned
by

                            (c1    1       2    2                               ¯ a ), ∀ a ∈ A
                              a , ha ) = (ca , ha )              =         ca , h
                                                                          (¯
                            (c1    1
                                                (c2    2                        ¯ b ), ∀ b ∈ B,
                              b , hb )    =       b , hb )       =         cb , h
                                                                          (¯

which implies (c1    1       1    1
                a , ha ) = (cb , hb ), and by

                           (c1    1
                             i , hi )      =      (ci , hi ), ∀ i ∈ [0, 1] \ (A ∪ B ),
                           (c2    2
                             i , hi )      =      (ci , hi ), ∀ i ∈ [0, 1] \ (A ∪ B ),
     1 2
and gi , gi , i ∈ [0, 1], are ﬁxed so as to guarantee that z 1 , z 2 ∈ S . By Separability,

                                          z R(u) z ⇔ z 1 R(u ) z 2 ,

so that, by the premise of the argument, z 1 R(u ) z 2 .


                                                             37
                                      3[0,1]
               3 3
   Let z 3 = (gi , ci , h3
                         i )i∈[0,1] ∈ R+       be deﬁned by
                    (c3    3               ¯ ), ∀ a ∈ A
                      a , ha )   =    ca , h
                                     (¯     a
                    (c3    3
                      j , hj )   =   (cj , hj ), ∀ j ∈ J
                    (c3    3
                      i , hi )   =   (c1    1
                                       i , hi ), ∀ i ∈ [0, 1] \ (A ∪ J ),
     3
and gi , i ∈ [0, 1], are ﬁxed so as to guarantee that z 3 ∈ S . By Responsibility,
 3
z P(u ) z 1 , so that, by transitivity, z 3 P(u ) z 2 .
                                        3[0,1]
   Let z 4 = (gi  , ci , h4
                 4 4
                          i )i∈[0,1] ∈ R+      be deﬁned by
                    (c4    4               ¯ ), ∀ b ∈ B
                      b , hb )   =    cb , h
                                     (¯     b
                    (c4    4
                      k , hk )   =   (ck , hk ), ∀ k ∈ K
                    (c4    4
                      i , hi )   =   (c3    3
                                       i , hi ), ∀ i ∈ [0, 1] \ (B ∪ K ),
     4
and gi , i ∈ [0, 1], are ﬁxed so as to guarantee that z 4 ∈ S . By Responsibility,
 4
z P(u ) z 3 , so that, by transitivity, z 4 P(u ) z 2 .
                                        3[0,1]
   Let z 5 = (gi  , ci , h5
                 5 5
                          i )i∈[0,1] ∈ R+      be deﬁned by
                    (c5    5               ¯ ), ∀ a ∈ A
                      a , ha )   =    ca , h
                                     (¯     a
                    (c5    5               ¯ ), ∀ b ∈ B
                      b , hb )   =    c ,h
                                     (¯b   b
                    (c5    5
                      i , hi )   =   (c4    4
                                       i , hi ), ∀ i ∈ [0, 1] \ (A ∪ B ),
     5
and gi , i ∈ [0, 1], are ﬁxed so as to guarantee that z 5 ∈ S . By Pareto Indiﬀerence,
 5
z I(u ) z 4 , so that, by transitivity, z 5 P(u ) z 2 .
                                         3[0,1]
   Let z 6 = (gi   , ci , h6
                  6 6
                           i )i∈[0,1] ∈ R+      be deﬁned by
                    (c6    6             ¯ ), ∀ a ∈ A
                      a , ha )   =  ca , h
                                   (¯     a
                    (c6    6             ¯ ), ∀ b ∈ B
                      b , hb )      cb , h
                                 = (¯     b
                    (c6    6
                      i , hi )   =   (c5    5
                                       i , hi ), ∀ i ∈ [0, 1] \ (A ∪ B ),
     6
and gi , i ∈ [0, 1], are ﬁxed so as to guarantee that z 6 ∈ S . By Compensation,
 6
z P(u ) z 5 , so that, by transitivity, z 6 P(u ) z 2 .
                                        3[0,1]
   Let z 7 = (gi  , ci , h7
                 7 7
                          i )i∈[0,1] ∈ R+      be deﬁned by
                    (c7    7               ¯ a ), ∀ a ∈ A
                      a , ha )   =    ca , h
                                     (¯
                    (c7    7               ¯
                                      cb , hb ), ∀ b ∈ B
                      b , hb )   =   (¯
                    (c7    7
                      i , hi )   =   (c6    6
                                       i , hi ), ∀ i ∈ [0, 1] \ (A ∪ B ),
      7
and gi  , i ∈ [0, 1], are ﬁxed so as to guarantee that z 7 ∈ S . By Pareto Indiﬀerence,
z 7 I(u ) z 6 , so that, by transitivity, z 7 P(u ) z 2 .
                                                                       3[0,1]
    Let z 8 = (gi   , ci , h8
                   8 8                    9    9 9       9
                            i )i∈[0,1] , z = (gi , ci , hi )i∈[0,1] ∈ R+      be deﬁned by
                            8
                           za    = za , ∀ a ∈ B
                            8
                           zb    = zb , ∀ b ∈ B
                            8       7
                           zi    = zi , ∀ i ∈ [0, 1] \ (A ∪ B ),
and
                            9
                           za    = za , ∀ a ∈ B
                            9
                           zb    = zb , ∀ b ∈ B
                            9       2
                           zi    = zi , ∀ i ∈ [0, 1] \ (A ∪ B ).


                                               38
By Separability,
                                  z 8 R(u) z 9 ⇔ z 7 R(u ) z 2 ,
so that, by transitivity, z 8 P(u) z 9 . Finally, observe that z 8 = z 9 = z , so that
z P(u) z , the desired contradiction.
    Step 2. We now prove the following claim: If a SWF (R) satisﬁes ax-
ioms Pareto, Compensation, Responsibility and Separability, then it satisﬁes
the following property, which amounts to requiring an inﬁnite aversion towards
inequality in uc : For all u ∈ U , z = (gi , ci , hi )i∈[0,1] , z = (gi , ci , hi )i∈[0,1] ∈ S ,
J, K ∈ M [0, 1] such that µ(J ) = µ(K ) > 0, if for all j ∈ J and k ∈ K ,
    • uc (zj , uj ) < uc (zj , uj ) < uc (zk , uk ) < uc (zk , uk ),

    • supi∈J uc (zi , ui ) < inf i∈J uc (zi , ui ),
                      / J ∪ K then z P(u) z .
and zi = zi for all i ∈
    Let u ∈ U , z = (gi , ci , hi )i∈[0,1] , z = (gi , ci , hi )i∈[0,1] ∈ S , and J, K ⊆ M [0, 1]
satisfy the conditions of this property. Let us assume, contrary to the claim,
that z R(u) z .
    By Pareto Indiﬀerence, we can assume, without loss of generality, that

                       zj ∈ max|uj B LF (zj ), zj ∈ max|uj B LF (zj ),
                      zk ∈ max|uk B LF (zk ), zk ∈ max|uk B LF (zk ).

Indeed, if it is not the case, then, by Pareto Indiﬀerence, we can replace bundles
zj , zj and zk , zk by a bundle on the same indiﬀerence curve that is optimal in
the corresponding budget.
                                                                            hj            hq
     By Pareto, we can assume that for all j, q ∈ J , cj + R                     = cq + R     and
                            hk              h
for all k, ∈ J , ck + R = c + R . Indeed, if it is not the case, we can
replace each zj with zj such that uc (zj , uj ) = supi∈J uc (zi , ui ), each zj with zj
such that uc (zj , uj ) = inf i∈J uc (zi , ui ), each zk with zk such that uc (zk , uk ) =
supi∈K uc (zi , ui ), each zk with zk such that uc (zk , uk ) = inf i∈K uc (zi , ui ). By
Pareto, z P(u) z and z P(u) z , so that z P(u) z , and the proof continues.
     By Responsibility*, z P(u) z , so that, by transitivity, z P(u) z , the desired
contradiction.
     Step 3. We now prove the claim presented in the statement of the Proposi-
tion. Let u ∈ U , z = (gi , ci , hi )i∈[0,1] , z = (gi , ci , hi )i∈[0,1] ∈ S , J ∈ M [0, 1] such
that µ(J ) > 0 and

                             sup uc (zj , uj ) < inf uc (zi , ui ).
                              j ∈J                i∈[0,1]


Let us assume, contrary to the claim, that z R(u) z . Let z = (gi , ci , hi )i∈[0,1] ∈
S be such that uc (zi , ui ) = u for all i ∈ [0, 1] and

                           sup uc (zj , uj ) < u < inf uc (zi , ui ).
                           j ∈J                       i∈[0,1]


Let N ∈ N be an integer such that N µ(J ) > 1. Let J ⊆ J be such that
µ(J ) = 1−N µ(J )
                  . We can create a sequence z 0 , . . . , z n , . . . , z N such that zi
                                                                                        0
                                                                                          = zi
                                c 0                                           N
for all i ∈ [0, 1] \ (J \ J ), u (zj , uj ) = u for all j ∈ J \ J , z = z and for



                                                39
each n ∈ {1, . . . , N }, there exists a set K n ∈ M [0, 1] such that µ(K n ) = µ(J ),
∪n∈{1,...,N } K n ∪ J = [0, 1],
                n
           uc (zk , uk )   = u, ∀ k ∈ K n
                                  n−1            1
                n
           uc (zj , uj )   = uc (zj   , uj ) +     (u − uc (zj , uj )) , ∀ j ∈ J
                                                 N
                n
           uc (zj , uj )   = u, ∀ k ∈ J \ J
                     n        n−1
                    zi     = zi   , ∀ i ∈ [0, 1] \ (K n ∪ J ).

By Pareto, z 0 P (u) z . By the property proven in Step 3, z n P (u) z n−1 . By
transitivity, z P (u) z . By Pareto, z P (u) z , so that z P (u) z , the desired
contradiction.



References
Atkinson, A. and Stiglitz, J. (1976). The design of tax structure: direct versus
  indirect taxation. Journal of Public Economics, 6(1):55–75.

Chamley, C. (1986). Optimal Taxation of Capital Income in General Equilib-
  rium with Inﬁnite Lives. Econometrica, 54(3):607–622.
Cremer, H. and Pestieau, P. (2006). Chapter 16 Wealth transfer taxation: a
  survey of the theoretical literature. Handbook of the Economics of Giving,
  Altruism and Reciprocity, 2(06):1107–1134.

Cremer, H., Pestieau, P., and Rochet, J.-c. (2001). Direct versus Indirect Tax-
  ation : The Design of the Tax Structure Revisited. International Economic
  Review, 42(3):781–799.
Cremer, H., Pestieau, P., and Rochet, J. C. (2003). Capital income taxa-
  tion when inherited wealth is not observable. Journal of Public Economics,
  87(11):2475–2490.
Farhi, E. and Werning, I. (2010). Progressive Estate Taxation. The Quarterly
  Journal of Economics, 125(2):635–673.
Fleurbaey, M., Leroux, M.-L., Pestieau, P., Ponthiere, G., and Zuber, S. (2018).
  Premature deaths, accidental bequests and fairness.
Fleurbaey, M. and Maniquet, F. (2006). Fair income tax. The Review of Eco-
  nomic Studies, 73(1):55–83.
Fleurbaey, M. and Maniquet, F. (2007). Help the low skilled or let the hard-
  working thrive? a study of fairness in optimal income taxation. Journal of
  Public Economic Theory, 9(3):467–500.
Fleurbaey, M. and Maniquet, F. (2018). Optimal Taxation Theory and Princi-
  ples of Fairness. Journal of Economic Literature, 56(3):1029–1079.
Golosov, M., Kocherlakota, N. R., and Tsyvinski, A. (2003). Indirect and Cap-
 ital Optimal Taxation. Review of Economic Studies, 70(3):569–587.


                                            40
Judd, K. L. (1985). Redistributive taxation in a simple perfect foresight model.
  Journal of public Economics, 28(1):59–83.

Kaplow, L. (1995). A note on subsidizing gifts. Journal of Public Economics,
 58(3):469–477.
Kaplow, L. (2001). A Framework for Assessing Estate and Gift Taxation. In
 W. Gale, J. H. and Slemrod, J., editors, Rethinking Estate and Gift Taxation,
 number January. Brookings Institution Press, Washington, D. C.

Kocherlakota, N. R. (2005). Zero expected wealth taxes: A mirrlees approach
 to dynamic optimal taxation. Econometrica, 73(5):1587–1621.
Kopczuk, W. (2013a). Incentive Eﬀects of Inheritances and Optimal Estate
 Taxation. American Economic Review, 103(3).

Kopczuk, W. (2013b). Taxation of intergenerational transfers and wealth. In
 Handbook of public economics, volume 5, pages 329–390. Elsevier.
Lockwood, B. B. and Weinzierl, M. (2016). Positive and normative judgments
  implicit in U . S . tax policy , and the costs of unequal growth and recessions
  $. Journal of Monetary Economics, 77:30–47.

McCaﬀery, E. (1994). The Uneasy Case for Wealth Transfer Taxation. Yale
 Law Journal, 2(104):283–365.
Mirrlees, J., Adam, S., Besley, T., Blundell, R., Bond, S., Chote, R., Gammie,
 M., Johnson, P., Myles, G., and Poterba, J. (2010). Reforming the tax system
 for the 21st century: The mirrlees review. Institute for Fiscal Studies.

Nordblom, K. and Ohlsson, H. (2006). Tax avoidance and intra-family transfers.
  Journal of Public Economics, 90(8-9):1669–1680.
OCDE (2021). Inheritance Taxation in OECD Countries.
Piketty, T. and Saez, E. (2012). A theory of optimal capital taxation. Technical
  report, National Bureau of Economic Research.
Piketty, T. and Saez, E. (2013). A theory of optimal inheritance taxation.
  Econometrica, 81(5):1851–1886.
Saez, E. and Stantcheva, S. (2016). Generalized social marginal welfare weights
  for optimal tax theory. American Economic Review, 106(1):24–45.
Stantcheva, S. (2015). Optimal income, education, and bequest taxes in an
  intergenerational model. Technical report, National Bureau of Economic Re-
  search.
Straub, L. and Werning, I. (2020). Positive long-run capital taxation: Chamley-
  judd revisited. American Economic Review, 110(1):86–119.
Wilhelm, M. O. (1996). Bequest behavior and the eﬀect of heirs’ earnings:
 Testing the altruistic model of bequests. The American Economic Review,
 pages 874–892.



                                       41