Policy Research Working Paper 10978 Poverty Lines and Spatial Differences in the Cost of Living Nicola Amendola Federico Belotti María Edo Marco Ranzani Giovanni Vecchi Poverty and Equity Global Practice November 2024 Policy Research Working Paper 10978 Abstract This paper proposes a new method for estimating a is derived indirectly as a ratio among subnational poverty full-coverage spatial price index using data typically avail- lines. The paper extends the analytical framework described able in household budget surveys. The food component of in Deaton and Zaidi (2002) and discusses the advantages the index is estimated at the household level using reported of this new methodology. expenditures and quantities, while the nonfood component This paper is a product of the Poverty and Equity Global Practice. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at mranzani@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Poverty Lines and Spatial Differences in the Cost of Living Nicola Amendola (Tor Vergata University of Rome) Federico Belotti (Tor Vergata University of Rome) María Edo (Universidad de San Andrés) Marco Ranzani (The World Bank) Giovanni Vecchi (Tor Vergata University of Rome) Keywords: cost-of-living differences, poverty lines, welfare analysis. JEL codes: D63, I32, R20. Acknowledgements: Corresponding Author: Marco Ranzani, mranzani@worldbank.org. The authors would like to thank Benoit Decerf and Nobuo Yoshida for their thorough review of the manuscript and for providing valuable and constructive suggestions. The authors thank Dean Jolliffe for his insightful comments. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. 1 Introduction In April 2024, the average price of one gallon of fresh fortified milk in US cities was $3.868. 1 Consumers in the northeast paid, on average, $4.395, and consumers in the west purchased milk at an average price per gallon of about $0.60 less ($3.757). In the same month, the urban consumer index of rent on a primary residence was about 5.4 percent higher in the west than in the northeast. 2 The existence of differences in the prices of consumption goods and services across space is well established. These variations are sometimes considerable between urban and rural areas, across regions, provinces or states, cities, and so on. The simple statistical evidence has, however, important effects on the purchasing power of money. The higher the prices, the lower the purchasing power of households. Welfare economists, practitioners, statisticians, and analysts interested in living standards are aware that any meaningful comparison of material well-being among individuals living in different geographical areas requires some adjustment to transform a nominal welfare aggregate, be it consumption or income, into a metric that does not conceal spatial differences in the cost of living. 3 First, adjusting for spatial price differentials has important implications for monetary poverty and inequality. Examples of relevant statistics that are affected by spatial differences in prices include poverty headcount ratios, the depth of poverty, the number of the poor, poverty profiles, and indicators of inequality. Second, such statistics on living standards are key inputs to poverty reduction strategies. Adjustments for spatial price differences are therefore relevant to the design of policies aimed at helping people move out of poverty and to the allocation across ministries or across subnational state or provincial governments of resources derived from national budgets, international donors, or multilateral development banks. Somewhat paradoxically, it is precisely low-income countries, in which the statistical information needed to measure price variability adequately is rarely available, that are characterized by a 1 See BLS Data Viewer (graphic), time series: Milk, fresh, whole, fortified, per gal. (3.8 lit), time period: 2024, series ID: APU0000709112. 2 The base period of the index is 1982–84, and the index is not seasonally adjusted. For an overview, see Measuring Price Change in the CPI: Rent and Rental Equivalence (web page), US Bureau of Labor Statistics, Washington, DC, https://www.bls.gov/cpi/factsheets/owners-equivalent-rent-and-rent.htm. 3 This is in addition to any required correction for differences in the prices of consumption goods and services over time (inflation) and differences in household size and composition (adjusted through the use of equivalence scales). 2 pronounced geographical variability in the cost of living (Gaddis 2016). In these countries, high transaction and transportation costs, together with other deviations from the ideal of perfect competition, lead to significant market segmentation. The data that are typically available allow for the construction of, at best, a spatial price index limited solely to food items (Mancini and Vecchi 2022). As a result, the geographical variation in prices, particularly in sectors in which markets are less well integrated, is overlooked. This paper proposes a method to estimate a household-level spatial price index that goes beyond food to include all items consumed by households, that is, a full-coverage index. The index is relative to the expenditure aggregate constructed and used for poverty analysis. This method requires, at most, the availability of the data typically used to estimate national poverty and geographical poverty profiles. It is an indirect estimation method based primarily on the analytical relationship between multiple poverty lines, usually calculated at the regional level, and “true” cost-of-living indexes. Specifically, the food component of the index is estimated directly using information on unit values and expenditure shares that is commonly available in household surveys. By contrast, the nonfood component is estimated indirectly through the ratio of multiple poverty lines calculated at the subnational level, for instance, the regional level. This results in an easy-to-implement and interpret full-coverage household-level spatial price index that can be expressed as follows: −1 −1 −1 ℎ = �ℎ �ℎ � + �1 − ℎ � �ℎ() � � The index can be obtained as a weighted average of a household-level Paasche spatial food price index (ℎ ) and a stratum-level non-food spatial price index (ℎ() ), with the weight ℎ calculated as the household-level food budget share. This paper is not the first to propose an indirect method for estimating spatial price variability. Thomas (1980) has introduced a method for estimating the nonfood component of the price index based on household expenditures. That method, though, sets explicit restrictions on individual preferences that the method here does not require. Jolliffe, Datt and Sharma (2004) and Marivoet and De Herdt (2015) have developed a similar estimation method based on regional poverty lines, but given that it jointly addresses both the food and nonfood components, the minimum level of aggregation allowed is regional. The method here adopts a mixed strategy whereby only the nonfood component is estimated indirectly, allowing for the construction of a spatial index that takes full advantage of household-level heterogeneity. Almås (2012) proposes an alternative empirical strategy based on the relationship between food expenditure share and total expenditure, that is, the Engel curve. Although this approach does not require the direct observation of market 3 prices, it relies on the highly problematic assumption of the stability of the Engel curve across regions. By contrast, the method here does not impose any restrictions on household behavior or preferences. Instead, it relies on the accuracy and consistency of the referencing used in constructing multiple poverty lines. It is therefore essential that regional poverty lines be utility consistent, that is, based on a common reference standard of living (Ravallion and Lokshin 2006). The rest of this paper is structured as follows. Section 2 introduces a theoretical framework for spatial deflation based on ratios among poverty lines. Section 3 discusses the challenges involved in estimating multiple poverty lines versus estimating a single national poverty line and a real consumption aggregate. Section 4 introduces a full-coverage household-level spatial price index, together with two slightly different estimation methods. Section 5 concludes. 2 Spatial deflation, poverty lines, and poverty measurement: A theoretical framework From a welfarist perspective (Ravallion 1998, 2016), the measurement of poverty and inequality relies on the assessment of individual well-being, typically derived from household-level consumption expenditure. 4 To ensure meaningful welfare comparisons across individuals, a form of price adjustment is needed to maintain consistency with the analytical microeconomic framework that underpins the definition of poverty concepts and measures of well-being (Deaton and Muellbauer 1980; Deaton and Zaidi 2002). This section aims to provide a theoretical framework that demonstrates the equivalence of two strategies for poverty measurement: one based on the spatial deflation of expenditures relative to a national poverty line and another relying on nominal expenditures using multiple spatial poverty lines under the condition of homothetic preferences. The price adjustment mechanism thus acts as a bridge between an empirical strategy that estimates poverty using a single poverty line and one that uses multiple lines. Household expenditure can be interpreted as a money-metric utility function representing the minimum cost sustained by households to achieve a specific standard of living or, equivalently, a certain level of utility. Let ℎ denote the market price vector faced by household h that achieves utility level ℎ . The household’s well-being is then measured by an expenditure function (ℎ , ℎ ), that is, by the minimum cost to achieve the utility level ℎ which, keeping constant the 4 In this section, household and individual are used interchangeably, that is, it is assumed that the preferences of household members may be specified by the preferences of a representative household member. 4 price vector, is a utility function representing the household’s preferences. 5 Consider now two households, say i and j, enjoying the same utility level u. In general, one may observe that ( , ) ≠ � , �, indicating that households facing different market prices will spend different amounts of money to attain the identical utility level u. This is where price adjustments come into play. A strategy to compare household well-being through household expenditure consists in choosing a reference price vector 0 , whereby, if = = , then (0 , ) = �0 , � = (0 , ) and vice versa. A similar challenge arises in poverty measurement, where the standard of living of each household needs to be compared with a common minimum standard of living. Let represent the utility level that defines such a minimum standard of living. A household h is deemed to be poor if and only if ℎ ≤ . To express this identification criterion in monetary terms, expenditure functions are used wherein the minimum expenditure to attain the standard of living is referred to as the poverty line. To maintain the poverty line constant across households, a reference price vector must be chosen. Household h is poor if and only if the following is true: (p0 , ℎ ) ≤ (p0 , ) (1) Using a reference price vector is analytically equivalent to applying a price adjustment factor or a price index to convert nominal expenditure into real terms. Alternatively, a theoretically equivalent strategy involves adopting multiple household-specific poverty lines. These two strategies are inherently linked, establishing an analytical connection between multiple poverty lines and price deflators. The following sections make this relationship explicit. Equation 1 makes clear that identifying poverty status through a single national level poverty line requires the establishment of a reference price system 0 . The challenge lies in the fact that observed expenditures depend on the actual prices ℎ faced by households rather than the reference price system used to define the poverty line = (0 , ). Conceptually, there is a simple solution to this issue. Define a “true” cost-of-living index for household h as follows: (pℎ , ) ℎ () = (2) (p0 , ) Equation 2 is the ratio between the minimum expenditure faced by household h in achieving the utility level and the same minimum expenditure at the reference price vector 0 . This allows equation 1 to be rewritten as follows: 5 The expenditure function depends on household preferences and should be indexed by h. To simplify the notation, this index is omitted here and reintroduced in section 4.1, where it becomes essential. 5 (pℎ , ℎ ) (p0 , ℎ ) ≤ (p0 , ) ⟺ ≤ (3) ℎ (ℎ ) One problem with equation 3 is that the estimation of a true cost-of-living index represents a complex econometric challenge mainly because of the constraints posed by the available data (Ray 2018). Deaton and Zaidi (2002) suggest approximating ℎ () by employing a price index, such as a Paasche or Laspeyres index, provided that spatial price variability and substitution effects remain relatively low. Hence, if ℎ is the nominal expenditure of household h, the empirical counterpart of equation 3 may be expressed as follows: ℎ ≤ (4) ℎ where ℎ denotes a household-level spatial price index. Equation 4 describes the standard practice for estimating poverty. An alternative strategy consists in using multiple poverty lines and price-unadjusted household expenditures. The idea, in this case, is to choose ℎ as a reference price vector, which implies that poverty lines are now household specific and given by (ℎ , ) = ℎ . Household h is considered poor if and only if the following is true: (pℎ , ℎ ) ≤ ℎ (5) And the equivalent of equation 4 is as follows: ℎ ≤ ℎ (6) Unlike in equation 4, in which poverty status is determined by a single scalar poverty line applied to the distribution of a household-level price-adjusted expenditure aggregate (ℎ ⁄ℎ ), poverty status in equation 6 is identified by applying household-level poverty lines ℎ (as many lines as the number of households) to price-unadjusted household expenditure (ℎ ). The link between strategy 1 (real expenditure applied to a single poverty line) and strategy 2 (nominal expenditure applied to multiple poverty lines) may be established by rewriting equation 5 as follows: (pℎ , ℎ ) ≤ (7) ℎ ( ) where ℎ ( ) is the true cost-of-living index for household h calculated at utility level . Equation 7 may be compared with equation 3. If the preferences are homothetic, the true cost-of- living index does not depend on u and ℎ ( ) = ℎ (). Equation (7) is therefore 6 equivalent to equation 3. Thus, if preferences are homothetic, strategy 1 and strategy 2 lead to identical results (Blackorby and Donaldson, 1988). The central finding of this section—the equivalence of two strategies for setting poverty lines— holds within the outlined theoretical framework. However, the empirical application of the two strategies entails several deviations from the theoretical tenets, resulting in the nonequivalence of equations 4 and 6. The next section explores these deviations in detail. 3 From theory to practice: Challenges in estimating one or multiple poverty lines Consider first the empirical strategy based on a single national poverty line. In this case, as shown in equation 4, it is necessary to estimate a spatial price index that allows nominal expenditures to be transformed into real expenditures comparable with the single national poverty line. To enhance precision in defining the deflator, it is beneficial to introduce additional notation. Denote the spatial price index as (, , ), where the three arguments refer to (a) the aggregation level of the index (whereby = 1, . . . . , represents the number of strata), (b) the coverage of the index ( indicates the number of commodities and services considered), and (c) the choice of the underlying reference group (where shows the reference group of households used in computing the index). To illustrate, consider a Paasche price index whereby = is the total number of households in the sample; = is the total number of goods included in the consumption aggregate; and ⊃ , that is, the reference group is a subset of all households, for instance, households in the bottom two deciles of the expenditure distribution ( = 1,2). It follows then that −1 �ℎ � ℎ �, , 1,2 � = �� ℎ ∙ � =1 ℎ where ℎ denotes the ’s good budget share for household ℎ, ℎ is the price of good faced by household ℎ and �ℎ � is the average price of good j at the national level. Such an index has the lowest possible level of aggregation (the household level), provides full coverage in terms of consumption items, and is based on a reference group that includes the poorest 20 percent of households. Different countries implement different choices. Recent poverty assessments produced by the World Bank make use of a Paasche index for food items (for example, Indonesia, Kenya, and Mozambique), regional or provincial (and urban or rural) consumer price indexes (CPIs) (for instance, Ethiopia, Gabon, Ghana, and Viet Nam), a mix of Paasche indexes for food 7 items and CPIs for nonfood items (such as Cambodia and the Lao People’s Democratic Republic), and, in some cases, the ratio between regional and national poverty lines (for example, Central African Republic, Chad, Niger, and Uganda). A common choice would be �, , 2,3 �, where denotes the number of Primary Sampling Units, which would be read as follows: a Paasche index calculated at the PSU-level, covering food items only (J=F), and based on consumption patterns of households in the bottom second and third deciles of the per capita expenditure distribution. Consistently with the notation introduced for the SPI, the national level poverty line becomes (, ) to emphasize that it depends on both coverage and reference group R. Accordingly, equation (4) can be rewritten as: ℎ ≤ (, ) (8) (, , ) In equation 8, households are identified as poor based on a single poverty line. The same equation indirectly identifies a set of implicit multiple poverty lines. Indeed, it is convenient to distinguish implicit or indirectly estimated variables from directly estimated variables by denoting the former with a tilde: ℎ ≤ z(, ) ∙ (, , ) = ̃ (, , ) (9) Equations 8 and 9 are interchangeable. One may use the real consumption aggregate compared with a single poverty line or the nominal consumption aggregate compared with a set of indirect poverty lines. The number of indirect poverty lines ̃ (, , ) depends on the aggregation level S, while the consistency of ̃ (, , ) with respect to the definition of the consumption aggregate depends on the coverage of the index. There is also a second empirical strategy based on the direct estimation of a set of multiple poverty lines. Let ( ′ , ′ , ′) denote the poverty line directly estimated for stratum , where ′ is the number of strata considered; ′ is the implicit coverage of the line; and ′ is the reference group utilized in stratum s. A household ℎ belonging to stratum s is poor if an only if the following is true: ′) ℎ ≤ ( ′ , ′ , (10) A key point is that a set of multiple lines ( ′ , ′ , ′) implicitly defines a set of spatial price deflators, as follows: ′) ( ′ , ′ , ′) ( ′ , ′ , � ′ ′ ′ ( , , ) = = (11) ′ )] [ ( ′ , ′ , ̃( ′ , ′ , ′ ) 8 where ′ is the aggregated reference group. Equation 10 may be rewritten as follows: ℎ ≤ ̃( ′ , ′ , ′ ) (12) � ′) ( ′ , ′ , Equations 10 and 12 are interchangeable. The analyst may equivalently use the nominal consumption aggregate in combination with (directly estimated) multiple lines (equation 10) or a � consumption aggregate deflated by the implicit deflators ′ ′ ′ ( , , ) in combination with a national-level implicit poverty line ̃( ′ , ′ , ′ ) (equation 12). 6 Table 1 summarizes the empirical strategies available to the analyst described by the set of explicit and implicit variables identified by empirical strategy 1, based on a single national poverty line and empirical strategy 2, based on the estimate of multiple poverty lines. Table 1. Empirical strategies to derive spatial price deflators for measuring living standards Strategy 1 Strategy 2 Method (single poverty line) (multiple poverty lines) (, ) ′) ( ′ , ′ , Direct estimation (, , ) — ̃ (, , ) ̃( ′ , ′ , ′ ) Indirect calculation � — ′ ′ ′ ( , , ) Note: Variables implicitly or indirectly defined are denoted by a tilde. � It is convenient to start focusing on the two spatial price indexes (, , ) and ′ ′ ′ ( , , ). First, the spatial price indexes may differ because they are constructed following different approaches. Typically, (, , ) is a Paasche price index directly estimated from survey data, � while ′ ′ ′ ( , , ) is an implicit spatial deflator based on a ratio of subnational (for example regional) poverty lines. Spatial deflators may also differ because of differences in the arguments: (, , ) ≠ ( ′ , ′ , ′) . There may be differences in the aggregation level, in the coverage, and in the reference group. Consider, for instance, the aggregation levels denoted by and ′ . Household surveys provide information on unit values and budget shares at the household level, which makes it possible to set = , that is, to estimate the SPI at the household level. The same option is not � available for the implicit indexes ′ ′ ′ ( , , ) because of the problem that Ravallion (2016) ′ defines as referencing. The reference group constitutes the empirical basis for estimating the 6 � ( , , ) depends on the aggregation level S, while the The number of implicit spatial deflators ′ ′ ′ consistency of ̃ ( ′ , ′ , ′ ) with respect to the definition of the consumption aggregate depends on the implicit coverage of the index. 9 money-metric utility function associated with the reference utility level defined in equation 1. Let −1 ({}) = {ℎ: ℎ = } denote the preimage of u, that is, the set of households h that attain the same utility level u. Hence, in a multiple poverty line setting, the referencing problem amounts to identifying correctly the preimage of for every stratum s, as follows: −1 ({ }) = ′ ∀ (13) ′ Ravallion (2016) emphasizes that the identification of is a complicated empirical issue, and increasing the number of strata ′ increases the risk of violating condition (13), with −1 ({ }) ≠ ′ 7 . This means that multiple poverty lines ( ′ , ′ , ′) may be inconsistent because they are based on standards of living that are different from . 8 The risk then turns into certainty if the maximum level of disaggregation is reached by choosing ′ = . In such a case, the reference ′ group is reduced, by construction, to one household, and condition (13) may be satisfied only if all households perceived the same standard of living , which is clearly unlikely. Thus, as a rule, ′ < ≤ . In practice, multiple poverty lines and implicit spatial deflators are computed at the regional level or even at a higher level of aggregation. � If the focus is on the coverage of the spatial price index, implicit deflators ′ ′ ′ ( , , ) have undeniable advantages. The multiple poverty lines ( ′ , ′ , ′) are, by construction, consistent with the consumption aggregate and include all the commodity and services consumed by � households, that is, ′ = . Hence, even though the implicit deflator ′ ′ ′ ( , , ) does not specify the prices and quantities of all goods, it can serve as a full-coverage index. The same is not true of (, , ), the spatial price indexes that are often calculated by proxying market prices through unit values and relying on household-level budget shares. While surveys often provide accurate information on food items, that task is challenging for nonfood items: the quantities consumed are rarely reported, and the high heterogeneity in the quality of goods makes it impossible to construct a nonfood price index. Furthermore, for certain components of the consumption aggregate, such as the consumption flow from durable goods, defining a notion of unitary prices is conceptually challenging. This explains why, typically, < ′ = , and, in practice, the coverage of index (, , ) is limited to food consumption goods, i.e. = . � In summary, compared with the implicit index ′ ′ ′ ( , , ), the index (, , ) allows for a higher level of disaggregation, properly accounts for the specificity of household consumption 7 The most widely accepted approach to addressing the referencing issue is based on the caloric intake required to meet a minimum daily energy threshold and on per capita or per adult equivalent food expenditure (Ravallion 2016). 8 See also Arndt and Simler (2010); Ravallion and Lokshin (2006). 10 patterns, and minimizes the risk of inconsistently setting the reference standard of living: in terms of our notation > ′ and −1 () = . By contrast, (, , ) has partial coverage and is likely to be inconsistent with respect to the composition of the consumption aggregate (i.e., < ′ = ). Similar conclusions hold for the other terms in table 1. The number of directly estimated multiple poverty lines ( ′ , ′ , ′) depends on the aggregation level ′ and is positively correlated with the ability to capture the specificity of household consumption patterns, but also with the risk of inconsistency regarding the standards of living . There is a trade-off between maximizing and minimizing the risk of inconsistency. The implicit poverty lines ̃ (, , ) may be computed at the household level without risk of inconsistency, but there is a price to be paid in terms of coverage. All these issues also affect the single poverty lines, (, ) and ̃( ′ , ′ , ′ ), that cannot � be evaluated separately from the spatial deflators, (, , ) and ′ ′ ′ ( , , ), respectively. The analysis in this section indicates a preference for the empirical strategy centered on a single poverty line over the one grounded on multiple lines. Within the single-line strategy, specificity and consistency in the standard of living, , may be independently pursued by increasing S and selecting the appropriate unique reference group, , respectively. In the strategy involving multiple lines, raising the value of ′ entails a trade-off, potentially jeopardizing consistency with respect to the standard of living, . This trade-off is inevitable and cannot be circumvented. However, there are significant empirical issues that restrict the coverage of a directly estimated spatial index and undermine its consistency with the composition of the consumption aggregate. One might therefore argue that, to make an empirical strategy based on a single poverty line appealing, it would still be necessary to establish an acceptable degree of consistency between the composition of household expenditure and the coverage of the price index. This objective can be achieved in two ways: (a) directly, by increasing the coverage level J of the spatial price index (, , ), or (b) indirectly, by adjusting the composition of the consumption aggregate to make it consistent with the given coverage level J of the spatial price index (, , ). Consider the second option. The idea is that, in poverty measurement, what matters is not the composition or comprehensiveness of the consumption aggregate, but the consistency of the aggregate with the poverty line. Lanjouw and Lanjouw (2001) show that poverty rates across countries or over time may be compared even if the underlying consumption aggregates are different, provided that the poverty lines are constructed consistently with the aggregates. However, the conditions necessary for the validity of the outcome are rather restrictive. Food consumption aggregates must still be consistent across countries; there must be a monotonic relationship between the share of household food expenditure and total household expenditure; 11 and this relationship must be stable across groups or over time. Moreover, the invariance of the poverty measures with respect to the composition of the consumption aggregate holds true for the poverty headcount, but does not extend to all the other Foster, Greer, and Thorbecke (1984) poverty measures. Hence, even if option (b) is simple and viable, it does not guarantee empirical robustness in poverty profiles and poverty dynamics. In summary, the assumption is highly plausible that narrowing the consumption aggregate to only food expenditures would result in a nonnegligible bias in poverty estimates and poverty profiles. For this reason, option (a), which involves extending the coverage of (, , ) to enhance the alignment with the consumption aggregate, albeit more demanding in empirical terms, seems to be the preferable choice. The next section is therefore dedicated to possible ways of implementing option (a) by expanding the coverage of the spatial price index, which is typically limited to food items. 4 A full-coverage household-level spatial price index Two possible ways may be conceived to extend the coverage of spatial price indexes. The first way is direct and involves the estimation of the price variability of additional components of the consumption aggregate beyond food items. The empirical implementation of this approach is often challenging because of data scarcity (Chen et. al. 2020). The Appendix presents two price indexes that include two additional components of the consumption aggregate, housing and durable goods. In both cases, the index may be biased because of the heterogeneity of goods, difficulties in detecting market prices, and the possible absence of prices (and necessary imputation). This approach does not allow a price index to be obtained that is fully consistent with the consumption aggregate. Alternatively, indirect estimation methods may be employed. Such approaches have been suggested in the literature. Perhaps the most well-known is the method based on Engel curves, originally proposed for temporal price indexes (Hamilton 2001) and later extended to spatial indexes by Almås (2012). The Engel curve method indirectly reconstructs the entire price index based on expenditure ratios and hinges on the highly restrictive assumption of a single homogeneous Engel curve among strata or regions. However, Gibson, Le, and Kim (2017) have strongly criticized this approach, highlighting not only the theoretical limitations, but also the empirical shortcomings. 12 4.1 The indirect method as a ratio of poverty lines In this subsection, the approach described involves using (a) the implicit spatial deflator � ′ ′ ′ ′ ′ ′ ( , , ) defined as the ratio between the poverty line ( , , ) for stratum s and the average poverty line [ ( ′ , ′ , ′ )] , and (b) the food Paasche deflator ℎ (, , ) to obtain a household level spatial price index for all nonfood goods and services included in the consumption aggregate. This nonfood-SPI serves as a component that can be combined with a household-level food Paasche spatial price index, ℎ (, , ), to produce a full-coverage that captures both food and nonfood items. The underlying idea is that poverty lines contain information on the variability of the cost of non-food items that cannot be directly reconstructed due to the lack of data on prices and quantities. This information is contingent upon the methodology used to estimate the poverty lines. For example, the food share method (Ravallion, 2016), the most commonly used approach for estimating the non-food component of poverty lines, leverages data on the consumption patterns of poor households and in the direct estimated cost of a basic need food basket. The advantage of this method is that it allows a full-coverage price index to be estimated, potentially at the household level, that is perfectly consistent, by construction, with the consumption aggregate used for poverty and inequality analysis. The drawbacks are that the strategy overlooks the price variability in nonfood items within stratum s and that the price variability pertains to the subset of poor households. 9 A true cost-of-living index for households that belong in stratum s and attain a utility level may be written as follows: (p , ) ∙ ( , ) ∙ ( , ) ( ) = = + (p0 , ) 0 ∙ (0 , ) 0 ∙ (0 , ) where is the average price vector in stratum s and 0 is the average price vector at the national level, while and 0 are the corresponding price vectors for k = food, nfood, tot; is the average quantity vector in stratum s, while is the quantity vectors for k = food, nfood, tot. Now, define the average food budget share calculated at the national level for a household around the utility level , that is, around the poverty line, as follows: 9 The latter aspect may not be necessarily deemed a weakness, and, in any case, its significance is negligible if the preferences are homothetic. 13 0 ∙ (0 , ) () = 0 ∙ (0 , ) The budget share () captures the consumption pattern among poor households in stratum s at the reference price vector 0 . We assume that these budget shares are constant across strata, i.e. () = , and may be estimated, for instance, by the average food budget share of the bottom deciles of the distribution of per capita expenditure.10 The true cost-of-living index, , may be rewritten by plugging in the budget share defined above: ∙ ( , ) ∙ ( , ) ( ) = + (1 − ) (14) 0 ∙ (0 , ) 0 ∙ (0 , ) Remember now that the stratum-level poverty lines �a simpli�ied notation for ( ′ , ′ , ′) �, � based on a representative bundle, implicitly defines a set of regional spatial deflators (a � simplified notation for ′ ′ ′ ( , , )), that may be interpreted as an estimate of a true cost-of- living index for poor households, defined in equation (11): = � = [ ] At the same time, the stratum-level Paasche spatial food price index (a simplified notation for (, , )) provides an estimate of a true cost-of-living index for food items, as follows: −1 p ∙ q (p , ) �ℎ � = = �� �ℎ � ∙ � (15) p0 ∙ q (p0 , ) �ℎ � where ℎ is the food item j budget share of household h. Substituting in equation (14) produces an estimate of a regional level spatial price index for nonfood items, as follows: 1 = � − (16) 1 − 1 − The index provides an estimate of a nonfood spatial price index that is focused on the consumption pattern of poor households. This index may be standardized and used to construct a household-level full-coverage Paasche price index, as follows: 10 If preferences are homogeneous across strata, then the shape of the expenditure functions is the same for all households and () = . 14 −1 �ℎ � −1 ′) ℎ (, , , = �� ℎ ∙ + � 1 − � ℎ � � ℎ() � � (17) ℎ where ℎ = ℎ is the budget share over total expenditure of food item j, and ℎ() = ℎ if household h belongs to stratum s 11. Equation (17) can be equivalently written as: −1 −1 −1 ′) ℎ (, , , = �ℎ �ℎ � + �1 − ℎ � �ℎ() � � (18) where ℎ is the food budget share of household h. ′) The index ℎ (, , , differs from the household-level index ℎ (, , ) defined in section 3, primarily because the estimation of the nonfood component depends on the reference ′ group . An alternative to equations (17) or (18) would be using the ratio of the stratum-level ′) poverty lines to estimate also the food component of ℎ (, , , . However, this approach would fail to capture the variability in the cost of living of food items within the stratum. Furthermore, the estimation of the spatial index would rely entirely on the accurate identification of the standard of living of poor households, a topic that is addressed in more detail in section 4.3. 4.2 The indirect method as the ratio among nonfood poverty lines A different method relative to the previous formula involves directly calculating the ratio among the nonfood components of the poverty lines at the stratum level to estimate a nonfood price index. The total poverty line for stratum s can be expressed as the sum of the food and nonfood poverty lines: = + 11 It is important to note that the full-coverage index described in equation (17) is at the household level. This is straightforward in the case of the food component: it reflects the cost each household faces for individual food items compared to their average prices across households. By contrast, the nonfood component is calculated at the stratum level, based on the ratio between stratum-level poverty lines and the average of such lines. However, both components are weighted by each household's expenditure share on food and nonfood items, therefore reflecting household-level variability. 15 Hence, recalling equation (11), we can write the implicit spatial price index for stratum s as follows: � = = + [ ] [ ] [ ] Assume now that the food budget share of poor households may be approximated by the ratio between the average food poverty line and the average total poverty line: � � ≅ (19) [ ] Substituting (19) in equation (16) produces the following: 1 � = = � − � � 1 − 1 − � � (20) 1 = � − � 1 − 1 − Equation (20) differs from equation (16), first, because equation (19) might not hold, i.e. the budget share is different from the ratio between the food component of the poverty line and � the total poverty line and, second, because the food price index , derived implicitly as the ratio of the food poverty line for stratum s and their average across strata, is different from the Paasche food price index defined in equation (15). 12 Subsection 4.3 compares the two methods to determine if a systematic bias exists in using the direct method instead of the indirect method. 4.3 A comparison of the two methods Assume that equation (19) holds with equality. Subtracting (20) from (15) gives the following: � − = � � − � (21) 1 − The difference between the two methods depends on the difference between the two possible estimates of a food spatial price index whereby is a Paasche price index, typically 12 The ratio between the average food poverty line and the average total poverty line is a good approximation of the average food budget share of poor households if and only if the relationship between total expenditure and the food budget share is a monotonic increasing function and not a correspondence or a nonmonotonic function (Lanjouw and Lanjouw 2001). 16 � computed over all households, while is an implicit price deflator based on the consumption pattern of a reference group of households with total expenditure approximately around the poverty line. Insofar as the typical consumption bundle of poor households includes a small number of items of more homogeneous quality compared with the consumption bundle of the entire population, the estimated food price index may exhibit more variability than � � � . Hence, when greater than 1, we would expect − > 0, and vice versa. That means that the estimation method described in equation (16) tends to capture more spatial price variability than the estimation method described in equation (20). In conclusion, it is crucial to discuss the advantages and limitations of the approach proposed in this paper. The primary advantage is the ability to extend the coverage of the index, thus ensuring full consistency with the composition of the consumption aggregate. Theoretically, this is equivalent to constructing an index with the highest level of disaggregation ( = ), encompassing all goods and services included in the welfare indicator ( = ), thereby minimizing the risk of identifying inconsistent standards of living. � On this last point, however, it is necessary to be more precise. The index , which captures the variability of prices of nonfood items, is an implicit index that does not rely on the direct observation of prices and quantities. Thus, there is no direct control over the potential heterogeneity of goods. The underlying assumption is that the quality of commodities and services is closely related to the standards of living of households. Thus, households that perceive the same level of utility purchase goods of similar quality. At this juncture, the reliability of the referencing, that is, the consistency of the underlying utility levels used in estimating multiple poverty lines, becomes crucial. ′) To clarify the point, refer to equation (13) in section 2. Let ( = be the underlying utility � = [ ] ∙ , the average utility for all households level for multiple poverty lines, and let ′ belonging to the reference groups . If utility levels, that is, the standards of living, were entirely � = . For the sake of simplicity, suppose also that consistent, = 1 for every s and preferences are homothetic. In this case, the Hicksian demand functions are linear in utility, and the following is therefore true: ∙ ( , ) ∙ ( , ) � = ≅ = � � 0 ∙ (0 , � ) [ ] 0 ∙ (0 , ) Hence, the variability of the nonfood price index used to complement the food index may not hinge solely on the variability in the cost of living, but also on potential fluctuations in standards of living among strata. These fluctuations might arise from errors or inconsistencies in the 17 referencing underlying multiple poverty lines. In other words, if higher standards of living were presumed in wealthier regions (or strata), > [ ] would occur, thus leading the nonfood component of the full-coverage index to overstate the cost of living in those regions. For this reason, it is advisable to run the utility-consistency tests that have been proposed in the literature, starting with Gibson and Rozelle (1999), and based on the theory of revealed preferences. Ravallion and Lokshin (2006) use quantity indexes in comparing alternative price and quantity combinations. Another option is to follow the approach proposed by Arndt and Simler (2010), based on information theory. In brief, Arndt and Simler impose on households’ behavior a rationality constraint, represented by the revealed preference conditions on the poverty bundles underlying multiple poverty lines. The rationality constraint guarantees that the consumption bundles of poor households are consistent with at least one preferences’ system, which can be then interpreted normatively as the “representative” preference system of rational poor households. At the same time, they take into account the actual consumption of poor households by minimizing the distance of the utility consistent budget shares from the original budget shares. 13 The advantage of this last option is that utility consistency is satisfied by construction; the drawback is that the utility consistent budget shares could differ significantly from the actual consumption shares even if poor households behave rationally. 14 5 Conclusion This paper proposes a methodology to adjust welfare aggregates for differences in prices across geographical areas, treating poverty lines as spatial deflators. Poverty lines may be compared against the nominal expenditure of households to measure poverty status. When used indirectly, poverty lines facilitate the conversion of household nominal expenditure into real expenditure, thus adjusting the welfare aggregate for differences in prices across geographical areas. This real expenditure aggregate can then be compared against a single national level poverty line. The paper 13 Additional examples of applications of the method proposed by Arndt and Simler (2010) are Campenhout et al. (2016), Stifel and Woldehanna (2016), and Stifel et al. (2016). 14 Availability of suitable data is key for the implementation of the Arndt and Simler (2010) procedure. The maximum entropy approach is reasonable and (hopefully) feasible for the food component; in contrast, one would expect systematic violation of the utility consistency constraint once the nonfood component is added, due to a number of reasons, including differences in needs or heterogeneous preferences. See Arndt and Tarp (2017) for additional material. 18 leverages this analytical property to develop a method for estimating a household-level full- coverage spatial deflator. The method relies on a mixed empirical strategy. The estimation of the food component of the index is based on observed prices and quantities, typically derived from household surveys. By contrast, the nonfood component of the index is derived indirectly from multiple poverty lines, usually defined at the subnational level. For nonfood items, prices and quantities are rarely available, thus the proposed approach relies on expenditure, particularly the minimum required to achieve a fixed standard of living. The proposed method offers two main advantages. First, it retains detailed information on the prices (unit values) and quantities involved in food expenditure. Second, it integrates this data with nonfood information derived from subnational poverty lines. This results in a deflator that, by construction, is fully consistent in terms of coverage with the consumption aggregate used for poverty and inequality estimates. This deflator may also be estimated at the household level. A key limitation of the method is that the ability to capture accurately the spatial variability of the prices of nonfood items depends on the quality of the identification process in the estimation of multiple poverty lines (Ravallion 2016). The underlying assumption is that multiple poverty lines are utility consistent (Ravallion and Lokshin 2006). This assumption may be tested or analytically imposed (Arndt and Simler 2010). Overall, the proposed method represents a significant advancement in measuring living standards by accounting for spatial differences in the prices of nonfood items, which comprise an increasingly large share of household consumption. This enhancement leads to more accurate measures of poverty and inequality. 19 References Almås, Ingvild. 2012. “International Income Inequality: Measuring PPP Bias by Estimating Engel Curves for Food.” American Economic Review 102 (2): 1093–1117. Amendola, Nicola, Giulia Mancini, Silvia Redaelli, and Giovanni Vecchi. 2024. “Deflation by Expenditure Components: A Harmless Adjustment?” Review of Income and Wealth. Published ahead of print, March 12, 2024. https://doi.org/10.1111/roiw.12685. Amendola, Nicola, and Giovanni Vecchi. 2022. “Durable Goods and Welfare Measurement.” Journal of Economic Surveys 36 (4): 1179–1211. Arndt, Channing, and Kenneth R. Simler. 2010. “Estimating Utility-Consistent Poverty Lines with Applications to Egypt and Mozambique.” Economic Development and Cultural Change 58 (3): 449–74. Blackorby, Charles, and David Donaldson. 1988. “Money Metric Utility: A Harmless Normalization?” Journal of Economic Theory 46 (1): 120–29. Campenhout, Bjiorn, Sekabira, Haruna and Fiona Nattembo. 2016.“Uganda: A New Set of Utility-Consistent Poverty Lines”, in Measuring Poverty and Wellbeing in Developing Countries, Channing Arndt and Finn Tarp (eds), Oxford University Press. Chen, Xiaomeng, Rose Mungai, Shohei Nakamura, Thomas Pearson, Ayago Esmubancha Wambile, and Nobuo Yoshida. 2020. “How Useful Is CPI Price Data for Spatial Price Adjustment in Poverty Measurement? A Case from Ghana.” Policy Research Working Paper 9388, World Bank, Washington, DC. Deaton, Angus S., and John Muellbauer. 1980. Economics and Consumer Behavior. Cambridge, UK: Cambridge University Press. Deaton, Angus S., and Salman Zaidi. 2002. “Guidelines for Constructing Consumption Aggregates for Welfare Analysis.” LSMS Working Paper 135, Living Standards Measurement Study, World Bank, Washington, DC. Foster, James E., Joel Greer, and Erik Thorbecke. 1984. “A Class of Decomposable Poverty Measures.” Econometrica 52 (3): 761–66. Gaddis, Isis. 2016. “Prices for Poverty Analysis in Africa.” Policy Research Working Paper 7652, World Bank, Washington, DC. 20 Gibson, John, Trinh Le, and Bonggeun Kim. 2017. “Prices, Engel Curves, and Time-Space Deflation: Impacts on Poverty and Inequality in Vietnam.” World Bank Economic Review 31 (2): 504–30. Gibson, John, and Scott Rozelle. 1999. “Results of the Household Survey Component for the 1996 Poverty Assessment for Papua New Guinea.” Discussion Paper, Poverty and Human Resources Division, World Bank, Washington, DC. Hamilton, Bruce W. 2001. “Using Engel’s Law to Estimate CPI Bias.” American Economic Review 91 (3): 619–30. Jolliffe, D., G. Datt, and M. Sharma. 2004. “Robust Poverty and Inequality Measurement in Egypt: Correcting for Spatial‐Price Variation and Sample Design Effects.” Review of Development Economics 8: 557–572. Lanjouw, Jean Olson, and Peter F. Lanjouw. 2001. “How to Compare Apples and Oranges: Poverty Measurement Based on Different Definitions of Consumption.” Review of Income and Wealth 47 (1): 25–42. Mancini, Giulia, and Giovanni Vecchi. 2022. On the Construction of a Consumption Aggregate for Inequality and Poverty Analysis. Washington, DC: World Bank. Marivoet, Wim, and Tom De Herdt. 2015. “Poverty Lines as Context Deflators: A Method to Account for Regional Diversity with Application to the Democratic Republic of Congo.” Review of Income and Wealth 61 (2): 329–52. Ravallion, Martin. 1998. “Poverty Lines in Theory and Practice.” LSMS Working Paper 133, Living Standards Measurement Study, World Bank, Washington, DC. Ravallion, Martin. 2016. The Economics of Poverty. History, Measurement, and Policy. New York: Oxford University Press. Ravallion, Martin, and Michael M. Lokshin. 2006. “Testing Poverty Lines.” Review of Income and Wealth 52 (3): 399–421. Ray, Ranjan. 2018. Household Behaviour, Prices, and Welfare: A Collection of Essays Including Selected Empirical Studies. Themes in Economics Series. Singapore: Springer Nature. Stifel, David, Razafimanantena, Tiaray and Faly, Rakotomanana. 2016. “Estimating Utility- Consistent Poverty in Madagascar, 2001–10” in Measuring Poverty and Wellbeing in Developing Countries, Channing Arndt and Finn Tarp (eds), Oxford University Press. 21 Stifel, David, and Tassew, Woldehanna. 2016. “Estimating Utility-Consistent Poverty in Ethiopia, 2000–11”, Measuring Poverty and Wellbeing in Developing Countries, Channing Arndt, and Finn Tarp (eds), Oxford University Press. Thomas, Vinod. 1980. “Spatial Differences in the Cost of Living.” Journal of Urban Economics 8 (1): 108–22. 22 Appendix This appendix considers methods to extend the coverage of the food Paasche price index partially by accounting for two of the most relevant components of the consumption aggregate: housing expenditures and the consumption flow of durable goods. In both cases, typical household budget surveys provide statistical information that may be used to approximate the spatial variation in the unitary cost of the two components. A Paasche spatial price index with rent Consider, first, the housing component. Some analysts have explored the inclusion of a price index for rent calculated at the stratum level s into the Paasche food price index (Amendola et al. 2024). From an analytical standpoint, the method is simple and straightforward. The extended household-level spatial price index may be written as follows: −1 �ℎ � [ ] 0 ℎ ℎ = �� ℎ ∙ + ℎ ]� (A.1) ℎ [ℎ whereby the following ℎ ℎ ℎ = ; ℎ ℎ + ℎ ℎ + ℎ ] are the budget shares on food item j and on housing, respectively, and [ℎ is the average ] rent for households belonging to stratum s, while 0 [ℎ is the mean of the stratum-level rents ] [ℎ . ] The implicit assumption underlying equation (A.1) is that [ℎ adequately measures the cost of a representative housing unit and that the quality of such housing is relatively homogeneous across strata. However, this is an assumption that can be systematically violated, especially if the average quality of housing is positively correlated with the average level of income and expenditure by stratum. A Paasche spatial price index with rent and consumer durables The coverage of the index may also be broadened by incorporating a measure of the price variability of durable goods, which may be estimated using information commonly found in 23 household budget surveys. To utilize such information effectively, though, some assumptions need to be made. Let represent the estimated depreciation rate for a durable i. The method used to estimate depends on the data provided by the household survey (Amendola and Vecchi ℎ 2022). A geometric depreciation model is assumed in the following description. Let , be the current market value of a durable i of age v, a variable usually available in household surveys. Hence: ℎ ℎ = , ∙ (1 − ) ℎ where is the market value of a new durable i. By aggregating at the strata level, the following may be written ≅ � ℎ , � ∙ (1 − ) [ ] ℎ where � , � is the average current price for durable i for stratum s, and [ ] is the average age of durable i for stratum s. The same relationship holds at the national level, as follows: 0 ℎ [ ] ≅ � , � ∙ (1 − ) Thus, a price index for durable i is given by the following: ℎ � , � 0 ≅ ℎ ∙ (1 − )� []−[]� �, � The elementary price index for the durable good i at the stratum level s is therefore given by the ratio of the average market prices reported by households, adjusted by a factor (1 − )� []−[]� that depends on the depreciation rate and on the difference between the average age of the durable good i in stratum s and the average age of the durable good recorded at the national level. Note that, if the average ages were the same, the adjustment factor would be equal to 1. According to the geometric model (Amendola and Vecchi 2022), the consumption flow of household h belonging to region s from durable i is given by the following: �,ℎ = ( + ) ∙ ℎ , where r is the real interest rate on safe assets. The budget share on consumption services from durable i for household h belonging to stratum s is given by the following: �,ℎ ℎ = �,ℎ ∑ Accordingly, the following is true: 24 �,ℎ = � � �,ℎ ∑ A Paasche price index for durable goods may be estimated as follows: −1 0 = �� � �� Hence, a household-level spatial price index with rent and durables may be estimated as follows: −1 0 0 �ℎ � [ ] 0 ℎ ℎ = �� ℎ ∙ + ℎ + � ℎ � �� (A.2) ℎ [ℎ ] where ℎ ℎ ℎ = ; ℎ = ; ℎ + ℎ + ℎ ℎ + ℎ + ℎ ℎ ℎ = ℎ + ℎ + ℎ 0 and and are the estimated average prices for a single new durable good i for households belonging to stratum s and at the national level, respectively. The index for durable goods suffers from the same limitations outlined in the case of the housing index. The observed variability may reflect the variability in the quality of durable goods. 25