The World Bank Economic Review, 36(2), 2022, 433–454 https://doi.org10.1093/wber/lhab027 Article How Important Is Temptation Spending? Maybe Less Downloaded from https://academic.oup.com/wber/article/36/2/433/6501138 by LEGVP Law Library user on 08 December 2023 than We Thought Lasse Brune, Jason T. Kerwin , and Qingxiao Li Abstract Temptation plays a key role in theoretical work on spending and saving in developing countries. The limited empirical evidence on its importance, however, suggests that cash transfers do not induce increases in temptation spending. This paper expands the evidence base by studying the effect of randomized exposure to temptation on spending decisions in rural Malawi. Consistent with the cash transfer literature, a more tempting environment does not induce significant changes in temptation spending. However, the magnitudes of both temptation spend- ing levels and the treatment effects are somewhat sensitive to the definition of temptation spending used. This paper examines the potential factors that may be driving these null results, and suggests that future research may find a limited role for temptation in the economic decisions of the poor. JEL classification: D90, D91, O12 Keywords: temptation spending, self-control, behavioral economics, development economics Does temptation play a major role in the spending decisions of the poor? Prominent theoretical work suggests that it does (Banerjee and Mullainathan 2010), and policymakers are often concerned that par- ticipants will mis-spend cash transfers on temptation goods (Harvey 2007; Ikiara 2009; Evans and Popova 2017). Temptation goods are typically defined by researchers to include goods that are commonly per- ceived as harmful (as in Evans and Popova 2017) or that the people themselves would prefer not to buy when asked at a different time (as in O’Donoghue and Rabin 1999). Since temptation goods are valued only in the moment, and not ahead of time, that money is as good as wasted. Despite the important role of temptation in both theory and policy, empirical evidence suggests that temptation spending is ei- ther unchanged or reduced by cash transfers on temptation goods (see Evans and Popova 2017, for an in-depth literature review on the impact of cash transfers on expenditures for temptation goods). The Lasse Brune is a research assistant professor at the Global Poverty Research Lab at Northwestern University, in Evanston, IL; his email is lasse.brune@northwestern.edu. Jason Kerwin (corresponding author) is an assistant professor in the Department of Applied Economics at the University of Minnesota, in Saint Paul, MN, and an affiliated professor at J-PAL; his email is jkerwin@umn.edu. Qingxiao Li is a postdoctoral scholar in the Department of Applied Economics at the University of Min- nesota, in Saint Paul, MN; his email is lixx5376@umn.edu. We thank Marc Bellemare, Jeff Bloem, Audrey Dorélien, Anett John, Mike Kuhn, Soomin Lee, Joe Ritter, Laura Schechter, and Dan Silverman for helpful comments. We are grateful for re- search support from the IPA/Yale Savings and Payments Research Fund (funded by the Bill and Melinda Gates Foundation), the University of Michigan Population Studies Center, and the Michigan Institute for Teaching and Research in Economics. Kerwin’s work on this study was supported in part by an NIA training grant to the Population Studies Center at the Uni- versity of Michigan (T32 AG000221), as well as by fellowship funding from the Rackham Graduate School. This study is registered with the AEA RCT Registry under registration number AEARCTR-0000437. All errors and omissions are our own’. A supplementary online appendix for this article can be found at The World Bank Economic Review website. © The Author(s) 2022. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 434 Brune, Kerwin, and Li disconnect between the theoretical literature and the evidence from cash transfers raises the question of how important temptation really is in the financial lives of people living in poor countries. This paper studies the relevance of temptation by attempting to experimentally vary workers’ exposure to temptation at the time that they receive cash payments and to examine whether and how workers’ spending differs, using a field experiment in southern Malawi. We do this by having treatment-group workers pick up their pay at the location of a weekly market on market day. In contrast, control-group Downloaded from https://academic.oup.com/wber/article/36/2/433/6501138 by LEGVP Law Library user on 08 December 2023 workers picked up their pay at the same market location but on the day before market day. Market days at local markets are commonly identified by participants of our study as a highly tempting environment. Thus, the intervention was designed to induce the treatment group to have cash on hand in a tempting environment, while the control group had cash on hand in an otherwise-equivalent environment without the source of temptation. The experiment held transaction costs, such as time and transportation costs, constant, by requiring each worker to come to the payment location at the local market both on market day and the day before.1 Our results do not provide strong evidence in favor of the typical temptation narrative. Direct exposure to the tempting market environment at the time of payment induces no appreciable changes in expendi- ture, nor in temptation spending in particular. These findings align with the results of cash transfer studies, and also with previous research in Malawi, which has found that recipients of a large cash windfall spent little on temptation goods (Brune et al. 2017). This study also contributes evidence on methods for measuring temptation spending. Most previous studies define temptation spending as spending on alcohol and tobacco (Banerjee and Duflo 2007), and high-calorie savory and sweet foods (Aker 2017; Dasso and Fernandez 2014) are sometimes included as well. In our study, we collect rich data on temptation spending—defining it not only using the standard definitions from the literature, but also by allowing the respondents to identify categories of expenditure that they themselves see as problematic and computing the share of all expenditure that is deemed to be temptation spending. Our study finds that respondents’ own designations of temptation goods can differ substantially from the typical definitions and that the magnitude of temptation spending varies substantially across definitions. For example, if workers are directly asked how much money they wasted or were tempted into spending that they should not have spent, temptation spending is 20 times larger than the conventional approaches that focus on alcohol and tobacco. Crucially, the different ways of measuring temptation spending do not alter the interpretation of our core finding: we cannot reject the possibility that spending decisions are unaffected by the variation in the receipt of payment in a tempting environment that the experiment induces. A caveat to this finding is that the magnitude of the point estimate is quite sensitive to the definition used, ranging from 3 percent of average control-group temptation spending up to 23 percent. We discuss seven potential reasons why the market day treatment may not produce substantial changes in spending and savings behaviors. The first is statistical power: the effects could be small enough that the sample is not sufficiently large to detect them. The second is that the treatment may simply have been too weak, that is, the variation in payment timing did not induce sufficient variation in temptation. Third, the experiment may have suffered from substitution bias: since there are other opportunities available for workers in the study to purchase temptation goods, the effects from the market day treatment may be limited. Fourth, under-reporting of temptation spending could attenuate any effects of the treatment. Fifth, workers may have successfully precommitted to spending plans that prevent them from spending money on temptation goods. A sixth potential factor is peer effects: since workers show up together on paydays and interact with each other, their choices could mirror one another’s, leading to similar 1 This market day treatment is cross-randomized against a second experiment that varied whether workers received their pay weekly or in a deferred lump sum (Brune and Kerwin 2019), which also makes it possible to test whether the effects of the deferred lump sum payment differ by temptation exposure. The World Bank Economic Review 435 temptation spending decisions across study arms. The seventh reason is costly self-control: workers are able to resist the temptations posed by the market, but at cost in terms of utility or willpower. These models imply a diminishing ability to resist as temptations increase. Out of the seven explanations mentioned above, the most compelling is that the treatment was simply not very intense. Specifically, we believe that we exposed workers to one of the most tempting environ- ments they commonly encounter, and that the treatment was implemented successfully—but that this Downloaded from https://academic.oup.com/wber/article/36/2/433/6501138 by LEGVP Law Library user on 08 December 2023 exposure has intrinsically small effects on behavior. If the upper bound on the practical effect of temp- tation exposure on spending is small, this implies that policymakers and economists may both be overly concerned about the role of temptation in driving spending and savings decisions in developing countries. This paper contributes to the empirical literature on temptation by measuring the effects of a natural temptation exposure on overall temptation expenditures in a real-world setting. Previous research has shown strong effects of temptation in lab settings (Toussaert 2018). Sadoff, Samek, and Sprenger (2020) find large dynamic inconsistency effects and a strong demand for commitment, but focus on food choices alone, and examine choices out of a restricted set of foods. There is also evidence that paying people in cash (as opposed to a bank account), leads to large changes in consumption, but not to increases in temptation spending (Somville and Vandewalle 2018). While this previous work implies that temptation exposure is very important in economic decision making, our findings suggest that it is not.2 These results provide important insights into the potential role of temptation in the economic lives of the poor. Given our findings, we argue that researchers should not expect strong effects of temptation exposure on temptation spending. Moreover, the measure of temptation spending used may mask effects in temptation spending studies. Alcohol and tobacco are the most used definition for temptation spending; however, not everyone drinks or smokes. Moreover, just because policymakers or consumers themselves want to reduce spending on a good like alcohol does not mean that the good qualifies as a temptation good in the theoretical sense. In the present sample, only 16 percent of people consume alcohol or tobacco, but more than 34 percent of people have spent on goods they consider a waste of money or are tempted to buy; the most common such goods are savory snacks, gifts for their children, clothes, and food. Estimates of temptation effects that focus on alcohol and tobacco alone may be downward biased. Relying on individuals’ own determination of which purchases are temptation spending may generate more-useful measures, while also giving people more agency over how their choices are evaluated and how policies are designed. The remainder of the paper proceeds as follows. Section 1 describes the data and the experimental design, while Section 2 lays out our econometric strategy. In Section 3 we present the effects of receiving pay in a tempting environment, and in Section 4 we discuss the mechanisms behind the null effects that we find. Section 5 concludes. 1. Data and Experimental Design The data in this study come from a field experiment that randomly assigned workers to receive their wages in environments with varying levels of temptation, as well as either in a smooth stream or a lump sum (Brune and Kerwin 2019). The wages were paid through an income support program organized by Mulanje Mountain Conservation Trust (MMCT), a local NGO in the Mulanje District of Malawi’s Southern Region, which provides temporary informal employment opportunities during the agricultural 2 The study’s intervention could also be framed as inducing a waiting period before people could make consumption choices. Previous research has shown that waiting periods induce more patient choices (Imas, Kuhn, and Mironova 2016) and healthier choices (Brownback, Imas, and Kuhn 2019) by facilitating a shift from heuristic to deliberative processing. In this study, the control group had to wait to make purchases on the market day, while the treatment group could make them immediately. Thus one channel for increasing temptation spending is that the treatment could have caused choices that were more heuristic. 436 Brune, Kerwin, and Li offseason. While the workers in our sample have other sources of income, the wages received from this program are an important supplement to their livelihoods. Two rounds of the experiment took place over a period of three months from November 2013 to January 2014. There were initially 350 workers from 7 villages recruited into the study for round 1, and an additional 15 workers were added for round 2 to replace the workers who dropped out after round 1.3 Workers were selected for participation by their respective village development committees, Downloaded from https://academic.oup.com/wber/article/36/2/433/6501138 by LEGVP Law Library user on 08 December 2023 which chose people largely on the basis of perceived disadvantage; thus the sample is predominantly female and poorer than average for the region. Each worker worked for two weeks during each round of the program, and for about four days per week. The daily wage rate was MK400 (PPP USD $2.50), which was at the national minimum wage level, and is approximately 160 percent of average daily spending for the workers in the sample. The experiment occurred during the lean season of the year, which begins around maize planting and continues until the harvest. At this time of the year, money is typically tight and jobs are scarce. Workers were assigned to work on conservation-oriented activities that promoted the sustainable use of natural resources. In our study, workers received their wages after the work was completed and were all paid at the site of the largest local market. They were randomly assigned, independently by round, to receive their wages either on market day (Saturday) or the day before market day (Friday); all payments were made at the same location. The total nominal income received by all workers was identical, and workers were informed about when they would be receiving their pay at the beginning of each round. Workers’ pay schedules were fixed for each round, the procedure was explained verbally, and workers were also provided with a simple handout explaining their schedule. The weekly payment was MK700 in round 1 and MK800 in round 2; correspondingly, the lump-sum payment was MK2800 in the first round and MK3200 in the second. To ensure that transaction costs, such as transportation and time costs, were held constant across wage payment modes, workers assigned to Saturday paydays were also asked to come to the payroll site on Fridays, and vice versa. An MK100 show-up stipend, on top of any money workers were slated to receive, was provided to encourage attendance and defray workers’ time cost. Equalizing transaction costs is key to interpreting differences in spending between Friday-payment workers and Saturday-payment workers as an effect of the differences in the degree of temptation of the environment. Since Saturday-payment workers are at the market on market days to get their pay, their marginal cost of purchasing a good during the market is just the price of the good. A worker who was not at the market would have to pay not just the price of the good but also the transportation and time cost of getting to the market. By requiring Friday-payment workers to show up during the market day as well, this difference in the overall cost of goods is eliminated. The market day treatment was cross-randomized against another experiment that varied payment frequency: workers received their pay in four weekly installments or in a deferred lump-sum payment at the end of the month. The two variations in the timing of pay (the frequency of payments and the temptation level of the environment when workers received the pay) were cross-randomized, creating four study arms in each round. Table 1 presents the payment schedule in each round across the four payday weekends with show- up stipends and wage disbursements per study arm. The market day and non–market day arms have an identical number of paydays in the lump-sum and weekly payment schemes. The total payment excluding the MK100 show-up stipend was MK2800 in round 1 and MK3200 in round 2, because there were seven work days during the first round and eight days during the second round. Workers in the study were randomly assigned to study arms in each round of the study, and the ran- domization for both rounds of the study was done prior to the baseline survey. The group assignments 3 Attrition is balanced across treatment and control: see panel 2 of table S1.1 in the supplementary online appendix for the balance tests after attrition in round 2. The World Bank Economic Review 437 Table 1. Average Pay Schedule across Rounds 1 and 2 Downloaded from https://academic.oup.com/wber/article/36/2/433/6501138 by LEGVP Law Library user on 08 December 2023 Source: Authors’ summary of the research design. Note: This table presents the average payment schedule (combining rounds 1 and 2) across the four payday weekends with show-up stipends and wage disbursements per study arm; the payments are rounded to the nearest MK10. Sample includes 359 respondents who participated in at least one round of the work program and have data from at least one data source for that round (either the payday data, the survey, or both). Each worker worked for two weeks during each round of the program but collected their pay in four weekly installments. The weekly payment is MK700MK in round 1 and MK800 in round 2. All subjects received an MK100 show-up stipend. All money amounts are in Malawian Kwacha (MK); during the study period the market exchange rate was approximately MK400 to the U.S. dollar, and the PPP exchange rate was approximately MK160 to the U.S. dollar. were not revealed to the workers until the beginning of each round of work. The randomization for the first round was stratified by village and gender, and the randomization for the second round was stratified on the round 1 assignment and village. We generally pool observations across rounds and the cross-randomized lump-sum treatment to improve statistical power. This study uses three rounds of survey data: a baseline and a survey after each round of the study. The surveys after each round were conducted on the Monday immediately after the last payday of each round. The order in which workers were visited for the surveys was randomized by village, and workers were interviewed at their homes. The survey collected information on income, household-level expenditure, physical assets, saving, transfers, cash on hand, and details on the worker’s expenditures since the first day of the final payday weekend.4 We also utilize brief survey questions asked of workers when they came to collect their pay; see Brune and Kerwin (2019) for a detailed discussion of the payroll survey data. The random assignment produced a sample of workers that is balanced across study arms on ob- servable characteristics. Table 2 shows balance tests for baseline (pretreatment) variables for the main comparison we use in this paper, which is between all workers who were paid on Fridays (the control group) and all those who were paid on Saturdays (the market-day payments treatment). We find no sta- tistically significant imbalance on any of the covariates in the table, and we also fail to reject the joint null hypothesis of zero difference on all covariates together (p = 0.91).5 We also fail to reject the null when we compare Friday and Saturday workers who received their pay in a lump sum (p = 0.97) or weekly (p = 0.79). The analytic sample is 70 percent female and 70 percent married; the average age is 40, and the workers have about 3.5 years of formal schooling on average. The average worker has received about MK3000 (PPP USD $18.77) in cash income and has spent about MK4,000 (PPP USD $25.02) since the previous Friday. Workers have received substantially more in loans than they have given out, and are also net beneficiaries of transfers. The average midline surveys took place 2.5 days after the last payday, and 74 percent of workers preferred the lump-sum wage payments. 1.1. Measures of Temptation Spending Temptation spending has been defined in different ways in previous studies, and temptation goods are typically goods that are commonly perceived as harmful (Evans and Popova 2017). For instance, alcohol, 4 The cash on hand variable was measured as the household’s remaining cash holdings out of income received starting since the Friday before the survey interview, and so is not necessarily equal to the household’s entire savings. 5 This table pools workers across rounds; the analysis also finds that the two study arms of interest are balanced when the study analyzes them separately by round (table S1.1 in the supplementary online appendix). The difference in income across study arms is statistically insignificant (p=0.503) but somewhat large in magnitude: 17 percent higher for the Friday payment group. The study therefore controls for baseline income in the main analyses. 438 Table 2. Balance of Background Characteristics and Financial Outcomes Lump Sum Weekly Fri. vs. Sat. bal test Friday Saturday Friday Saturday Balance test Balance test Mean SD N Mean SD N Mean SD N Mean SD N p-value p-value (logs) (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) Background characteristics Male 0.33 0.47 177 0.34 0.47 173 0.28 0.45 174 0.34 0.47 176 0.376 – Married 0.70 0.46 176 0.66 0.48 169 0.70 0.46 171 0.73 0.44 172 0.879 – Age (Years) 39.83 15.55 177 40.35 15.41 173 40.09 16.00 174 39.26 14.43 176 0.890 – Years of education completed 3.60 3.36 176 3.58 3.11 172 3.24 3.08 173 3.66 3.08 173 0.408 – Midline survey date (days after Sunday) 2.50 1.15 177 2.47 1.13 168 2.61 1.16 169 2.57 1.15 175 0.701 – Prefers lump-sum wage payments 0.73 0.45 177 0.72 0.45 172 0.72 0.45 173 0.77 0.42 176 0.520 – Roundtrip Distance to Closest Two Markets (mi.) 12.30 1.12 153 12.33 1.12 147 12.20 1.25 149 12.27 1.18 147 0.592 – Financial outcomes (in units of MK unless noted) Income received since past Friday 2,680 4,671 177 2,360 3,185 173 3,803 12,044 174 3,165 5,994 176 0.503 0.826 Remaining cash holdings out of income received 603 1,770 177 532 2,743 173 540 1,766 174 893 3,476 176 0.555 0.159 Total spending since Friday 3,955 4,588 177 3,290 3,319 173 4,027 5,148 174 3,553 4,116 176 0.144 0.137 Asset ownership (PCA) 0.14 2.81 177 0.04 2.62 173 -0.31 2.09 174 0.13 3.10 176 0.410 – Loans received in past month 3,416 9,886 177 2,435 7,091 173 1,790 4,714 174 4,121 17,798 176 0.524 0.283 Loans made in past month 800 3,441 177 571 1,645 173 562 1,636 174 993 4,211 176 0.571 0.678 Transfers received in past month 802 2,197 177 768 1,410 173 994 2,648 174 851 2,269 176 0.597 0.623 Transfers made in past month 779 2,322 177 464 1,360 173 509 2,493 174 633 2,562 176 0.421 0.634 p-value from joint significance of 15 covariates: 0.97 0.79 0.91 Source: Authors’ calculations based on the baseline data. Note: Sample includes 359 respondents who participated in at least one round of the work program and have data from at least one data source for that round (either the payday data, the survey, or both). All money amounts are in Malawian Kwacha (MK); during the study period the market exchange rate was approximately MK400 to the U.S. dollar, and the PPP exchange rate was approximately MK160 to the U.S. dollar. Roundtrip Distance to Two Closest Markets measures the distance for the worker to walk from home to the payroll site, then to the closest other market to their home, and then back home again. Brune, Kerwin, and Li Downloaded from https://academic.oup.com/wber/article/36/2/433/6501138 by LEGVP Law Library user on 08 December 2023 The World Bank Economic Review 439 tobacco, high-calorie savory foods, and sweets are commonly included in the definition of temptation spending. In general, temptation spending is defined as money “wasted” by the poor on things that pol- icymakers would prefer them to not purchase. This approach presumes that otherwise-competent adults cannot be trusted to make their own decisions, and that policymakers or people in other countries could do better on their behalf. At the same time, the poor—like most consumers—commonly identify categories of spending that they wish to reduce (Banerjee and Duflo 2007). Downloaded from https://academic.oup.com/wber/article/36/2/433/6501138 by LEGVP Law Library user on 08 December 2023 In an attempt to expand on the frameworks used in previous studies, this paper explores a number of different ways to capture temptation spending that build on the respondent-driven approach in Banerjee and Duflo (2007). We use detailed survey instruments that both allow us to see how spending levels and the results in the experiment vary using different approaches, and to shed light on their relative merits. Table 3 overviews the seven temptation outcomes that we use, listing key characteristics as well advantages and disadvantages. We use five definitions that are based on respondents’ own categorizations. For three of those addi- tional definitions (A, C, and E) we first allow respondents to identify categories of expenditure that they themselves typically see as problematic. In second step we match actual spending from an itemized expen- diture module with each respondent’s categories to compute measures of temptation spending, separately for each definition. For this category-elicitation approach our instruments included the following defini- tions of temptation goods: goods that the respondent is tempted into purchasing that they should not buy or that are a waste of money (definition A); purchases that the respondent commonly regrets after the fact (definition C); and goods that are commonly unplanned purchases (definition E).6 We also use two other respondent-driven measures of temptation spending. For definition B, we ask the respondent to report the total amount of money they “wasted,” where the latter term is the English translation of the Chichewa expression that, based on extensive piloting, best captures the spirit of temp- tation goods. Definition D focuses on unplanned purchases, and is captured by asking about every good in the itemized list of purchases, whether the purchase was planned beforehand (following Brune et al. 2017). Unplanned purchases are a proxy for temptation spending in this case. The English translations of the exact survey questions that we used for all respondent self-reports of temptation spending are shown in table S1.2 in the supplementary online appendix. There are several empirical patterns worth highlighting when considering the types of goods that are included in the different definitions of temptation spending (see table S1.3 in the supplementary online appendix for a detailed breakdown). First, the items that respondents categorize as temptation goods differ starkly from the narrow definitions of only harmful goods that are often used to capture temptation spending. For example, under definition A, sweets and alcohol are mentioned by 11 percent and 6 percent of respondents, respectively. In contrast, three of the top four categories are gifts for children (15 percent), clothes or shoes (14 percent), and food (11 percent). Second, the items included in respondent’s own categorizations differ across definitions. For example, gifts for children are commonly reported under definition A. However, this category is not commonly reported as regretted (definition C) or unplanned (definition E). Third, categorizing purchases based on what respondents consider to be common temptation goods can lead to important miscategorizations: items that are typically considered temptation spending in given category are not necessarily temptation goods all or even most of the time. For example, all spending on food or clothes, commonly mentioned as categories for definitions A, C, and E, would be classified as temp- tation spending for respondents who list this category. Obviously some purchases of clothing and food are necessities and do not fit a reasonable definition of temptation spending; what respondents clearly mean 6 The study had options for the most common responses from pilot-testing the survey, and also an “other” category where workers could list up to three additional goods. Very few workers used all three “other” spaces on the survey. The survey module where workers categorized temptation goods was conducted after the study elicited actual purchases, to avoid potential underreporting due to priming. 440 Table 3. Temptation Variable Construction, Advantages, and Disadvantages (1) (2) (3) (4) (5) (6) (7) A. Waste/ B. Money C. Frequently D. Unplanned E. Commonly F. Alcohol G. Alcohol, tobacco, temptation goods wasted regretted purchases against plans and tobacco donuts, and soda Based on respondent categorization X X X X X Uses itemized spending within pre-specified categories X X X X X X Timing of elicitation of categorization, by survey round 2 n/a Baseline n/a Baseline n/a n/a Data availability, by survey round 1&2 2 only 1&2 1&2 1&2 1&2 1&2 Advantages Individualized measure X X X X X Focuses on harmful goods X X Disadvantages Paternalistic X X Only elicits most common categories of goods X X X Can categorize non-tempt. purchases as tempt. X† X X Spending measure can include non-tempt. goods X X X X X X Ex-post rationalization X X Undercounting of items in total X Source: Authors’ summary of the research design. Note: † Examples include health care spending and funerals. Brune, Kerwin, and Li Downloaded from https://academic.oup.com/wber/article/36/2/433/6501138 by LEGVP Law Library user on 08 December 2023 The World Bank Economic Review 441 Figure 1. Comparison of Available Definitions of Temptation Spending Downloaded from https://academic.oup.com/wber/article/36/2/433/6501138 by LEGVP Law Library user on 08 December 2023 Source: Authors’ calculations based on the two waves of follow-up data. Note: Sample includes 359 respondents who participated in at least one round of the work program and have data from at least one data source for that round. Details for Panel B: All variables are standardized. Upper right triangle shows binned scatterplots, diagonal shows histograms, lower left triangle shows Pearson correlation coefficients. is that they sometimes make impulse purchases of these goods, rather than that all purchases of them are temptation spending. Conversely, if a respondent does not consider an item a common temptation good, no purchases of the good will be coded as temptation spending, even if some actually are. This inclu- sion/exclusion error is a downside that does not apply the direct elicitation of spending under definition B. The main disadvantage of using definition B is that it is susceptible to ex-post rationalizations: unlike the categorizations of common temptation goods, elicitation of total temptation spending is only done after the fact. Fourth, the definitions that rely on eliciting regretted and unplanned goods have an important limi- tation in that not all purchases that are unplanned or regretted are temptation spending. For example, healthcare expenses and funerals are commonly listed under unplanned purchases (definition E) but are not plausible common temptation goods and are in fact not listed as tempting or a waste of money (under definition A). Under definition C, respondents frequently regret buying “expensive food” (as opposed to just “food,” which was mentioned separately), presumably because they feel they paid too much for it. We supplement the subjective self-judgments of temptation goods with two objective measures drawn from the previous literature. First, following Evans and Popova (2017), we consider purchases of alco- hol and tobacco to be temptation spending (definition F). Second, we use an expanded version of their definition, by including all goods that are mentioned as temptation goods in the studies they summarize and that also appear in our surveys’ itemized lists of purchases; this adds donuts and soda to their list (definition G).7 Figure 1 presents summary statistics and correlations between the definitions of temptation spending that we use in the analysis. The recorded level of temptation spending varies substantially based on the 7 All fried sweet breads are categorized as donuts. 442 Brune, Kerwin, and Li definition. Moreover, the various measures are only weakly correlated with one another: the only corre- lation coefficient that exceeds 0.25 is between “Alcohol and Tobacco” (Row F) and “Alcohol, Tobacco, Donuts, and Soda” (Row G)—an artifact of the overlapping definitions. The low pair-wise correlations suggest that the different types of approaches are potentially capturing different aspects of the same con- cept or, alternatively, are mismeasuring the target concept in mostly uncorrelated ways. If the different approaches captured different aspects or if the approached were unbiased but noisy measures of the same Downloaded from https://academic.oup.com/wber/article/36/2/433/6501138 by LEGVP Law Library user on 08 December 2023 concept, the best approach empirically would be to aggregate the different measures. To the extent that one approach is capturing the concept of temptation spending better than others, we must rely on con- ceptual arguments about which definition seems most fitting. The discussion of the items included under the different definitions above suggests that temptation spending is best captured by self-reported aggregate money wasted (definition B, “Money wasted”), and to a lesser extent, by purchases of goods in categories that the workers listed separately as items they often wasted money on or as items they are tempted to buy (definition A, “Waste/Temptation”). First, regretted purchases (definition C) and unplanned purchases (D and E) often capture other mistakes and deviations from plans that are not conceptually equivalent to being tempted into wasting money. Second, the common researcher-imposed definitions of temptation spending (F and G) miss important categories of goods that the workers in our sample report being tempted into purchasing such as savory snacks, gifts for children, and clothing. Our temptation measures show nontrivial average levels of temptation spending. Notably, definition B shows an average temptation spending level of MK306, which is 11 percent of average baseline income from table 2 (MK2,680). Temptation spending under definition A is 4 percent of average income, while the other definitions are less than 3 percent. Although we think that definitions A and B are the best measures of temptation spending, since we do not have a pre-analysis plan, we use all seven definitions to limit researcher degrees of freedom (see for example Simmons, Nelson, and Simonsohn 2011, who discuss how researcher choices over variable definitions can lead to false-positive findings). We report our main analyses separately for each temptation spending measure, but focus primarily on a combined index of temptation spending. We create this index by taking the first principal component of the seven individual temptation measures for the control (non– market day payment) group, generating index values for the entire sample and normalizing to the control group’s mean and standard deviation. The weights of the components of the index are calculated using the control group only, so that variable construction is unaffected by the treatment. Since one of the seven measures (total money wasted) was collected only in round 2 of the study, we construct the index in two ways: one that includes all seven outcomes but is only computed for round 2, and one that excludes the “total money wasted” variable and is computed for both rounds. 1.2. Market Days as a Source of Temptation Market days are a common institution in rural Africa. They bring together traders and customers at fixed locations and on a regular fixed schedule. The locations, called trading centers, contain a few permanent businesses and have a large number of spaces for other vendors to come and sell additional goods on the market day. In the local area where we ran the experiment, there are seven of these trading centers, and typically each one holds two market days per week, with one of the two days being a bigger event than the other. The largest market day in a given region is typically on Saturday. Vendors will travel long distances to the most important market days, usually arriving the night before so they can set up their stalls early in the morning. As an illustration of what a market day in southern Malawi looks like, fig. S1.1 in the supplementary online appendix shows a picture of a market day at another trading center in the region, outside of the site of our study.8 8 To protect the confidentiality of the research subjects’ responses, no pictures of the participants in the study are included. The World Bank Economic Review 443 In Malawi, market days are very tempting environments. They are typically lively, noisy affairs with many goods on offer, and with salespeople trying to convince customers to browse their wares. Clothes are put out on display and vendors cook fragrant foods. The environment, with many sources of tempta- tion, is a fairly stark contrast to ordinary days in rural Malawi. These temptations are also not com- pletely avoidable: market days are often the only feasible option for people living in rural Malawi to buy common consumption goods. Anecdotally, people in Mulanje District often describe market Downloaded from https://academic.oup.com/wber/article/36/2/433/6501138 by LEGVP Law Library user on 08 December 2023 days as tempting situations, in which excitement can cause them to purchase things they would rather not. We chose market days as the tempting environment for the study based on extensive qualitative and descriptive work with people in the local area. Prior to running the experiment, we did open-ended in- terviews with people from the local area to ask them about which situations they find tempting. Based on their responses, we chose market days as a potentially tempting environment and conducted a pilot test of the experiment at a market near the study site to refine our field procedures and to ensure that the experiment was feasible. Participants in that pilot reported that they found the market highly tempting. Survey data from our sample of workers (table S1.4 in the supplementary online appendix) confirms that people find markets tempting: for a free-response question about situations that are tempting or in which respondents may waste money, 37 percent of all respondents volunteered market days as a tempting situation, by far the most common response (panel A). Multiple-choice questions (panel B) show the same pattern: 69 percent of people said that market days are more tempting than the day before market days, and 66 percent of people said that having a lot of cash on hand at the trading center was more tempting than having it on hand elsewhere. These answers suggest that payments during market days could exacerbate temptation- based psycho- logical savings constraints, by inducing people to spend money on tempting goods that they would prefer to save. Panel D confirms that markets are an important part of life in the area, with the typical person reporting they went to the market six times in the past month. Saturdays are the most common days that people visit the market (32 percent of all visits), although other trading centers do hold market days on Fridays and so 26 percent of visits happen on Fridays. We compare payments during the market day to payments at the same site the day before, when the market does not take place. We chose the day before—Friday—as the alternate day for several reasons. First, it was logistically simpler to manage payments on two consecutive days than on non- adjacent ones; Sunday was not an option because the vast majority of our sample goes to church on Sunday mornings. Second, using the day before the market ensured that all respondents had the liq- uid cash needed to make purchases at the market—if we had paid the control group on a later day, then for the first week they would not have had any money to spend at the market on Saturday. Third, and most important, if the control group was paid after the Saturday group, then any differ- ences in temptation could simply be a function of having the money for a shorter period. The control group does not collect their wages in the tempting market day environment; thus the money will not “burn a hole in their pocket” in the sense of Fudenberg and Levine (2006) unless they keep the money and come back to the market again the other day or find another market to spend their wages right away. The location and timing of the payroll was specifically chosen to maximize the likelihood that people would be exposed to temptation goods. The market at Mwanamulanje happens only on Wednesdays and Saturdays (with Saturdays having the larger market), and principally in the morning, which is when people were paid. Shops are still open on Fridays, and there are some mobile vendors, but the majority of market activity happens on Saturdays. 444 Brune, Kerwin, and Li Table 4. Effects of Market Day Wage Payment on Cash on Hand (1) (2) (3) (4) Cash on hand before payment Cash on hand after payment Dependent variable: Friday Saturday Friday Saturday Market day payment –26.6 −140.4*** −791.2*** 644.0*** Downloaded from https://academic.oup.com/wber/article/36/2/433/6501138 by LEGVP Law Library user on 08 December 2023 (20.8) (24.5) (22.0) (26.3) Control-group mean 121.7 246.0 988.3 346.0 Number of observations 686 686 686 686 Source: Authors’ calculations based on the payroll survey data. Note: Sample includes 359 respondents who participated in at least one round of the work program and have data from at least one data source for that round (either the payday data, the survey, or both). Regressions are run on pooled data from round 1 and round 2. Boldface type indicates the treatment variable of interest. 1 USD was worth approximately MK400 at market exchange rates and MK160 at PPP exchange rates during the study period. All regressions control for stratification cell fixed effects, an index of baseline asset ownership based on first principal components, indicators for the number of days after the weekend the interview occurred, baseline income, baseline total spending and (if available) the baseline value of the outcome variable. Heteroskedasticity-robust standard errors, clustered by worker, in parentheses: *p<0.1; **p<0.05; ***p<0.01. 2. Econometric Strategy To estimate the mean effects of the exposure to a tempting environment on expenditure and temptation spending, we estimate regressions of the following form: Yir = α + β Tir + γ Xir + εir (1) where i denotes worker and r denotes the round of survey. The outcome of interest for worker i in round r is Yir . The treatment variable Tir equals 1 if the worker receives wages in a tempting environment (the market day) and 0 otherwise. The vector X ir is a set of controls, which comprise stratification cell dum- mies, two household financial variables,9 indicators for the day-of-week of the exogenously assigned (first attempted) interview date, baseline income, and the baseline values of the outcome variable.10 Finally, εir is a mean-zero error term.11 Standard errors are clustered at the worker level when we use pooled data from both rounds to account for the statistical dependence of outcome measures for the same worker across two rounds. This means that our standard errors are arguably conservative, since treatment status is randomized within-worker (Abadie et al. 2017). The stratification cells are defined separately by round to control for round fixed effects. 3. Effects of Receiving Pay in a Tempting Environment We begin by showing that the intervention substantially shifted both when workers had cash on hand and with the timing of expenditures. Table 4 shows the effect of being paid on Saturday on workers’ cash on hand before the wage payments (columns 1 and 2) and after the payments (columns 3 and 4). Prior to the wage payments, there is no appreciable difference in cash on hand on Fridays, whereas treatment-group workers have MK140 less on hand on Saturdays—presumably because the Friday-payment workers keep 9 The household financial variables are an index of physical assets and livestock ownership using principal component analysis (PCA) and total spending out of income received since the past Friday. Both variables were measured before the randomized assignment. 10 Missing values of the baseline covariates are dummied out. 11 It is not possible to rule out potential spillovers between groups because workers in the sample can interact with each other. It is possible, however, to rule out any within-household spillovers as eligibility is restricted to a single person per household. Section 4 also shows that there is no evidence of spillovers between workers who were next to one another in the payroll queue. The World Bank Economic Review 445 some of their pay with them and bring it back to the market the next day. The post-payment cash-on-hand results confirm that the study did in fact induce treatment-group workers to have more cash during the market day: they have MK644 more cash on hand on Saturdays, and MK791 less on Fridays. Table 5 shows the effect of the market-day payment treatment on spending. Panel A shows pooled results across both workers paid weekly and workers paid in deferred lump sums. Panel B presents the results using only the weekly payment group. Panel C examines treatment effects just for workers in the lump-sum treatment Downloaded from https://academic.oup.com/wber/article/36/2/433/6501138 by LEGVP Law Library user on 08 December 2023 group. There are no substantive differences across the three approaches, so our discussion focuses on panel A. The market-day payment treatment leads to large shifts in the exact timing of expenditure. Column 1 shows that combined spending on Friday and Saturday drops by MK814 (PPP USD $5.08) for the market-day treatment group relative to the workers who are paid on a non-market day. In the presence of liquidity constraints, this is to be expected: workers paid on Friday have had an additional day to spend their income. Taking into account the difference in income timing, the market day treatment induces no meaningful changes in total expenditure: workers spend a similar amount immediately upon receiving their income (Columns 2 and 3) and have statistically indistinguishable total income, remaining cash holdings, and total spending between the previous Friday and the survey date. The next logical question is whether exposing workers to a tempting environment—the market day— induced any changes in temptation spending in particular. Table 6 shows the results of this analysis.12 It shows that the market-day payment treatment does not substantially change temptation spending for any of the temptation-spending measures that we use. The estimates in all columns are statistically in- significant, and the signs for the effect of market-day payment on temptation-spending measures are mostly negative. The only positive treatment effect estimates for temptation-spending measures are re- gretted purchases (which rise by 9 percent) and alcohol and tobacco (which rise by 23 percent). The other temptation-spending measures show negative effects ranging from 3 percent to 22 percent of the control- group mean. We focus on the PCA index measure (which uses data from both rounds and thus omits self-reported money wasted), which prevents issues arising from researcher degrees of freedom. It allows us to rule out all but the smallest treatment effects: the upper bound of the 95 percent confidence interval for the combined index across both rounds is 0.15 SDs. Our results imply a null effect of exposure to a tempting environment on spending among our sample of workers. In addition, the market-day payment treatment does not have appreciable impacts on temptation spending, irrespective of the choice of definition. 4. Mechanisms for the Null Effect of Temptation The finding of the null effect of temptation is consistent with previous research on cash transfers (Evans and Popova 2017). However, it runs contrary to a prominent strain of theoretical work (Banerjee and Mullainathan 2010) which argues that temptation plays a key role in the spending of the poor. Moreover, the previous evidence simply shows that very little of cash transfers is spent on temptation goods; this paper shows that even receiving cash in a more tempting environment seems to lead to little change in behavior. Here we discuss the potential mechanisms that may provide explanations for the null results in our study. 4.1. Statistical Power A first possibility is that our intervention might have had meaningful effects that are simply too small for us to detect. The fourth row of table 6 presents the minimum detectable effect (MDE) size on temptation 12 Table S1.5 in the supplementary online appendix shows the same results without any control variables. The results are substantively identical. 446 Table 5. Effects of Market Day Wage Payment on Expenditures (1) (2) (3) (4) (5) (6) Payday survey panel – Spending at market on the four payday weekends Household survey data Amount spent on (Spending on Income received Remaining cash out of Total spending since Friday and Saturday, Amount spent payday) / (income since last Friday income received since Friday from itemized Dependent variable: all weekends (MK) on payday (MK) received) (MK) last Friday (MK) expenditure data (MK) Panel A –Lump sum and weekly payment group pooled Market day payment −814.2*** −28.0 0.013 17.9 −93.6 127.1 (113.3) (89.5) (0.029) (194.9) (75.9) (161.3) Control-group mean 3,293.0 1,688.0 0.622 3,081.0 579.2 3,147.0 Number of observations 689 689 689.0 689 689 689 Panel B –Lump sum payment group only Market day payment −758.4*** −23.7 0.038 161.6 −160.9 186.3 (171.6) (120.0) (0.042) (230.4) (107.0) (237.0) Control-group mean 3,068.0 1,247.0 0.534 3,753.0 670.6 3,341.0 Number of observations 345 345 345.0 345 345 345 Panel C –Weekly payment group only Market day payment −810.9*** −26.1 −0.003 −3.9 −30.3 53.0 (153.5) (115.1) (0.039) (253.9) (114.7) (234.3) Control-group mean 3,530.0 2,151.0 0.714 2,378.0 483.5 2,944.0 Number of observations 344 344 344 344 344 344 Source: Authors’ calculations based on the baseline data, the payroll survey data, and the two waves of follow-up data. Note: Sample includes 359 respondents who participated in at least one round of the work program and have data from at least one data source for that round (either the payday data, the survey, or both). Regressions are run on pooled data from round 1 and round 2. Boldface type indicates the treatment variable of interest. 1 USD was worth approximately MK400 at market exchange rates and MK160 at PPP exchange rates during the study period. All regressions control for stratification cell fixed effects, an index of baseline asset ownership based on first principal components, indicators for the number of days after the weekend the interview occurred, baseline income, baseline total spending, and (if available) the baseline value of the outcome variable. Heteroskedasticity-robust standard errors, clustered by worker, in parentheses: *p<0.1; **p<0.05; ***p<0.01. Brune, Kerwin, and Li Downloaded from https://academic.oup.com/wber/article/36/2/433/6501138 by LEGVP Law Library user on 08 December 2023 Table 6. Effects of Market Day Wage Payment on Temptation Spending (1) (2) (3) (4) (5) (6) (7) (8) (9) PCA indices of temptation spending measures (columns 3 to 9) Measures of temptation spending (MK) The World Bank Economic Review Omitting Including A. Waste/ B. Money C. Frequently D. Unplanned E. Commonly F. Alcohol and G. Alcohol, tobacco, Dependent variable: column 4 column 4 temptation goods wasted regretted purchases against plans tobacco donuts, and soda Rounds available 1&2 2 only 1&2 2 only 1&2 1&2 1&2 1&2 1&2 Market day payment −0.008 −0.129 −9.6 −49.2 4.0 −5.1 −14.9 3.0 −2.2 (0.078) (0.126) (25.4) (71.2) (16.0) (8.4) (20.1) (3.2) (5.7) Control-group mean −0.012 0.000 131.2 324.3 42.7 52.7 69.1 12.9 67.2 Control-group SD 1.234 1.378 382.7 719.7 178.6 122.6 321.4 44.3 93.7 MDE on temptation spending at 80 0.177 0.256 0.186 0.277 0.251 0.191 0.175 0.204 0.171 percent power (SDs) Number of observations 689 346 689 346 689 689 689 689 689 Source: Authors’ calculations based on the baseline data and the two waves of follow-up data. Note: Sample includes 359 respondents who participated in at least one round of the work program and have data from at least one data source for that round (either the payday data, the survey, or both). The dependent variable in column 1 is the PCA index constructed using data from both rounds, and thus it excludes the variable in column (4) since it is only available in round 2. Regressions are run on pooled data from round 1 and round 2. Boldface type indicates the treatment variable of interest; 1 USD was worth approximately MK400 at market exchange rates and MK160 at PPP exchange rates during the study period. All regressions control for stratification cell fixed effects, an index of baseline asset ownership based on first principal components, indicators for the number of days after the weekend the interview occurred, baseline income, baseline total spending and (if available) the baseline value of the outcome variable. Heteroskedasticity-robust standard errors, clustered by worker, in parentheses: *p<0.1; **p<0.05; ***p<0.01. 447 Downloaded from https://academic.oup.com/wber/article/36/2/433/6501138 by LEGVP Law Library user on 08 December 2023 448 Brune, Kerwin, and Li spending at 80 percent power based on our estimated standard errors.13 We have 80 percent power to detect 0.17 to 0.28 SDs changes in temptation spending, and the MDE for the aggregate index of temp- tation is 0.18 SDs. For our specific temptation spending measures, we find MDEs of 0.19 SDs for goods respondents say they waste money on or are tempted to buy and 0.28 SDs for self-reported aggregate money “wasted,” with similar values for the other goods. These values correspond to 54 percent and 61 percent changes relative to the control-group mean. One estimate of the effect of exposure to tempta- Downloaded from https://academic.oup.com/wber/article/36/2/433/6501138 by LEGVP Law Library user on 08 December 2023 tion on temptation spending comes from Wansink, Painter, and Lee (2006), who varied whether a candy dish was located next to or further away from office workers.14 They find increases in candy consumption of nearly 100 percent. While this may be a more tightly controlled and stronger intervention than the one we conducted, our study is arguably well-powered based on their estimated treatment effects. An alternative way of assessing this present study’s statistical power follows an argument made by Evans and Popova (2017), who point out that cash transfer studies that find null effects on tempta- tion spending nevertheless do find significant treatment effects on other outcomes. This suggests that power should be less of a concern. The same logic applies to our study: while we find null effects on temptation spending, the intervention did have large and statistically significant effects on the exact tim- ing of expenditure (table 5, column 1). We also find that our MDEs are within the range of MDEs for 80 percent power in the studies reviewed in Evans and Popova (2017), which they assess to be reasonably well-powered, albeit for a different intervention (they study changes in the level of income received while we study changes in the timing of income receipt). Overall we conclude that limited statistical power is not the main issue driving our null results: our data would let us detect treatment effects that are large enough to be of interest and consistent with the literature. 4.2. Treatment Intensity Along similar lines, our treatment simply may not have been intense enough to produce appreciable changes in behavior. The available evidence suggests this is somewhat unlikely. Table S1.4 in the supple- mentary online appendix presents survey evidence from the sample of workers on the tempting nature of market days. Market days are the situation workers most commonly report as one in which they waste money or are tempted into spending (panel A). Out of the workers who report any situation as being tempting, 61 percent choose market days, and this number is by far the most common response (about four times more frequent than the second-most-common selection). Panel B shows that large majorities of respondents find the market day more tempting than the day before the market day (69 percent) and are more tempted to spend money (they will later regret) when they have cash in their pocket at a trad- ing center as opposed to elsewhere (66 percent). In addition, the market day is considered much more tempting than the night before (which is a common night for drinking in this setting); 74 percent of our sample found the market day to be more tempting. Temptation spending is also self-perceived to be a major driver of waste: 42 percent of workers report it as a reason they waste money (panel C). While the survey responses above strongly suggest that getting paid on market days creates a more tempting environment than getting paid the prior day, the receipt of the MK100 show-up stipend for both treatment and control on market days might have weakened the treatment. It is possible that workers from both study arms spent the show-up stipend on temptation goods, satisfying their demand (this could be the case, in particular, due to a mental accounting effect that made the “unearned” income psychologically more available for spending). For all but two of the temptation-spending measures the average in the control group is in fact smaller than the show-up stipend, which could support this view. However, the 13 The minimal detectable effect at 80 percent power is 2.8 times the standard error divided by the control group standard deviation (Ioannidis, Stanley, and Doucouliagos 2017). 14 Some of Wansink’s research has been retracted due to poor scientific practice and data that does not match the published results. However, the authors are aware of no issues that have been raised with this specific paper. The World Bank Economic Review 449 two measures for which average spending is higher than stipend amount are the ones that we argue are conceptually superior in “Measures of Temptation Spending” in section 1. More importantly, the show- up stipend is a very small amount of money: just 3 percent of wages for the lump-sum payment group and 12 percent for the weekly payments group. If temptation purchases are easily satisfied by such small amounts of spending, this finding still supports our overall conclusion that receipt of cash in a more tempting environment does not have practically meaningful effects on temptation spending. Downloaded from https://academic.oup.com/wber/article/36/2/433/6501138 by LEGVP Law Library user on 08 December 2023 A different way of thinking about treatment intensity is that even if our treatment was as strong as it could be, receiving cash in a tempting environment might simply have small effects on spending decisions. If so, our small treatment effects thus have important implications: they imply that even the most-tempting situation that a person could practically be exposed to will not strongly affect their behavior. This means that policymakers should worry less about the potentially temptation-enhancing aspects of a program’s design, and that economists should focus, at the margin, on other explanations for low savings rates in developing countries. 4.3. Substitution Bias One specific reason why the treatment may have been weaker than expected is “substitution bias.” Heckman et al. (2000) define substitution bias as an attenuation in the estimates of treatment effect that occurs because the control group gets access to the treatment or to a close substitute.15 In our case, the intervention exposed treatment-group workers to an environment that was fairly tempting, but control- group workers could have chosen to substitute toward other temptation spending opportunities. Our treatment was designed around the market at Mwanamulanje Trading Centre, which operates on Satur- days (with a smaller market day on Wednesdays). However, there are a number of other nearby trading centers that do have market days on Fridays, which was the alternate day on which workers received their pay. It is possible that the workers who are assigned to a low-temptation environment on payday (Friday at Mwanamulanje) simply substitute toward other sources of temptation, such as the market days hap- pening elsewhere. Evidence of this potential substitution is found in the baseline survey data, presented in table S1.4 in the supplementary online appendix: while Saturdays are the most common day on which members of our sample go to the market prior to the experiment, Fridays are nearly as common (26 per- cent vs. 32 percent of all market visits), and 42 percent of all visits happen on a different day of the week entirely. The Mwanamulanje market day does not occur on Fridays, so this suggests that workers are visiting other markets for market day. There are three other major markets within easy walking distance of Mwanamulanje. None of them holds an official market day on Fridays, but all have some permanent vendors that are open every day. Moreover, workers may also travel longer distances to other markets that do hold market days on Fridays. The likelihood of this possibility is mitigated somewhat by the fact that the control-group workers, who were paid on Fridays, were required to come to the market on Saturday even though they were not being paid their wages. This reduces the time available for workers to seek out other markets, making it less likely for workers to seek out an alternative market to consume the day before. As a partial test of this possibility, we look at treatment effect heterogeneity by the time it would take workers to travel to another market. Specifically, we compute the perimeter of a triangular path that starts at each worker’s home, goes to the payroll site, then to the closest alternative market to their home, and then back home again. This variable is included in the balance tables, and there are no significant differences across study arms. Table S1.6 in the supplementary online appendix examines treatment effect heterogeneity by this variable. There are no statistically significant differences in the treatment effects for any of the outcome measures, so we find no evidence in favor of the substitution bias mechanism. 15 “Substitution bias” also has a separate definition in the literature on price indices. 450 Brune, Kerwin, and Li At the same time, the mandatory attendance at all paydays could have led to another form of sub- stitution, over time instead of across space: if workers primarily save their earnings by holding cash on their persons, then workers who are paid on a non–market day may simply hold onto the cash and face the same temptations as those paid on a market day.16 This explanation helps reconcile our results with the workers’ own evaluations of markets as being extremely tempting, and with the fairly high levels of temptation spending that we observe (1–10 percent of total expenditure, depending on the definition we Downloaded from https://academic.oup.com/wber/article/36/2/433/6501138 by LEGVP Law Library user on 08 December 2023 use). The cash-on-hand results that are discussed above suggest some role for this mechanism, although Friday-payment workers do not bring large amounts of cash with them to the market. Even if substitution bias does explain our null results, they have important theoretical implications. If people take intentional actions to seek out other temptation-spending opportunities, then temptation spending is conceptually quite different from how it is typically conceived in economic models. It is hard to reconcile the active seeking of temptations with dual-self style theoretical frameworks in which temp- tations are valued only by the instantaneous, current self. 4.4. Under-Reporting Another reason for the small measured effects of the treatment is under-reporting of temptation spending. There is evidence that socially undesirable behaviors are misreported in surveys (Mathiowetz et al. 2002), and it is very common for people to under-report spending on alcohol and tobacco. For instance, cigarette smoking is significantly under-reported compared with cigarette sales figures (Warner 1978), and survey reports of alcohol use are less than half of retail sales in the United States (Cook 2007). It is unlikely that under-reporting would be systematically related to treatment status in our setting: respondents did not know the intent of the study and had no incentive to alter their responses based on their treatment status. Still, if temptation spending is sufficiently under-reported across the board, this would cause our coefficient estimates to converge to zero. This is also implausible: some of the measures of temptation spending comprise 10 percent of overall money spent. In addition, two of the temptation measures allow respondents to self-designate spending as problem- atic (rows A and B of fig. 1), and are thus less focused on purchases that respondents would be likely to under-report. We observe higher spending on this temptation spending measure than on alcohol, tobacco, donuts, and soda. This mitigates the concerns that our results are driven primarily by under-reporting. 4.5. Precommitted Spending Plans Alternatively, the small average treatment effects could mask important heterogeneity. Both models of temptation (e.g., Gul and Pesendorfer 2001) and dual-self models of self-control (e.g., Fudenberg and Levine 2006) imply that if workers are aware of their self-control problems, they should demand com- mitment devices. If workers in our sample are aware of the temptation of the market, they need not actually succumb to temptation; instead, they can find ways to constrain their behavior through commit- ment devices (Bryan, Karlan, and Nelson 2010). In particular, since workers know their wage payment schedule, it is possible that workers may precommit to spending plans ahead of time, which would reduce the scope for temptation spending. This could be done in two ways: workers might have made promises to friends or family members or agreements with vendors, or workers might have mental accounts (Thaler 1985) that drive them to spend the money in the planned-upon fashion. Both mechanisms could constrain workers’ spending decisions and mute any temptation-spending effects. 16 The survey did not distinguish between savings kept at home and those carried on one’s person, so it is not possible to assess how common this is. Anecdotally, however, the workers in the sample commonly carry at least some of their savings on their person. The World Bank Economic Review 451 4.6. Peer effects Another factor that could mask the effects of the treatment is peer effects: workers picked up their pay in a queue with all other workers, including those who were not paid (but who still appeared in order to receive a small attendance incentive). Interactions with these peers might lead all workers to make similar temptation-spending decisions. For example, workers might all go buy beers or snacks together, or workers exposed to temptation might instead follow their peers and not spend money on tempting Downloaded from https://academic.oup.com/wber/article/36/2/433/6501138 by LEGVP Law Library user on 08 December 2023 goods. Evidence on peer effects in temptation spending suggests that the spillovers are most likely to be positive (Chuang 2018). This would cause the workers to behave more similarly to one another, leading to attenuated treatment effects. Peer effects have previously been documented in this context: Brune, Chyn, and Kerwin (2020) find evidence of workplace peer effects at an agricultural firm in the same district of Malawi. Our study’s design makes it possible to examine one potential source of peer effects, via the order in which workers queued up to receive their pay. Workers had assigned ID numbers, and the sign-in sheet was in order by number. To speed up the payroll process, the workers typically queued in the order of their names on the payroll sheet, which was sorted by village and then alphabetically by last name. Thus workers were exposed to the same neighbors in line throughout the study, and those neighbors had randomly assigned treatment statuses. We can estimate peer effects from line neighbors by including the average treatment status of the workers ahead of and behind you in line as an additional variable in the regression equation: ¯ Yir = α + β Tir + δ T −i,r + γ X ir + εir (3) where T ¯−i,r = (T(i−1),r + T(i+1)r )/2 is the average treatment status17 of the workers i−1 and i+1 in round r, and takes the values 0, 0.5, or 1.18 We find no evidence that peer effects are driving the substantive results. The estimates of equation (3) in table S1.7 in the supplementary online appendix reveal no statistically significant effect of peers’ treatment status on workers’ temptation-spending decisions.19 More importantly, the point esti- mates for the effect of the workers’ own treatment status are essentially unchanged relative to the estimates of equation (1) from table 6. An important limitation of this analysis is that it relies on the assumption that, if peer effects exist, they operate at least in part through the workers one interacts with in line. We have no measures of other social networks such as workers’ friends, extended families, or neighbors. A further limitation is that this test does not cover the case where peer pressure leads to no responses to the treatment whatsoever. If that is true, then workers would not respond to their peers’ treatment status. 4.7. Costly Resistance to Temptation One prediction of some models of temptation is that people can resist actually purchasing tempting goods, but must pay a utility cost to do so (Gul and Pesendorfer 2001). This suggests that the workers in our study may have been able to forgo the tempting items they faced during the market day. Models of finite willpower (Ozdenoren, Salant, and Silverman 2012) suggest that people can overcome temptations by drawing on a limited well of self-restraint in order to control their impulsive behavior. This theory predicts that people who exert greater self-control in consumption problem will exhibit less self-control 17 To keep the first and last members of the line in the sample, the study uses the treatment status of their only neighbor. The results are also robust to several alternative specifications: using two workers ahead of and two workers behind in the line, controlling separately for the treatment statuses of the workers ahead and behind the focal worker, and controlling separately for the treatment statuses of the 10 workers ahead of the focal worker (results available upon request). 18 This test is not affected by the exclusion bias problem of Caeyers and Fafchamps (2019) because treatment status is randomly assigned and thus independent of any worker characteristics. 19 One of the individual components has a significant difference at the 0.1 level; this is likely due to random sampling variation. 452 Brune, Kerwin, and Li in subsequent activities. A similar implication holds if the worker simply pays a utility cost and the cost is convex. A partial empirical test of this prediction is possible. Workers who have more exposure to other tempta- tions should be less able to resist the temptation of the market day, and vice versa.20 If we use temptation- spending levels as a proxy for temptation exposure, this implies that the treatment should have higher effects on workers at the top of the distribution of temptation spending. Under the assumption of rank Downloaded from https://academic.oup.com/wber/article/36/2/433/6501138 by LEGVP Law Library user on 08 December 2023 preservation, we can test this via quantile regressions. Figure S1.2 in the supplementary online appendix shows the results for our preferred index of temptation spending, which uses data from both rounds of the experiment. The results reveal no evidence of systematic differences in treatment effects across quantiles of the outcome distribution: none of the quantiles show a statistically significant effect, and the point es- timates fluctuate between positive and negative. While this is not a high-powered test, it does not provide evidence that costly self-control plays a role in driving our results. 5. Conclusion This paper examines the importance of temptation on spending by studying the effect of exposure to temptation on spending decisions. Our data come from a randomized field experiment in Malawi that varied the temptation level of the environment in which workers received cash payments by having some workers receive their pay during the major local market day, which is identified as the most-tempting local environment. Our experiment suggests that the exact context in which workers are paid may not be an important consideration for designing payment systems. This paper finds results that are consistent with, and extend, the findings from the literature on cash transfer studies: that literature shows that receiving additional income does not raise temptation spending, while our results show directly that the specific timing and environment of income receipt do not raise temptation spending. We discuss a range of potential factors that could have led to the null effects of exposure to temptation on spending decisions. The most plausible explanation, given the evidence, is that the intensity of the treatment was low. However, our treatment was implemented well and captures one of the most tempting situations that the people in the sample ever face. This suggests that the role exposure to temptation in driving consumption and savings choices in Sub-Saharan Africa may be less of a concern than researchers and policymakers have heretofore thought. More broadly, the findings of our study imply that the specific location or day of income receipt is not a major driver of spending decisions in a broad range of settings in rural Africa. We also show the deficits of some temptation-spending measures and recommend measurements that use respondents’ own determination of which purchases are temptation spending to estimate the temp- tation effects. Spending on tempting goods such as alcohol and tobacco is widely used in studies as the definition of temptation spending. However, these commonly used definitions may conceal important pat- terns. People may be tempted to spend on goods other than these conventional ones, which would result in downward-biased estimates of temptation levels as well as treatment effects. We show that respon- dents’ own designations of temptation goods can be quite different from the conventional definitions and could provide a better estimate of actual temptation spending. The self-reported measure of temptation spending is 8 times higher than the level of spending on tobacco and alcohol—and 11 times higher for women. 20 The use of temptation-spending levels as a proxy for outside temptation exposure comes with the caveat that it may capture other factors as well. For example, people who differ in their baseline temptation spending may also have different willpower stocks. The World Bank Economic Review 453 A few factors limit the strength of the conclusions that can be drawn from this study. While the evidence best supports one of the mechanisms, it is not possible to convincingly rule out the other six. Also, although our sample is reasonably representative of low-income households in the local region that is studied, it is possible that the treatment effects would differ for other populations of people in Malawi or in other parts of Sub-Saharan Africa or the developing world. While replicating this study in other contexts would be valuable, it would also be worthwhile for future research to examine other potential drivers of self-control Downloaded from https://academic.oup.com/wber/article/36/2/433/6501138 by LEGVP Law Library user on 08 December 2023 problems aside from temptation spending. References Abadie, A., S. Athey, G. W. Imbens, and Wooldridge J. 2017. “When Should You Adjust Standard Errors for Clustering?” Working Paper 24003. National Bureau of Economic Research. Cambridge, MA, USA. Aker, J. C. 2017. “Comparing Cash and Voucher Transfers in a Humanitarian Context: Evidence from the Democratic Republic of Congo.” World Bank Economic Review 31(1): 44–70. Banerjee, A. V., and E. Duflo. 2007. “The Economic Lives of the Poor.” Journal of Economic Perspectives 21 (1): 141–168. Banerjee, A. V., and S. Mullainathan. 2010. “The Shape of Temptation: Implications for the Economic Lives of the Poor.” Working Paper No. 257, BREAD. Cambridge, MA. Brownback, A., A. Imas, and M. Kuhn. 2019. “Behavioral Food Subsidies.” Working Paper. SSRN. Brune, L., X. Giné, J. Goldberg, and D. Yang. 2017. “Savings Defaults and Payment Delays for Cash Transfers: Field Experimental Evidence from Malawi.” Journal of Development Economics 129: 1–13. Brune, L., E. Chyn, and J. Kerwin. 2020. “Peers and Motivation at Work: Evidence from a Firm Experiment in Malawi.”Journal of Human Resources, 0919. Brune, L., and J. T. Kerwin. 2019. “Income Timing and Liquidity Constraints: Evidence from a Randomized Field Experiment.” Journal of Development Economics 138: 294–308. Bryan, G., D. Karlan, and S. Nelson. 2010. “Commitment Devices.” Annual Review of Economics 2 (1): 671–698. Caeyers, B., and M. Fafchamps. 2019. “Exclusion Bias in the Estimation of Peer Effects.” Working Paper No. 22565. National Bureau of Economic Research. Cambridge, MA, USA Chuang, Y. 2018. Self Control or Social Control Peer Effects on Temptation Consumption. Working Paper. Cook, P. 2007. Paying the Tab: The Costs and Benefits of Alcohol Control. Princeton, NJ: Princeton University Press. Dasso, R., and F. Fernandez. 2014. “Temptation Goods and Conditional Cash Transfers in Peru.” Working Paper. Dupas, P., and J. Robinson. 2013. “Why Don’t the Poor Save More? Evidence from Health Savings Experiments.” American Economic Review 103 (4): 1138–1171. Evans, D. K., and A. Popova. 2017. “Cash Transfers and Temptation Goods.” Economic Development and Cultural Change 65(2): 189–221. Evans, W. N., and T. J. Moore. 2011. “The Short-Term Mortality Consequences of Income Receipt.” Journal of Public Economics 95 (11–12): 1410–1424. Fudenberg, D., and D. K. Levine. 2006. “A Dual-Self Model of Impulse Control.” American Economic Review 96 (5): 1449–1476. Gul, F., and W. Pesendorfer. 2001. “Temptation and Self-Control.” Econometrica 69 (6): 1403–1435. Harvey, P. 2007. “Cash-Based Responses in Emergencies.” IDS Bulletin 38(3): 79–81. Heckman, J., N. Hohmann, J. Smith, and M. Khoo. 2000. “Substitution and Dropout Bias in Social Experiments: A Study of an Influential Social Experiment.” Quarterly Journal of Economics 115 (2): 651–694. Ikiara, G. K. 2009. Political Economy of Cash Transfers in Kenya (p. 28). London: Overseas Development Institute. Imas, A., M. Kuhn, and V. Mironova. 2016. Waiting to choose. Working Paper. Ioannidis, J. P. A., T. D. Stanley, and H. Doucouliagos. 2017. “The Power of Bias in Economics Research.” Economic Journal 127(605): F236–F265. Mathiowetz, N. A., C. Brown, and J. Bound. 2002. “Measurement Error in Surveys of the Low-Income Population.” In Studies of Welfare Populations: Data Collection and Research Issues. Edited by Michele Ver Ploeg, Robert A. Moffitt and Constance F. Citro, 157–195. Washington, DC: National Academy Press. O’Donoghue, T., and M. Rabin. 1999. “Doing It Now or Later.” American Economic Review 89 (1): 103–124. 454 Brune, Kerwin, and Li Ozdenoren, E., S. W. Salant, and D. Silverman. 2012. “Willpower and the Optimal Control of Visceral Urges.” Journal of the European Economic Association, 10 (2): 342–368. Sadoff, S., A. Samek, and C. Sprenger. 2020. “Dynamic Inconsistency in Food Choice: Experimental Evidence from Two Food Deserts.” Review of Economic Studies 87 (4): 1954–1988. Simmons, J. P., L. D. Nelson, and U. Simonsohn. 2011. “False-Positive Psychology Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant.” Psychological Science 22 (11): 1359–1366. Somville, V., and L. Vandewalle. 2018. “Saving by Default: Evidence from a Field Experiment in Rural India.” Amer- Downloaded from https://academic.oup.com/wber/article/36/2/433/6501138 by LEGVP Law Library user on 08 December 2023 ican Economic Journal: Applied Economics 10 (3): 39–66. Spears, D. 2011. “Economic Decision-Making in Poverty Depletes Behavioral Control.” B.E. Journal of Economic Analysis & Policy 11 (1). Thaler, R. 1985. “Mental Accounting and Consumer Choice.” Marketing Science 4 (3): 199–214. Toussaert, S. 2018. “Eliciting Temptation and Self-Control through Menu Choices: A Lab Experiment.” Econometrica 86 (3): 859–889. Wansink, B., J. E. Painter, and Y.-K. Lee. 2006. “The Office Candy Dish: Proximity’s Influence on Estimated and Actual Consumption.” International Journal of Obesity 30(5): 871–875. Warner, K. E. 1978. “Possible Increases in the Underreporting of Cigarette Consumption.” Journal of the American Statistical Association 73 (362): 314–318. White, J. S., and S. Basu 2016. “Does the Benefits Schedule of Cash Assistance Programs Affect the Purchase of Temptation Goods? Evidence from Peru.” Journal of Health Economics 46: 70–89.