Cash for Coolers

This paper examines a large-scale appliance replacement program in Mexico that since 2009 has helped 1.5 million households replace their old refrigerators and air-conditioners with energy-efficient models. Using household-level electric billing records from the population of Mexican residential customers we find that refrigerator replacement reduces electricity consumption by an average of 11 kilowatt hours per month, about a 7% decrease. We find that air conditioning replacement, in contrast, increases electricity consumption by an average of 6 kilowatt hours per month, with larger increases during the summer. To put these results in context we present a simple conceptual framework in which energy-efficient durable goods cost less to operate, so households use them more. This behavioral response, sometimes called the "rebound" effect, is important for air-conditioners, but not important for refrigerators.


Introduction
Supporters of energy-efficiency policies argue that they represent a "win-win", helping participants reduce energy expenditures while also reducing carbon dioxide emissions and other negative externalities associated with energy use. 1 Skeptics of energy-efficiency policies question the magnitude of the potential reductions in consumption and argue that there are important economic costs to energy-efficiency programs that tend to be overlooked. 2 These claims are difficult to evaluate, however, because there is a surprisingly small amount of direct empirical evidence.
The lack of large-scale analyses of energy-efficiency programs is surprising given the immense policy importance of these questions. Electric utilities in the United States, for example, spent $22 billion dollars on energy-efficiency programs between 1994 and 2010, leading to a reported total savings of more than 1 million gigawatt hours of electricity. 3 Every major piece of U.S. federal energy legislation since the Energy Policy and Conservation Act of 1975 has included a substantial energy-efficiency component.
Most recently, the American Recovery and Reinvestment Act of 2009 provides $17 billion for energy-efficiency programs. 4 Moreover, many low and middle-income countries are now adopting energyefficiency policies. In part, this reflects a widely-held view that there is an abundant supply of low-cost, high-return investments in energy-efficiency, particularly in developing countries (Johnson, et. al, 2009;McKinsey and Company, 2009b). Over the next several decades most of the growth in global energy demand is expected to come from the developing world. Between 2010 and 2035, energy consumption in non-OECD countries is expected to increase 85%, compared to only 18% in OECD countries. 5 Meeting this increased demand for energy will be a substantial challenge, so understanding the potential role of energy-efficiency is extremely important.
1 McKinsey and Company (2009a), for example, argues that energy-efficiency investments are a "vast, lowcost energy resource" that could reduce energy expenditures by billions of dollars per year.
2 For a recent survey see Alcott and Greenstone (2012). 3 U.S. DOE (1994DOE ( -2011. Expenditures reported in year 2010 dollars. In this paper we examine a large-scale national appliance replacement program in Mexico. Since 2009, "Cash for Coolers" (hereafter, "C4C") has helped 1.5 million households replace their old refrigerators and air conditioners. To participate in the program a household's old appliance must be at least 10 years old and the household must agree to purchase an energy-efficient appliance of the same type. These old appliances are permanently destroyed, making the program similar to "Cash for Clunkers" and other vehicle retirement programs.
The objective of this paper is to quantify the overall impact of C4C on electricity consumption and greenhouse gas emissions. To guide the empirical analysis we use a conceptual framework that emphasizes that energy demand is derived from demand for household services produced using durable goods. More energy-efficient durable goods have lower energy costs of producing those services and hence households will tend to use them more. The more price elastic the demand for service, the smaller the energy savings. Hence, the cost-effectiveness of durable good replacement programs like C4C depends not only on the number of households that are induced to replace, but also on the price elasticity of the demand for the underlying services.
We find that refrigerator replacement reduces electricity consumption by an average of 11 kilowatt hours per month, about a 7% decrease. This is considerably less than what was predicted ex ante by the World Bank and McKinsey based on engineering models that ignore behavioral responses. 6 The World Bank study, for example, predicted savings for refrigerators that were about four times larger than our estimates. While electricity savings from refrigerator replacement is smaller than was predicted, we find that air-conditioning replacement actually increases electricity consumption. The magnitude varies substantially across months, with near zero changes during the winter and 20+ kilowatt hour increases per month in the summer.
We discuss and present ancillary empirical evidence supporting several possible explanations. One important explanation for the differences between our results and the ex ante estimates is changes in appliance utilization. This is important for airconditioners, but not important for refrigerators. In addition, we discuss how increases in appliance size and appliance features (e.g. through-the-door ice) worked to substantially offset the potential reductions in electricity consumption. Finally, we argue that many of the old appliances were probably not working or not working well at the point of replacement. This paper helps address an urgent need for credible empirical work in this area.
Allcott and Greenstone (2012) argues that, "much of the evidence on the energy cost savings from energy-efficiency comes from engineering analyses or observational studies that can suffer from a set of well-known biases." 7 In fact, the primary source of data on energy-efficiency programs in the United States comes from self-reported measures of energy savings from utilities. Economists have long argued that these measures of energy savings are overstated (Joskow and Marron, 1992), yet reporting practices have changed little over time.
A key feature of our analysis is the use of high-quality microdata. For this analysis we were granted access to the census of household-level electric billing records for the population of residential electricity customers in Mexico. There is some precedent for econometric analysis of electric billing data, but this has mostly been in the context of measuring the responsiveness of demand to changes in prices or weather. 8 Moreover, most previous analyses have been of a much smaller scale, for example, using data from a single electric utility. The sheer number of households in our analysis allow us to estimate effects precisely even with highly non-parametric specifications.
The fact that our analysis is based on a large-scale national program gives our results an unusually high degree of external validity. Program evaluation, particularly with energy-efficiency policies, is typically based on small-scale interventions implemented, for example, in one particular location or by one particular utility. In these settings a key question is how well do parameter estimates generalize across sites.
Moreover, utilities that choose to participate in these programs tend to be considerably 7 Allcott and Greenstone (2012) go on to say, "We believe that there is great potential for a new body of credible empirical work in this area, both because the questions are so important and because there are significant unexploited opportunities for randomized control trials and quasi-experimental designs that have advanced knowledge in other domains." 8 See, for example, Engle, Granger, Rice, and Weiss (1986), Reiss and White (2008), and Ito (2011). different from the population of utilities, raising important issues of selection bias. 9 With C4C, we have a program that was available in all Mexican states so our results are nationally-representative.
Our study is among the first to examine energy-efficiency policy in a developing country. 10 To understand energy demand in developing countries it is critical to focus on refrigerators and other appliances. Table 1 shows that in a large group of developing countries, almost 1/3 rd of households have refrigerators while only 5% have vehicles. As incomes increase and families emerge from poverty, they first acquire refrigerators and other household appliances, and it is not until income reaches substantially higher levels that households acquire vehicles (Wolfram, Shelef, and Gertler, 2012). In the near future, we can expect very large number of households to purchase household electric appliances.
The format of the paper is as follows. Section 2 describes a conceptual framework that helps motivate the empirical analyses which follow. Section 3 provides background information about the electricity market in Mexico and the C4C program. Sections 4 and 5 describe the data and empirical strategy and present the main results. Section 6 evaluates cost-effectiveness, calculating the implied cost of the program per unit of energy savings. Section 7 concludes.

2.
A Conceptual Framework

Program Design
Before turning to C4C it is valuable to briefly lay out a general framework for evaluating appliance replacement programs. When policymakers think about these 9 Recent work by Allcott and Mullainathan (2011) provides evidence on the limitations of external validity. Examining 14 nearly-identical energy conservation field experiments in different cities across the United States, they show that estimated effects vary by 240 percent, an amount which is both statistically and economically significant. Moreover, they find strong evidence of selection bias, that electric utilities that choose to participate in these experiments are considerably different from the population of utilities. 10 The small existing literature on energy-efficiency is focused mostly on the United States. See, for example, Dubin, Miedema, and Chandran (1986), Metcalf and Hasset (1999) and Davis (2008). There is also a related literature which uses utility-level data to evaluate energy-efficiency programs, again mostly in the United States (Joskow and Marron, 1992;Loughran and Kulick, 2004;Auffhammer, Blumstein, and Fowlie, 2008;and Arimura, Li, Newell, and Palmer, 2011). Much of what is known about energy-efficiency in developing countries comes from studies based on highly-aggregated data (see, e.g., Zhou, Levine, and Price, 2010).
programs, they typically have in mind a tradeoff between program costs and energy savings and/or environment benefits. Let denote the subsidy amount per household, denote the number of participating households, ∆ denote the savings in energy consumption per replacement, and let denote external costs per unit of consumption.
Thus the net benefits from an appliance replacement program can be expressed as:

∆ .
Program effectiveness depends on two key behavioral parameters: (i) program participation, and (ii) energy savings per replacement. This first parameter is the number of households the subsidy induces to replace their appliances who otherwise would not have. While we can observe the number of replacements for a given subsidy, it is difficult to empirically separate households who were truly induced to replace their appliances by the program from inframarginal households who would have replaced anyway. Joskow and Marron (1991) call these households "free riders". Typically even more difficult to measure is this second parameter (∆ . Accordingly, this is where we focus most of our attention in the empirical analysis.
The characterization of the government's objective function above illustrates an important tradeoff in program design. Higher subsidies attract more participants, ⁄ 0, but only increase net benefits if ∆ . That is, if the value of the avoided external costs per household exceeds the value of the subsidy. This puts a limit on how large of a subsidy should be used. A program that attracts a large amount of participation is of little use if the avoided external costs do not exceed the value of the subsidy. And, even a program that performs well per participant will be of limited value if it cannot be scaled up to a large number of participants.

How Appliance Replacement Programs Affect Household Energy Use
To evaluate the change in energy consumption per replacement (∆ we turn to a household production model following Hausman (1979), Dubin andMcFadden (1984), andBaker, Blundell andMicklewright (1989). In these models, demand for energy is derived from demand for household services that are produced in the home according to a household production technology. Durable goods play a central role, determining the parameters of the household production technology and thus the price of different household services.
Households are assumed to choose the durable good portfolio that yields the highest level of utility, where is a conditional indirect utility function and θ is a vector of characteristics for durable good portfolio . Portfolios differ in terms of characteristics θ and rental prices . Appliance replacement programs like C4C affects portfolio choices by changing the rental price of particular durable good portfolios.
The decision of which portfolio to purchase is made taking into account that whatever portfolio is purchased; it will be operated at the optimal level of utilization, θ , , | θ As in the original household production problem described by Becker (1965), this formalizes the relationship between market inputs and services produced within the home. Household utility is defined over household services and a composite good with a price normalized to one. The production function for is denoted and depends on inputs . While in general there could be an entire vector of inputs, in the simplest case there is a single input, energy. The parameters of the household production technology depend on θ , the characteristics of the household's durable goods. These characteristics could include, for example, the energy-efficiency of the household's refrigerator. Households evaluate expenditure on inputs based on the utility derived from and the disutility of foregone consumption of composite good . The budget constraint depends on a vector of input prices , household income y, and on , the perperiod rental cost net of any available subsidy. 11 11 These models typically assume that there are no borrowing constraints so households can spread the capital costs of durable good investments over many years. Gertler, Shelef, Wolfram, and Fuchs (2011) consider analytically and empirically how borrowing constraints can affect residential energy demand. mechanisms by which a change in energy-efficiency affects energy consumption. First, an increase in energy-efficiency decreases the amount of energy used per unit of household services. For a fixed level of demand for household services an increase in energy-efficiency results in a proportional decrease in energy consumption. Second, an increase in energy-efficiency decreases the price of household services produced with that durable good. Energy-efficient durable goods cost less to operate so households may use them more. This second mechanism is indirect but not necessarily smaller in magnitude than the first mechanism. Which effect is larger depends on the price elasticity of demand for the underlying household service, .
Sometimes called the "rebound" effect, this idea that improvements in energyefficiency may lead to increased utilization goes back at least 150 years. 13 Most of what has been written on the topic, however, has been based on introspection rather than empirical evidence. Some have argued that this behavioral response is so large that high-efficiency durable goods increase energy consumption, implicitly claiming that the price elasticity of utilization exceeds one. 14 Others have argued that utilization elasticities are considerably smaller. 15 Our view is that the magnitude depends crucially on the particular end-use in question, depending on several different factors.
First, a key determinant of the price elasticity of utilization is the availability of substitutes. Demand for services with few substitutes is likely to be inelastic.
Refrigeration is a good example. A household can switch entirely to non-perishable food but this requires a drastic change in diet and likely an increase in total food expenditures. For other household services there are more available substitutes. Take air-conditioning, for example. In the production of "thermal comfort" there are many possible substitutes. A household can use an electric fan, use more or differently natural ventilation, shut curtains during the day, spend more time outdoors or at work, or wear different clothing.
Second, for some household services there simply is not much of an intensive margin. Again consider refrigeration and air-conditioning as examples. Most households leave their refrigerators plugged in 24 hours a day so there is little scope in the short-run to adjust utilization in response to a change in energy-efficiency. In contrast a household can easily adjust the level of utilization of air-conditioning.
Households can adjust the settings on an air-conditioning unit, or turn it on and off, trading off the cost of operation versus thermal comfort.
Third, the price elasticity depends on the overall level of utilization. Consider airconditioning as an example. Above a certain income level, a household is going to choose to maintain the ideal level of thermal comfort at all hours of the day regardless of energy costs and the price elasticity of utilization for these households is very low. At lower income levels, however, households choose to operate their air-conditioners only on particularly hot days, or during particular hours of the day. Improvements in energyefficiency will lead these households to increase their utilization, potentially by a substantial amount.

Context and Program Rationale
The Growth in residential appliances is one of the major drivers of this increase in demand. Figure 1 plots ownership rates for televisions, refrigerators, and vehicles by income level in Mexico. As incomes increase households first acquire televisions, then refrigerators and other appliances, and it is not until income reaches substantially higher levels that households acquire vehicles. Again, this is typical of developing countries and policymakers worldwide are making appliance energy-efficiency a major point of emphasis (Gertler, Shelef, Wolfram and Fuchs, 2011). For example, development of energy-efficient appliances is one of the major initiatives of the Clean Energy Ministerial, a partnership of 20+ major economies, aimed at promoting clean energy. 22 Meeting this increased energy demand will require an immense investment in generation and transmission infrastructure. The Mexican Energy Ministry has calculated that $100 billion dollars will need to be invested in new generation and transmission infrastructure between 2010 and 2025. 23 The C4C program is viewed by policymakers as one of the ways to potentially reduce these looming capital expenditures. Part of the broader goal of our analysis is to consider whether energy-efficiency programs like C4C could serve as a substitute for these capital-intensive investments.
Much of the promise of C4C is based on ex ante engineering analyses that estimate that replacements should lead to substantial decreases in electricity consumption. In independent studies of available energy-related investments in Mexico both the World conditioners would be extremely cost-effective. 24 In fact, both reports found a negative net cost for these investments. That is, these were found to be investments that would pay for themselves even without accounting for carbon dioxide emissions or other externalities. At the heart of these predictions are optimistic predictions about the amount of electricity saved per replacement. The World Bank report, for example, considers an intervention essentially identical to C4C, in which refrigerators 10 years or older are replaced with refrigerators meeting current standards. The World Bank predicted that these refrigerator replacements would save 482 kilowatt hours per year, with larger savings for very old refrigerators. We will revisit these predictions below, contrasting them with the results from our empirical analysis.

Program Details
Launched in March 2009, the objective of the C4C program is to reduce electricity consumption and thereby reduce carbon dioxide emissions. Unlike Cash for Clunkers, the program has never been viewed as an economic stimulus program. 25  The program provides both direct cash payments and subsidized financing. The direct cash payments come in two different amounts, approximately corresponding to $140 and $80 dollars (see Appendix Table 1). To qualify for the more generous subsidy a household needs to have a fairly low level of mean electricity consumption. Households with medium levels of electricity consumption were eligible for the smaller subsidy, and households with high levels of electricity consumption were eligible for subsidized 27 Refrigerators must be between 9 and 13 cubic feet, and can have a maximum size no more than 2 cubic feet larger than the refrigerator which is replaced. Air conditioners are subject to similar requirements, both for the size of the new units and for the maximum size difference between the new and old units. In addition to these eligibility requirements there are several others. The individual requesting the subsidy must have their name on the electricity bill, have a public registered ID number (CURP), be 18 years old or older, be in good standing with the electricity company (i.e. no balance), and not be an employee of the electricity company or other affiliated governmental body. For air-conditioners participants additionally must reside in relatively hot parts of the country, corresponding to electricity tariffs 1C, 1D, 1E, or 1F. financing only. This structure was implemented out of distributional concerns in an attempt to target the program to lower-income households. Mean electricity consumption is calculated over the previous year. For refrigerator replacements mean consumption is calculated over non-summer months only. For air conditioners mean consumption is calculated over summer months.
Subsidized financing is subject to eligibility requirements that are similar to those in place for the direct cash payments. The financing comes in the form of a one-time credit that is paid back over a 4-year period. The loans are offered at a preferential interest rate that is below typical rates for consumer loans in Mexico. Households need not have a credit history in order to quality for these loans, though if a household does have a credit history it can be disqualified for having a poor credit history. The maximum credit amount available to a participating household depends on the household's mean electricity consumption, with higher maximum amounts available to households with higher levels of consumption.
Households can accept the cash subsidy, the subsidized financing, or both. In practice, all households choose to accept the cash subsidy, but many households decide not to use the subsidized financing. In addition to these two incentives, most participants are eligible for an additional subsidy (approximately $30 dollars) that is used to pay for the transport and disposal of the old appliance. The retired appliances are transported to recycling facilities and disassembled. 30 Stores are reimbursed for the subsidy about one month after the file is completed, which includes verified receipt of the old appliance at one of the recycling facilities.  Table 3. As of 2007, there were 23 million refrigerators owned nationwide. Of these, 57% were less than 10 years old, compared to 28% between 11 and 20 years old, and 15% more than 20 years old. 32 This number comes from personal correspondence with the Mexican National Association of Electric Materials (Cámara Nacional de Materiales Eléctricos, CANAME). Based on their own internal analysis of national-level sales data, CANAME concludes that C4C has generated through March 2012 a total of 900,000 additional refrigerator sales and 160,000 additional sales of air-conditioners (both about 60% of total C4C replacements). 33 This increase in sales also suggests that the economic incidence of the subsidy is largely on households. This makes sense given the structure of the market. Supply of appliances is highly-competitive in Mexico with 10+ companies involved in manufacturing refrigerators and air-conditioners, and a similar number of large national retailers. Multinational appliance companies like GE, LG, Samsung, and Daewoo have a significant presence in Mexico and the global manufacturing capacity to quickly ramp up production in response to increases in demand. 34 In a closely related market Davis and Kahn (2010)  In Mexico residential electricity is billed every two months using overlapping billing cycles. We assign billing cycles to calendar months based on the month in which the cycle ends. We then normalize consumption to reflect monthly consumption by dividing by the number of months in the billing cycle. The average number of months per billing cycle is 1.98 months, with 93% of all cycles representing two months. An additional 5% of all cycles represent one month, with the remaining 2% representing 3+ months. These irregular billing periods arise for a variety of reasons. For example, some households in extremely rural areas have their meters read less than six times per year.

Program Take-up
Equally important for the analysis is a second dataset which describes C4C participants. These data were provided by SENER and describe all participants in the program between March 2009 and June 2011, a total of 1,162,775 participants. We dropped 51,823 participants (4.5%) for whom no installation date for the new appliance was recorded. We merged the remaining data with the billing records using customer account numbers. We were able to match 86% percent of C4C participants with identical account numbers in the billing records. Each record in the program data includes the exact date in which the appliance was replaced, whether the appliance replaced was a refrigerator or an air-conditioner, the amount of direct cash subsidy and credit received by the participant, the reported age of the appliance that was replaced, and other program information. We drop 93 households (<.0001% of participants) who replaced more than one air-conditioner, leaving us with 957,080 total treatment households.

Empirical Strategy
The main empirical challenge in analyses of energy-efficiency has been the lack of high-quality microdata. Utilities are often reluctant to share billing data of the form used in this analysis and analyses with more aggregate data struggle to credibly distinguish the impact of policies from other determinants of energy consumption that are changing over time. Thus a significant part of the contribution of this analysis is simply bringing the relevant data to bear.
This section describes the estimating equation used for our baseline estimates of the effect of refrigerator and air-conditioner replacement on household electricity consumption. The basic approach is difference-in-differences. In the preferred specification, impacts are measured by comparing electricity consumption before and after appliance replacement using as a comparison group households living in the same county. The sheer size of our dataset and immense number of treatment and control households allows us to estimate effects precisely while using highly non-parametric specifications.
Our empirical approach is described by the following regression equation, Parameters and measure the mean change in electricity consumption associated with appliance replacement, corresponding to ∆ in the conceptual framework.
Most of the estimates that we report include household by month-of-year fixed effects, , . That is, for each household we include 12 separate fixed effects, one for each calendar month (e.g, January, February, etc). 36 This controls not only for timeinvariant household characteristics such as the number of household members and size of the home, but also household-specific seasonal variation in electricity demand. For example, these fixed effects capture the fact that some households have air-conditioning 36 In the billing data we observe both the housing unit and the household. Consequently, we can observe when a new household moves into an existing housing unit. In the empirical analysis we treat each household / housing unit pair as a separate "household". With household by month-of-year fixed effects we are identifying the effects of C4C using only households who remain in a housing unit for at least one year. and some do not, and that demand for air-conditioning varies differentially across the year for different households.
The estimates that we report also include month-of-sample fixed effects . This controls for year-to-year differences in weather as well as for population-wide trends in electricity consumption, for example, driven by increased saturation of durable goods throughout this period for reasons that have nothing to do with C4C. Many specifications, in addition, include month-of-sample by county fixed effects. This richer specification controls for county-specific variation in year-to-year weather, as well as differential population-wide trends across counties. Finally, the error term captures unobserved differences in consumption across months. In all results we cluster standard errors at the county level to allow for arbitrary serial correlation and correlation across households within counties.
Most specifications include a group of control households that we matched to the treatment households using account numbers. The way account numbers are structured they identify not only the state and county where the household lives, but also the specific route used by meter readers. For each C4C participant, we select the account corresponding to the closest consecutive non-participating housing unit. In many cases this is the household living immediately next door. In some cases a non-participating household is selected as the control for more than one treatment household. These matched households are a better comparison group than had we used the entire set of non-participating households because of their close physical proximity to the treatment households (see Figure 2). Weather is a major determinant of electricity consumption so this matching ensures, for example, that the distribution of counties of residence among the control households is the same at the distribution among treatment households.
Because our preferred specifications include month-of-sample by county fixed effects, identification comes from within-county, within-month comparisons between treatment and control households.
We also report estimates from regressions that are estimated using only households that participated in C4C. In these specifications we simply drop all control households and the effects of C4C are identified using within-household changes in electricity consumption. We continue to include month-of-sample by county fixed effects to control for time effects. This is possible because households replaced their appliances at different times during the sample period.

Graphical Results
This subsection presents graphical results intended to motivate the regression analyses that follow. We begin in this section with refrigerators rather than airconditioners because they make up 90% of all appliance replacements, and because refrigerators lend themselves well to the "event study" analysis performed here.
Whereas refrigerator electricity consumption is approximately constant across months of the year, air-conditioning usage has a strong seasonal pattern which is better examined in a regression context. where denotes the event month defined so that =0 for the exact month in which the refrigerator is delivered, =-12 for one twelve months before replacement, =12 for twelve months after replacement, and so on. The coefficients are measured relative to the excluded category ( =-1). Both sets of fixed effects play an important role here.
Without the county by month-of-sample fixed effects ( ), for example, the effect of replacement is confounded with seasonal effects as well as slow-moving populationwide changes in residential electricity consumption.
During the months leading up to replacement electricity consumption is almost perfectly flat, suggesting that the fixed effects are adequately controlling for seasonal effects and underlying trends. Beginning with replacement electricity consumption falls sharply by approximately 10 kilowatt hours per month. Consumption then continues to fall very gradually over the following year. We attribute the fact that the decrease appears to take a couple of months to the fact that the underlying billing cycles upon which this is based are actually every other month, and to a modest amount of measurement error in the replacement dates. The gradual decline between months +2 and +12 reflects modest compositional changes in the treatment households. 37 In all periods the coefficients are estimated with enough precision to rule out small changes in consumption in either direction.
With Figure 4 we perform the exact same exercise but using households who did not participate in C4C. Our sample is the set of all households that were matched to households that replaced their refrigerators. We assigned each household the replacement date of the household to which that household is matched. We then constructed event time before and after t=0 exactly as we did for Figure 3. The figure exhibits no discernible pattern with estimated coefficients near zero and statistically insignificant in all months, providing additional corroboration that our fixed effects are adequately controlling for seasonal effects and underlying trends.  (1) includes household by calendar month and month-of-sample fixed effects. In this specification, refrigerator replacement decreases electricity consumption by 11.2 kilowatt hours per month. This is similar in magnitude to the difference observed in the event study figure. Mean electricity consumption among households who replaced their refrigerators is about 150 kilowatt hours per month so this is a 7% decrease. Whereas 37 We show later that households who replaced refrigerators early in the program tended to save more electricity after replacement. These households are disproportionately represented as one moves from left to right in the figure. For example, whereas all treatment households are observed at t=0, only households who replaced during the first year are represented at t=12. refrigerator replacement decreases electricity consumption, the estimates indicate that air-conditioning replacement increases consumption by about 8.5 kilowatt hours per month. Mean electricity consumption among households who replaced their airconditioners is about 400 kilowatt hours per month, so this is a 2% increase.

Baseline Estimates
Column (2)  intervention period. However, we do find small but statistically significant differences in pre-intervention period. A common correction for this is to include a linear time trend for program participants. The results are essentially unchanged when in column (4) we do indeed include a linear time trend for participants following Heckman and Hotz (1989). In Appendix Table 2 we report results from including a quadratic and cubic polynomial time trend for participants and results are again similar. This suggests that the statistical significance is driven more by our large sample size rather than meaningfully differential trends.
Columns (5) and (6) present results from alternative specifications. In both columns we drop all control households, and estimate the regressions using only households who participated in the program. Column (6), in addition, drops the month during which replacement occurred. In these regressions, we are controlling for time effects by exploiting differential timing of replacement across sample months. The estimates change little in these specifications, suggesting that what matters most in these regressions is the within-household comparison. In additional results, not reported, we have also estimated regressions using a random sample of non-participating households. 38 Results are again very similar. Although we prefer to use the matched comparison group, we find it reassuring that the results are similar regardless of which comparison group we use, or whether we use a comparison group at all.  newly eligible households tend to have less to gain from replacement, and as time goes 38 An additional possible specification would be to use a matched comparison group, in which the matching is done both on geography and historical electricity usage. Given the insensitivity of our results, however, we think that this would be highly unlikely to change the results. Moreover, matching on the basis of historical consumption would require us to exclude households who replace in the first couple of months in our sample. 39 We merged in the 2010 Mexican Census data using state and county names. For 93% of households there was an exact match for both state and county. Requiring an exact match for the state and using probabilistic string matching for the county we increased this to 98%. For the remaining 2% of households we used the mean characteristics of the state.

Heterogeneous Effects
on an increasing proportion of the participants are those that just barely meet the eligibility requirements. For example, with appliances that are exactly 10 years old in 2010 or 2011.
Continuing to explore heterogeneity in the effect of appliance replacement we now turn to variation across months of the year. Figures 5A and 5B  Estimates are close to zero during winter months, but then large and positive during summer months. The largest coefficient corresponds to September. Because the billing data is bimonthly, this reflects change in consumption during August and September, two of the warmest months in Mexico.

Comparing Our Results to Ex Ante Predictions
Available ex ante analyses predicted that the savings from appliance replacement  Hausman (1979), finds that the elasticity of utilization for room air-conditioners is 0.27. It is worth highlighting, however, that like most of the existing empirical work in this area, this estimate comes from the United States. At lower income levels and lower baseline utilization levels the price elasticity is going to be higher. Particularly important is where a household is relative to its "bliss point" for ambient temperature. A household far away from its optimal level for thermal temperature has more scope to change utilization in response to an improvement in energy-efficiency. 45 Here there is a bit of a distinction between C4C and most vehicle retirement programs. "Cash for Clunkers", for example, required vehicles to have been registered for at least 12 months prior to being Second, appliance sizes have increased over time. Both refrigerators and airconditioners were supposed to meet specific size requirements. New refrigerators were supposed to be between 9 and 13 cubic feet, and have a maximum size no more than 2 cubic feet larger than the refrigerator which is replaced. Similar requirements were imposed for air conditioners. Again, however, it is likely that enforcement was less than Fourth, there is no evidence that the program was particularly effective at targeting households with very old appliances. Program rules required appliances to be at least 10 years old and a disproportionate fraction of old appliances were reported to just barely meet this requirement. The average reported age of the refrigerators that were replaced is 13.2 years. Almost 70% were 10-14 years old, 20% were 15-19, and only 10% were 20 years or older. 48 The average reported age for air-conditioners is 10.9 years and only 5% traded. There is no equivalent registration system for appliances making it more likely that a program brings in appliances that are not actually being used. 46 The current standard both in the United States and Mexico specifies that refrigerators with top-mounted freezer and automatic defrost without through-the-door ice has a maximum energy use of 9.80AV+276.0 where AV is the total adjusted volume in squared feet. Under C4C refrigerators had to be between 9 and 13 cubic feet, implying a range of minimum consumption from 364 to 403 kilowatt hours per year. 47 Current energy-efficiency standards both in the United States and Mexico provide separate requirements for refrigerators with and without through-the-door ice. Refrigerators without through-the-door ice have a maximum energy use of 9.80AV+276.0 where AV is the total adjusted volume in squared feet. The equivalent formula for refrigerators with through-the-door ice is 10.20AV+356.0. 48 There are a couple of possible explanations for the apparent lack of success at targeting very old refrigerators. One possible explanation lies in the eligibility criteria themselves. As we discussed earlier, the of the air-conditioners that were replaced are more than 15 years old. There is likely to be a large amount of measurement error in these self-reported ages, but this apparent lack of success at targeting very old appliances is potentially important because energyefficiency has improved steadily over time (see Figure 6). 49 6 Cost-Effectiveness 6.1 Baseline Estimates  During this period there were about 850,000 refrigerator replacements and about 100,000 air-conditioning replacements. Our estimates imply that total reduction in electricity consumption associated with the program is about 100 gigawatt hours annually. As a size of the subsidy is decreasing in household electricity consumption. This means that, ironically, households with very inefficient appliances tend to not qualify for the most generous subsidies. Another explanation is income. Low-income households tend to have older appliances, and also may be less likely to participate in the program due borrowing constraints or other factors. Panel (C) reports baseline estimates of cost-effectiveness. Based on the total number of participants and the subsidies that they received we calculate that direct program costs were $130 million for refrigerators, and $13 million for air-conditioners.
This includes the cash subsidies received by households, but not costs incurred in 54 These calculations capture the energy consumed in refrigerator operation but not from the energy consumed in other parts of the refrigerator "life-cycle". Taking into account materials production and processing, assembly, transportation, dismantling, recycling, shredding, and recovery of refrigerant, Kim, Keoleian, and Horie (2006) find that energy usage during operation accounts for 90% of total life-cycle energy use. 55 Greenstone, Kopits, and Wolverton (2011) presents a range of values for the social cost of carbon dioxide according to different discount rates and for different time periods that is intended to capture changes in net agricultural productivity, human health, property damages from increased flood risk, and other factors. In Table 4 with a 3% discount rate (their "central value") for 2010 they find a social cost of carbon dioxide of $21.40 (in 2007 dollars) per metric ton of carbon dioxide. In 2010 dollars this is approximately $22.
program design, administration, advertising, or other indirect costs. Evidence from previous studies indicates that these indirect costs are important (Joskow and Marron, 1992). Dividing these costs by the estimated change in electricity consumption provides a measure of the direct program cost per kilowatt hour reduction. The relevant change here is the total discounted lifetime change in electricity consumption. For this calculation we adopt a 5% annual discount rate and assume that the program accelerated appliance replacement by 5 years. Under these assumptions the program cost per kilowatt hour is $.25 for refrigerators and $.30 overall. We do not report program cost per kilowatt hour separately for air-conditioners because the program led to an increase rather than a decrease in consumption. The program cost per ton of carbon dioxide emissions can be calculated similarly. For both refrigerators-only and for the entire program this exceeds $400 per ton.
These estimates of cost-effectiveness change predictably under alternative assumptions. The choice of a 5% discount rate is fairly standard in the literature (see, e.g., Arimura, Li, Newell, and Palmer 2011). With a 0% discount rate the program cost per cost per kilowatt hour is $0.27 compared to $0.30, and the program cost per ton of carbon dioxide is $460 compared to $506. The measures of cost-effectiveness are more sensitive to the assumption about how many years over which to calculate lifetime benefits. We have assumed that the program accelerates appliance replacement by 5 years but it seems likely that many of these participants were "free riders", i.e.
households who would have replaced their appliances anyway, in which case the program does not accelerate replacement at all and 5 years is too generous. On the other hand, even for "free riders", the program prevents appliances from being resold to other households who might have continued to use them for many years. If one assumes that the program accelerated appliance retirement program by 10 years, then the program cost per kilowatt hour is $0.17 compared to $0.30, and the program cost per ton of carbon dioxide is $283 compared to $506.
Some have argued that C4C would have been much more cost-effective if participants had been required to purchase more energy-efficient appliances. Program rules required participants to purchase refrigerators and air-conditioners that exceeded the minimum energy-efficiency standards by 5%. These standards date back to 2002, and the market for both refrigerators and air-conditioners has moved considerably past this.
The United States, for example, has adopted new energy-efficiency standards for refrigerators that will take effect in 2014 that are about 25% more energy-efficient than current standards. A typical refrigerator meeting these more stringent standards uses 63 fewer kilowatt hours annually. 56 Had the refrigerator replacements saved 63 more kilowatt hours per year, the program cost per kilowatt hour (for refrigerators) would have been $0.17 compared to $0.25, and the program cost per ton of carbon dioxide (for refrigerators) would have been $289 compared to $427.

Discussion and Limitations
These estimates of the program cost per kilowatt hour avoided and per ton of carbon dioxide abated are high compared to most available estimates in the literature.
For example, electric utilities in the United States reported in 2010 spending $2.9 billion in energy-efficiency programs leading to 87 terawatt hours of energy savings, implying an average direct program cost per kilowatt hour of 3.3 cents. 57 Economists have long argued that these self-reported measures likely overstate the cost-effectiveness of these programs (Joskow and Marron, 1992). Nonetheless, it is striking that our estimate for C4C is 9 times larger. Allcott (2011)  It is important to emphasize that program cost per kilowatt hour avoided and program cost per ton of carbon dioxide abated are both measures of cost-effectiveness, and do not capture the full set of welfare implications of a program like C4C. Although commonly used in previous studies of energy-efficiency, these measures miss several important program components on both the cost and benefit side, and don't carefully distinguish between private and social costs and benefits. For example, a substantial benefit from the program is that households receive utility from using newer, more feature-rich appliances. With refrigerators, households enjoy better insulation, and in some cases automatic ice-makers, through-the-door water and ice and other features.
There may even be modest health benefits in the form of improved cooling and temperature consistency leading to less spoilage. These increases in utility were not the primary objective of the program, but they are a substantial source of benefits.
These measures also capture only some of the total costs of the program. These measures of cost-effectiveness include the direct cost of the subsidies, but not the indirect costs from administering the program. Another important component in the full welfare analysis is the private costs borne by households to purchase these appliances.
Even the households in C4C who qualified for the most generous subsidies ended up paying for a large part of the price of these appliances out of pocket. A comprehensive welfare analysis would also want to take into account the costs borne by buyers in the secondary market. These households are perhaps the biggest losers from the program, now without access to the over one million old refrigerators and air-conditioners that have been destroyed. These older appliances have real economic value, particularly in a country like Mexico where appliance saturation is less than 100% and electricity rates are subsidized.
The sense in which these cost-effectiveness numbers make sense is, instead, as a metric for comparing tradeoffs between government expenditure, electricity consumption, and the environment. As we framed at the beginning, when policymakers envision appliance replacement programs they typically have in mind these tradeoffs, so it makes sense to evaluate the program on this basis. And, along this dimension, C4C is an expensive policy. Viewed as strictly a tradeoff between government expenditure and the environment, the Mexican government could do much better buying permits in the European Union Emissions Trading System and tearing them up. As of March 2012, the price of a permit is about $10 per ton. Thus, purchasing permits and tearing them up would allow the Mexican government to "buy" carbon dioxide abatement at a cost about 1/50 th the cost of C4C.

Conclusion
At first glance, there would seem to be much to like about an appliance replacement program like C4C. Over the last 30 years residential appliances have experienced unprecedented gains in energy-efficiency, so there would seem to be scope for these programs to substantially decrease energy consumption. Moreover, residential appliances like refrigerators are long-lived durable goods with a low baseline replacement rate, so it seems reasonable to believe that a subsidy could substantially accelerate their turnover.
Thus it is hard to not be somewhat disappointed by the program results. We found that households who replace their refrigerators with energy-efficient models indeed decrease their energy consumption, but by an amount considerably smaller than was predicted by ex ante analyses. Even larger decreases were predicted for air-conditioners, but we find that households who replace their air-conditioners actually end up increasing their electricity consumption. Our results indicate that C4C reduces electricity consumption at a program cost of about $.30 per kilowatt hour, and reduces carbon dioxide emissions at a program cost of about $500 per ton.
These results underscore the urgent need for careful modeling of household behavior. A central feature in our household production framework, and a key theme throughout the study, has been the importance of accounting for changes in utilization.
A nice feature of the analysis is that refrigerators and air-conditioners occupy different ends of the spectrum along this dimension, making comparisons particularly interesting.
Households receive utility from using these appliances, and they can and should increase utilization when upgrading to more energy-efficient appliances. This "rebound" is a good thing -it means that households are increasing their utility. It does, however, complicate the design of energy-efficiency policy and ceteris paribus, in pursuing environmental goals it will make sense for policymakers to target appliances for which demand for utilization is inelastic.
More broadly our results point to several lessons for the design and evaluation of energy-efficiency policies. Over time appliances have become more energy-efficient, but also bigger and better. These size and quality increases are another form of the demand for increased utilization, and it makes sense to take them into account when designing policy. Also, despite attempts by administrators to build enforcement mechanisms into the program design, it is difficult to prevent people from receiving subsidies for nonworking durable goods. While one can envision third-party enforcement mechanisms, this would add cost to the program and be susceptible to fraud.

FIGURE 2a
Comparing Participants to Non-Participants: Refrigerators

FIGURE 2b
Comparing Participants to Non-Participants: Air Conditioners Note: These figures plot average electricity consumption by calendar month for households who replaced their refrigerators and air-conditioners through the C4C program ("participating households"), non-participating households matched to these treatment households using account number information ("matched non-participating households") and all non-participating households. For all households the sample is restricted to observations from the first year of the program (May 2009-April 2010). Additionally, for treatment households the sample is limited to those who adopted after the first year of the program (May 2010-April 2011). This restriction ensures that the figure describes pre-treatment means i.e. from before households receive a new appliance.

FIGURE 3
The Effect of Refrigerator Replacement on Household Electricity Consumption Note: This figure plots estimated coefficients and 95 th percentile confidence intervals describing monthly electricity consumption before and after refrigerator replacement. Time is normalized relative to the delivery month of the appliance (t=0) and the excluded category is t=-1. The sample includes 858,962 households who received new refrigerators through C4C between March 2009 and May 2011 and an equal number of non-participating control households matched to treatment households using account number information. The regression includes household and county by month-of-sample fixed effects. Standard errors are clustered by county.

FIGURE 4
Assessing the Validity of the Comparison Group Note: This figure is constructed similarly to Figure 3 but using households who did not participate in C4C. The sample includes 858,962 non-participating households matched using account number information to households who replaced their refrigerators through the C4C program. Each household is assigned the replacement date for the household to which that household is matched. Time is normalized relative to this month (t=0) and the excluded category is t=-1. The regression includes household and county by month-of-sample fixed effects. Standard errors are clustered by county.    Notes: This table describes data from the Mexican National Census Censo de Poblacion y Vivienda from the years indicated in the column headings. These statistics were compiled by the authors using microdata from the longform survey which is completed by a 10% representative sample of all Mexican households. All statistics are calculated using sampling weights. We have cross-checked total population, number of households, and appliance saturation at the national and state level against published summary statistics and the measures correspond closely. Improved flooring includes any type of home flooring except for dirt floors. In all regressions the dependent variable is monthly electricity consumption in kilowatt hours and the coefficients of interest correspond to indicator variables for households who have replaced their refrigerator or air-conditioner through C4C. The sample includes billing records from May 2009 through April 2011 from the complete set of households that participated in the program and a sample of non-participating households matched to treatment households using account number information. Mean electricity use is 153 kilowatt hour and 395 kilowatt hour per month for households who replaced refrigerators and air conditioners, respectively. Standard errors are clustered by county. Double asterisks denote statistical significance at the 1% level; single asterisks at the 5% level. Notes: This table reports coefficient estimates and standard errors from six separate regressions, three per panel. In each regression the sample is restricted to a subset of C4C participants as indicated in the row headings, along with a matched sample of nonparticipating households. In all regressions the dependent variable is monthly electricity consumption in kilowatt hours. Coefficients are reported from indicator variables for whether the household had replaced their refrigerator or air conditioner. All regressions include household by calendar month and county by month-of-sample fixed effects. Standard errors are clustered by county. Double asterisks denote statistical significance at the 1% level; single asterisk denotes 5% level. The total number of households in each panel is slightly larger than the sample size in Table 3 because 486 households replaced both a refrigerator and an airconditioner. Notes: This table reports coefficient estimates and standard errors from four separate regressions. In all regressions the dependent variable is monthly electricity consumption in kilowatt hours and the coefficients of interest correspond to indicator variables for households who have replaced their refrigerator or air-conditioner through C4C. Column (2) includes an interaction between month-of-sample and an indicator for whether the household participated in C4C. Column (3) contains that same term as well as an interaction between month-of-sample squared and an indicator for participation. Column (4) contains both of those terms plus an interaction for month-of-sample cubed and an indicator for participation. The sample includes billing records from May 2009 through April 2011 from the complete set of households that participated in the program and a sample of non-participating households matched to treatment households using account number information. Standard errors are clustered by county to allow for arbitrary serial correlation and correlation across households within municipalities. Notes: This table describes data from the Mexican Encuesta Nacional de Ingreso y Gasto en los Hogares (ENIGH), a biannual survey conducted August through November. These statistics were compiled by the authors using microdata from the long-form survey which is completed by a representative sample of Mexican households. All statistics are calculated using sampling weights. We have cross-checked total population and number of households at the national and state level against published summary statistics and the measures correspond closely. We also carefully examined, but are not reporting, expenditure information in ENIGH including, in particular, purchases of refrigerators and air conditioners. We found that the implied number of purchases in these data varied by an unreasonably large amount across years and implied a total number of purchases considerably lower than total annual purchases for those appliances according to Mexican industry sources.