Urban CO 2 Emissions A Global Analysis with New Satellite Data

The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. This paper estimates an urban carbon dioxide emissions model using satellite-measured carbon dioxide concentrations from 2014 to 2020, for 1,236 cities in 138 countries. The model incorporates the global trend in carbon dioxide concentration, seasonal fluctuations by hemisphere, and a large set of georeferenced variables that incorporate carbon dioxide–intensive industry structure, emissions from agricultural and forest fires in neighboring areas, demography, the component of income that is uncorrelated with industry structure, and relevant geographic conditions. The income results provide the first test of an Environmental Kuznets Curve relationship for carbon dioxide based on actual observations. They suggest an environmental Kuznets curve that reaches a peak near or above $40,000 per capita, which is at the 90th percentile internationally. The research also finds that economic development has a significant effect on the direction of the relationship between population density and carbon dioxide emissions. The relationship is positive at very low incomes but becomes negative at higher incomes. The paper also uses cities’ mean regression residuals to index their carbon dioxide emissions performance within and across regions, decomposes model carbon dioxide predictions into broad source categories for each city, and uses the regression residuals to explore the impact of subway systems. The findings show significantly lower carbon dioxide emissions for subway cities. This paper is a product and Practice. It is of a larger the open research a to development policy the Policy Research Working Papers largest negative residuals in the first decile, and identify the subset of 132 global subway cities in each decile. We find that subway cities are four times first-decile cities than among tenth-decile cities. We also find that representation cities declines across deciles. While these results provide strong suggestive support for the non-Pigouvian view, they are subject to potential endogeneity that should be considered in future research.


Policy Research Working Paper 9845
This paper estimates an urban carbon dioxide emissions model using satellite-measured carbon dioxide concentrations from 2014 to 2020, for 1,236 cities in 138 countries. The model incorporates the global trend in carbon dioxide concentration, seasonal fluctuations by hemisphere, and a large set of georeferenced variables that incorporate carbon dioxide-intensive industry structure, emissions from agricultural and forest fires in neighboring areas, demography, the component of income that is uncorrelated with industry structure, and relevant geographic conditions. The income results provide the first test of an Environmental Kuznets Curve relationship for carbon dioxide based on actual observations. They suggest an environmental Kuznets curve that reaches a peak near or above $40,000 per capita, which is at the 90th percentile internationally. The research also finds that economic development has a significant effect on the direction of the relationship between population density and carbon dioxide emissions. The relationship is positive at very low incomes but becomes negative at higher incomes. The paper also uses cities' mean regression residuals to index their carbon dioxide emissions performance within and across regions, decomposes model carbon dioxide predictions into broad source categories for each city, and uses the regression residuals to explore the impact of subway systems. The findings show significantly lower carbon dioxide emissions for subway cities.
This paper is a product of the Urban, Disaster Risk Management, Resilience and Land Global Practice. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at sdasgupta@worldbank.org, slall1@worldbank.org, and wheelrdr@gmail.com.

4
Third, we use the results to estimate the expected CO2 concentration for each city, given its geographic, demographic and economic characteristics. We explore the implications using regression residuals as performance indicators that identify cities whose CO2 emissions are less than or greater than model-based expectations. This provides the first empirical scorecard of city performance in CO2 emissions management.
Fourth, we use the econometric results to address a critical policy question on the role of public investment as a supplement to Pigouvian policy in mitigating CO2 emissions. Economists generally support emissions reduction via emissions taxation or permit trading (Stiglitz and Stern 2021;Jacobs and van der Ploeg 2019;King et al. 2019;Klenert et al. 2018). Many policy analysts who support Pigouvian pricing also argue for a non-Pigouvian supplement: coordinated public investment in low-carbon land development, energy and transport that will accelerate the transition to low-carbon economies (van der Ploeg and Venables 2020). Investments in subway systems provide an interesting test case for the non-Pigouvian approach. Reduced motor vehicle emissions and energy efficiencies associated with higher-density development are frequently cited as carbon-saving advantages of mass transit systems. By implication, cities with subways should have followed lowercarbon development paths, other things equal. We explore this proposition with our regression residuals, drawing on a recent global survey of subway systems (Turner and Gonzalez-Navarro 2018).
The remainder of the paper is organized as follows. Section 2 introduces the econometric model, while Section 3 describes the data. Section 4 performs an exploratory analysis of the data and the econometric results are presented in Section 5. Section 6 explores the implications, Section 7 offers a prospectus for future research, and Section 8 summarizes and concludes the paper.

Modeling Urban CO2 Concentrations
An extensive empirical literature has explored the determinants of CO2 emissions growth. Most of the attention has focused on drivers of CO2 emissions from fossil fuel combustion and cement production (e.g. Raupach et al., 2007;Jotzo et al., 2012). Land-use change also produces large CO2 emissions, which have been estimated more precisely by recent research (Gasser et al. 2020;Winkler et al. 2021). However, emissions drivers in this sector have received less attention than work on industrial determinants (Sanchez and Stern 2016). For both industrial and non-industrial sectors, previous studies have relied almost exclusively on estimates from CO2 emissions inventories that apply engineering parameters to measures of activity in industry, transport, land-clearing, residential heating and other sectors. Standard engineering estimates are particularly suspect for developing countries, because many of the parameters have been calibrated using databases and models developed for high-income economies.
This study takes a completely different approach, employing only direct CO2 observations from satellites. The dependent variable in our model is the atmospheric CO2 concentration above an urban area. For climate change analysis, the dominant concentration component is the global cumulative stock of long-duration CO2 molecules that have been emitted by human activity since the Industrial Revolution. Another global component is seasonal, reflecting differential CO2 absorption and respiration by vegetation over the annual cycle. The seasonal CO2 component is more pronounced in the Northern Hemispheric because it has more plant life than the Southern 5 Hemisphere. The third component is local, reflecting the time lag between local emissions of CO2 molecules and their full dispersion into the global mix. In this paper, we use the term "concentration anomaly" for the local component because it measures the deviation from the global background CO2 concentration.
Our model incorporates the two global components in a global time trend and controls for seasonality and hemisphere location. The local component comprises variables in three broad categories. The first includes activities in the most critical CO2-emitting sectors. The Intergovernmental Panel on Climate Change (IPCC) (Gale et al. 2005) has identified four dominant industrial sources of CO2 emissions: power plants, steel mills, cement plants, and oil refineries. Another potentially-important factor is atmospheric "spillover" from agricultural and forest burning in neighboring areas. Traffic emissions are also important, but we do not have reliable measures of motor vehicle operation for most of the cities in our sample. 1 The model therefore absorbs motor vehicle operations into the non-industrial income effects that are described below.
The second category includes demographic and economic variables. CO2 emissions should increase with urban population, ceteris paribus, because each resident accounts for some emissions. The spatial distribution of population may also play a significant role. In high-income areas that rely heavily on mechanized transport, increased population density may lower aggregate emissions by reducing travel requirements. On the other hand, higher population density in low-income areas may translate to higher CO2 emissions because small reductions in sparse mechanized transport do not offset increased CO2 emissions from factors like more concentrated household cooking and heating.
Economic development also has countervailing effects on CO2 emissions intensity. Higher-income urbanites use more goods and services, some of which are CO2-intensive (e.g., home heating, motor vehicles). However, locational economics inhibit CO2-intensive industrial activities in higherincome urban areas with higher land costs, fewer active resource mining sites, and stricter control of local air pollutants that are emitted along with CO2. In consequence, higher-income cities tend to import CO2-intensive goods and services from areas where the converse conditions hold. The net effect of urban economic development on CO2 emissions depends on the relative strength of direct income effects and indirect displacement and pollution control effects. Empirical research on the relationship between CO2 emissions and income has relied almost entirely on country-level emissions inventories. The results are mixed; some studies find a linear relationship, while others identify an inverse U-shaped relationship, or Environmental Kuznets Curve (EKC) (Ben Youssef, Hammoudeh and Omri 2016; Dasgupta et al. 2002). Where the EKC holds, domination passes from direct effects to displacement and pollution control effects as income rises. With satellitebased CO2 measures, the EKC investigation in this paper departs from conventional practice by avoiding emissions inventories and focusing on local rather than national CO2 emissions.
Climate comprises the third category. Other things equal, we would expect greater annual heatingrelated CO2 emissions from cities in cold climates.
We specify a linear estimation model because the atmospheric CO2 load should be additive in CO2 emissions from different sources. Spatially-referenced variables in the model are translated to 6 consistent measures by resampling to centroids for grid cells with a resolution of 10 km. We allow for measurement "spillover" as emissions diffuse from source cells to neighboring grid cells.
We incorporate all of the previously-mentioned factors in our estimation model: Our prior expectations for parameter signs are positive for elapsed time (β1), sectoral activity (β2, … β8) and population (β9). Our population density specification has two terms, density (β10) and density interacted with inverse log income (β11). Sign-switching can occur for the case [β10 < 0, β11 > 0], where the first (negative) term dominates for high incomes and the second (positive) term 7 dominates for low incomes. We use inverse log income because the log transformation increases our ability to differentiate effects at very low income levels.
For the relationship between CO2 concentration and income per capita, we expect (β12 > 0). In the income-squared term, we expect β13 < 0 (the EKC case) or β13 = 0 (a linear relationship). Heating requirements are greater in colder climates, so we expect the effect of heating degree days on CO2 emissions (β14) to be positive.
Clearly-exogenous variables in the model include fires in neighboring areas and the demographic and climate factors. We include georeferenced data on CO2-intensive industrial facilities because they should have important effects on observed CO2 concentrations above their locations. Their inclusion serves our principal objective, a relatively complete econometric accounting of city-level CO2 emissions that can be used for comparative performance benchmarking and exploration of residuals. However, we recognize the possibility of some estimation bias from interactions with income and local air pollutants. Industrial facilities emit locally-significant air pollutants (e.g., NO2, SO2, fine particulates) along with CO2. Technical measures to reduce local air pollutants (e.g., stack scrubbers for SO2 emissions) do not reduce CO2 emissions directly, but the associated costs may have significant indirect effects by altering facility location decisions. Stricter environmental regulation in higher-income cities may enhance this effect. In principle, we could treat the potential bias problem by using measures for input costs to instrument the facility-level variables. In practice, however, we have no prospect of measuring the relevant variables at the requisite spatial resolution. Joint determination of income and CO2 emissions could also be a significant problem for the EKC component of the model. Our results discussion will include some estimation exercises that shed further light on these issues.

Sources
Several satellite platforms provide CO2 measures (Pan et al. 2021 The design of OCO-2 supports comparative exercises like our analysis. It follows a sunsynchronous near-polar orbit, crossing the equator in ascending mode around 1330 hours local time. In practice, this means that the OCO-2 observations for our study are collected between 1200 and 1500 local time for all cities in the sample. This provides a consistent mid-day activity benchmark for comparing CO2 concentration anomalies across cities. 2 OCO-2 has an observation repeat time of 16 days We have downloaded georeferenced measures of XCO2 (the column-averaged dry air mole fraction of CO2) and computed daily mean values for each 10 km grid cell in our global database.
We use georeferenced facility-level global databases to obtain capacity measures and technology specifications for power plants (Byers et al. 2021), steel mills (GEM 2021), cement plants (McCaffrey et al. 2021), and oil refineries (Auch 2017). We use capacity estimates in the regressions because production estimates are both scanty and assigned low confidence by the database producers. Van der Werf et al. (2017) provide monthly estimates of carbon emissions from agricultural and forest burning at 25 km resolution.
The World Cities Database (2021) provides data on urban centroid locations and populations. Population density data are provided at 5 km resolution by CIESIN (2018). We use two sources to construct our georeferenced measure of income per capita. From the G-Econ database (Nordhaus et al. 2006), we obtain GDP per capita in 2005 purchasing power parity for a global grid with 100 km resolution. Each grid cell is assigned to its geographically-dominant country by G-Econ. For each cell in a country, we compute the ratio of cell GDP per capita to the national mean for all cells.
We merge the results with annual World Bank estimates of GDP per capita in constant $US 2015, and use the cell ratios to estimate annual GDP per capita for each cell. We introduce another proxy for economic activity by incorporating monthly observations for VIIRS global nighttime lights (EOG 2021) at 500 m resolution. Mistry (2019) has provided global estimates of monthly heating degree days at 25 km resolution.

CO2 Diffusion Effects
Space-based observations detect higher CO2 concentrations over significant emissions sources because atmospheric diffusion is not instantaneous. As emissions diffuse from their sources, deviations from background concentrations will persist for some distance. By implication, the effects of emissions sources in the model should not be constrained by arbitrarily-scaled grid structures. To allow for diffusion, we incorporate the assumption that deviations from background CO2 concentrations are inversely related to distances from emissions points. We search for best-fit inverse-distance functions in regression experiments with distance exponent values from 0.1 to 3.0. We obtain the best results with an exponent that is effectively 1.0. It does not differ significantly across model variables, which simply confirms that CO2 molecules behave identically in the atmosphere, whatever their source. For estimation, the effect of an emissions source on satellitemeasured CO2 concentration has unitary weight at its location and inverse-distance weights at the centroids of neighboring cells. For tractability, we bound the weighting radius at 100 km (where the weight is .01).

Wind Effects
Our modeling approach uses grid search experiments to determine distance decay functions for CO2 emissions from all geolocated sources (power plants, steel mills, cement plants, refineries, agricultural and forest burning). To illustrate the consequence, a capacity observation for coal-fired power plants at each 10-km grid centroid is the sum of inverse-distance-weighted capacities of coalfired plants within a 100-km radius. At each point in time, the trajectory of emissions from each plant is potentially influenced by the wind direction at its location. However, we aggregate information from plants that may be widely separated, with a different wind direction at each plant.
In many urban areas (particularly in developing countries), the prevailing wind direction is regularly recorded at only one location (typically the airport). Taken together, the scarcity of wind direction information and aggregation of capacity measures across dispersed facilities in different microclimates eliminate any chance of incorporating meaningful wind effects. Of course, all locations in some urban areas may be subject to identical wind-direction effects during some periods, in which case radially-symmetric inverse-distance weighting would generate errors for some observations. However, our study calculates daily averages from hourly data, with consequent summation over random changes in wind directions for multiple, widely-scattered points over five annual cycles. Some measurement error probably remains, but we believe that it compares favorably with measurement errors for other variables in our model (e.g., facility capacities, local heating degree days, neighboring fire locations).

Urban Scale
How many grid cells should be assigned to each city in our sample? Comparative analysis must confront the difference between arbitrary administrative boundaries and the actual extent of urban economic regions. Any attempt to define the latter will also include an arbitrary element. For all cities in this exercise, we standardize by including grid cells that lie within the same radial distance from their centroids. Then we test the robustness of the model by varying the radial distance. Another possible approach involves identification of a functional urban area (FUA) for each city region, as in Schiavina et al. (2019). FUAs are identified from urban mobility data for the OECD and Colombia, and estimated for other cities using a machine learning algorithm trained from estimated travel times, population distributions and incomes. We provide a further robustness test by estimating the model for FUAs as well.

Annual CO2 Concentrations
The global reference standard is provided by CO2 measurement at Mauna Loa Observatory, Hawaii (Keeling et al. 1976;Thoning et al. 1989

Hemispheric Seasonal Cycles
As previously noted, global CO2 concentrations fluctuate seasonally with CO2 absorption and respiration by vegetation over the annual cycle. Seasonal fluctuations are more pronounced in the Northern Hemispheric because it has more plant life. To measure the amplitude of CO2 cycles in the OCO-2 data, we regress measured CO2 on a time trend and compute monthly mean residuals for the Northern and Southern Hemispheres. Figure 1 displays mean residuals by month. The cycle is pronounced in the Northern Hemisphere, with the peak in April, the trough in August, and an annual amplitude of 7.5 ppm. The Southern Hemisphere cycle is much flatter, with the peak in July, the trough in March, and an annual amplitude of 1.5 ppm.

City Concentration Anomalies
In our model, differences in cities' economic, demographic and geographic conditions affect residual variations in measured CO2 once global trend growth and seasonal fluctuations are accounted for. We assess the scale of these concentration anomalies by computing residuals from a regression of CO2 on a time trend and hemispheric dummy variables for months. Figure 2 displays the distribution of mean anomalies for the cities in our sample, with a standard deviation of 1.0 ppm, 94.8% of cities in the range [-2,2] ppm and 98.5% in the range [-3,3] ppm. For comparison, human activity currently generates about 40 gigatons of CO2 emissions each year, increasing the atmospheric CO2 concentration by about 2 ppm. The city concentration anomalies in Figure 2 have the same order of magnitude, thus highlighting the global significance of inter-city variation.
Methods for direct conversion of city residuals to CO2 emissions are still in the research phase. Several recent studies (e.g. Ye et al. 2020;Wu et al. 2020) compare OCO-2-based city concentration anomalies (∆CO2OCO2) with anomalies (∆CO2E) estimated by combining atmospheric transport models with city-level data from global emissions inventories (principally ODIAC (Oda, Maksyutov and Andres 2018)). For the scaling factor [R=∆CO2OCO2/∆CO2E], Ye et al. (2020) find values of 1.6-1.9 for Riyadh, 2.4-2.9 for Cairo, and 2.9-3.2 for Los Angeles. In all three cases, city concentration anomalies calculated from standard emissions inventories significantly underestimate the anomalies in OCO-2 observations. These discrepancies may incorporate errors in sector-level activity data or emissions parameters employed by emissions inventories, as well as exclusion of some sectors from the inventories. For each city studied, a mid-range R-factor could be used to adjust its inventory-based emissions estimate.
Over time, accurate estimation of R-factors for more cities may permit larger-scale adjustment of urban CO2 emissions inventory estimates. The research reported in this paper contributes by quantifying the incremental contributions of multiple sectors to city-level OCO-2 concentration anomalies. Follow-on research could construct sector-level R-factors for adjusting emissions inventories at the sector level. Longer-term, R-factor research may succeed in dropping its current dependence on emissions inventories and produce methods for direct estimation of CO2 emissions from satellite-measured CO2 concentration anomalies. At present, however, this domain remains largely unexplored.

Sources of CO2 Emissions Variation across Cities
Before the formal econometric analysis, we explore descriptive evidence on three potential sources of variation in urban CO2 emissions.

Population
As previously noted, CO2 emissions should increase with urban population because each resident accounts for some emissions. We divide our sample cities into three size ranges with lower bounds at 500,000, 2,000,000 and 5,000,000 and compute mean CO2 concentration anomalies for the cities in each range. As Figure 3 indicates, mean anomalies increase with population size range. Population variation is associated with city anomaly variation over a range of 0.7 ppm, which is 0.7 standard deviation for the overall urban CO2 anomalies displayed in Figure 2.

Industry Structure
For this exploration, we combine information on capacity in each city for industrial facilities in six categories: power plants fired by coal, gas and oil; non-electric steel mills; refineries; and cement plants. We normalize capacity in each category to the range [0 -100] 4 and compute total normalized capacity for the six sectors in each city. We compute mean CO2 anomalies for cities in five capacity 13 ranges. Figure 4 shows a positive relationship between CO2 anomaly and aggregate capacity in CO2-intensive facilities, with capacity variation associated with anomaly variation over a range of 1.0 ppm, which is 1 standard deviation for the overall urban CO2 anomalies displayed in Figure 2.

Economic Development
We divide cities into five ranges of income per capita ($US 2015) with upper bounds at $US [4,000 11,000 25,000 50,000 100,000]. Figure 5 shows that the association between mean CO2 anomaly and income per capita is consistent with an Environmental Kuznets Curve in which CO2 emissions increase to an upper bound in the range [$US 40,000-50,000] and then decrease. In Figure 5, sample variation in city per capita income is associated with variation of 1.6 ppm, or 1.6 standard deviation for the overall urban CO2 anomalies displayed in Figure 2. In addition, our aggregative measure of plant capacity imposes an implicit assumption of equal sectoral impact that may not be warranted. For more systematic evidence, we turn to the results of our econometric estimation.

Model Estimation Results
Table 2 reports our regression results for radial distances of 20, 40 and 60 kilometers, as well as cells within 60-km radii that also lie within Functional Urban Areas. We present results for OLS and HAC 5 panel estimation, which adjusts standard errors for spatial autocorrelation. We have scaled the variables to yield easily-reportable parameter estimates (i.e., estimates with limited leading zeros after the decimal place); units are included for each variable.
Overall, we find that the model provides a good fit to the data. The results are highly robust to variations in radial distance and cell restriction to Functional Urban Areas, with the expected signs and generally-high levels of statistical significance for all model variables. At the same time, the cities in our sample display broad variation in regression residuals after the model variables are taken into account. This suggests a potentially-important domain for climate-related policies, and Section 6 explores one policy dimension with an analysis of the relationship between urban mass transit investments and the regression residuals.
Our comparative results for power facilities are consistent with prior expectations: Coal-fired power has the greatest impact on atmospheric CO2 concentration, followed successively by oil-and gasfired power. Non-electric steel/iron complexes, cement plants and refineries have generally-high significance, along with agricultural and forest burning in neighboring areas. We have incorporated the VIIRS nighttime light illumination index as an additional proxy for economic development; all results have similar parameter estimates, the expected signs, and high significance. Heating degree days also has the expected sign and high significance in all cases.

Population Effects
Population has the expected positive, highly-significant effect on CO2 concentration. The results for population density and its interaction with inverse log population are consistent with signswitching as income rises. Figure 6 displays the estimated relationship between income per capita and the composite density parameter [β10 + β11/log(y)]. It suggests that sign-switching occurs around $US 1000, where the composite density parameter is equal to 0.  Figure 7 illustrates the implications by displaying the density/CO2 relationship as income increases. The relationship is strongly positive at $US 400 per capita and mildly so at $US 800. Then it switches sign and becomes progressively more negative for $US 1200, 2000 and 5000.
These population density results may be of interest for the discussion of optimal timing in urban development strategy. In our interpretation, they do not imply that low-income cities should not exploit opportunities for higher-density development, because pursuing such opportunities may help them avoid locking into carbon-intensive residential and infrastructure patterns that are difficult to reverse as income increases. However, our results do imply that CO2 increases may accompany the first phase of densification for some low-income cities.

Income Effects
Our results for income per capita are highly significant and consistent with an Environmental Kuznets Curve (EKC) for urban CO2 emissions (β12 > 0, β13 < 0). In supplementary exercises, we have addressed issues related to (1) potential biases associated with inclusion of emissions-intensive facilities; (2) inclusion of an income-interactive term in the population density component; and (3) simultaneity in the relationship between CO2 emissions and non-industrial income factors. For case (1), we have estimated regressions that exclude either industrial emissions sources or income. When we exclude the income terms, we find a very small effect (median change of 3%) on parameter estimates for power plants, steel mills, cement plants and refineries. When we exclude the industrial facilities, we find that the income parameters change slightly. For case (2), we have estimated regressions that exclude all righthand variables except income, time trend and seasonal controls (by hemisphere). As expected, with all collinear variables dropped, we find an increase in the significance and size of the EKC parameters. Figure 8 displays the estimated EKC with industrial emissions sources, without them (case 1), and without any righthand variables except the EKC terms (and the trend and seasonal controls) (case 2). With industrial sources excluded, the EKC peaks sooner and declines more sharply at higher incomes. At present, we cannot say how much of this change reflects estimation bias and how much reflects the collinearity-related upward bias that would accompany exclusion of properlyinstrumented facility variables. In any case, all three estimates carry the same basic message: The EKC reaches a peak in the range [$40,000 -$50,000] which is above the 90 th percentile internationally.
Case (3) warrants more detailed attention because of potential simultaneity bias in the relationship between income and CO2 emissions. A substantial literature has used inventory-based emissions estimates to study this relationship at the country level (e.g., Apergis and Payne 2014). Results have differed substantially by country and time period (Ben Youssef et al. 2016). The research presented in this paper is different from previous work in at least three relevant ways. First, it uses direct, satellite-based observations of CO2 rather than estimated emissions inventories. Previous studies have risked introducing some technical element of simultaneity by construction, because their emissions inventories derive entirely from measures of activities that provide components of income. Second, the present study uses local, spatially-referenced data rather than national or regional aggregates. Third, we take a different approach to energy as a link between income and CO2 emissions. Most previous studies of the simultaneity issue have focused on the energy sector as the key determinant of simultaneity between income and CO2 emissions, because CO2 emissions increase with energy use and energy use contributes to economic growth. Researchers' views have differed substantially on the potential importance of this problem. Csereklyei and Stern (2015) argue that the bias is fairly small, so estimated emissions-income elasticities will be close to effects for exogenous changes in income. In any case, the modeling exercise in this paper takes an entirely different approach to the energy sector, incorporating georeferenced data for the coal-, gas-and oil-fired power plants that create sectoral CO2 emissions while excluding other power facilities (e.g., nuclear, solar, wind, hydro) that do not. In the previous subsection, we have discussed and illustrated the implications of our approach to energy facilities.
As previously noted, the EKC component of our model relates to non-industrial, income-related activities (e.g., motor transport) that cannot be observed directly. The relationship between income and CO2 emissions from these activities may also include elements of simultaneity, such as the joint emission by some activities of CO2 and local air pollutants (NO2, SO2, CO) that affect health, productivity, and therefore income (Van Ewijk and Van Wijnbergen 1995).
For econometric estimation, instrumental variables (IV) provide the standard correction for such simultaneity problems. In recent work on EKC estimation for local pollutants, Lawell and Liscow (2013) have identified the age dependency ratio as a plausibly-exogenous instrument that affects economic growth via the savings rate and overall labor productivity. We have adopted this approach for a first-stage regression that relates national income growth since 2010 to the age-dependency ratio. We substitute the regression prediction for income per capita in our EKC model that excludes all other righthand variables except the time trend and seasonal variations by hemisphere. The resulting EKC estimate is included in Figure 8 (labeled "IV"). Its peak occurs at a lower CO2 anomaly than its OLS counterpart, and at an income that is substantially higher. We should emphasize that this IV result is far from the last word, and future research should use satellitemeasured CO2 data to address the simultaneity issue in more depth.
We should also note that the income-related results in this study are only intended to provide a benchmark for judging urban performance in reducing CO2 emissions. The same would be true if our results had rejected the EKC (β12 > 0, β13 = 0), yielding a relationship in which CO2 emissions increase continuously with income per capita. To avoid any misunderstanding, we should emphasize that our EKC results have no normative or policy implications in themselves. They do not imply that additional public resources are not needed for reducing CO2 emissions because "the problem will take care of itself" with continued economic growth. In fact, the opposite is true. The most recent IPCC report (IPCC (2021), Figure SPM.10) affirms a near-linear relationship between cumulative CO2 emissions and the increase in global surface temperature. Our EKC results ( Figure  8) show that even high-income countries have not approached zero CO2 emissions, and growing industrial giants like India and China are still on the rising portion of the curve. The clear implication is that waiting for the EKC to reduce emissions from all countries would produce an enormous increase in cumulative CO2 and a potentially-catastrophic global temperature increase. This conclusion will simply be compounded if future research finds insignificance for the EKC regression parameter (β13 = 0).
Finally, we should note that our EKC results only provide a descriptive "snapshot" of the income/emissions relationship for our sample urban areas during the period 2014-2019. Even if the EKC specification survives future econometric tests, policy changes and technology improvements may lower the EKC peak significantly while shifting it to a much lower income level. Indeed, it is possible that recent changes in policy and technology in wealthier economies have already steepened the post-peak decline in CO2 emissions at higher incomes. Tracking changes in the emissions/income relationship and its determinants should be an important topic for future research.

Influence of Regression Variables on CO2 Emissions
We use standardized (beta) coefficients 6 to provide measures of relative importance for model variables. Table 3 presents results from our 40-km radial estimates, standardized to weights that add to 100. Overall, we find influence weights of 34.3 for industrial sources, 34.5 for measures of economic development, 10.6 for population-related factors, and 20.8 for environmental variables. 7 Among industrial sources, the top three by influence are coal-fired power (13.2), cement (7.3) and steel (6.2). For economic development, population factors and environmental variables, the top variables are income per capita, population density and heating degree days, respectively. 6. Discussion

Global Comparisons
Our regression model incorporates a host of determinants, including the annual global trend, seasonal changes by hemisphere, industry structure, fires in neighboring areas, demography, the income component that is uncorrelated with industry structure, and climate.

Regression Predictions and Residuals
Figure 9 maps regression-predicted mean CO2 anomalies for the 1,236 cities in the sample. The predictions vary widely in all major regions. China is distinguished, both by the number of cities and the high proportion of CO2-intensive cities in its eastern coastal region. Overall, however, Chinese cities display the same broad variation as cities in other regions.
The fitted model can provide a pilot template for judging cities' emissions performance. However, we should emphasize some particular features of model structure. First, it includes heavily-emitting industrial facilities but does not include clean power sources and electric arc steel production. Urban areas where industry has switched to these less-CO-intensive technologies will have lower actual and predicted emissions than cities with CO2-intensive technologies. In a related vein, although some urban areas may not have pollution-intensive power plants within the radial distances used for this study, they may import power from more distant pollution-intensive facilities. From a consumption perspective, such areas are not "cleaner" than areas with local power production. We provide these cautionary notes because our model is explicitly intended to "level the playing field" by treating facilities like pollution-intensive power plants as historical legacies that provide benchmarks for judging future performance. Our model estimates for industrial facilities establish fixed initial conditions, while automatically adjusting for the continuing global trend, seasonal fluctuations, changes in nearby fires, temperature changes, and changes in population and income.
Some insight into cities' current status can be gained by examining their mean regression residuals, which provide a measure of their deviations from regression predictions during the sample period. Figure 10 maps regression residuals for the 1,236 cities in the sample, while Figure 11 displays boxplots of residual distributions by region. 8 The map displays wide variation in all regions, but certain regional patterns suggest deviations from collective global experience. The number of large positive residuals in China is apparent, while large negative residuals are strongly evident in the former Comecon countries. 9 Figure 11 confirms the impression for China. Despite a wide dispersion of positive and negative residuals, China's median is about .5 ppm (or .5 standard deviation for cities' local component) above the global norm (a residual of 0 More detailed information about city distributions by region is provided in Appendix tables A1-A10, which tabulate the cities in each region that have 15 negative and 15 positive residuals with the largest absolute values. These are the cities whose measured CO2 emissions are notably smaller or larger than expectation, as measured by the regression predictions. For each city, the tables present measured CO2 for the sample period, model-predicted CO2, and the residual.

Source Decomposition of Predicted CO2
The Appendix tables also use the regression results to decompose predicted CO2 emissions into five source categories: Industry (power plants, steel mills, refineries, cement plants); Fires (carbon emissions from agricultural and forest burning); Income (non-industrial CO2 sources that are correlated with income); Population (population and population density); and Climate (heating degree days). Figure 12 maps illustrative decompositions for the sample cities. Particularly important roles are suggested for Fires in Sub-Saharan Africa and Climate in northern Asia, the Russian Federation and Eastern Europe.

Implications for Non-Pigouvian Policies
Our results can also contribute to the discussion of strategies for achieving steep reductions in CO2 emissions. Most climate economists have argued for Pigouvian carbon pricing via emissions taxation or permit trading (Stiglitz and Stern 2021; Jacobs and van der Ploeg 2019; King et al. 2019;Klenert et al. 2018). Many policy analysts who support Pigouvian pricing also argue for a non-Pigouvian supplement: coordinated public investment in low-carbon land development, energy and transport that will accelerate the transition to low-carbon economies, particularly in lower-income countries that are not yet locked into high-carbon growth paths (van der Ploeg and Venables 2020). Economic growth is accompanied by urban development, which should avoid carbon-intensive land development, infrastructure and energy systems that are difficult to retrofit once they are locked in (Seto et al. 2014). The empirical literature suggests that lower-carbon residential, energy and transport development can have self-reinforcing effects on residents' preferences for low-carbon energy services (Carattini et al. 2018;Allcott and Rogers 2014) and transport modes (Weinberger and Goetzke 2010;Grinblatt et al. 2008;Bamberg et al. 2003).
While the non-Pigouvian argument is certainly plausible, the global resource implications of adopting it are huge and rigorous empirical support would be highly desirable. An attempt at rigorous testing would be premature for the current exercise, but an analysis of the residuals from our econometric model can provide suggestive evidence. Empirical leverage is provided by the diverse urban development paths followed by cities within and across countries. This is particularly true for investments in subway systems, undertaken by some cities but not by many others (Pasquale et al. 2016;Costa and Fernandes 2012;Jones 2008;Post 2007;Cudahy 1990). Subway installation has been motivated by the belief that it will shorten commuting times, reduce traffic congestion and vehicular emissions, and promote higher-density residential development near subway stations. Reduced vehicle emissions and energy efficiencies associated with higher-density development are frequently cited as carbon-saving advantages of mass transit systems.
From this perspective, subway cities should have had lower-carbon development paths, other things equal. We explore this proposition with our regression residuals, drawing on a recent global survey of subway systems by Turner and Gonzalez-Navarro (2018). Subways provide an excellent test of the non-Pigouvian supplement to carbon pricing for several reasons: They exemplify massive directed infrastructure investment; they are numerous but far from universal in most world regions; and their histories vary from over a century to less than a decade. If directed public investments can make a significant contribution to low-carbon development, the effect should register in a large sample that includes cities that have installed subway systems and cities that have not.
Our econometric results in Table 2 are very similar for all three urban radii, so we present the median case (40 km). We have residuals for 1,236 cities with populations greater than 500,000. The Turner/Gonzalez-Navarro survey identifies 132 of these as cities with subway systems. We divide the residuals into deciles, with the largest negative outliers in Decile 1. These are the cities whose measured CO2 concentrations are far lower than their predicted concentrations. In counterpoint, the cities in Decile 10 have measured concentrations far higher than their predictions. We count the number of cities with subways in each decile and present the results in Table 5. Figure 13 graphs the results with a regression line for easier interpretation. The results provide striking evidence of a negative relationship, with a decline of about 2 subway cities per decile. There are 23 subway cities among the negative outliers in Decile 1, and only 6 subway cities among the positive outliers in Decile 10. Total 1,236 132

Figure 13: Cities with subways by residuals decile
Our results certainly offer suggestive support for the view that large, directed public investments can make a significant contribution to low-carbon development. However, we cannot discount the potential role of endogeneity in this exercise. The implied relationship between mass transit investment and CO2 emissions has no explicit endogeneity, because reducing CO2 emissions has not been a goal of mass transit investments until very recently. However, traffic congestion has been a target, with local air pollution as a correlate. Some element of endogeneity may therefore be present, because CO2 emissions and local air pollutants have common sources in vehicle traffic, heavy industry, power generation, and residential heating. At the same time, the heterogeneity of global cities makes it unclear whether simultaneity is an important consideration in this case. Our database includes cities with and without subways in 138 countries, with some subway installations dating back more than a century, in political and economic regimes as varied as the former Soviet Union, other COMECON countries, social democratic regimes in Western Europe, military and populist authoritarian regimes in Latin America, relatively laissez-faire regimes in the United States and Australia, and regimes in Asia that range from authoritarian in the Democratic People's Republic of Korea and mainland China to relatively laissez-faire in Thailand and Taiwan, China. In light of these multiple, disparate factors, we believe that the exploratory results in Table 5 and Figure  13 offer reasonably strong evidence in favor of the public investment hypothesis. In future research, we hope to revisit this question in an econometric exercise with an explicit treatment of potential simultaneity.

Future Research 10
The advent of satellite-based CO2 data has opened many research lines that could not be explored until the requisite information became available. In this section, we summarize some of the topics that have been identified in the course of our own research.

Quantifying CO2 Emissions
Although satellite-based measures of CO2 concentration anomalies are extremely useful for comparative analyses, the policy community would undoubtedly benefit from conversion of concentration anomalies to physical estimates of CO2 emissions. As we note in the paper, recent research for a few cities has used OCO-2 observations to produce scaling factors (R) for rough adjustment of ODIAC-type emissions estimates (Oda et al. 2018). Future extensions of our econometric work could include the construction of sectoral R-factors that could be used to improve emissions inventories by identifying the sources of their discrepancies from satellitemeasured concentration anomalies. Hopefully, empirical research will ultimately depart from its continued dependence on emissions inventories by developing methods for direct conversion of satellite-measured CO2 anomalies to physical emissions estimates.

Potential Endogeneity Problems
10 Our thinks to the reviewers of this paper for their useful thoughts about future research directions.
Two potentially-important problems with our current specification should be addressed in future work. The first relates to possible endogeneity in the relationship between CO2 emissions and mass transit investments. The second concerns joint determination of CO2 emissions and income per capita. Our results are consistent with the EKC hypothesis, but we recognize the need for more work on the problem. In a related vein, future research should explore the directions and sources of change in the emissions/income relationship. For example, the recent priority given to CO2 reduction in wealthier countries may already be changing the position or shape of the relationship at high income levels. Research on this issue should also investigate the intervening relationship between income and climate policies, which are now being registered in sources like the LSE's climate policy database (LSE/Grantham 2021).

Non-Pigouvian Policies
This paper has focused on subways because we have been able to use a large subway data set constructed by Turner and Gonzalez-Navarro (2018). However, we readily acknowledge that other mass transit systems may also have important effects on CO2 emissions (e.g., regional rail and bus rapid transit systems). As global data sets expand to include these systems, they should be incorporated into the analysis.
Explicit incorporation of mass transit investments into the econometric model will provide another avenue for future research. As previously noted, the possible joint determinacy of subway investments and CO2 emissions may warrant the use of instrumental variables in the expanded model. In addition, future research should investigate the contribution of subways and other mass transit investments to synergies between Pigouvian and non-Pigouvian policies. For example, a dynamic modeling exercise for the Paris urban area by Avner, Rentschler and Hallegatte (2014) suggests that the fuel price elasticity of carbon emissions can be much higher in cities with robust public transport options. An extension of the current econometric research would incorporate relevant price variables (e.g., fuel prices), as well as their interactions with transit investment variables to test the effects on price elasticities.

Summary and Conclusions
In this paper, we have estimated an urban CO2 emissions model using satellite-measured CO2 concentrations from 2014 to 2020, for 1,236 cities in 138 countries. The model incorporates the global trend in CO2 concentration, seasonal fluctuations by hemisphere, and a large set of georeferenced variables that incorporate CO2-intensive industry structure, emissions from agricultural and forest fires in neighboring areas, demography, the component of income that is uncorrelated with industry structure, and relevant geographic conditions. We resample all model variables to a 10 km global grid, and capture CO2 diffusion from discrete emissions sources via inverse-distance weighting to a grid cell centroid distance of 100 km.
In four econometric estimation exercises, we assign grid cells to cities if they lie within 20, 40 and 60 km of city centroids, or within the boundaries of UN-defined Functional Urban Areas. The results are very similar and robust in all four cases, with the expected signs and generally high levels of significance. We find that economic development has a significant effect on the direction of the 35 relationship between population density and CO2 emissions. The relationship is positive at very low incomes, but becomes negative at higher incomes. Our income results provide the first test of an Environmental Kuznets Curve relationship based on actual CO2 observations. With caveats about potential simultaneity problems, we find evidence for an EKC that reaches a peak in the range [$40,000 -$50,000] per capita, which is above the 90th percentile internationally.
We should note that our EKC results are only intended to provide a benchmark for judging future urban performance in reducing CO2 emissions. The same would be true if we had found a linear emissions/income relationship. We should also emphasize that our EKC results have no normative or policy implications in themselves. They do not imply that additional public resources are not needed for reducing CO2 emissions because "the problem will take care of itself" with continued economic growth. In fact, the opposite is true. Waiting for the EKC to reduce emissions from all countries would produce an enormous increase in cumulative CO2 and a potentially-catastrophic global temperature increase.
We explore other implications of our estimates in a series of exercises. Model-based predictions provide expected CO2 concentration anomalies for cities, given their sectoral, demographic, economic and geographic characteristics. We map the expected concentration anomalies and regression residuals for the 1,236 sample cities, using the residuals to index CO2 emissions performance that exceeds or falls short of model-based expectations. Among cities, we find wide variation in performance among cities within regions, as well as significant differences across regions. Our results can also inform the discussion of policy instruments for CO2 emissions reduction. Many policy analysts who support Pigouvian pricing also argue for a non-Pigouvian supplement: coordinated public investment in low-carbon land development, energy and transport that will accelerate the transition to low-carbon economies. We explore this proposition for subway investments, drawing on a recent global survey of subway systems. We divide our 1,236 regression residuals into deciles, with the largest negative residuals in the first decile, and identify the subset of 132 global subway cities in each decile. We find that subway cities are four times more numerous among first-decile cities than among tenth-decile cities. We also find that representation of subway cities declines steadily across deciles. While these results provide strong suggestive support for the non-Pigouvian view, they are subject to potential endogeneity that should be explored in future research.
To conclude, this paper offers a response to the World Bank's new climate change mandate, which requires new metrics for judging progress in CO2 emissions reduction. We demonstrate that satellite-based CO2 measures can contribute by enabling rigorous analysis and performance assessment for all global cities and regions. We have estimated a CO2 emissions model for 1,236 cities with populations greater than 500,000, but our 10 km grid covers all terrestrial areas of the 36 globe. The same model could be used in other geographic domains, such as large and small cities within regions or countries, regions within countries, or specific project areas. In light of the World Bank's mandate for new metrics to track progress in greenhouse gas reduction, we hope to extend this pilot initiative to an open-source, regularly-updated CO2 database that will inform all global stakeholders.