Policy Research Working Paper 10297 Scalable Tracking of CO2 Emissions A Global Analysis with Satellite Data Susmita Dasgupta Somik Lall David Wheeler Development Economics Development Research Group February 2023 Policy Research Working Paper 10297 Abstract This paper extends recent research on satellite-based carbon emissions performance is above or below expectation. dioxide measurement to an easily updated template for Although the tracking model is “simple,” it requires soft- tracking changes in carbon dioxide concentrations at ware and hardware that are beyond the means of many local and regional scales. Using data from the National interested stakeholders. For this reason, the World Bank’s Aeronautics and Space Administration’s Orbiting Carbon Development Economics Vice Presidency has established Observatory-2 satellite platform and a large sample of urban an open web facility that pre-filters data from the National areas, a comparison of trend estimation models suggests Aeronautics and Space Administration’s Orbiting Carbon that the template can use a simple model that estimates Observatory-2satellite and publishes monthly mean con- trends directly from satellite data pre-filtered to isolate centration anomalies for all terrestrial cells of a 25-kilometer local concentration anomalies. Illustrative applications are global grid. The website will also publish annual carbon developed for a long-period trend model and a short-pe- dioxide tracking reports for urban areas and provide infor- riod model focused on change in the most recent year. In mation that links the 25-kilometer global grid cell IDs to addition, the paper estimates carbon dioxide emissions IDs for urban areas and national administrative units (levels for thousands of urban areas and identifies cities whose 0, 1, and 2). This paper is a product of the Development Research Group, Development Economics. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at sdasgupta@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Scalable Tracking of CO2 Emissions: A Global Analysis with Satellite Data Susmita Dasgupta* Somik Lall David Wheeler World Bank Keywords: CO2 emissions, OCO-2, Urban pollution, Emissions tracking JEL Classification: Q53, Q54, Q58, R11, R40, R48, R58 The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. * Authors’ names in alphabetical order. 1. Introduction The World Meteorological Organization forecasts that the current greenhouse gas (GHG) emissions trend will increase global temperature 3-5 degrees C by 2100 (Reuters 2018). This would far overshoot the 2-degree limit pledged by the 2015 Paris Climate Accords (COP 21) and might have a catastrophic impact (Steffen et al. 2018; World Bank 2012). In response, several industrial nations pledged very steep emissions reductions at the recent Leaders’ Summit on Climate (April 22-23, 2021). Unfortunately, these pledges confront a striking information shortfall at the outset: near-total absence of directly-measured local and regional GHG data for problem diagnosis, program design and performance assessment. Recently, the advent of satellite-based GHG measurement has greatly expanded the potential for empirical assessment. High-resolution observations of atmospheric GHG concentrations are now available from several platforms, including NASA’s OCO-2 and OCO-3 instruments, the European Space Agency’s METOP-A and TROPOMI (Sentinel-5P) platforms, China’s TANSAT and the Japan Space Exploration Agency’s GOSAT and GOSAT-2. Detailed technical assessments of measures from these platforms have verified that they provide useful and comprehensive information for global carbon emissions analysis (Weir et al. 2021; Nassar et al. 2021; Pan et al. 2021; Wu et al. 2020; Hakkarainen et al.; Labzovskii et al. 2019). This paper extends recent research on satellite-based measurement to an easily-updated template for tracking changes in atmospheric CO2 concentrations at local and regional scales. Using observations from NASA’s OCO-2 platform, we develop the template from the data filtering techniques and econometric analysis employed by Dasgupta, Lall and Wheeler (2022). Our prior work estimates an econometric model that relates satellite-based CO2 measures to georeferenced emissions sources. In this paper, we develop and compare two versions of the CO2 tracking template. The first tracks changes in residuals after fitting the econometric model to satellite-based observations, while the second version simply tracks changes in the observations themselves. Using data from thousands of urban areas, we find an extremely close correspondence between results for the two versions. We opt for simplicity and select the second version for template development. We also introduce a technique for identifying city-level changes that are distinct from regional changes induced by broader atmospheric circulation patterns. We provide several illustrative applications for urban areas, while noting that the same approach could be used for any other areas of interest. In an additional exercise, we use our regression model results to compute expected emissions from urban areas. The regression residuals identify the directions and relative magnitudes of departures from expected values for individual areas. We convert the residuals to their emissions equivalents using high-resolution gridded information from the EDGAR global database (Crippa et al. 2020). For 1,306 urban areas with populations greater than 500,000, we find a rough balance between cities whose emissions are higher and lower than their expected values. Converting deviations to percentages of expected values, we find that percent deviations are typically greater in absolute value for cities with lower-than-expected emissions. The remainder of the paper is organized as follows. Section 2 motivates the comparison of tracking models by reviewing the econometric model and supporting data from Dasgupta, Lall and Wheeler (2022). Section 3 develops the candidate tracking models and Section 4 compares their results for a large sample of urban areas. Section 5 applies the selected model to illustrative urban cases, while 2 Section 6 computes expected emissions values and deviations from those values for a large number of urban areas. Section 7 summarizes and concludes the paper. The Appendix describes the World Bank’s development of an online platform to support future work. 2. The CO2 Emissions Model 2.1 Model Specification An extensive body of empirical literature has explored the determinants of growth in CO2 emissions. Attention has focused primarily on the drivers of CO2 emissions from fossil-fuel combustion and cement production (e.g., Raupach et al. 2007; Jotzo et al. 2012). Recent research has also estimated more precisely the CO2 emissions from fires associated with agriculture and land-use change (Gasser et al. 2020; Winkler et al. 2021). However, emissions drivers in this sector have received less attention than work on industrial determinants (Sanchez and Stern 2016). For the industrial sectors, most of the available estimates are inferred from survey-based activity measures that may be incomplete, particularly for developing countries. This study takes a completely different approach, employing direct CO2 observations from satellites. The dependent variable in our model is the atmospheric CO2 concentration above a cell in a global grid with 25 km resolution. We employ 25-km grid cells defined by Van der Werf et al. (2017), who provide the data on CO2 emissions from fires for this analysis. The dominant component of the atmospheric concentration is the global stock of CO2 molecules that have accumulated since the Industrial Revolution. The second component is seasonal CO2, reflecting differential absorption and release by vegetation over the annual cycle. The seasonal CO2 component is latitudinal, differing by hemisphere because the Northern Hemisphere has more plant life than the Southern Hemisphere. The third component is local CO2 emissions, reflecting the time lag between local emissions of CO2 molecules and their full dispersion into the global mix. We follow convention by terming this the “concentration anomaly” since it measures the local deviation from the global background CO 2 concentration. In our global emissions model, we classify the determinants of local concentration anomalies in three categories. The first comprises activities in the most significant CO 2-emitting industry sectors. The Intergovernmental Panel on Climate Change (IPCC) (Gale et al. 2005) has identified four dominant industrial sources of CO2 emissions: (i) power plants, (ii) steel mills, (iii) cement plants, and (iv) oil refineries. The second category includes CO2 emissions from agricultural and forest fires. The third category comprises population-related emissions other than those directly associated with CO2-intensive industrial activity. These include motor vehicle emissions, which are not measured reliably at the spatial resolution required for our analysis.1 We would expect traffic emissions per capita to increase with income per capita, all else being equal. Population-related factors also include CO2 emissions from household heating and cooling, which are not captured by data for central power plants.2 The subway component of the model affects population-related emissions by 1 Recent research has used Google Traffic to infer vehicular emissions from high-resolution traffic congestion data for some cities (Heger et al. 2018; Dasgupta, Lall, and Wheeler 2021). However, no currently available technology enables direct estimation of global vehicular emissions at 25 km resolution. 2 Household air conditioning is powered by fossil-fired home generators in many hot low-income areas where utility-scale power is either nonexistent or unreliable. For a detailed assessment, see Lam et al. (2019). 3 reducing the demand for motor vehicle transport and promoting denser settlements that are more easily served by utility-scale energy sources. We specify the econometric model as follows: (1) 2 = 0 + 1 + 2 + 3 + 4 + (5 + 6 + 8 (, ) 7 ) + . Expected signs: β1, β2, … β7 > 0, β8 < 0 For grid cell i in period t, CO2it is the satellite-measured mean CO2 concentration anomaly. Iit stands for CO2 emissions from industrial sources; DIit represents wind-displaced industrial CO2 emissions from other cells; Fit equals CO2 emissions from agricultural and forest fires; and DFit is wind-displaced fire CO2 emissions from other cells. Pit stands for population and Yit represents income per capita. Hit and Cit represent heating and cooling degree days, respectively. Sit is the subway impact index, which is a function of system scale (L) and age (A); ε is a random error term. In this equation, the atmospheric CO2 anomaly is related to emissions from industrial sources, fires, and non-industrial population sources. Spatially-referenced variables in the model are translated to consistent measures by resampling to centroids for our 25-km grid cells. The core model is additive because emissions from the three sources contribute separately to the accumulation of CO2 molecules in the atmosphere. The anomaly recorded for a grid cell by a satellite platform includes emissions from sources within the cell and the “spillover” emissions created by wind displacement from sour ces in neighboring cells. For industry and fires, the model includes both cell-specific emissions (Iit, Fit) and wind-displaced emissions (DIit, DFit). In the population-related component of the model, the marginal impact of population (Pit) is a function of heating degree days (Hit), cooling degree days (Cit), and income per capita (Yit). For a subway city, their composite effect is conditioned by an exponential function of the scale (Lit) and age (Ait) of the subway system. The exponential constrains the multiplier to a range from 1 (no subway: Lit = 0, Ait = 0) to 0. The multiplier value should decline with both age and scale, the latter measured by the length of operating subway lines. 2.2 Data Data from several satellite platforms that provide CO2 measures have been collected by various instruments over different periods, with different resolutions and observation repeat cycles and widths of area coverage along orbital paths (Pan, Yuan, and Jieqi 2021). The data are also accessible in varying degrees. Combining observations from multiple sources could present difficulties that are as yet little- explored. For this exercise, prudence has dictated the choice of one platform, NASA’s OCO (Orbiting Carbon Observatory)-2, because it offers open access (JPL/NASA 2021); a long panel of consistently measured, daily observations (beginning on September 6, 2014); and the highest spatial resolution among the available sources (1.29 × 2.25 km). The design of OCO-2 supports comparative exercises like our analysis. It follows a sun-synchronous near-polar orbit, crossing the equator in ascending mode around 1330 hours local time. This means that the OCO-2 observations for our study are collected between 1200 and 1500 local time for all 4 cities in the sample, providing a consistent mid-day activity benchmark for comparing CO2 concentration anomalies.3 OCO-2 has an observation repeat time of 16 days. We have downloaded georeferenced measures of XCO2 (the column-averaged dry air mole fraction of CO2). We filter the XCO2 data for local concentration anomalies, or differences between observed and background CO2 at each point. We calculate background CO2 using the methodology of Hakkarainen et al. (2019), which incorporates both temporal and geographic elements. As Hakkarainen notes, the available data are insufficient for estimating daily medians at resolutions higher than 10 degrees of latitude. We compute the daily median XCO2 for each 10-degree latitude band and linearly interpolate the result to each OCO-2 observation with 1-degree resolution. Following Hakkarainen, we use the median as the representative value because it is not skewed by extreme observations. We subtract this background value to compute the local anomaly for each observation. Then we compute monthly mean values of concentration anomalies for the 25-km grid cells in our database. We use georeferenced facility-level global databases to obtain capacity measures and technology specifications for power plants (Byers et al. 2021), steel mills (GEM 2021), cement plants (McCaffrey et al. 2021), and oil refineries (Auch 2017). We convert capacity measures to annual CO 2 emissions using standard emissions factors for power production by fuel source (USEIA 2021), steel mills (World Steel Association 2021), cement (IEA 2020), and refineries (Jing et al. 2020). Van der Werf et al. (2017) provide monthly estimates of carbon emissions from agricultural and forest burning at 25 km resolution. Mistry (2019) has provided global estimates of monthly heating and cooling degree days at 25 km resolution.4 We compute population at 25 km resolution by aggregating data from CIESIN (2021) at 5 km resolution. Monthly estimates are interpolated from data provided for 2010, 2015, and 2020. We use two sources to construct our georeferenced measure of income per capita. From the G-Econ database (Nordhaus et al. 2006), we obtain GDP per capita in 2005 purchasing power parity for a global grid with 100 km resolution. Each grid cell is assigned to its geographically dominant country by G-Econ. For each cell in a country, we compute the ratio of cell GDP per capita to the national mean for all cells. We merge the results with annual UN estimates of GDP per capita in constant $US 2015 (UN 2021), and use the cell ratios to estimate annual GDP per capita for each cell. We resample these cells to 25 km for compatibility with the rest of our database. We draw our subway data from two sources. The first is a global subway survey by Turner and Gonzalez-Navarro (2018), which includes 137 systems installed prior to 2011.5 The survey includes digital subway maps at five-year intervals from 1930 to 2010. The second source is our own survey of 55 subway systems installed since 2010. We have constructed digital maps for these systems using 3 CO2 measurement during the full daily activity cycle will improve as systems like OCO-3 observe each area at more widely varying times. 4 Mistry’s data terminate in December 2019. We extend the domain for regression analysis by computing monthly means for each 25 km cell using the data for 2014–19. 5 Gendron-Carrier et al. (2020) provide the following definition: “These data define a ‘subway’ as an electric powered urban rail system isolated from interactions with automobile traffic and pedestrians. This excludes most streetcars because they interact with vehicle and pedestrian traffic at stoplights and crossings, but underground streetcar segments are counted as subways. The data do not distinguish between surface, underground, or aboveground subway lines as long as the exclusive right of way condition is satisfied. To focus on intraurban subway transportation systems, the data exclude heavy rail commuter lines (which tend not to be electric powered). For the most part, these data describe public transit systems that would ordinarily be described as ‘subways’, e.g., the Paris metro and the New York city subway, and only such systems.” 5 information from OpenStreetMaps (OSM 2021). We overlay the digital subway maps on our 25 km grid and compute the total length of subway lines in each grid cell and year. Both subway age and line length measures are highly right-skewed, so we apply the inverse hyperbolic sine transformation prior to estimation.6 2.3 Accounting for Wind Displacement of CO2 Emissions Space-based observations detect higher CO2 concentrations over emissions sources because atmospheric diffusion is not instantaneous. As the prevailing winds displace emissions from their sources, deviations from background concentrations persist for some time. City-level or plant-level estimates have commonly employed measures of wind direction to model these effects (Nassar et al. 2017; Wu et al. 2020; Ye et al. 2020). We replicate this exercise at global scale, using ERA5 monthly wind direction data for all grid cells in the database (Hersbach et al. 2019). For emissions from each grid cell, we determine the wind-directed path across neighboring cells. We compute monthly wind bearings at 0.25◦ resolution from 10-m u and v components and then resample to our 25 km grid. 7 Wind paths are calculated in sequence. For each origin cell (A) in the sequence, the destination cell (B) is determined by the wind bearing in cell A. Using each grid cell as a source, we determine the sequential path across nearby cells through 20 iterations. Theory provides no guidance on local atmospheric persistence as wind displacement proceeds, so we address the issue empirically. In preliminary regression experiments, we perform a grid search across two variables. The first is the duration decay function, modeled as the inverse of the iteration sequence number raised to a power that varies in increments of 0.1 between 0 and 2.0. The second is the number of iterations, which varies from 1 to 20. Our grid search yields best fits for decay and iteration parameters of 1.0 and 10, respectively. Using these parameters, we incorporate wind displacement effects as follows. For each year and month, we use our industrial and fire emissions data to compute total CO2 emissions separately for industry and fires in each grid cell. We route these emissions across nearby cells through 10 sequential iterations, identifying the destination cells by iteration. Once this process is complete for all cells in the grid, we proceed cell-by-cell. For each cell, we add across observations for displaced CO2 from every other cell, with separate totals by iteration step. We weight these totals for industry and fire CO 2 by the inverse iteration step number (which incorporates the decay function). Next, we add across the weighted totals to obtain the overall decay-weighted totals for wind-displaced CO2 in each cell. These are the variables DI and DF in the econometric model (equation [1]). 6 We use the IHST rather than the logarithmic transformation because most of the observations in the data set are zeros (see Burbidge, Magee and Robb (1988) and Layton (2001)). 7 The bearing calculation formula can be viewed at https://www.movable-type.co.uk/scripts/latlong.html. 6 2.4 Model Results Our econometric results are presented in Table 1. The table includes results for alternative estimators that incorporate different assumptions about the structure of the stochastic error term (Ɛit) in the model. These techniques produce the same point estimates for model parameters, but their differing estimates of standard errors (and the accompanying t-statistics) may lead to very different inferences about the statistical significance of model variables. We replicate the point estimates in columns (1) – (3) to aid interpretation of the t-statistics. We include results for standard nonlinear (NL) regression, NL with robust standard errors (SE) and NL with SE adjusted for 3,074 clusters defined by level-1 administrative units (states, provinces, etc.) for the 190 countries in the regression database. As the table shows, all results have the expected signs. The mean anomaly for satellite-observed CO2 concentrations in a 25-km grid cell is positively related to direct emissions from industry and fires; wind-displaced emissions from industry and fires in neighboring areas; and population-related emissions. The marginal impact of population is positively related to heating needs (heating degree days), cooling needs (cooling degree days) and income per capita, and declines with the interaction of subway length and scale. All variables meet classical significance tests in all three regressions, with the exception of cooling degree days in the cluster-adjusted regression. 3. Alternative Templates for CO2 Emissions Tracking As noted in the introduction, recent research indicates that satellite-based observations can support an objective, spatially-referenced system for tracking CO2 trends in local areas and regions. Pre- filtering by the method of Hakkarainen et al. (2019) or similar methods enables measurement of CO2 concentration anomalies -- the locally determined components of atmospheric concentrations. Temporal considerations are also important because satellite-based measurements, like observations from ground-based monitors, include random components that hinder short-period trend identification. At the present state of the art, trend estimates for periods shorter than a year seem problematic. 7 Table 1: Determinants of CO2 concentration anomalies Dependent Variable: XCO2 Anomaly (parts per billion) NL NL (Robust) NL(Cluster)a Industry CO2 Emissions 0.277*** 0.277*** 0.277*** [‘000 Tons] (26.19) (18.85) (9.88) Industry CO2 Wind-Displaced Emissions 0.275*** 0.275*** 0.275*** [‘000 Tons (Weighted)] (44.59) (31.60) (13.82) Fires CO2 Emissions 0.359*** 0.359* 0.359* [‘000 Tons] (24.79) (2.35) (2.03) Fires CO2 Wind-Displaced Emissions 0.573*** 0.573*** 0.573*** [‘000 Tons (Weighted)] (61.85) (8.64) (4.46) Population [‘000] 5.362*** 5.362*** 5.362*** x Heating Degree Days (69.53) (17.25) (6.18) Population [‘000] 0.398*** 0.398*** 0.398 x Cooling Degree Days (5.48) (5.29) (1.32) Population [‘000] 13.54*** 13.54*** 13.54*** x Income Per Capita [$US ‘000] (24.45) (15.94) (3.75) IHSTb [Subway Scale] + IHSTb [Subway Age] -0.193*** -0.193*** -0.193*** [Scale: Track Length in km] (-37.48) (-22.29) (-10.51) Constant -193.4*** -193.4*** -193.4*** (-192.01) (-153.66) (-10.32) Observations 1,961,754 1,961,754 1,961,754 a GADM (2021) Level 1 Administrative Divisions (States, Provinces) b IHST: Inverse hyperbolic sine transformation t statistics in parentheses * p<0.05 ** p<0.01 *** p<0.001 8 At temporal resolutions of one year or greater, it may be useful to augment the Hakkarainen technique with a second filter based on an econometric model like (1) above. The model relates local CO2 concentration anomalies to local emissions from industry, fires and non-industrial population sources. Model parameters provide estimates of characteristic marginal relationships between local emissions and satellite-recorded concentration anomalies, while model residuals measure the deviations of local anomalies from their expected values. In principle, model estimation could be viewed as a useful post- Hakkarainen filter, with trends in model residuals used for tracking changes in local CO 2 emissions intensities. However, the present case is complicated by the presence of both static and dynamic variables in the model. The industry components are basically fixed effects because they are derived from fixed plant capacities, not variable outputs. Among the non-industrial components, population, income per capita and the subway variables are interpolated from observations over intervals of a year or longer. The only variables with one-month periodicity are fires, heating degree days and cooling degree days. Under these conditions, it is not clear whether the econometric post-Hakkarainen filter adds significant value to an emissions tracking analysis. We test the filtering utility of econometric model estimation with data for 6,142 Functional Urban Areas (FUAs) with populations greater than 100,000, as defined by Schiavina et al. (2019). Our exercise controls for potential biases introduced by limited sampling within FUAs. Although the OCO-2 satellite platform provides the best available database, its coverage for our 25-km grid cells is limited by its 16-day repeat cycle, relatively narrow observation track, and the frequent occurrence of cloud cover over some areas. Within an FUA, typical concentration anomalies may differ substantially across grid cells. To cite one possible consequence, a naive trend analysis could generate spuriously- positive results in cases where early-period observations are more numerous in lower-anomaly cells and later observations are more concentrated in higher-anomaly cells. To test for this potential source of bias, our exercise includes regressions with dummy variable controls for grid cells. 9 For each FUA (j), we estimate the following tracking models: CO2 Models (2) 2 = 0 + 1 + (3) 2 = 0 + ∑=1 + 1 + Residuals Models ̂ ] = 0 + 1 + (4) [2 − 2 ̂ ] = 0 + ∑ + 1 + (5) [2 − 2 =1 where, for grid cell i in month t: CO2it = Mean CO2 anomaly (after pre-filtering by the method of Hakkarainen et al. (2019)) ̂ 2 = Prediction from model (1) above Di = Dummy variable for grid cell i8 t = Time from initial period in months εit = Random error term For an FUA, changes in the CO2 concentration anomaly over the sample period are judged from the sign, size and statistical significance of ̂ ̂ 1 and 1 . 4. Results for Tracking Models For model evaluation, we select the 507 urban areas with populations greater than 100,000 that have sufficient observations to yield 60 degrees of freedom after accounting for the number of dummy variable controls in equations (3) and (5).9 All models are estimated for the period September 2014 to October 2021. We are particularly interested in testing the efficacy of (2), the simplest possible tracking model, which is estimated directly from the pre-filtered data with no grid cell dummy variables or econometric-model-based controls for local emissions sources. Our tests are performed for the change parameters ̂ ̂ 1 and 1 . Table 2 displays correlation coefficients between change parameters, ordered by structural “distance” from model (2). We focus on column (2), which tabulates correlation coefficients for model (2). They are all very high, declining slightly from 0.99 to 0.97 for (5), the most structurally-distant model, which includes filtering with econometric residuals and dummy variable controls for grid cells. Figure 1 displays the accompanying point scatter for model (2) vs model (5), while Table 2 presents the associated regression results. 8 No subscript j is needed because grid cells are uniquely assigned to FUAs. 9 The familiar classical 95% significance criterion is t=2.00 with 60 degrees of freedom for estimation. 10 Table 2: Correlation coefficients for model parameters (N=507) (2) (3) (4) (5) Model Parameter γ1 γ1 Β1 Β1 (2) γ1 1.00 (3) γ1 0.98 1.00 (4) Β1 0.98 0.97 1.00 (5) Β1 0.97 0.98 0.98 1.00 Table 3: Regression results Figure 1: Β1 (model 5) vs γ1 (model 2) Β1 (Model 5) γ1 (Model 2) 0.974*** (88.10) Const. 0.0906 (1.05) R2 0.94 N 507 t statistics in parentheses * p<0.05 ** p<0.0 *** p<0.001 The results in Tables 2 and 3 and Figure 1 strongly suggest that the simplest model (2) is sufficient for tracking trends in local concentration anomalies. This is good news for interested global stakeholders, who can track areas of interest in two simple steps: (1) match the areas to grid cells in the 25 km Hakkarainen-filtered database of concentration anomalies; (2) estimate model (2) for each area. 5. Illustrative Applications 5.1 Long- and Short-Period Tracking Models We illustrate the methodology with two tracking models for urban areas. The first is (6), reproducing (2) above, which provides trend estimates for an extended period. While these multi-year trends provide useful information, they may lack the immediacy needed to catalyze local action. The second model (7) contributes by estimating the size and significance of changes in the most recent year. Technically, (7) replaces the trend term in (2) with a dummy variable (DF) for observations in the final year. 11 (6) 2 = 0 + 1 + (7) 2 = 0 + 1 + 5.2 Notes on Interpretation In the following section, we report urban trend results for September 2014 – December 2021 and dummy-variable results for two five-year periods: January 2015 – December 2019 and January 2017 – December 2021. Before presenting the results, we believe that some interpretive notes are warranted. First, we offer a caveat about viewing long- and short-period differences across urban areas as indicators of differential performance, because “performance” implies intentionality on the part of public or private actors. However, even highly-significant changes in local concentration anomalies may reflect non-intentional factors such as changed agricultural practices, blight-induced forest degradation, or additional emissions from traffic congestion during periods when mass transit systems are installed. Measurement anomalies may also intrude, particularly for model (7) because it focuses on relatively short-run changes.10 To summarize, it may be more useful to view tracking results as guides to detailed local assessments rather than as performance indicators. A note about measurement units is also warranted. In Dasgupta, Lall and Wheeler (2022), we convert concentration anomalies to emissions estimates using the overall ratio of total global CO2 emissions (drawn from standard sources) to the total for all grid squares of concentration anomalies predicted from the parameters of our econometric model. This enables us to perform a distribution of predicted CO2 emissions across all terrestrial cells of the 25 km global grid. However, we have little confidence in our current ability to infer changes in emissions volumes from directly-observed changes in local anomalies that are not linked to identifiable ground sources. For this reason, our trend estimation exercise operates solely with concentration anomalies. This has no practical consequence, since the results provide readily-comparable change estimates. 5.3 Tracking Data We estimate models (6) and (7) for 1,799 functional urban areas (FUAs) whose data provide at least 30 degrees of freedom for estimation. 11 Table 4 enumerates the outlying observations that are removed prior to estimation. Observations with standardized z-values greater than 5.0 have been identified as outliers.12 As the table shows, 1,761 of 1,799 FUAs have no outliers. Single outliers have been removed for 35 FUAs, 2 outliers for 2 FUAs and 3 for 1 FUA. 10 See particularly Weir et al. (2021) for a useful discussion of potential effects from year-to-year variability caused by differences in atmospheric circulation. We address this issue in Section 5.5. 11 The sample is much larger than the sample used in Section 4 because models (6) and (7) do not absorb degrees of freedom with dummy variable controls for grid cells. 12 The z-value of an observation is its distance from the mean, measured in standard deviations. 12 Table 4: Outlier observations removed Outliers (z>5) FUAs 0 1,761 Trends 1 35 In each2 case, 2 3 1 Total 1,799 Model estimation yields t-statistics for results classification by degree of significance. We group urban areas into four categories by significance level: p>.05 (no significant change at 95% confidence); p≤.05 (95% confidence); p≤.01 (99% confidence) and p≤.001 (99.9% confidence). Figure 2 illustrates this classification for selected urban areas with negative and positive trends at different levels of significance for the period 2014 – 2021.13 Each graph includes the regression line (equivalent to model (6)) through the observations. The graphs for the non-significant group have been chosen for FUAs with t-statistics near 1.0. The illustrations highlight the need for statistical analysis, given the role of random variation in satellite-based measurements. Of course, the same is true for ground-based measurements (e.g. Chakraborty et al. 2008). 5.4 Tracking Results: FUA Trends, 2014 - 2021 Table 5 summarizes our results for trend equation (6). Our sample comprises 1,799 urban areas with 30 degrees of freedom or more during the period 2014 - 2021.14 Of these, 272 have significant decreases in local concentration anomalies and 108 have significant increases. Overall, 380 of 1,799 cities (21.1%) have significant changes during the 8-year period. Table 6 distributes the changes across regions, showing Asia with a disproportionate share of increases and the other regions with disproportionate shares of decreases. Figure 3 maps the same information, showing that within Asia, substantial portions of the decreases and increases are accounted for by India and China, respectively. 13 Scaling on the y-axis varies across cases because observations vary over different ranges. 14 FUAs in the database have populations of 50,000 or more. 13 Figure 2: Trends in monthly concentration anomalies, 2014 – 2021 (ppm) Negative Trends Positive Trends p≤.001 Vila Velha, Brazil Naples, Italy .01
p(.05) To summarize, regional atmospheric circulation is rejected as the source of a significant change in an FUA concentration anomaly if the outside change parameter has the opposite sign from the inside parameter and/or the inside parameter is significant while the outside parameter is not. Table 7 provides evidence for 335 FUAs from our estimation sample that have populations greater than 100,000 and inside change parameters with p≤.05. Among the corresponding outside change parameters, 92 (27.5%) have sign reversal (sign( 1 ) ≠ sign( 1 )). Among the 243 outside parameters that do not have sign reversal, 136 (56.0%) are not significant at 95% confidence. In summary, the null hypothesis is rejected for 228 (92 + 136) (68.1%) of the 335 urban areas in the sample. Table 7: FUA concentration anomaly trends: inside/outside test results (p( ) ≤ p(.05); FUAs with populations > 100,000) Conditions for Rejection of H0: sign(1 ) ≠ sign(1 ) or p(1 ) ≤ p(.05); p(1 ) > p(.05) Sign Reversal Count Percent Yes 92 27.5 No 243 72.5 Total 335 If No Sign Reversal: Significance (95%) Yes [p(1 ) ≤ p(.05)] 107 44.0 No [p(1 ) > p(.05)] 136 56.0 Total 243 Atmospheric Circulation Hypothesis Accepted 107 31.9 Rejected 228 68.1 Total 335 18 Table 8 incorporates this factor, presenting percent changes in CO2 concentration anomalies for 83 urban areas with populations greater than 100,000, trend results significant at p≤.01, and absence of general circulation effects. 15 The table reveals substantial geographic diversity, with 44 countries represented. The United States has 13 entries, followed by India (9), the Russian Federation (5), China (4) and South Africa (4). Table 9 shows that cities with decreasing concentration anomalies outnumber cities with increasing anomalies in all regions. At the same time, African and European cities have disproportionate decreases, American cities have disproportionate increases, and Asian cities are about equally represented. Overall, 71% of the cities have decreasing trends and 29% have increasing trends. 5.6 Short-Period FUA Changes Model (7) estimates the size and significance of final-year deviations from typical concentrations in previous years. For this illustration, we employ rolling five-year time series for urban areas. The change indicator is 1 , the parameter for the final-year dummy variable in model (7). Figure 4 provides illustrations for six urban areas during the period 2017-2021. 16 Warsaw and Orlando represent the highest confidence class (p ≤ .001), with Warsaw concentration anomalies in 2021 significantly below the five-year line and those of Orlando significantly above. The results for Hengyang/Mumbai (p≤.05) meet standard classical significance standards but are somewhat less robust, because estimated final-year deviations are smaller and/or observational variance is larger than in the most robust cases. Casablanca and Bukhara have decreases and increases, respectively, but neither is significant by classical standards. Tables 10 and 11 summarize our results for five-year periods ending in 2019 and 2021, respectively.17 The patterns of statistical significance and negative/positive distributions are similar for the two periods; significant changes are identified for 225 urban areas in 2015-2019 and 235 areas in 2017- 2021. 15 We calculate percent changes after subtracting each city’s minimum concentration anomaly from all of its observations. This permits percent calculations by transforming negative anomalies to positive values. For each city, we use its regression result for model (6) to predict concentrations in the first and last years. These predictions are used to calculate percent changes. 16 Again a cautionary note: y-axis scaling differs substantially across cases. 17 The results are for urban areas with at least 30 degrees of freedom for estimation. 19 Table 8: Trends in CO2 concentration anomalies, 2014 – 2021 (population ≥ 100,000; p ≤ .01; atmospheric circulation effects absent) City Country % Change City Country % Change Shivamogga India -38.2 Matala Angola 61.0 Trier Germany -32.7 Marrakesh Morocco 53.8 Lublin Poland -31.5 Haldwani India 49.5 Russian Cherkasy Ukraine -31.5 Dimitrovgrad Federation 49.0 Maputo Mozambique -30.8 Hachinohe Japan 46.7 Balurghat India -30.6 Winston-Salem United States 44.8 Vientiane Lao PDR -30.5 Eugene United States 42.0 Bindi Pakistan -30.1 Ho Chi Minh City Vietnam 42.0 Valence France -29.9 Huelva Spain 41.7 Izmir Türkiye -29.4 San José Costa Rica 39.0 Durban South Africa -28.7 Sora-myeon Korea, Rep. 32.5 Tiraspol Moldova -28.5 Orlando United States 32.1 Le Tampon Reunion -28.5 Van Türkiye 28.9 Adama Ethiopia -28.4 Bucharest Romania 26.2 Fayetteville United States -28.2 Ahwaz Iran, Islamic Rep. 26.2 Odense Denmark -27.8 Puerto Vallarta Mexico 22.8 Jabalpur India -27.5 Lviv Ukraine 21.6 Río Piedras [San Juan] Puerto Rico -26.5 Portsmouth United Kingdom 21.4 Russian Ulyanovsk Federation -25.6 Florianópolis Brazil 20.6 Cuernavaca Mexico -25.5 Daegu Korea, Rep. 18.9 Jerez Spain -24.8 Hangyulu China 14.8 Ciudad del Este Paraguay -24.5 Appleton United States 14.5 Russian Irkutsk Federation -24.5 Jacksonville United States 14.4 Ogden United States -24.4 Chenzhou China -24.3 Yicheng China -23.7 Relizane Algeria -23.6 Mariupol Ukraine -23.3 Warangal India -23.0 Balaghat India -22.6 Russian Sterlitamak Federation -22.3 Janesville United States -22.2 Eskisehir Türkiye -22.1 Champaign United States -21.3 Cape Town South Africa -21.2 Middelburg South Africa -21.2 Maumere Indonesia -21.1 Iran, Islamic Behbahan Rep. -21.0 Rustenburg South Africa -20.8 Katowice Poland -20.5 Russian Saint Petersburg Federation -20.3 Iloilo City Philippines -19.1 Athens Greece -19.0 Melbourne Australia -18.9 Guadalajara Mexico -18.8 Nicosia Cyprus -18.8 Rajgarh India -17.5 Sherbrooke Canada -16.8 La Rochelle France -16.7 St. Louis United States -16.7 Lafayette United States -16.4 Buenos Aires Argentina -16.1 Rajkot India -15.8 Muzaffarpur India -15.0 Madrid Spain -14.8 Dhaka Bangladesh -13.4 Table 9: Trends by region * Trend Region Decrease Increase Total Africa 8 2 10 (13.6) (8.3) (12.1) Americas 14 8 22 (23.7) (33.3) (26.5) Asia 20 9 29 (33.9) (37.5) (34.9) Europe 16 5 21 (27.1) (20.8) (25.3) Oceania 1 0 1 (1.7) (0.0) (1.2) Total 59 24 83 (71.1%) (28.9%) * Column percents in parentheses Table 10: FUAs, 2019 deviations from 5-year means, 2015-2019* Dependent variable: Local CO2 Anomaly (ppm) 2019 Deviation Significance Level Negative Positive Total Not Significant 446 401 847 .05 71 56 127 .01 42 29 71 .001 16 11 27 Total Significant 129 96 225 Total 575 497 1,072 * FUAs with degrees of freedom ≥ 30 Table 11: FUAs, 2021 deviations from 5-year means, 2017-2021* Dependent variable: Local CO2 Anomaly (ppm) 2021 Deviation Significance Level Negative Positive Total Not Significant 418 431 849 .05 76 70 146 .01 37 25 62 .001 15 12 27 Total Significant 128 107 235 Total 546 538 1,084 21 * FUAs with degrees of freedom ≥ 30 Figure 4: Monthly local concentration anomalies, 2017 - 2021 (ppm) Negative Deviation Positive Deviation p≤.001 Warsaw, Poland Orlando, United States .01
.05, ≤.05, ≤.01, ≤.001]. • Stata files that match G25 grid cell ID numbers for: • Functional urban areas (ID numbers from Schiavina et al. (2019)); • National level 0, 1 and 2 administrative areas (World Bank GADM (2021)). 39