Policy Research Working Paper 10957 Designing Air Quality Measurement Systems in Data-Scarce Settings Bridget Hoffmann Sveta Milusheva Development Economics Development Impact Group October 2024 Policy Research Working Paper 10957 Abstract While populations in low- and middle-income countries are the other data sources. Calibration, especially context-spe- exposed to some of the highest levels of air pollution and cific calibration, of the low-cost monitors’ data improves its its consequences, the majority of economics research on alignment with other data sources. The paper uses each data the topic is focused on high-income settings where there is source to evaluate the air pollution externality of mobility greater data availability. This paper compares and evaluates reduction policies using a difference-in-differences design the three principal sources of air pollution data (regulato- and finds similar results, especially in terms of percent ry-grade monitors, satellites, and low-cost monitors) in a reduction. The paper considers policy makers’ constraints Sub-Saharan African context in terms of the accuracy of to air pollution monitoring in low-income settings and measurements of inhalable fine particulate matter across demonstrates that co-locating one regulatory-grade moni- spatial and temporal frequencies and their performance tor in a network of low-cost monitors can capture the spatial when studying policy impacts. Satellite data is closely variation of pollution across an urban area and achieve aligned with data from the regulatory-grade monitors at better accuracy than either of these data sources alone. This lower temporal frequencies. The low-cost monitors under- provides a framework for policy makers to generate the data estimate the amount of fine particulate matter relative to needed to evaluate environmental policies and externalities. This paper is a product of the Development Impact Group, Development Economics. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at smilusheva@worldbank.org and bridgeth@IADB.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Designing Air Quality Measurement Systems in Data-Scarce Settings Bridget Hoffmann and Sveta Milusheva ∗ Keywords: Air pollution, PM2.5 measurement, satellite, low-cost monitors, policy evaluation Codes: Q53, Q58, Q52, C81 ∗ We would like to thank Niall Maher and Ruiwen Zhang for research assistance support and Aram Gassama, Marion Sagot, Mame Diarra Bousso Sarr, Tiloux Soundja, Abdallah Cissé and Alice Crolard for project and field coordination assistance. We would like to thank the Directorate for Environment and Classified Facilities (DEEC) and its Center for Air Quality Management (CGQA) at the Ministry of Environment, Sustainable Development and Ecological Transition, and the Executive Council for Sustainable Transport (CETUD). We also wish to thank all of the institutions that allowed us to install air quality monitors at their location and supported us in maintaining these monitors and ensuring our ability to collect data in these locations. Furthermore, we would like to thank the World Bank Senegal Country Management Unit and the BRT Transport team. We gratefully acknowledge funding from the UK Government through the ieConnect for Impact Program and from the European Union through the IntPact Program. 1 Introduction Air pollution has causal impacts on mortality, physical and mental health, labor supply and productivity, migration, cognitive performance, human capital accumulation, and decision-making (Hoffmann and Rud, 2024; Aguilar-Gomez et al., 2022; Chen et al., 2022; Guidetti et al., 2021; Anderson, 2020; Deryugina et al., 2019; Zhang et al., 2017; Arceo et al., 2016; Graff Zivin and Neidell, 2013). The majority of the evidence on the impacts of air pollution comes from high-income settings where there is high-quality data available to study pollution and its impacts.1 However, individuals in low- and middle-income countries face greater exposure and negative impacts of air pollution than those in high-income countries. Approximately 80% of people exposed to unsafe levels of air pollution live in low- and middle-income countries (Rentschler and Leonova, 2022) and 93% of the mortality and morbidity attributed to air pollution globally occurs in developing countries (World Bank and Institute for Health Metrics and Evaluation, 2016). As there are important differences in the effects of air pollution across higher- and lower-income settings (Arceo et al., 2016), more research is needed in low-income countries to evaluate both the extent of the problem and the effectiveness of policies in tackling it. A necessary first step is to develop approaches that can accurately measure air pollution, taking into account the particular constraints faced by governments in low-income countries. Recent studies have expressed enthusiasm about new frontiers in monitoring air quality, such as the advent of low-cost monitoring devices and the availability of satellite data (Snyder et al., 2013; Di et al., 2016), but it is critical to understand how well they measure air pollution and how useful they are for policy evaluation. Additionally, identifying how to combine different data sources into a cost-efficient air quality monitoring system could help low-income countries improve their air pollution measurement. In this paper, we compare and evaluate three sources of air pollution data in a lower middle-income context. We focus on particulate matter with a diameter of 2.5 micrometers or less (PM2.5), which is a mixture of fine inhalable particles of various chemical compositions including dust, dirt, soot, and smoke. PM2.5 is a widely regulated air pollutant and has stronger health effects across a broader range of health outcomes than coarser particles (Bell et al., 2004; Pope and Dockery, 2006). First, we compare the data sources in terms of accuracy of PM2.5 measurement across different spatial and temporal frequencies. Second, we use each of the data sources to evaluate a specific policy that reduced air pollution and compare the estimated treatment effects on PM2.5. Third, we consider policy makers’ constraints to air quality measurement in low-income settings to make policy recommendations about designing systems of air pollution monitoring. We compare and evaluate three principal sources of air pollution data. First, the gold standard is regulatory-grade monitors that are widely used by governments in high-income countries to create a network of ground monitoring stations that capture detailed, precise and accurate measurements of air pollutants. However, the high set-up and maintenance costs of regulatory-grade monitors preclude low- and middle-income countries (LMICs) from developing dense networks of regulatory-grade monitors and 1 One exception is China, which is an upper-middle income country, and is the focus of a large number of papers in the literature (Chang et al., 2019; Fu et al., 2021; He et al., 2019; Zivin et al., 2020). 2 from replacing them at recommended intervals. Second, satellite images can be used to approximate the level of air pollution at ground level. Satellites produce images that can be used to create gridded data sets of aerosol optical depth (AOD) as well as individual aerosols that are comparable across the world and can be used to estimate levels of particulate matter. However, satellite data does not directly measure particulate matter at ground level and provides data at a lower temporal and spatial frequency. Third, low-cost monitors that citizens can use to track their own air pollution exposure can be deployed by governments to create a network of air pollution monitors at low-cost. However, low-cost monitors do not record pollution measurements at the same level of accuracy as regulatory-grade monitors. Because each of these data sources has advantages and disadvantages, evaluating their performance in measuring air pollution levels and in estimating the effects of policies to reduce air pollution is critical for researchers and policy makers. This is especially important for LMIC settings that can face additional challenges implementing these data sources, where there have been few studies comparing their performance, and where effective environmental policies are most needed. Dakar, Senegal provides an ideal context for this study for three reasons. First, Dakar is a large and growing city in a lower middle-income country with large industrial plants and recurrent construction works. This implies that our results are informative for other fast-growing Sub-Saharan African cities that are similarly experiencing high levels of congestion and pollution. Second, in addition to the air pollution generated by congestion and economic activity, Dakar experiences Saharan desert storms, especially during the first few months of the year. These environmental conditions differ from those typical of the high-income contexts where air pollution data is usually calibrated and analyzed. Third, Dakar currently has seven regulatory-grade monitors, which allows for comparison and calibration. But in contrast to high-income settings, the density of the monitoring network is low, and the data available is sparse and of uncertain accuracy due to challenges in maintaining the monitors. While this amplifies the challenges in comparing data sources and calibrating the data from low-cost monitors, this reflects the realities of air quality monitoring in lower-income settings.2 The dearth of high-quality data at fine intervals of time and space combined with high levels of air pollution highlight the importance of studying ways to utilize the existing data and increase the data available for rigorous policy analysis, which could then be applied to other settings in the region. To collect the data for this study, we set up a network of low-cost air pollution monitors in 28 locations across Dakar, and we worked closely with the local government agency in charge of air quality data to access the data available from their regulatory-grade monitors. We test several different sources of satellite air pollution measures to identify the one that generates the values most aligned with the ground-based measures. Armed with these three sets of air pollution data, we compare the level of PM2.5 at various temporal frequencies as measured by each data source. The satellite data is closely aligned with the data from the regulatory-grade monitors at the daily, weekly, and monthly levels but less so at the hourly level. There are substantial differences between the PM2.5 estimates from the regulatory-grade and the low- 2 We use calibration to refer to the calibration of the data collected by low-cost monitors, not the calibration of the instrument (Jaffe et al., 2023). In engineering or science literature, this is often called "data correction". 3 cost monitors in terms of level. Similarly, the low-cost monitors substantially underestimate PM2.5 levels relative to the satellite measures. This relative underestimation is largely due to the low-cost monitors’ under-measurement of dust. Adjusting the data from the low-cost monitors, especially using a context-specific formula, improves the alignment of the data with both the data from the regulatory-grade monitors and the estimates from the satellite. Next, we use the three data sets to evaluate pollution externalities of the set of mobility-reducing policies enacted during the COVID-19 pandemic. We use a difference-in-differences identification strategy to analyze the impact of the policies on the level of air pollution. Overall, we find the policies lead to a positive externality in the form of a reduction in air pollution, similar to research conducted in high- income settings (Brodeur et al., 2021). Importantly, we find that all three data sources display similar patterns of PM2.5 reductions due to the policy. Using PM2.5 data at the hourly level, the impact of the policy in levels is smaller when using the data from the low-cost monitors than the other data sources because the low-cost monitors tend to underestimate PM2.5 levels. However, the impact of the policy in percent is similar across all data sources. Finally, we demonstrate the value of co-locating at least one regulatory-grade monitor in a network of low-cost monitors. We demonstrate that for a given day, relying on only a regulatory-grade monitor or only the satellite data will provide no spatial variation. However, relying on only the network of low-cost monitors will provide spatial variation but severely underestimate the level of PM2.5. Using a co-located regulatory-grade monitor to develop a context-specific calibration formula to be applied to the data from the network of low-cost monitors improves the alignment in the level of PM2.5 while preserving spatial variation. Given the tight budget constraints that policy makers in LMICs face, investing in one well-maintained regulatory-grade monitor complemented with a dense network of low-cost monitors may strike the right balance. Barring any regulatory-grade monitors, calibration of PM2.5 measurements from low-cost monitors using satellite based estimates of PM2.5 can serve as an alternative. Given the better performance of satellite data at a lower temporal frequency, we also show that this alternative calibration should be applied at a daily level (rather than hourly), to achieve results that are better aligned with what would be expected from regulatory measurements. We make three contributions to the literature. First, we contribute unique evidence comparing the three principal sources of air pollution data in a LMIC context. Fowlie et al. (2019) contrasts air pollution levels from satellite data with those derived from regulatory-grade monitors in the U.S. and concludes that differences imply potential misclassifications of E.P.A. attainment status. In Greece, Stavroulas et al. (2020) find a moderate to strong correlation between PM2.5 levels measured with low- cost monitors and with regulatory-grade monitors. However, these studies do not compare all three data sources and the results of these studies may not translate directly to contexts in LMICs because the accuracy and precision of the air pollution data can vary across low- and high-income contexts. The precision and accuracy of air pollution data from low-cost monitors is affected by environmental conditions (e.g. relative humidity) and the type of particulate matter, such as dust events (Jaffe et al., 2023). The availability, accuracy, and precision of air pollution data from regulatory-grade monitors depends on a dense network of monitors and regular investments in maintaining and replacing monitors. 4 Second, we extend this literature beyond a comparison of the air pollution levels captured by the three principal data sources to provide a novel comparison of the data sources in terms of policy evaluation. In addition to accurate information on air pollution levels, which documents the extent of the problem, policy makers need accurate estimates of the impacts of policies on air pollution to design effective solutions. We evaluate the performance of these different data sources in a rigorous analysis of a set of policies that have an important environmental externality. Third, we contribute policy recommendations for designing air quality monitoring systems in LMICs. We illustrate the value of operating at least one regulatory-grade monitor in a network of low-cost monitors. Co-locating a regulatory-grade monitor with a low-cost monitor allows for context specific calibration of the data from the low-cost monitor network that improves the accuracy of the data relative to using an off-the-shelf calibration formula. We also show that, in the absence of a regulatory-grade monitor, satellite PM2.5 estimates can be used to calibrate low-cost monitors to the local context. For a given budget, a combination of data sources including low-cost monitors can capture the spatial variation of pollution across an urban area and achieve better accuracy than a single data source alone. The next section describes the context, and section 3 describes the three data sources in detail. Section 4 lays out the methods, section 5 provides the results, and section 6 concludes. 2 Context Dakar is the capital and largest city of Senegal, which is located on the coast of West Africa. Similar to many other cities in LMICs, Dakar has experienced rapid population growth and urbanization. The Dakar metro area has a population of 3.9 million and a population density of 14,000 per km2 which is in line with cities such as New York City (Agence Nationale de la Statistique et de la Démographie, 2023). Dakar currently has 7 regulatory-grade air quality monitors or approximately one monitor per 580,000 residents, which is on the higher side of the typical range (one monitor per 100,000 to 600,000 residents) for urban areas of Europe and North America (Pinder et al., 2019).3 However, for long stretches of time only one or two regulatory-grade monitors, and for a period of approximately a year no regulatory-grade monitors, were reporting PM2.5 data (see Figure 2 for a visualization of all available data from the regulatory monitors from 2012 to 2024). In practice, this puts Dakar on par with other African urban areas, which typically have one monitor per 4.5 million residents (Pinder et al., 2019). Dakar experiences high levels of ambient air pollution, typically exceeding World Health Organization standards. The main sources of particulate matter in Dakar are traffic (49%), mineral dust (16%–25%), sea salts (15%–20%), and industries (10%–11%) (Dou). While a large share of particulate matter is generated by anthropogenic sources such as traffic (vehicular, two-wheel two-strokes and resuspended road particles), biomass (wood, charcoal and fuel) combustion, waste burning, and industries, Saharan dust storms are another major source of particulate matter (Dou). Particularly during the first few months of the year, Saharan dust storms can lead to extremely high levels of particulate matter (Dou). Relative to other African cities, the air pollution in Senegal is reported to be moderate-to-severe. Of 24 3 The 2023 population of the Dakar region was reported at 3,896,563 according to the Agence Nationale de la Statistique et de la Démographie (2023). 5 African countries where air quality data was available, Senegal has the 9th highest levels of air pollution (IQAir, 2024).4 3 Data We compare three sources of air pollution data in Dakar: regulatory-grade monitors, low-cost monitors, and satellites. We focus on particulate matter with a diameter of 2.5 micrometers or less (PM 2.5). Particulate matter is the mixture of solid particles and liquid droplets present in the air, which originate from a range of both natural sources, such as dust and wildfires, and human activities like vehicle emissions and industrial processes. Regulatory-Grade Monitors Regulatory-grade monitors are one of the primary instruments used to measure air pollution by govern- ments. Regulatory-grade monitors are designed to provide highly accurate measurements of air quality that are required for legislation (Castell et al., 2017). Regulatory-grade monitors typically cost approxi- mately $100,000 per year to install and maintain and have an operational life expectancy of approximately 7 years, although this can be extended with the replacement of monitor components (Pinder et al., 2019; EPA, 2017). Figure 1: Location of regulatory and low-cost monitors in Dakar, Senegal In Dakar, the Center for Air Quality Management (CGQA) has been using regulatory-grade monitors to collect data for more than 10 years in 7 locations across the city (Figure 1).5 Five of the monitors were installed in 2010 and the other two monitors were installed in 2018 and 2022. PM2.5 data is collected 4 Due to limited data, 30 African countries are not included in the IQAir report. 5 CGQA is the authority in charge of air quality monitoring, information and cooperation within the Ministry of Envi- ronment, Sustainable Development and Ecological Transition in Senegal. 6 by two types of monitors, the Met One BAM-1020 and the Envea MP101M, which both use a beta ray attenuation measurement technique. We are able to access data at the hourly or daily level (depending on the station) from 2012 to 2024. The PM2.5 data availability is sparse (Figure 2), both because some air quality stations do not collect PM2.5 data and because of challenges maintaining the regulatory-grade monitors. Beginning in 2012, there is relatively consistent coverage with at least one monitor active, except for 2021 and 2022, which have almost no data. This is important because the data from a regulatory-grade monitor can be used to calibrate other data sources. Additionally, when more than one monitor is reporting data concurrently we can use readings from multiple monitors to validate the data. At the times when monitors overlap, the values are closely aligned. Importantly, on days when the values seem to be improbably high, readings from different active monitors both have very high values, which is reassuring that these represent days of extremely high pollution rather than an issue with the measurement instruments. The data was pre-processed by the CGQA for extreme values. In addition, we removed PM2.5 values greater than 500 µm /m3 .6 To analyze the data, we take the values from whichever monitor is active, and in cases where there are multiple monitors, we use data from all active monitors. Low-Cost Monitors In the last decade there has been significant advancement in the technical specifications of low-cost monitors and substantial growth in the number of low-cost monitors on the market, leading to an explosion in their use (Kumar et al., 2015; Demanega et al., 2021). In addition to their use by individuals and businesses, the recognition that air pollution in urban areas is hyperlocal led to initiatives in high- income settings to complement networks of regulatory-grade monitors with low-cost monitors to provide air pollution data at a fine spatial scale. For instance, London set up a dense network of low-cost monitors across the city and several cities in California mapped pollution street by street using sensors attached to Google Street View cars (Gil-Alana et al., 2020; Mittal, 2020). The ability to capture variation in pollution at a granular spatial level is crucial for low-income settings where the prevalence of pollution is high and where the technology for capturing the origin of the pollution is limited (Hajat et al., 2015; Kamigauti et al., 2024). Therefore, the team set up a network of low-cost monitors in Dakar to measure pollution at a higher spatial granularity. We selected to use the Purple Air PA-II-SD, which, at the time the study began in 2019, had one of the highest correlations with regulatory monitors (R-squared of 0.93 to 0.97) (South Coast Air Quality Management District, 2024).7 Additionally, its accuracy has been tested under atmospheric conditions when exposed to a variety of pollutants and PM2.5 concentrations and, unlike monitors from other providers with a similar technology, PM2.5 measures captured by PA-II monitors are not found to be affected by changes in temperature and relative humidity, making them more reliable (Ardon-Dryer et al., 2020). From February 2020 until 2023, the team progressively deployed PA-II monitors across 28 locations in the Dakar region (Figure 1). The team also used data from an additional PA-II monitor set up by the Ecole 6 Only 0.17% of PM2.5 values were above 500. 7 At the time when the decision was made, in 2019, only PA-II classic monitors were available on the market. 7 Figure 2: Daily PM2.5 from regulatory-grade monitors in Dakar, 2012-24 Note: Data is aggregated daily. Values greater than 500 are removed. Supérieure Polytechnique (ESP) University, with data available starting in January 2020. 8 (A) Daily average PM2.5 across all monitors (B) The number of monitors active each day Figure 3: Daily monitor metrics, 2020-23 Note: An active monitor is defined as one that provides PM2.5 values for a given day. The daily average PM2.5 values are derived through the aggregation of cleaned hourly data. PA-II data is collected from two sensors every two minutes. Two values of PM2.5 are provided from each of the two sensors: one without applying any correction (CF=1) and one corresponding to adjusted mass concentrations derived after applying a proprietary algorithm developed by the company that produces the sensors (CF=ATM). We use the CF=1 value since we test several different adjustments applied to this unadjusted data. We initially cleaned the raw PM2.5 data by removing outliers above 500.8 We further cleaned the data by removing low values below the 0.01 quantile. In addition, we identified and removed outlier values from individual sensors (values that exceed the mean difference between the two sensors plus two standard deviations). We average the clean values from the two sensors to provide a single estimate for each two-minute interval, and aggregate the data at the hourly level by averaging the two-minute values within the hour. We further aggregate the hourly data from the Purple Air monitors to the day level and focus on two daily measures, the daily average and the daily maximum of the hourly averages. Satellite-based Measurement The third source of air quality measurement data comes from satellites. Instruments on satellites can measure atmospheric constituents such as gases and aerosols.9 These measurements can be used to generate estimates of pollution levels that can be compared to the measurements provided by ground- based instruments. The most commonly used satellite-based predictor of PM2.5 levels in the literature is aerosol optical depth (AOD). AOD is a measure of the quantity of aerosols in a vertical atmospheric column. Instruments on satellites, such as MODIS, MISR, and OMI, measure AOD by estimating the amount of sunlight that is blocked from reaching the Earth’s surface by particles in the atmosphere. 8 We also tested a threshold of 100 and determined it to be too low. 9 Aerosols are solid or liquid particles suspended in gas (Li et al., 2022). 9 While comparisons at the annual 10km level show a relatively high correlation between ground-based PM2.5 and AOD with an R2 of 0.61 (Gendron-Carrier et al., 2022), others find a weak correlation between AOD and PM2.5 at the daily 1km x 1km level (Handschuh et al., 2023). In general, it is necessary to model the relationship between AOD and ground-based PM2.5 including other covariates to obtain good predictions. In addition to AOD, other instruments on satellites can measure the levels of specific gases and aerosols. For example, the TROPOMI (TROPOspheric Monitoring Instrument) instrument on the Sentinel-5P satellite is capable of measuring levels of ozone, sulfur dioxide, nitrogen dioxide, carbon monoxide, formaldehyde and methane (Zeng et al., 2020). These raw data from different satellites and instruments have also been processed and integrated to generate assimilations, where data is integrated into atmospheric models. Examples of assimilations include the Copernicus Atmosphere Monitoring Service (CAMS) and NASA’s MERRA-2 (Modern-Era Retrospective analysis for Research and Appli- cations, version 2), which provide usable atmospheric information in one system (Buchard et al., 2017; Flemming et al., 2017).10 In preliminary comparisons, AOD captured from MODIS does not correlate well with the data from regulatory-grade monitors in Dakar at the daily level, which aligns with findings in the literature for other contexts (Handschuh et al., 2023). This is likely due to several important limitations of the AOD measure. AOD has trouble distinguishing water vapor from other particles, while the ground-based monitors capture only dry particles. In a context like Dakar, which is surrounded by water, this can have important consequences (Gendron-Carrier et al., 2022). Additionally, particles that are present in the atmosphere, such as smoke or dust from long-range transport, can lead to no (or poor) correlation between satellite measures and PM from ground-monitors, which measure particles only at the surface (Kumar et al., 2007). Given this, we focus on an estimate of PM2.5 that we generate using data from the MERRA-2 assimilation. This assimilation provides processed data on individual aerosols at the surface rather than column level, helping to address the issue of water vapor and additional particles in the atmosphere. The MERRA-2 data is accessed via Google Earth Engine, and it is available hourly at the 55 x 69km² spatial level (Gelaro et al., 2017). MERRA-2 provides assimilation data on individual aerosols (dust, organic carbon, black carbon, sulfate, and sea salt). To generate a measure of PM2.5, we combine these aerosols using a formula provided by NASA (2023): 132.14 PM2.5 = Dust + Organic Carbon + Black Carbon + Salt + (SO4 × ) (1) 96.06 The MERRA-2 data has numerous benefits. It is freely available. It provides a measure of pollution that can be compared across the world, and there is high temporal granularity, so we are able to produce PM2.5 estimates at the hourly level. It has data from the time period of 1980 to the present day. However, an important limitation is that due to the low level of spatial granularity, the MERRA-2 data in effect provides only a single value of pollution for Dakar at any given time. 10 See Appendix Table A1 for a description of different satellite products. 10 4 Methods 4.1 Calibration The raw Purple Air sensor data can have significant biases. Therefore, the data needs to be calibrated. Different calibration equations have been developed for this purpose, including one from the U.S. EPA (Barkjohn et al., 2023) and one from Barkjohn et al. (2021). More recently, Jaffe et al. (2023) find that while these calibrations perform well under certain conditions, such as urban aerosols and smoke aerosols, they largely underestimate PM under high dust pollution. They present an alternative calibration that involves identifying potential dust events and further adjusting the estimate by a factor of 5.6. We test both the Barkjohn et al. (2021) and Jaffe et al. (2023) calibrations. The Barkjohn et al. (2021) calibration applies the following equation: calibrated PM2.5 = (raw PM2.5 × 0.52) − (Relative Humidity × 0.085) + 5.71 (2) The Jaffe 2023 calibration uses a two-step method that applies either the Barkjohn et al. (2021) calibration or a variation of it depending on the value of the 0.3 µm/5 µm ratio. This is the ratio of the concentration of particles with a diameter of 0.3 micrometers to those with a diameter of 5 micrometers. • If 0.3 µm/5 µm > 190, use the original Barkjohn et al. (2021) calibration • If 0.3 µm/5 µm < 190, use the Barkjohn et al. (2021) calibration ×5.6 As one of our low-cost monitors lacks data on levels of particles of size 0.3 µm and 5 µm, a substitution is made. An alternative criterion for identifying dust events is when the PM1/10 ratio < 0.25 (Jaffe et al., 2023). Comparing the two Jaffe specifications, we find that approximately 86% of hourly values are the same. Since the monitor that lacks the 0.3 µm and 5 µm values is one of the only ones functioning in 2020, we use this alternative method for generating the Jaffe calibration in order to be able to conduct the analyses of the COVID-19 mobility policies implemented in 2020. These calibrations both use data from the U.S. where conditions may differ substantially from Dakar, leading to a different relationship between the data from low-cost monitors and regulatory-grade moni- tors. For example, the lack of paved roads may affect the amount and type of dust particles in the air, along with different fuel standards, high levels of construction, and the proximity to the Sahara Desert. Therefore, we also produce a correction based on calibrating the Purple Air data using data from local regulatory-grade monitors. We co-located two of the PA monitors with two regulatory-grade monitors in the city. One was located in Bel Air and provided overlapping data from July 21, 2020, to April 17, 2021.11 The second was located in Pikine and provided overlapping data from April 17, 2023, to December 15, 2023.12 To generate the calibration formula, we use a lasso method with a linear model and a five-fold cross- validation. We include the Purple Air PM2.5 value along with humidity and temperature averaged across 11 Some additional dates are missing within this period, most notably October 26-December 31, 2020. 12 Again, there are some missing dates within this period including all of August and October. 11 Dakar as predictors.13 The humidity and temperature data for Dakar are collected by Visual Crossing (2024). We run the model separately for each monitor as well as combining the data from both monitors. We also run the model separately for hourly observations and for daily observations. We implement the calibrations at both the hourly and daily levels. Table 1: Calibration Coefficients comparing Regulatory and nearby Purple Air monitors (Hourly) Monitor at Bel-Air Monitor at Pikine Bel-air & Pikine (1) (2) (3) (4) (5) (6) Mass Count Mass Count Mass Count PM2.5 PM2.5 PM2.5 >2.5 >2.5 >2.5 (µg/m3 ) (µg/m3 ) (µg/m3 ) Mass Concentration PM2.5 2.378 0.762 1.588 Count concentration >2.5 1.486 1.780 1.715 Humidity -0.606 -0.674 -1.766 -1.564 -1.178 -0.860 Temperature -1.072 -0.504 -2.423 -1.228 -1.583 0.819 Constant 66.63 72.85 232.1 171.2 142.5 57.96 Out-of-Sample R 2 0.615 0.799 0.365 0.515 0.449 0.565 Note: Lasso linear model with 5-fold cross-validation. Mass PM2.5 (µg/m3 ) refers to the mass concentration of fine particulates with a diameter of fewer than 2.5 microns. Count (particles/100ml) refers to the count of all particles greater than or equal to 2.5 micrometers in diameter. Data is aggregated hourly before calibration. To calibrate the low-cost monitor data to the local regulatory-grade monitors, we use the model that combines the data from the two locations and use the PA PM2.5 measure as the main predictor. Table 1 displays the coefficients and out of sample R2 for each calibration conducted at the hourly level.14 We apply the following calibration equation corresponding to the fifth column of Table 1 to generate a local calibration: Local Calibration PM2.5 = (Unadjusted PM2.5 × 1.59) − (Humidity × 1.18) − (Temperature × 1.58)+142.2 (3) In the absence of data from regulatory-grade monitors, satellite data may provide an alternative for calibrating the data from low-cost monitors. Therefore, we also calibrate the Purple Air data with the satellite data using a similar calibration procedure.Using the MERRA-2 satellite data for the calibration, there is only a single value for Dakar. Therefore, we use a Dakar-wide average of the Purple Air monitors to generate the satellite calibration model: Satellite Calibration PM2.5 = (Unadjusted PM2.5 ×1.66)−(Humidity×1.3)−(Temperature×4.84)+249.6 (4) 13 Additionally, we test including the number of particles larger than 2.5 micrometers as a predictor instead of the PM2.5 value, because the number of larger particles has a very high correlation with the regulatory-grade monitor values. 14 In a similar manner, we generate a daily local calibration using the same data aggregated daily (Table A2). 12 4.2 Comparison of Different Measures and Policy Analysis We begin by comparing the three data sources visually and exploring the correlations between them. We then demonstrate how the data sources compare when used to measure the impact of a particular policy. The focus is on the set of policies that were implemented in 2020 as a response to the COVID- 19 pandemic. As in many other countries, the Government of Senegal instituted a number of policies starting on March 14, 2020, with the goal of limiting mobility and thus curbing the spread of COVID- 19.15 Literature from other contexts has documented important externalities from these types of policies, including large decreases in pollution (Venter et al., 2020). Therefore, it is possible to use the timing of these policies in Senegal to study how the different sources of data measure this externality in Dakar. Additionally, this is a time for which we have data from all three data sources, which makes it possible to compare measured impact across all three. To study the policy impact, a difference-in-differences style design is used. Air pollution data right before and right after the policy is compared to air pollution data during those same pre- and post-policy periods in a different year. This strategy makes it possible to control for seasonality, which plays an important role in PM2.5 levels. The specification estimated is: ′ P Myt = β0 + β1 I2020 y + β2 P ostt + γI2020 y × P ostt + Xyt δ + ϵyt (5) where P Myt is PM2.5 at hour t of year y . I2020 y is an indicator equal to one if the year is 2020 and zero otherwise. P ostt is an indicator equal to one for all hours after the policies were put in place (March 14 to May 12) and zero before the COVID-19 policies (January 23 to March 13). Xyt ′ is a vector of controls that includes weather variables (temperature and humidity) and indicator variables for hour of the day, the day of the week and week of the year (i.e., an indicator of the week within a calendar year, starting from week 1), and ϵyt is the error term.16 We conduct the analysis comparing 2020, when we expect reduced mobility and levels of PM2.5, to 2023, when we expect mobility to have normalized in the city.17 There was consistent data (more than 80 days during the analysis period) available from two regulatory-grade monitors in 2020, Guediawaye and Republique. We include data from both monitors and control for monitor fixed effects. For 2023, there is only data from a third regulatory monitor that was recently installed in Pikine. We use the hourly MERRA-2 data for 2020 and 2023 for the city. There is only one Purple Air monitor for which data is available for several months before the policy in 2020 and also after the policy, but this monitor does not have data beyond 2020. Instead, for 2023 we include data from all monitors, which includes data on at least 80 different days within the period of 110 days of the analysis. We control for monitor fixed effects. In addition to using the unadjusted Purple Air data, we also conduct the analysis using two off-the-shelf calibrations of the data from Barkjohn and Jaffe, as well as our local and satellite calibrations. We conduct several robustness checks. We conduct the analysis at the daily level, using both the 15 These included banning public and private gatherings, a curfew from 20:00 to 6:00, and closing schools. 16 Precipitationis not included as a weather control because the period of analysis includes only the dry period in Dakar and precipitation is consistently zero. 17 There were still some COVID-19 policies in place in 2021 that could affect mobility, and therefore pollution, and we do not have data from any regulatory monitors in 2022 to allow for comparison of the policy results across data sources. 13 daily average across hours as well as the maximum hourly value of the day as outcomes. We use data from 2020 and 2023, and we include controls for weather variables and indicator variables for day of the week and month of the year. We also study a longer time period, allowing us to better control for seasonality and overall high variance in the outcome variable. We use the satellite data and the data from regulatory-grade monitors for 2017-2020 and use the Purple Air data for 2020, 2022 and 2023.18 We include results for these analyses using the longer time period at both the hourly and daily levels. For these robustness checks, we analyze both the unadjusted low-cost monitor data and the locally calibrated data, using the hourly calibration values for regressions at the hourly level and the daily calibration values for regressions at the daily level. 5 Results 5.1 Comparison of PM2.5 Levels First, we compare the PM2.5 estimates generated using satellite data from MERRA-2 with the data from the regulatory-grade monitors. Figure 4 demonstrates that there is a strong alignment of values at the monthly level (Figure 4A), weekly level (Figure 4B), and daily level (Figures 4C and 4D), and that aggregating the data temporally improves alignment. The satellite data is able to capture the peaks in pollution that occur in the first two months of each calendar year (January - February), but the satellite measures also record increases in pollution in May-June for 2013-2015 that are not reflected in the data from the regulatory-grade monitors (Figure 4A). Particularly in 2019-2020, the PM2.5 estimates from the satellite data at the monthly-level are very closely aligned with the PM2.5 data from the regulatory-grade monitors (Panel 4A). This reflects the fact that in 2019-2020, the regulatory data comes primarily from a new monitor installed in 2018 in Guediawaye, which measures much higher levels of PM2.5 compared to the other regulatory monitors in the city, and thus aligns more closely with the higher estimates of pollution generated with the satellite data.19 When aggregated to the weekly level, the two data sources still track each other closely for 2019-2020 but there is greater disparity in the levels than when aggregated to the monthly level. There are some discrepancies between the two measures at the daily level, with the satellite data recording greater levels of PM2.5 than the regulatory-grade data at some times and lower levels at other times (Figures 4C and 4D). Next, we look at alignment between the two data sources at higher frequency. One of the advantages of the data from MERRA-2 over other sources of satellite data is that it can generate estimates of PM2.5 at the hourly level. The estimates from MERRA-2 and the data from the regulatory-grade monitor are not as closely aligned at this higher temporal frequency. Figure 5 uses data from March 2020 to illustrate that the measurements of PM2.5 at the hourly level from these two data sources frequently diverge by 50µg/m3 or more and are often moving in the opposite direction (i.e. one data series is not a level shift 18 Because of the inconsistent availability of data from the regulatory-grade monitors across years, we combined data from different monitors, using data from Bel Air for 2017, data from Republique in 2018, from Guediewaye in 2019 and data for Republique and Guediewaye in 2020. 19 The higher levels of pollution measured by the monitor at Guediawaye may reflect higher levels of pollution in this part of the city or may be due to the calibration of the instrument. 14 15 (A) 2012-2020 data aggregated monthly (B) 2019-2020 data aggregated weekly (C) 2019 data aggregated daily (D) 2020 data aggregated daily Figure 4: Regulatory and Satellite PM2.5 data comparison Note: The Regulatory data is from the CGQA monitors at Republique, Bel-air, Guediawayé, and Pikine, depending on data availability across time. Panel A is an average of all available regulatory monitors. In 2018 a new monitor was installed in Guediawaye. For Panels B,C, and D, 2019 data comes from Guediawaye, and 2020 data comes from an average of Republique and Guediawaye. Satellite estimates are generated from the MERRA-2 assimilation. The satellite value is the MERRA-2 PM2.5 estimate closest to the Bel-air regulatory-grade monitor station. of the other). Together figures 4 and 5 suggest that, while the satellite-derived estimates are useful for understanding air pollution trends at lower temporal frequencies such as the weekly and especially the monthly level, they are not very useful for capturing temporal variation in pollution at high frequency such as hourly. Since air pollution has strong non-linear impacts (Hoffmann and Rud, 2024), policy makers may be particularly interested in capturing extreme values of air pollution that may be masked by lower-frequency averages. Second, we compare the PM2.5 measurements from the low-cost monitors to those from the regulatory- grade monitors. Figure 6 shows the values of PM2.5 over time for a regulatory-grade and a low-cost monitor co-located in Pikine, Dakar. There is a large difference between the level of PM2.5 from the regulatory-grade monitor and the unadjusted level of PM2.5 from the low-cost monitor, with the low- cost monitor underestimating PM2.5 relative to the regulatory-grade monitor. This pattern aligns with related findings that low-cost monitors underestimate PM2.5 during dust events (Barkjohn et al., 2021). The underestimation is due to design limitations in the Purple Air monitors, such as low-flow fans, making them inefficient at capturing and detecting dust particles (Barkjohn et al., 2021). (A) Comparison between satellite and regulatory (B) Comparison between low-cost monitors and regulatory Figure 5: PM2.5 estimates for March 2020 at the hourly level Note: For the low-cost monitor PM2.5 values, the Local Calibration is used, which is generated following Equation (3). The Regulatory data comes from the CGQA monitors at Republique and Guediawayé. Satellite estimates are generated from the MERRA-2 assimilation. The satellite value is the MERRA-2 PM2.5 estimate closest to the Bel-air regulatory-grade monitor station. To address the Purple Air monitors’ underestimation of pollution from dust, we apply the existing two-step method proposed by Jaffe et al. (2023) that specifically aims to address the limited ability of PA monitors to capture dust. Separately, we also adjust for this underestimation using the local calibration that we generate following Equation (3). The local calibration shows a much closer alignment to the regulatory-grade monitor values, compared to the unadjusted values from the low-cost monitors as well as compared to the Jaffe calibration (Figure 6). While the Jaffe calibration improves the correlation between 16 the low-cost values and the regulatory monitor (Table 2 Panel A), the figure shows that there remains a large underestimation when using the low-cost monitor data adjusted using the Jaffe calibration, which is confirmed by a larger RMSE when comparing the regulatory and Jaffe-adjusted data (Table 2 Panel B). We find that aggregating the data temporally improves alignment, as it did with the satellite and regulatory data comparison. This can be seen both visually in Figure 6 Panel D, where the data is aggregated weekly, and also in the higher correlation of 0.89 at the weekly level (Table 2 Panel A). Comparing at a higher temporal frequency at the hourly level leads to lower alignment (Figure 5). Nevertheless, the locally calibrated data performs much better at the hourly level (correlation of 0.74) as compared to the satellite data (correlation of 0.54) (Table 2 Panel A). Third, we compare the PM2.5 values from the low-cost monitors to the PM2.5 estimates produced using the satellite MERRA-2 data. In a similar result to the comparison between the regulatory-grade and low-cost monitors, we find that the low-cost monitors record much lower levels of pollution compared to the satellite data (Figure 7). This underestimation was particularly severe in early 2020 and early 2021, when dust storms were frequent, where the unadjusted Purple Air PM2.5 values do not increase substantially despite peaks in PM2.5 in the satellite data (Figure 7A). To confirm that the substantial divergence between the PM2.5 measurements from the two data sources is due to an underestimation of dust by the Purple Air monitors in our context, we generate a modified PM2.5 estimate using all of the MERRA-2 components except dust.20 Comparing the MERRA- 2 estimates of PM2.5 with and without dust suggests that much of the PM2.5 pollution in Dakar is due to dust (Figure A2). The PM2.5 estimate excluding dust from MERRA-2 and the unadjusted PM2.5 values from the low-cost monitors are more closely aligned (Figure 7B). However, the Purple Air values tend to be higher, especially during peaks in PM2.5, than the satellite-derived estimates that exclude dust since some dust pollution is captured by the Purple Air monitors. Furthermore, while the closer alignment between these two pollution series provides evidence that the low-cost monitors underestimate PM2.5 in Dakar due to incomplete measurement of dust pollution, the closer alignment does not reflect a better measurement of PM2.5 in Dakar. As with the comparison between the data from the regulatory-grade monitors and the low-cost moni- tors, we compare the satellite estimates of PM2.5 to two adjustments of the PM2.5 data from the low-cost monitors. Again, the adjustment using the Jaffe calibration does not improve the underestimation of values (Figure 7C), but there is an increase in the correlation between the adjusted and satellite data (Table 2). The local calibration that we develop improves the alignment of the data from the low-cost monitor with the satellite estimates, both in terms of levels and the correlation (Figure 7D and Table 2). Considering alignment with both the PM2.5 measurements from the regulatory-grade monitors and the satellite estimates, calibrating the data from the low-cost monitor appears to partially compensate for its underestimation of PM2.5 due to dust. Furthermore, the results suggest that local calibration improves alignment relative to off-the-shelf adjustment of the data from the low-cost monitors. 20 To generate the MERRA-2 PM2.5 estimate without dust, we use the same equation to combine the different components except dust is not included. 17 (A) Low-cost (Unadjusted) and Regulatory, Daily (B) Low-cost (Jaffe Calibration) and Regulatory, Daily (C) Low-cost (Local Calibration) and Regulatory, Daily (D) Low-cost (Local Calibration) and Regulatory, Weekly Figure 6: Low-cost and Regulatory PM2.5 comparison at Pikine station, May-December 2023 Note: For Panels A, B, and C, data is aggregated daily. For Panel D, data is aggregated weekly. The low-cost estimate uses Purple Air data from a monitor co-located with a regulatory monitor in Pikine. The Jaffe Calibration comes from Jaffe et al. (2023) and uses the PM1/10 ratio to identify dust events. The Local Calibration is generated following Equation (3) and is based on calibrating with two regulatory-grade monitors. The Regulatory data comes from the CGQA monitor at Pikine. Overall, the correlations and root mean square errors between the PM2.5 measurements from the low-cost monitors and the regulatory-grade monitors or satellite support the two main findings above (Table 2). First, the correlations increase as the data is aggregated temporally. At the weekly level, all of the Purple Air PM2.5 values (unadjusted, Jaffe calibration, local calibration, and satellite calibration) are well-aligned with the regulatory and satellite data. Second, correcting or calibrating the PM2.5 mea- surements from the low-cost monitors leads to a higher correlation and lower RMSE. The one exception is that, across all temporal aggregations, applying the Jaffe calibration increases the RMSE, even as the correlation improves. Futhermore, the co-located calibrations - the Local and Satellite Calibrations generally represent an improvement over the Jaffe calibration. While the Jaffe calibration leads to higher correlations than the co-located calibrations for some temporal aggregations, all correlations are high and the differences are typically small. At all temporal frequencies, the RMSE using co-located calibrations 18 (A) Low-cost (Unadjusted) and Satellite (B) Low-cost (Unadjusted) and Satellite (excluding dust) (C) Low-cost (Jaffe Calibration) and Satellite (D) Low-cost (Local Calibration) and Satellite Figure 7: Low-cost monitor and Satellite PM2.5 values, 2020-21 Note: Data is aggregated daily. The Low-cost uses unadjusted Purple Air data aggregated across monitors for panels A and B. The Jaffe Calibration comes from Jaffe et al. (2023) and uses the PM1/10 ratio to identify dust events. The Local Calibration is generated following Equation (3) and is based on calibrating with two local regulatory-grade monitors. The Satellite estimate uses the aerosol values from the MERRA-2 estimates closest to the Bel-air regulatory-grade monitor station. . To produce the Satellite value without dust, the dust aerosol is excluded from the construction of the PM2.5 value. are lower than those for the Jaffe calibration. At the hourly level, the co-located calibrations stand out as having the highest correlation with the regulatory and satellite measures and substantially lower RMSE. Comparing the satellite data to the unadjusted low-cost monitor data, the satellite data performs much better across all temporal aggregations when it comes to the RMSE, but especially so for the daily and weekly aggregations. Nevertheless, the correlation is better between the low-cost and regulatory data at the hourly and daily levels. The goal of comparing the different data sources is to understand the data generated by each source and how the data sources can be combined to generate better air quality monitoring. To inform the design of air quality monitoring systems, we compare the spatial PM2.5 data from each data source and from multiple sources combined on a specific day (Figure 8). The challenges of maintaining expensive regulatory monitors in a low-resource setting like Senegal leads to limited data, in most cases with only 19 20 Table 2: Comparison of PM2.5 Measures Panel A: Correlations Regulatory Monitors Satellite Measures Hourly Daily Weekly Hourly Daily Weekly (1) (2) (3) (4) (5) (6) Satellite Measures 0.54 0.75 0.87 . . . Low-cost (unadjusted) 0.60 0.81 0.83 0.53 0.72 0.84 Jaffe Calibration 0.62 0.90 0.96 0.54 0.76 0.89 Local Calibration 0.74 0.87 0.89 0.63 0.80 0.91 Satellite Calibration 0.68 0.87 0.90 0.64 0.82 0.92 Panel B: Root Mean Square Error Regulatory Monitors Satellite Measures Hourly Daily Weekly Hourly Daily Weekly (1) (2) (3) (4) (5) (6) Satellite Measures 55.4 34.8 20.6 . . . Low-cost (unadjusted) 59.9 52.1 46.9 60.2 53.5 46.6 Jaffe Calibration 61.8 60 56.4 66.4 61.8 56.6 Local Calibration 39.5 26.9 21.3 42.7 29.8 18.8 Satellite Calibration 44.9 32.1 26.5 42.2 28.5 18 Note: The date range used is for the full year 2023. The regulatory monitor values come from the average of CGQA regulatory-grade monitors. Satellite measures are generated from the MERRA-2 assimilation. The satellite value is the MERRA-2 PM2.5 estimate closest to the Bel-air regulatory-grade monitor station. The regulatory monitor values come from the average of CGQA regulatory-grade monitors. The low-cost unadjusted and calibrated values use averages across all low-cost monitors active for a given time interval. Jaffe calibration uses PM1/10 ratio to identify dust events. The local calibration is the unadjusted Purple Air values calibrated to the local context using data from two local regulatory monitors. The satellite calibration uses the MERRA-2 measure to calibrate the unadjusted low-cost monitors. one regulatory-grade monitor functioning at any given time. Therefore, this data implies a uniform level of pollution across the city (Figure 8 Panel A). The PM 2.5 level from the regulatory-grade monitor aligns closely with that from the satellite data (Figure 8 Panel B), but both sources miss variation across space. A network of low-cost monitors captures significant spatial variation in pollution across the city, but the levels are substantially lower than those from the regulatory or satellite data due to the limitations in capturing dust (Figure 8 Panel C). The presence of at least one regulatory-grade monitor makes it possible to conduct a local calibration, which can then be used to adjust the PM2.5 data from the low-cost monitors to better align with the levels from the regulatory monitor, while preserving spatial variation across the city (Figure 8 Panel D). We can also calibrate the low-cost data using the satellite data, since there may not be high-quality regulatory data available for calibration in LMIC contexts. Calibrating the low-cost monitors using the satellite data shows potentially an overestimate of air pollution across the city, with the average value for the city higher than the regulatory one. For the two calibrations, we use the values generated based on the hourly calibration and average the calibrated hours across the day. Alternatively, we can average the hours across the day and apply the daily calibration. The daily calibration, both with the regulatory and the satellite data, has a much higher out-of-sample R2, which may suggest that it could lead to better estimates (Tables A2). Applying the daily calibration for the same day, the daily local calibration looks quite similar to the one using the hourly calibration, but importantly the satellite calibration is now much more closely aligned with the local calibration and closer to the average regulatory value. These results suggest that investing in a combination of regulatory and low-cost monitors allows for a much richer dataset that can capture both magnitudes and variation in space. In the case where regulatory data is not available, though, it may be possible to calibrate low-cost monitors using satellite data at a lower temporal frequency (daily or lower). 21 (A) Regulatory Monitor (B) Satellite (C) Low-cost monitors (unadjusted (D) Low-cost monitors (local cali- (E) Low-cost monitors (satellite cal- data) bration applied) ibration applied) Figure 8: One day example of PM2.5 across Dakar measured using different data sources Note: Maps are created using different data sources on April 7, 2023. The data in panel (a) is from the regulatory-grade monitor at one location, Pikine in Dakar; the data in panel (b) is from the MERRA-2 satellite estimate for Dakar; the map in panel (c) is interpolated based on unadjusted data from Purple Air monitors across 19 different locations ; the map in panel (d) is interpolated based on locally calibrated Purple Air data from 19 locations using data from two regulatory-grade monitors for the calibration; the map in panel (e) is interpolated based on calibrated Purple Air data from 19 locations using satellite data from MERRA-2 for the calibration. 5.2 Policy Analysis A primary use of air quality data is to measure how different policies, projects and interventions affect air quality levels. Therefore, it is crucial to understand how well each of the data sources is able to capture the treatment effects of policies on air pollution levels. Therefore, we examine the impact of the implementation of COVID-19 mobility reduction policies on air quality using the measurements of PM2.5 from the different data sources. Figure 9 plots the PM2.5 data series from the three data sources at the daily level in a window around the implementation of the mobility restriction policies in 2020. All three data sources display a large drop in the pollution level after the policies were implemented on March 14, 2020. Importantly, comparing across data sources, the proportional drop in average daily pollution is very similar across the different sources. As discussed in the earlier comparison section, the magnitudes are quite different between the Purple Air data and the satellite data, but the overall patterns are nevertheless very similar.21 21 Note the different y-axes used to plot the different data sources in order to demonstrate the alignment in pattern, even as the magnitudes are quite different. 22 (A) Comparison between low-cost monitors and satellite (B) Comparison between regulatory monitors and satellite Figure 9: PM2.5 Measures before and after Covid mobility restrictions in 2020 Note: Data is aggregated daily. Low-cost data uses the Local Calibration generated by adjusting the Purple Air data using a calibration based on two local regulatory-grade monitors. The Regulatory data comes from the CGQA monitors at Republique and Guediawayé. Satellite estimates are generated from the MERRA-2 assimilation. The satellite value is the MERRA-2 PM2.5 estimate closest to the Bel-air regulatory-grade monitor station. The left y-axis in panel A corresponds to the low-cost data and the left y-axis in panel B corresponds to the regulatory data. The right y-axis in both panels corresponds to the satellite data. It is important to control for the seasonality of pollution by comparing the drop in pollution in 2020 with changes in pollution over this time period in other years. This is especially important considering the timing of the implementation of the policy. In Dakar, pollution levels are higher in the first quarter of the year (Jan-March) than in the rest of the year, and the time period in which this seasonal drop in pollution occurs coincides with the implementation of the policies to limit mobility. For each of the data sources, we therefore compare the pollution values in the window around March 14th in 2020 and in 2023. Visually, the data from the low-cost monitors demonstrates a very pronounced drop in air pollution after the policy in 2020 and no similar drop in 2023 (Figure 10 shows average weekly values in 2020 and 2023).22 There is also a large drop in air pollution in the regulatory data in 2020, which is not seen in 2023. The satellite estimates also show a drop in pollution in 2020, though they seem to show higher values before March 14 in 2020 compared to 2023, and then pollution drops after March 14 in 2020 but not in 2023. There are two reasons that the impact may be more pronounced using the data from the Purple Air monitors. First, for 2023, we are able to use data from 13 different monitors that had observations on at least 80 of the 110 days in the analysis, which helps to smooth the otherwise high variation in pollution when measured from only a single monitor. Second, the Purple Air monitors only partially capture dust pollution, which makes up a large portion of the pollution measured by the satellite and regulatory-grade monitors. Because dust is driven by environmental factors rather than human-activities, the Purple Air data may focus on the elements of PM2.5 that are likely to be influenced by mobility policies. Next, we formally analyze the impact of the policy on air pollution levels using the three data 22 Plots using daily aggregated data can be seen in Figure A4. 23 (A) Low-cost Monitors (B) Satellite (C) Regulatory monitors Figure 10: Comparison of air pollution in 2020 and 2023, Weekly Note: Black dotted line denotes when mobility-limiting policies were implemented in 2020. Data is aggregated weekly. Low-cost data uses the Local Calibration based on adjusting the Purple Air data using a calibration using two local regulatory-grade monitors. The Regulatory data comes from the CGQA monitors at Republique, Guediawayé, and Pikine, depending on data availability. Satellite estimates are generated from the MERRA-2 assimilation. The satellite value is the MERRA-2 PM2.5 estimate closest to the Bel-air regulatory-grade monitor station.. sources in a difference-in-difference regression framework. We begin by analyzing data at the hourly level following regression equation 5. Controls include weather variables (temperature and humidity), hour- of-day, day-of-week, and week-of-the-year fixed effects. Standard errors are heteroscedasticity robust. 24 25 Table 3: Impact of Covid Mobility Restrictions on Air Pollution Measures (Hourly Averages, 2020 & 2023) (1) (2) (3) (4) Regulatory Satellite Low-cost PA Local Calibration Policy Period (Mar 14-May 12) -37.58∗∗∗ -29.41∗∗∗ -4.18∗∗∗ -6.32∗∗∗ (2.91) (7.92) (0.66) (1.04) 2020 20.54∗∗∗ 31.90∗∗∗ 2.61∗∗∗ 4.21∗∗∗ (1.46) (2.95) (0.62) (0.98) Policy Period (March 14-May 12)*2020 -24.98∗∗∗ -30.51∗∗∗ -8.50∗∗∗ -13.61∗∗∗ (1.71) (3.67) (0.50) (0.78) Average 100.07 129.38 23.08 58.72 % of Average -24.97 -23.58 -36.85 -23.18 Note: Controlling for temperature and humidity, with hourly, weekly, monthly, day-of-week and monitor fixed effects. Contains data from 22nd January to 12th May for 2020 and 2023. The ’Average’ is based off the pre-Covid average in 2020. Regulatory data comes from monitors operated by the CGQA. Satellite estimates are generated from the MERRA-2 assimilation. The satellite value is the MERRA-2 PM2.5 estimate closest to the Bel-air regulatory-grade monitor station. The Low-cost measurements in Column 3 are the unadjusted values from all Purple Air monitors active in 2020 or 2023. The Local calibration is based on calibrating the unadjusted Purple Air data using co-located regulatory data from two local monitors. Winsorizing outliers at 99th percentile. ∗ p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001. Table 4: Comparison of Low-cost Calibration Regression Coefficients (Hourly, 2020 & 2023) Type of Correction Regulatory Local Bel-air Pikine Policy Period (Mar 14-May 12) -37.58 ∗∗∗ -6.32 ∗∗∗ -9.75 ∗∗∗ -3.26∗∗∗ (2.91) (1.04) (1.57) (0.53) 2020 20.54∗∗∗ 4.21∗∗∗ 6.22∗∗∗ 1.99∗∗∗ (1.46) (0.98) (1.47) (0.47) Policy Period (March 14-May 12)*2020 -24.98∗∗∗ -13.61∗∗∗ -20.27∗∗∗ -6.53∗∗∗ (1.71) (0.78) (1.18) (0.37) Average 100.07 58.72 53.31 67.86 % of Average -24.97 -23.18 -38.02 -9.62 Note: Controlling for temperature and humidity, with hourly, weekly, monthly, day-of-week and monitor fixed effects. The ’Average’ is based off the pre-Covid average in 2020. Jaffe correction uses PM1/10 ratio to identify dust events. Winsorizing outliers at 99th percentile. Low-cost refers to raw Purple Air data using the cf=1 formula. Barkjohn and Jaffe are Purple Air corrections based on previous literature. Local and Satellite corrections are based on calibrating the Purple Air data on co-located regulatory and satellite data, respectively. ∗ p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001. Table 5: Comparison of Low-cost Correction Regression Coefficients (By different areas of the city, Hourly, 2020 & 2023) Type of Correction Low-cost Local Treatment Spillover Control Policy Period (Mar 14-May 12) -5.08∗∗∗ -7.90∗∗∗ -3.15∗∗ -6.35∗∗∗ -7.18∗∗∗ (0.63) (0.99) (1.00) (1.23) (1.14) 2020 2.74∗∗∗ 4.38∗∗∗ 4.14∗∗∗ -0.53 0.40 (0.62) (0.98) (0.62) (0.73) (0.73) Policy Period (March 14-May 12)*2020 -8.94∗∗∗ -14.26∗∗∗ -11.39∗∗∗ -12.05∗∗∗ -6.35∗∗∗ (0.50) (0.78) (0.72) (0.80) (0.69) Average 23.12 58.81 23.99 23.99 23.34 % of Average -38.67 -24.24 -47.48 -50.25 -27.22 Note: Controlling for temperature and humidity, with hourly, weekly, monthly, day-of-week and monitor fixed effects. The ’Average’ is based off the pre-Covid average in 2020. Local correction based on calibrating the Purple Air data on co-located regulatory and satellite data, respectively. ∗ p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001. The estimated treatment effect is a reduction in the concentration of PM2.5 of 8.50 using the unad- justed low-cost data, 30.51 using estimates from satellite data, and 24.99 using data from the regulatory- grade monitors (Table 3). It is unsurprising that the impact of the policy in levels is smaller when using the data from the low-cost monitors than the other data sources because the low-cost monitors tend to underestimate the PM2.5 level. Using the average PM2.5 values measured by each data source in the pre-policy period in 2020, we calculate the percent change in air pollution due to the policy. As a percent of the average value before the policy, the regulatory and satellite data provide very aligned results showing a 24%-25% decrease in PM2.5, while the low-cost monitor shows a much larger decrease of around 37%. When we use the locally calibrated PM2.5 measurements from the low-cost monitors, the coefficient is larger (though still smaller than the coefficient in the regression using regulatory data), and the percent decrease becomes perfectly in line with the regulatory and satellite data. These findings are encouraging because they demonstrate that it is possible to measure similar impacts of a policy on PM2.5 pollution at the city level using three different data sources. Thus, if there is data 26 Table 6: Comparison of Low-cost Correction Regression Coefficients (By different areas of the city, Hourly, 2020 & 2023) Type of Correction Unadjusted Local Treatment Spillover Control Policy Period (Mar 14-May 12) -4.18 ∗∗∗ -6.32∗∗∗ -1.39 -8.18 ∗∗∗ -7.23∗∗∗ (0.66) (1.04) (1.01) (1.55) (1.14) 2020 2.61∗∗∗ 4.21∗∗∗ 3.85∗∗∗ -1.27 0.64 (0.62) (0.98) (0.62) (0.78) (0.73) Policy Period (March 14-May 12)*2020 -8.50∗∗∗ -13.61∗∗∗ -10.87∗∗∗ -10.44∗∗∗ -6.58∗∗∗ (0.50) (0.78) (0.71) (0.93) (0.68) Average 23.08 58.72 23.99 23.99 23.34 % of Average -36.85 -23.18 -45.33 -43.50 -28.20 Note: Controlling for temperature and humidity, with hourly, weekly, monthly, day-of-week and monitor fixed effects. The ’Average’ is based off the pre-Covid average in 2020. Jaffe correction uses PM1/10 ratio to identify dust events. Winsorizing outliers at 99th percentile. Low-cost refers to raw Purple Air data using the cf=1 formula. Barkjohn and Jaffe are Purple Air corrections based on previous literature. Local and Satellite corrections are based on calibrating the Purple Air data on co-located regulatory and satellite data, respectively. ∗ p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001. available from only one source, it is still possible to measure the impacts of a policy that affects pollution at a larger geographic level. Nevertheless, the ability to accurately measure the impact of a policy with any data source will depend on the type of policy and the spatial and temporal frequency of interest. For instance, if there is interest in evaluating the impact of an intervention that is spatially concentrated and is unlikely to affect air pollution globally in a city, the satellite data would not be adequate since it provides only one value for the city. Similarly, if policy makers are interested in the impact of a policy to reduce short-term peaks in air pollution, the satellite data would not be a good choice since it is not well aligned with data from regulatory-grade monitors at the hourly level. In these cases, it would be necessary to use data from air pollution monitors on the ground that collect data across space (whether low-cost or regulatory-grade), and the accuracy would depend on the density of the network. Table 7: Comparison of Low-cost Calibration Regression Coefficients (Hourly, 2020 & 2023) Type of Calibration (1) (2) (3) (4) (5) Unadjusted Barkjohn Jaffe Local Satellite Policy Period (Mar 14-May 12) -4.18∗∗∗ -2.57∗∗∗ -2.88∗∗∗ -6.32∗∗∗ -6.80∗∗∗ (0.66) (0.35) (0.61) (1.04) (1.10) 2020 2.61∗∗∗ 1.91∗∗∗ -6.36∗∗∗ 4.21∗∗∗ 4.35∗∗∗ (0.62) (0.32) (0.81) (0.98) (1.03) Policy Period (March 14-May 12)*2020 -8.50∗∗∗ -4.93∗∗∗ -1.61∗∗ -13.61∗∗∗ -14.18∗∗∗ (0.50) (0.26) (0.59) (0.78) (0.82) Average 23.08 14.64 18.68 58.72 79.74 % of Average -36.85 -33.66 -8.64 -23.18 -17.78 Note: Data is aggregated hourly. Controlling for temperature and humidity, with hourly, weekly, monthly, day-of-week and monitor fixed effects. The ’Average’ is based off the pre-Covid average in 2020. Barkjohn and Jaffe are Purple Air corrections based on previous literature. The Jaffe calibration uses the PM1/10 ratio to identify dust events. Local and Satellite corrections are based on calibrating the Purple Air data on co-located regulatory and satellite data, respectively. Winsorizing outliers at 99th percentile. ∗ p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001. Finally, the results up to this point suggest that the unadjusted PM2.5 measurements from the low- 27 cost monitors give much lower magnitudes (and a lower coefficient) relative to the regulatory and satellite data, even as the effects of a policy measured in percent change is largely aligned across data sources. Therefore, we also evaluate the impact of the policy at the hourly level using the four different calibrations for the Purple Air Data: the two off-the-shelf calibrations from the literature (Barkjohn and Jaffe), and our two co-located calibrations based on PM2.5 measurements from regulatory-grade monitors and satellite estimates (Table 7). Both off-the-shelf calibrations estimate much smaller coefficients (treatment effect in levels), even as the treatment effect as a percent change stays around the same value for the Barkjohn calibration. The co-located calibrations result in a larger coefficient than the unadjusted data and are more in line with the coefficient estimated with the data from the regulatory-grade monitors. For the local calibration, the impact of the policy as a percent change is estimated as 23%, which is very similar to the 25% estimated with the data from the regulatory-grade monitors and 24% with the satellite data. The coefficient (treatment effect in levels) estimated using the satellite calibration is very similar to the one using the regulatory monitor calibration, though the percent change is smaller at only 18%. We conduct several robustness checks. Using data from only 2020 and 2023, we compare the results across the three data sources at the daily level using both the maximum hourly PM2.5 value of the day and the average of the hourly values for the day (Table A3 Panels A and B). Since there is only one year of comparison and therefore a small number of observations when conducting the analysis at the daily level, the results are much less stable. For both outcome variables, the coefficient remains negative when using data from the regulatory-grade monitor, but it is no longer significant and the percent change is much smaller. Using the estimates from satellite data, the coefficient is positive and not significant. Using the low-cost data, the results remain negative and significant when the outcome variable is the daily average, though the percent change is smaller. We conduct an additional robustness check, adding more years to the analysis.23 The analysis at the hourly level shows very similar results to what we find when only comparing 2020 and 2023 at the hourly level, increasing our confidence in those results (Table A3 Panel C). At the daily level, we now find the results are fully aligned with the results when using the hourly data, in contrast to what we find when using only one year for comparison. Additionally, the local calibration coefficient at the daily level is much closer aligned to the coefficient using the regulatory data. This is in line with our findings that the calibration at the daily level provides a better fit (Table A2). This highlights two points. First, it demonstrates the benefit of having higher resolution hourly data, especially in cases where there is a shorter time-series available. It also shows that in cases where there is a longer time-series, using a lower temporal aggregation (daily or higher) for calibrating the low-cost monitors can help provide policy results that are better aligned with the regulatory data. Data from a spatially dense network of low-cost monitors can be calibrated using at least one regulatory-grade monitor to improve the measurement of PM2.5 levels and of policy impacts. Invest- ing in a combination of low-cost and regulatory-grade monitors can improve the air quality data used for policy evaluation. In the absence of data from regulatory-grade monitors, the results suggest that 23 For the satellite and regulatory data we analyze 2017-2020 due to data availability and for the low-cost data we analyze 2020, 2022 and 2023, removing 2021 since some policies were still in place at that time. 28 satellite data can be used to calibrate the PM2.5 measurements from a network of low-cost monitors, although the results may not be as accurate in terms of measurement of PM2.5 levels or policy impacts as when calibrated with regulatory-grade data. 6 Conclusion In this paper, we compare and evaluate the three principal sources of PM2.5 data in Dakar, Senegal. We find that the low-cost monitors report substantially lower PM2.5 levels due to incomplete measurement of dust and that, at higher levels of temporal aggregation, PM2.5 levels derived from satellite data are closely aligned with PM2.5 levels from regulatory-grade monitors. Using mobility restrictions during COVID-19, we show that all three data sources provide similar estimates of the impact of a policy on PM2.5 at the city-level, particularly when measured in percent changes instead of levels. Finally, we show the value of investing in at least one regulatory-grade monitor. Developing a context-specific calibration formula provides substantial improvement over off-the-shelf calibration formulas in aligning the data from low-cost monitors with data from regulatory-grade monitors. Faced with budget constraints, an air pollution monitoring system based on one well-maintained regulatory-grade monitor complemented with a dense network of low-cost monitors can provide PM2.5 data with high spatial variation that can also be used to accurately estimate the impact of policies. Barring any data from regulatory-grade monitors, calibration of PM2.5 measurements from low-cost monitors using satellite-based estimates of PM2.5 can improve the measurement of PM2.5 levels and policy impacts as long as the calibration is conducted at a lower temporal frequency. Our results have important implications for policy makers in rapidly growing urban areas of LMICs. Our results also speak to rural or suburban areas of higher-income countries where the coverage of regulatory-grade monitors is also sparse. Overall, the results are encouraging and point to the poten- tial that low-cost monitors have as part of a system of air pollution monitoring. Increasing the data availability in these settings could help lead to an important expansion of the literature focused on air pollution, providing a better understanding of context-specific challenges and potential environmental policy solutions to address these. 29 References (). Source apportionment of ambient particulate matter (pm) in two Western African Urban Sites (dakar in senegal and bamako in mali). Agence Nationale de la Statistique et de la Démographie (2023). 5ème Recensement Général de la Population et de l’Habitat (RGPH-5) - Rapport préliminaire. Tech. rep., Agence Nationale de la Statistique et de la Démographie (ANSD). Aguilar-Gomez, S., Dwyer, H., Graff Zivin, J. and Neidell, M. (2022). This is air: The “non- health” effects of air pollution. Annual Review of Resource Economics, 14, 403–425. Anderson, M. L. (2020). As the wind blows: The effects of long-term exposure to air pollution on mortality. Journal of the European Economic Association, 18, 1886–1927. Arceo, E., Hanna, R. and Oliva, P. (2016). Does the effect of pollution on infant mortality differ between developing and developed countries? evidence from Mexico City. The Economic Journal, 126, 257–280. Ardon-Dryer, K., Dryer, Y., Williams, J. N. and Moghimi, N. (2020). Measurements of pm 2.5 with purpleair under atmospheric conditions. Atmospheric Measurement Techniques, 13 (10), 5441– 5458. Barkjohn, K., Holder, A., Clements, C., Frederick, S. and Evans, R. (2023). Sensor data cleaning and correction: Application on the airnow fire and smoke map. U.S. EPA. Barkjohn, K. K., Gantt, B. and Clements, A. L. (2021). Development and application of a united states-wide correction for pm2.5 data collected with the purpleair sensor. Atmospheric Measurement Techniques, 14, 4617–4637. Bell, M., Samet, J. and Dominici, F. (2004). Time-series studies of particulate matter. Annual Review of Public Health, 25, 247–280. Brodeur, A., Cook, N. and Wright, T. (2021). On the effects of covid-19 safer-at-home policies on social distancing, car crashes and pollution. Journal of environmental economics and management, 106, 102427. Buchard, V., Randles, C., Da Silva, A., Darmenov, A., Colarco, P., Govindaraju, R., Ferrare, R., Hair, J., Beyersdorf, A., Ziemba, L. et al. (2017). The merra-2 aerosol reanalysis, 1980 onward. part ii: Evaluation and case studies. Journal of Climate, 30 (17), 6851–6872. Castell, N., Dauge, F. R., Schneider, P., Vogt, M., Lerner, U., Fishbain, B., Broday, D. and Bartonova, A. (2017). Can commercial low-cost sensor platforms contribute to air quality monitoring and exposure estimates? Environment international, 99, 293–302. 30 Chang, T. Y., Graff Zivin, J., Gross, T. and Neidell, M. (2019). The effect of pollution on worker productivity: evidence from call center workers in China. American Economic Journal: Applied Economics, 11 (1), 151–172. Chen, S., Oliva, P. and Zhang, P. (2022). The effect of air pollution on migration: Evidence from China. Journal of Development Economics, 156, 102833. Demanega, I., Mujan, I., Singer, B. C., Anđelković, A. S., Babich, F. and Licina, D. (2021). Performance assessment of low-cost environmental monitors and single sensors under variable indoor air quality and thermal conditions. Building and Environment, 187, 107415. Deryugina, T., Heutel, G., Miller, N. H., Molitor, D. and Reif, J. (2019). The mortality and medical costs of air pollution: Evidence from changes in wind direction. American Economic Review, 109 (12), 4178–4219. Di, Q., Kloog, I., Koutrakis, P., Lyapustin, A., Wang, Y. and Schwartz, J. (2016). Assessing pm2. 5 exposures with high spatiotemporal resolution across the continental United States. Environ- mental science & technology, 50 (9), 4712–4721. EPA (2017). Quality assurance handbook for air pollution measurement systems volume ii ambient air quality monitoring program. Section 11.0. Office of Air Quality Planning and Standards, U.S. Environmental Protection Agency. Flemming, J., Benedetti, A., Inness, A., Engelen, R. J., Jones, L., Huijnen, V., Remy, S., Parrington, M., Suttie, M., Bozzo, A. et al. (2017). The cams interim reanalysis of carbon monoxide, ozone and aerosol for 2003–2015. Atmospheric Chemistry and Physics, 17 (3), 1945–1983. Fowlie, M., Rubin, E. and Walker, R. (2019). Bringing satellite-based air quality estimates down to earth. In AEA Papers and Proceedings, American Economic Association 2014 Broadway, Suite 305, Nashville, TN 37203, vol. 109, pp. 283–288. Fu, S., Viard, V. B. and Zhang, P. (2021). Air pollution and manufacturing firm productivity: Nationwide estimates for china. The Economic Journal, 131 (640), 3241–3273. Gelaro, R., McCarty, W., Suárez, M. J., Todling, R., Molod, A., Takacs, L., Randles, C. A., Darmenov, A., Bosilovich, M. G., Reichle, R. et al. (2017). The modern-era retrospective analysis for research and applications, version 2 (MERRA-2). Journal of climate, 30 (14), 5419–5454. Gendron-Carrier, N., Gonzalez-Navarro, M., Polloni, S. and Turner, M. A. (2022). Subways and urban air pollution. American economic journal: Applied economics, 14 (1), 164–196. Gil-Alana, L. A., Yaya, O. S. and Carmona-González, N. (2020). Air quality in London: evidence of persistence, seasonality and trends. Theoretical and Applied Climatology, 142, 103–115. Graff Zivin, J. and Neidell, M. (2013). Environment, health, and human capital. Journal of Eco- nomic Literature, 51, 689–730. 31 Guidetti, B., Pereda, P. and Severnini, E. (2021). “placebo tests” for the impacts of air pollution on health: The challenge of limited health care infrastructure. AEA Papers and Proceedings, 111, 371–375. Hajat, A., Hsia, C. and O’Neill, M. S. (2015). Socioeconomic disparities and air pollution exposure: a global review. Current environmental health reports, 2, 440–450. Handschuh, J., Erbertseder, T. and Baier, F. (2023). Systematic evaluation of four satellite aod datasets for estimating pm2. 5 using a random forest approach. Remote Sensing, 15 (8), 2064. He, J., Liu, H. and Salvo, A. (2019). Severe air pollution and labor productivity: Evidence from industrial towns in China. American Economic Journal: Applied Economics, 11 (1), 173–201. Hoffmann, B. and Rud, J. P. (2024). The unequal effects of pollution on labor supply. Econometrica (Conditionally Accepted). IQAir (2024). 2023 world air quality report. Jaffe, D. A., Thompson, K., Finley, B., Nelson, M., Ouimette, J., Andrews, E. et al. (2023). An evaluation of the US EPA’s correction equation for purpleair sensor data in smoke, dust, and wintertime urban pollution events. Atmospheric Measurement Techniques, 16 (5), 1311–1322. Kamigauti, L. Y., Perez, G. M., Martin, T. C., de Fatima Andrade, M. and Kumar, P. (2024). Enhancing spatial inference of air pollution using machine learning techniques with low-cost monitors in data-limited scenarios. Environmental science: atmospheres. Kumar, N., Chu, A. and Foster, A. (2007). An empirical relationship between pm2. 5 and aerosol optical depth in Delhi metropolitan. Atmospheric Environment, 41 (21), 4492–4503. Kumar, P., Morawska, L., Martani, C., Biskos, G., Neophytou, M., Di Sabatino, S., Bell, M., Norford, L. and Britter, R. (2015). The rise of low-cost sensing for managing air pollution in cities. Environment international, 75, 199–205. Li, J., Carlson, B. E., Yung, Y. L., Lv, D., Hansen, J., Penner, J. E., Liao, H., Ramaswamy, V., Kahn, R. A., Zhang, P. et al. (2022). Scattering and absorbing aerosols in the climate system. Nature Reviews Earth & Environment, 3 (6), 363–379. Mittal, L. (2020). London air quality network summary report 2020. https://londonair.org.uk/ london/reports/2020_LAQN_Report.pdf. NASA (2023). Supplemental Documentation for GEOS Aerosol Products. Tech. Rep. GMAO Office Note No. 22 (Version 1.1), Global Modeling and Assimilation Office, Earth Sciences Division, NASA Goddard Space Flight Center, Greenbelt, Maryland 20771, release Date: 10/03/2023. Pinder, R. W., Klopp, J. M., Kleiman, G., Hagler, G. S., Awe, Y. and Terry, S. (2019). Opportunities and challenges for filling the air quality data gap in low-and middle-income countries. Atmospheric environment, 215, 116794. 32 Pope, C. A. and Dockery, D. (2006). Health effects of fine particulate air pollution: Lines that connect. Journal of the Air Waste Management Association, 56, 709–742. Rentschler, J. and Leonova, N. (2022). Air Pollution and Poverty. World Bank, Washington, DC. Snyder, E. G., Watkins, T. H., Solomon, P. A., Thoma, E. D., Williams, R. W., Hagler, G. S., Shelow, D., Hindin, D. A., Kilaru, V. J. and Preuss, P. W. (2013). The changing paradigm of air pollution monitoring. Environmental science & technology, 47 (20), 11369–11377. South Coast Air Quality Management District (2024). Pm sensor evaluations - aq-spec pro- gram. http://www.aqmd.gov/aq-spec/evaluations/criteria-pollutants/summary-pm, accessed: 20-Mar-2024. Stavroulas, I., Grivas, G., Michalopoulos, P., Liakakou, E., Bougiatioti, A., Kalkavouras, P., Fameli, K. M., Hatzianastassiou, N., Mihalopoulos, N. and Gerasopoulos, E. (2020). Field evaluation of low-cost pm sensors (purple air pa-ii) under variable urban air quality conditions, in Greece. Atmosphere, 11 (9), 926. Venter, Z. S., Aunan, K., Chowdhury, S. and Lelieveld, J. (2020). Covid-19 lockdowns cause global air pollution declines. Proceedings of the National Academy of Sciences, 117 (32), 18984–18990. Visual Crossing (2024). Visual crossing weather. https://www.visualcrossing.com/, accessed: 2024-08-29. World Bank and Institute for Health Metrics and Evaluation (2016). The cost of air pollu- tion: Strengthening the economic case for action. License: Creative Commons Attribution CC BY 3.0 IGO. Zeng, J., Gerasimov, I., Adams, J., Huwe, P., Wei, J. and Meyer, D. (2020). Exploration of atmospheric compositions by tropomi on sentinel-5p. In EGU General Assembly Conference Abstracts, p. 4330. Zhang, X., Zhang, X. and Chen, X. (2017). Happiness in the air: How does a dirty sky affect mental health and subjective well-being? Journal of Environmental Economics and Management, 85, 81–94. Zivin, J. G., Liu, T., Song, Y., Tang, Q. and Zhang, P. (2020). The unintended impacts of agricultural fires: Human capital in China. Journal of Development Economics, 147, 102560. 33 7 Appendix Figure A1: Dashboard showing PM2.5 values for low-cost monitors over time Note: Data is aggregated daily. Low-cost data uses the unadjusted values. 34 35 Figure A2: Satellite PM2.5 estimates with and without dust, 2020-2021 Note: Data is aggregated daily. Satellite estimates are generated from the MERRA-2 assimilation. The satellite value is the MERRA-2 PM2.5 estimate closest to the Bel-air regulatory-grade monitor station. The ’excluding dust’ estimate removes the dust aerosol from the estimate of PM2.5 generated by Equation 1. 36 (A) Low-cost monitors (local cali- (B) Low-cost monitors (satellite cal- bration applied) ibration applied) Figure A3: One day example of PM2.5 across Dakar measured using Low-cost monitors applying daily calibrations Note: Maps are created using calibrated Purple Air data on April 7, 2023. The data in panel (a) is interpolated based on locally calibrated Purple Air data from 19 locations using data from regulatory-grade monitors for the calibration; the map in panel (b) is interpolated based on calibrated Purple Air data from 19 locations using satellite data from MERRA-2 for the calibration. 37 (A) Low-cost Monitors (B) Satellite (C) Regulatory monitors Figure A4: Comparison of air pollution in 2020 and 2023, Daily Note: Dotted line denotes when mobility-limiting policies were implemented in 2020. Data is aggregated daily. Low-cost data uses the Local Calibration that adjusts Purple Air values based on calibration with two local regulatory monitors. The Regulatory data comes from the CGQA monitors at Republique, Guediawayé, and Pikine, depending on data availability. Satellite estimates are generated from the MERRA-2 assimilation. The satellite value is the MERRA-2 PM2.5 estimate closest to the Bel-air regulatory-grade monitor station.. Table A1: Air Pollution products from Satellites Type Source Products Temporal Spatialg Availability Assimilation from various sources MERRA-2 AOD, Gases, Aerosolsa Hourly 55 x 69 km² 1980- Present Copernicus Atmosphere Monitoring Service AOD, Gases, Aerosols, PMb Dailyc 44 x 44 km² 2003 - Present Van Donkelaar et al (2021) PM2.5d Monthly 10 x 10 km² 1998 - 2022 Instruments on satellite MODIS on Terra/Aqua AOD Daily 1 x 1 km² 2000-02-26 - Present SLSTR on Sentinel-3 AOD Daily 9.5 x 9.5 km² 19-08-2020 - Present VIIRS on S-NPP AOD Daily 6 x 6 km² 2012-03-01 - Present OMI on Aura Gases, Aerosols Daily 13 x 24 km² 2004-10-01 - Present TROPOMI on Sentinel 5-P Gases, Aerosolse Daily 1.1 x 1.1 km² 2018-06-28 - Present SEVIRI on MSG AOD Daily 3 x 3km² 01-02-2004 - 31-12-2012 AVHRR, GOME-2, and IASI on METOP AOD, aerosol typef Daily 5/10 × 40 km² 10-07-2007 - 31-08-2019 a MERRA-2 AOD, Gases, Aerosols data available at 0.5° x 0.625°. b Copernicus Atmosphere Monitoring Service data has real-time resolution. Resolution changes to 80 x 80 km² for re-analysis. Prior to July 2021, only data on Total Aerosol Optical Depth at 550nm and PM2.5 surface estimates are available. Data updates approximately every 3 hours after July 2021. c Daily data, approximately every 3 hours after July 2021. d Van Donkelaar et al (2021) PM2.5 data available at 0.1° × 0.1°. e TROPOMI on Sentinel 5-P Gases, Aerosols data available at 0.01° × 0.01°. f AVHRR, GOME-2, and IASI on METOP aerosol types are categorized by ’fine, coarse, ash, and biomass’. g Spatial distance is approximate, and can vary depending on the distance the measurement is from the equator. Table A2: Calibration Coefficients comparing Regulatory and nearby Purple Air monitors (Daily) Monitor at Bel-Air Monitor at Pikine Bel-air & Pikine (1) (2) (3) (4) (5) (6) Mass Count Mass Count Mass Count PM2.5 PM2.5 PM2.5 >2.5 >2.5 >2.5 (µg/m3 ) (µg/m3 ) (µg/m3 ) Mass Concentration PM2.5 2.479 1.845 2.396 Count concentration >2.5 2.098 2.144 2.171 Humidity -0.219 -0.415 -1.774 -2.067 -0.693 -0.930 Temperature -1.196 0.423 -3.178 -2.008 -1.094 1.381 Constant 35.76 19.21 226.1 223.9 72.91 40.03 Out-of-Sample R2 0.696 0.909 0.706 0.763 0.719 0.780 Note: Lasso linear model with 5-fold cross-validation. Mass PM2.5 (µg/m3 ) refers to the mass concentration of fine particulates with a diameter of fewer than 2.5 microns. Count (particles/100ml) refers to the count of all particles greater than or equal to 2.5 micrometers in diameter. Data is aggregated daily before calibration. 39 40 Table A3: Impact of Covid Mobility Restrictions on Air Pollution Measures (1) (2) (3) (4) Regulatory Satellite Low-cost Local Calibration Panel A: 2020 & 2023 (Daily Maximum) Policy Period (Mar 14-May 12) 27.47∗ -31.02 -6.91∗ -10.42∗ (11.07) (22.89) (2.85) (4.33) 2020 -1.11 -16.49 -1.07 -4.37 (11.45) (18.50) (5.81) (8.87) Policy Period (March 14-May 12)*2020 -7.73 23.66 -5.54 -7.85 (13.29) (22.55) (4.51) (6.94) Average 159.59 210.25 47.56 107.02 % of Average -4.84 11.25 -11.66 -7.33 Panel B: 2020 & 2023 (Daily Averages) Policy Period (Mar 14-May 12) 7.26 -19.11 -0.11 -0.12 (6.15) (11.83) (1.04) (2.50) 2020 2.57 3.50 0.76 1.73 (5.94) (9.86) (1.80) (4.26) Policy Period (March 14-May 12)*2020 -4.10 1.96 -4.08∗ -9.81∗∗ (6.92) (12.28) (1.59) (3.78) Average 102.83 131.44 23.41 54.03 % of Average -3.98 1.49 -17.44 -18.15 Panel C: Over a 4-year period (Hourly Averages) Policy Period (Mar 14-May 12) -15.35∗∗∗ -54.08∗∗∗ -10.33∗∗∗ -15.90∗∗∗ (2.70) (4.73) (0.57) (0.89) 2020 20.12∗∗∗ 22.21∗∗∗ -1.22∗ -1.89∗ (1.20) (2.13) (0.58) (0.91) Policy Period (March 14-May 12)*2020 -26.20∗∗∗ -24.55∗∗∗ -5.82∗∗∗ -9.34∗∗∗ (1.20) (2.36) (0.45) (0.70) Average 100.07 129.38 23.08 58.72 % of Average -26.18 -18.97 -25.20 -15.90 Panel D: Over a 4-year period (Daily Averages) Policy Period (Mar 14-May 12) 20.99∗∗∗ -14.98∗ -5.05∗∗∗ -11.94∗∗∗ (4.41) (7.19) (0.86) (2.07) 2020 11.41∗ 13.83 -0.55 -1.47 (4.79) (7.37) (1.94) (4.55) Policy Period (March 14-May 12)*2020 -20.84∗∗∗ -18.74∗ -6.01∗∗∗ -14.28∗∗∗ (4.89) (8.32) (1.66) (3.86) Average 102.83 131.44 23.41 54.03 % of Average -20.27 -14.26 -25.68 -26.44 Note: For every panel, the ’Average’ is based off the pre-Covid average in 2020 (January 22-March 13). Winsorizing outliers at 99th percentile. Regulatory data comes from monitors operated by the CGQA. Satellite estimates are generated from the MERRA-2 assimilation. The satellite value is the MERRA-2 PM2.5 estimate closest to the Bel-air regulatory-grade monitor station. Low-cost refers to unadjusted Purple Air data. The Local calibration is based on calibrating the Purple Air data using co-located regulatory data. All panels control for temperature and humidity, with monthly, day-of-week, and monitor fixed effects. Panels A and C additionally include hour of the day and week of the year fixed effects. Panel A and B contain data from 22nd January to 12th May for 2020 and 2023. In Panel C and D, regulatory and satellite data are from 2017-2020, while low-cost data are from 2020, 2022, and 2023 (2021 is not included as some COVID mitigation measures were ongoing during that time). Panel A represents the maximum hourly value for each day. Panel B and D represent values aggregated daily. Panel C represents data aggregated hourly. For Panels B and D, the daily local calibration is used, applying the Bel-Air and Pikine combined coefficients from Table A2. * p < 0.05, ** p < 0.01, *** p < 0.001