POVERTY POVERTY AND EQUITY EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT Enabling High-frequency and Real-time Poverty Monitoring in the Developing World with SWIFT Survey of Wellbeing via Instant and Frequent Tracking Nobuo Yoshida and Danielle Victoria Aron © 2024 International Bank for Reconstruction and Development / The World Bank 1818 H Street NW Washington DC 20433 Telephone: 202-473-1000 Internet: www.worldbank.org This work is a product of the staff of The World Bank with external contributions. The findings, interpretations, and conclusions expressed in this work do not necessarily reflect the views of The World Bank, its Board of Executive Directors, or the governments they represent. The World Bank does not guarantee the accuracy, completeness, or currency of the data included in this work and does not assume responsibility for any errors, omissions, or discrepancies in the information, or liability with respect to the use of or failure to use the information, methods, processes, or conclusions set forth. The boundaries, colors, denominations, and other information shown on any map in this work do not imply any judgment on the part of The World Bank concerning the legal status of any territory or the endorsement or acceptance of such boundaries. Nothing herein shall constitute or be construed or considered to be a limitation upon or waiver of the privileges and immunities of The World Bank, all of which are specifically reserved. Rights and Permissions The material in this work is subject to copyright. Because The World Bank encourages dissemination of its knowledge, this work may be reproduced, in whole or in part, for noncommercial purposes as long as full attribution to this work is given. Any queries on rights and licenses, including subsidiary rights, should be addressed to World Bank Publications, The World Bank Group, 1818 H Street NW, Washington, DC 20433, USA; fax: 202-522-2625; e-mail: pubrights@worldbank.org. ACKNOWLEDGMENT Our heartfelt thanks to the contributors for their essential insights on the country case studies in Section 6 of this report. Special mentions include Xueqi Li and Eliana Carolina Rubiano Matulevich for Paraguay; Carolina Diaz-Bonilla for Botswana; Paripoorna Baxi, with assistance from Shinya Takamatsu for the DRC; Anjali Kini, with inputs from Dhiraj Sharma, Shinya Takamatsu, and Rob Swinkels for Zimbabwe; Jeremy Schneider, with insights from Aziz Atamanov for Uganda; Anjali Kini and Xueqi Li, with Lydia Kim's contributions for Mongolia; Maria Gabriela Farfan Betran and Sebastian Patrick Alexandre Silva Leander for Zambia; and Paripoorna Baxi, with Marta Schoch's inputs for Nigeria. Their expertise significantly enriched this report, highlighting the diverse applications and impacts of the SWIFT methodology worldwide. Special appreciation goes to Jeremy Evan Schneider for his dedicated assistance in drafting this report. We're also grateful for the insightful and constructive feedback provided by Tara Vishwanath, Gabriel Lara Ibarra, Maria Eugenia Genoni, Carolina Diaz-Bonilla, Nandini Krishnan, Maria Fernanda Gonzalez Icaza, Laura Liliana Moreno Herrera, Roy Van der Weide, Hai-Anh H. Dang, David Newhouse, Dean Mitchell Jolliffe, Talip Kilic, Ksenia Abanokova, Kimberly Blair Bolch, Henry Stemmler, and all participants in the review meeting at the World Bank. Our appreciation also extends to Carlos Sabatino for his editorial review, and to Luis-Felipe Lopez-Calva and Benu Bidani for their leadership and support in the preparation of this report. Table of Contents 1. Introduction......................................................................................................................................................................................................................... 6 2. Framework for SWIFT Poverty Estimaton............................................................................................................................................ 8 A. Model development.............................................................................................................................................................................................................. 9 B. Collection or harmonization of the target data...................................................................................................................................................... 10 C. Imputation............................................................................................................................................................................................................................... 11 Improved model stability with SWIFT Plus and SWIFT 2.0................................................................................................................................ 11 D.  E. Reliability of SWIFT-based poverty estimates.........................................................................................................................................................12 Advantages of the SWIFT framework and its caveats........................................................................................................................................ 13 F.  3. Poverty Estimation with SWIFT: A Flexible Framework for Diverse Use Cases ............................................14 Increasing the frequency of poverty statistics using available frequent surveys..................................................................................... 14 A.  Producing poverty statistics when an appropriate training dataset is not already available (SWIFT 2.0)................................... 15 B.  Rapid poverty monitoring with non-traditional data collection ..................................................................................................................... 16 C.  Restoring comparability to reestablish a poverty trend..................................................................................................................................... 16 D.  4. Cost Implications of the SWIFT Framework ................................................................................................................................... 18 Frequent Household Survey Already Available........................................................................................................................................................ 18 (i)  (ii) No Appropriate Data for Training ................................................................................................................................................................................ 19 No Frequent Household Survey Available, but Plans for a Frequent Phone Survey............................................................................... 19 (iii)  No Frequent Household Survey Available, but Plans for Community-Based Data Collection......................................................... 20 (iv)  5. Brief Summary of the SWIFT Framework........................................................................................................................................... 22 6. Country Examples ..................................................................................................................................................................................................... 26 Paraguay: utilizing quarterly surveys to provide poverty statistics.............................................................................................................. 26 (i)  Botswana: utilizing quarterly labor force surveys for poverty statistics.................................................................................................... 28 (ii)  Democratic Republic of Congo: providing a poverty trend despite a lack of comparable household surveys........................... 31 (iii)  Zimbabwe: using SWIFT to provide poverty statistics after unexpected economic changes...........................................................32 (iv)  (v)  Malawi: providing timely poverty data in a climate crisis context.................................................................................................................34 Uganda: providing timely poverty data in a refugee crisis context............................................................................................................. 36 (vi)  Mongolia: restoring poverty trends after improvements to the household survey.............................................................................. 38 (vii)  Zambia: restoring poverty trends due to differences in household surveys ......................................................................................... 40 (viii)  Nigeria: using available past surveys to uncover a poverty trend ................................................................................................................43 (ix)  7. Conclusion..........................................................................................................................................................................................................................46 Annex 1. Metadata for country cases............................................................................................................................................................. 50 Annex 2. Simulation process..................................................................................................................................................................................... 52 Figures Figure 1 Steps of the SWIFT framework...................................................................................................................................................... 23 Figure 2 Paraguay quarterly poverty projection results..........................................................................................................................28 Figure 3 Botswana poverty projection results............................................................................................................................................30 Figure 4 Rural southern Malawi poverty estimates using RFMS........................................................................................................... 35 Figure 5 Uganda poverty estimates for refugees...................................................................................................................................... 37 Figure 6 Mongolia validation exercise – comparison of 2016 actual and imputed poverty headcount by aimag group..........39 Figure 7 Zambia poverty and inequality trends......................................................................................................................................... 42 Tables Table 1 Comparison of poverty rates – actual expenditures and SWIFT imputed expenditures (%).............................................12 Table 2 Classification and activities of SWIFT analysis......................................................................................................................... 24 Table 3 Paraguay model validation results................................................................................................................................................ 27 Table 4 DRC poverty estimates ................................................................................................................................................................... 32 Table 5 Zimbabwe poverty rates by upper, lower, and food poverty lines.......................................................................................... 33 Table 6 Mongolia actual and imputed Gini coefficients in 2016, 2018, and 2020 .........................................................................40 Table 7 Zambia model performance results in 2015 data ....................................................................................................................41 Table 8 Nigeria poverty estimates at international poverty lines, by GHS wave............................................................................. 44 Zambia Photo: World Bank 4 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT <<< 5 1 Introduction The Survey of Wellbeing via Instant and Frequent Tracking (SWIFT) is an innovative framework for producing poverty statistics with significantly reduced costs and time. Lanjouw and Yoshida (2022) reveal that official poverty data in low-income countries are only available every seven years on average, which leads to an inability to fully understand yearly fluctuations in the poverty rate. Prolonged gaps between official poverty rates are primarily due to the slow and laborious nature of traditional methods of poverty estimation. Traditional methods require detailed consumption or income data which is collected through carefully designed Computer Assisted Personal Interview (CAPI) questionnaires, involving extensive interviews conducted by well- trained enumerators. After data collection, poverty measurement experts spend a minimum of three (and sometimes up to twelve) months to produce poverty estimates. Overall, collecting and analyzing data using traditional methods can take 1.5 to 3 years, costing millions of dollars. The high costs and time required result in the infrequent availability of poverty data. In contrast, SWIFT enables the production of more frequent and timely poverty estimates with remarkable efficiency. By leveraging machine-learning models trained on the latest household surveys, SWIFT computes poverty estimates from just 10 to 15 poverty correlates. When these correlates are present in existing datasets, the estimation cost is minimal. Even when these inputs are not readily available, a brief 2 to 3-minute interview per household is all that’s needed to gather the data—a considerable time-saving compared to traditional household surveys. Despite the substantial reduction in costs and time, the accuracy of SWIFT-based poverty estimates is high. Research by Yoshida et al. (2022) indicates that SWIFT’s estimates usually deviate by no more than +/- 2 percentage points from actual poverty rates. This level of accuracy, coupled with the ease of implementation, has led to the widespread adoption of SWIFT in over 200 projects across 70 countries. 6 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT DRC - Capturing health data. Photo: Vincent Tremeau / World Bank This report aims to familiarize those involved in estimating The first three types of cases highlight the utilization of SWIFT official poverty statistics with the SWIFT framework to to increase the frequency of poverty data, while the fourth enhance the frequency and quality of poverty data. It presents addresses how SWIFT can improve the quality of poverty data how SWIFT works, discusses the advantages and caveats of by ensuring comparability over time, a crucial obstacle to official the methodology, and provides examples of country-specific poverty monitoring. applications , covering cases such as: The report is organized as follows: Section 2 describes 1. Enhancing the frequency of poverty statistics using the SWIFT framework, Section 3 lists the applications of existing frequent household surveys, SWIFT included in this report, Section 4 describes the cost expectations, Section 5 provides decision-making guidance 2. Producing poverty statistics when an existing training on how and when to apply SWIFT, Section 6 shows specific dataset is not already available, country examples of how SWIFT has been utilized, Section 7 3. Exploring the integration of new data collection concludes the report, and Section 8 lists those who contributed approaches, such as phone surveys and community- to the writing of this report. based data collection into the SWIFT framework, and 4. Restoring comparability of poverty data over time. ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT <<< 7 2 Framework for SWIFT Poverty Estimaton The SWIFT framework for estimating poverty contains three primary components. The first is the development of an imputation model from a household survey (we will refer to this dataset as the training dataset). The training dataset can be any sufficiently large dataset that is nationally representative and contains both a welfare aggregate and data on poverty correlates, such as household demographics, dwelling conditions, asset ownership, labor market statistics, food consumption, food security, etc. The second component of the SWIFT framework is creating the secondary dataset which household expenditure1 will be imputed into (we will refer to this second dataset as the target dataset). The target dataset can be any dataset that contains data on the poverty correlates that are included in the model. The dataset can be created by either collecting the necessary data on poverty correlates in a new survey or by harmonizing an existing survey to be comparable to the training dataset.2 Finally, the third component is the application of the developed model(s) to the target dataset to impute household expenditure and estimate poverty rates. The order of the three components of the SWIFT framework will depend on the existing availability of the training and target data. Paraguay - Capiibary producers fair in Paraguay Photo: Farrah Frick / World Bank 1 The welfare variable that is imputed into the target dataset is determined by what is available in the training dataset. This could be household expenditure, consumption, or income. For simplicity of language in this report, we will refer to only “household expenditure” as the household welfare variable in the Methodology section. 2 In some instances, there may be multiple target datasets, which could be a mix of existing surveys and newly collected surveys. 8 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT A. MODEL DEVELOPMENT In the SWIFT framework it is usually assumed that the natural indicators are produced: (1) the absolute value of the logarithm of household expenditure per capita (or per adult difference between the actual and projected poverty rates and equivalence) of household h follows a linear model (1), where (2) the mean squared error (MSE). The selected significance x h is a (k x 1) vector of poverty correlates, ß is a (kx1) vector level is the lowest p-value where either the absolute difference of coefficients of the poverty correlates, and an error term (u h ) is minimized or where the MSE is minimized. which follows a normal distribution (N ( 0,o ) ).3 Once the model has been created with the optimal significance lny h = x h'ß + u h (1) level, it is tested for performance within the training data u h~ N ( 0,o ) by imputing household expenditure twenty times for each household.5 The mean imputed household expenditure and An imputation model is trained within the training data using associated poverty rates can then be compared with the a stepwise OLS regression.4 The stepwise regression starts training data’s actual household expenditure and poverty rates. with a large pool of variables within the training data and uses Though this is an “in-sample” test, the results are guarded an iterative process to pare down the number of variables against the over-fitting problem because of the significance to a small set with independent explanatory power for the level selected by the CV exercise. dependent variable, the natural log of household expenditure. Variables from the larger pool are dropped as candidates The modeling process is an iterative approach, with the for poverty correlates in the model if their relationship to log performance tests allowing an analyst to see when a model household expenditure is insignificant at a given threshold. should be re-done to produce better results. Analysts should also pay special attention to the signs of the coefficients of the This significance threshold is chosen through a Cross- model variables, checking that each variable is affecting log Validation (CV) exercise. Cross-Validation is frequently used household expenditure in an expected way. When variables in machine learning applications to avoid the “over-fitting” are not affecting expenditure in a predictable way, the training problem that can occur when developing and testing a model data should be re-evaluated and/or dialogue should be opened within the same dataset. Over-fitting occurs when a model with a country expert before proceeding to better understand performs very well within the training data but may perform the underlying data. poorly in outside datasets, which can be the result of a model including too many variables relative to the sample size of Lastly, the number of models required for the most accurate the training data. CV randomly splits the training data into poverty estimates depends on the project context. It is often subsamples called folds. In the SWIFT modeling exercise, the case that multiple models will be required, disaggregated ten folds are used. A series of models are estimated using a by regions or population types. For instance, if the known stepwise regression with varying significance thresholds from consumption patterns vary vastly from the northern province nine folds and then tested in the one remaining fold. Because in a country compared to the southern province, it is advisable the testing fold is outside the sample of observations used to create a model specific for the northern province population to train the model, the results are not subject to the over- and a model specific for the southern province population. fitting problem. This process is repeated so that each fold is Similarly, it is often the case that separate models should be used as the testing fold. From this exercise, two performance created for urban and rural populations. 3 The SWIFT modeling approach usually uses the natural logarithm of household expenditure per capita because many household expenditure data follow log-normality. However, there are some cases where household expenditure data do not follow log-normality. Yoshida et al. (2022) examines some approaches to address this issue. The use of box-cox transformation is one of them. 4 The stepwise regression approach is employed because during the imputation phase, coefficients are derived from the distribution of the estimated coefficients. Large standard errors in these coefficients can result in a change in the coefficient’s sign when drawn from its distribution. Moreover, these large standard errors suggest that the range of coefficients drawn from their distribution is wide, potentially amplifying the variance of the imputed household expenditures. It’s worth noting that the step- wise regression approach can be susceptible to multicollinearity. To address this, the SWIFT team implements strategies to mitigate the issue. These include identifying highly correlated variables and either eliminating some or employing principal component analysis. 5 The details of the imputation process are available in Annex 2 and Yoshida et al. (2022). ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT <<< 9 B. COLLECTION OR HARMONIZATION OF THE TARGET DATA This subsection discusses another key component of the After creating a set of harmonized variables, it is advisable to SWIFT framework – the target dataset. There are two possible carefully review the summary statistics of all variables in both cases regarding the availability of the target dataset: (1) the surveys to look for large discrepancies that are not explainable target dataset is yet to be collected or (2) the target dataset due to a difference in period. For instance, if the mean of bicycle comes from an existing survey. ownership is 15 percent in one survey and 75 percent in another survey that was collected only 2 years later, it is reasonable to In the first case, a SWIFT questionnaire module is created assume that this variable is not comparable between the two by compiling all questions relevant to the poverty correlates surveys. Closely checking the data, especially with those who in the model, ensuring that question wording and answer are familiar with the country and data, is needed to ensure options stay as close as possible to the questionnaire of the reliable results from the SWIFT estimation. It is also worth training data. Additionally, it is important to evaluate whether noting that such a comparability assessment should also be there are any differences in seasonality and data collection applied in the case where the target data is collected based on months between the training data and the target data. The the model variables. Despite the collection of the target dataset target dataset including the SWIFT questionnaire module is incorporating the SWIFT questionnaire module, data might not then collected. Basing the SWIFT questionnaire module off align perfectly with the training dataset. This can arise from the questionnaire of the training dataset will increase the differences in the multiple facets of data collection between the likelihood of achieving comparability between the training and two surveys. target datasets, minimizing bias in poverty estimation. The SWIFT framework is compatible with target datasets In the second case, unlike the first, the target dataset is likely collected by less traditional data collection methods, like not designed to be comparable to the training dataset. In this phone and web surveys. However, when collecting data from case, the two datasets need to be harmonized before the these surveys, it is important to incorporate a reweighting modeling exercise. The harmonization results in the formation exercise prior to using the data to ensure that results are of a variable set consisting of relevant information that is not prone to sampling bias. For example, data from phone comparable between the training and target datasets. The surveys tend to have large sampling biases towards wealthier modeling process outlined above is then conducted using only households since there is no universal phone ownership the harmonized variables. in many developing countries. The sampling bias can be corrected by the reweighting technique outlined in Zhang et When harmonizing variables between two surveys, there are al. (2023). In addition, the responses to phone interviews can several important factors to be cognizant of. First and foremost, be different to those of in-person interviews. Therefore, it is it is important to ensure that the way questions were asked in important to review the comparability of the training and phone each survey are comparable, both in the question phrasing and survey data. The application of SWIFT for phone surveys in the available response options. While it is nearly impossible has been tested and thoroughly evaluated by Yoshida et al. to find identical question phasing and response options, they (2023a), which conclude that SWIFT can reliably estimate the should be sufficiently aligned to confidently deem the data from prevalence of poverty with phone survey data if the data are both variables as comparable. For example, for any variables properly reweighted and scrutinized before the imputation. involving recall periods, such as household consumption of food items or food security indicators, it is important to have the same exact recall period. 10 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT C. IMPUTATION Once the target dataset has been prepared so that all necessary household expenditure 20 times or more for each household. variables are comparable to those in the training dataset, the The newly imputed household expenditure can then be used model(s) can be used to impute household expenditure into the to estimate poverty and inequality statistics. More details are target dataset. The target dataset is appended to the training available in Annex 2. dataset, and multiple imputation methods are used to estimate  MPROVED MODEL STABILITY WITH SWIFT PLUS AND D. I SWIFT 2.0 In the SWIFT framework, imputation models are usually trained However, if exceptionally large climate shocks or economic using the most recent household survey, which are often three crises occur, the inclusion of quickly changing variables in to five years old. Nonetheless, post-survey events like climate imputation models might not be enough to eliminate bias shocks or financial crises can disrupt the established link in the poverty estimates. To address this, Yoshida et al. between poverty indicators and household spending, leading (2022) suggest a more fundamental change in approach to to a change in model coefficients over time. Sticking to a address the model instability, referred to as SWIFT 2.0. This model calibrated on pre-crisis data can result in biased poverty approach involves gathering a “mini-HBS” (a small sample estimates, an issue known as “model instability.” Consider size Household Budget Survey) to train imputation models that a scenario where a significant economic downturn occurs, accurately reflect the present relationship between household reducing household income and expenditure, but households expenditure and poverty correlates. The mini-HBS needs to have yet to adjust assets such as radios or roofing materials. use a full household expenditure module to collect household If the model continues to apply pre-crisis coefficients for these expenditure data directly. Though collecting expenditure data is variables, it will likely overstate post-crisis household spending resource intensive, the sample size of the mini-HBS is set to be and, consequently, underestimate poverty levels. small to save on data collection costs. The mini-HBS can then be used to train the models to impute household expenditures Yoshida et al. (2022) have shown that this model instability risk into a much bigger target dataset so that the sample size of can be mitigated by incorporating rapidly changing variables imputed household expenditure is large enough to produce into the models, such as whether a household consumed reliable poverty estimates. SWIFT 2.0 can be used not only meat or experienced forms of food insecurity in the past when large climate shocks or economic crises occur but when week. This improved modeling approach, which incorporates the latest household survey is so outdated that the SWIFT Plus these dynamic variables, is referred to as the “SWIFT Plus” approach cannot fully eliminate bias due to model instability. approach, distinguishing it from the “traditional SWIFT” modeling introduced in Yoshida et al. (2015). The inclusion of However, the downside of this SWIFT 2.0 approach is the fast-changing variables into the imputation models mitigates substantial cost and time of collecting a mini-HBS. Even though the model instability issue and enables us to produce accurate the sample size is small, collecting high quality household poverty rates even when there has been a sudden economic expenditure or income data requires a carefully designed downturn (see Table 1 in the next subsection). The SWIFT consumption or income module and extensive training of highly Plus approach is now the recommended approach in all qualified enumerators. Collecting high quality consumption circumstances, as it can accommodate situations where there or income data is far more complex and resource-intensive are sudden changes in economic conditions and situations than collecting poverty correlates in the target dataset. Unless with more gradual changes. the team is familiar with the collection of full consumption or income data, SWIFT 2.0 is not recommended. <<< 11 E. RELIABILITY OF SWIFT-BASED POVERTY ESTIMATES Many machine learning models and techniques are only Yoshida et al. (2022) conducted an out-of-sample test performance tested using within-sample tests, meaning introduced by Christiaensen et al. (2012), hereafter referred that the data used to train the models is also the same data to as the “Christiaensen test.” The Christiaensen test uses the used to test the performance of the models. Within-sample last two rounds of a comparable household survey, both of tests are prone to miss problems of over-fitting and model- which include household expenditure and poverty correlates. instability. Yoshida et al. (2022) go beyond within-sample Models are trained using the first round of the survey and then tests and test multiple applications of SWIFT with out-of- used to impute household expenditure into the second round. sample tests, providing empirical evidence that SWIFT is a Since there are actual household expenditure data in the reliable methodology for poverty estimation. Unlike a within- second round of the survey, poverty rates can be produced sample test, an out-of-sample test shows the performance of based on the actual and the imputed household expenditure a model within a dataset that was not used to train the model. and then compared to see the model’s performance. Table These out-of-sample tests help to identify whether models are 1 shows the results of this performance test for 12 country vulnerable to model instability.6 examples. T A B L E 1 - Comparison of poverty rates – actual expenditures and SWIFT imputed expenditures (%) ACTUAL CONSUMPTION SWIFT IMPUTATION COUNTRIES & YEARS FIRST-YEAR SECOND YEAR (SECOND YEAR) Vietnam (1992/93–97/98)* 60.6 37.4 36.7 Inner Mongolia (2000–2004)* 19.0 6.2 7.8 Kenya (1997–2005/6)* 50.8 46.6 45.5 Russia (1994–2003)* 11.4 11.1 9.2 Morocco (2001–2007)** 15.3 8.9 8.4 Afghanistan (2011–2016) 38.3 54.5 53.5 Albania (2005–2008) 17.7 12.1 13.0 Malawi (2005–2011) 51.6 50.2 49.7 Romania (2011–2012) 22.6 21.7 22.3 Rwanda (2005–2008) 56.7 44.9 43.3 Sri Lanka (2009–2012) 8.7 6.5 7.0 Uganda (2009–2012) 24.5 19.5 23.3 Source: Christiaensen et al. (2012), Douiditch et al. (2013), and Yoshida et al. (2022). For this test, it is necessary that the two survey rounds be fully Except for Uganda (2012), all poverty rates based on comparable. Unfortunately, in many developing countries, it household expenditures imputed by SWIFT models are within is not uncommon to find that two household surveys are not +/- 2 percentage points of those from actual consumption comparable due to government changes to the questionnaire, data. This comparison includes three countries (Vietnam, survey logistics, or enumerator training for newer surveys. Afghanistan, and Rwanda) with large changes in the poverty Because of this, Yoshida et al. (2022) selected surveys for rate between the first and second survey round (+/-10 countries and years where two subsequent household surveys percentage points), showing that SWIFT models can estimate are confirmed to be comparable by the World Bank’s country poverty rates well even when there is a large change in the poverty economists. poverty rate. 6 As mentioned above, the SWIFT modeling exercise incorporates a Cross-Validation exercise to prevent the over-fitting problem during each application of SWIFT; out-of- sample performance tests are instead meant to address proof of model-stability. 12 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT  DVANTAGES OF THE SWIFT FRAMEWORK AND ITS F. A CAVEATS The SWIFT framework has been rigorously tested and refined However, it is important to note some of the SWIFT framework’s based on a range of performance evaluations (see Yoshida et shortcomings. Despite implementing robust out-of-sample al., 2022). As highlighted previously, the reliability of SWIFT- testing, the danger of overfitting and model instability cannot based poverty estimates has been assessed using out-of- be fully dismissed. Moreover, while the variables in the SWIFT sample tests, including Cross-Validation and the Christiaensen model are simple, data collection demands diligent execution. test. These out-of-sample tests, which distinguish between Poverty statistics from SWIFT estimations will only be as good training and testing data, are crucial for assessing the as the training and target datasets. Good data encompasses performance of model-based approaches, especially where several factors, including sampling, questionnaire design, there are concerns about overfitting and model instability. enumerator training, supervision, and data processing. Special care must be directed towards questionnaire design to While some advocate for traditional poverty estimation using ensure the phrasing and layout of the questions in the SWIFT household survey consumption data over SWIFT-based questionnaire module correspond harmoniously with that of alternatives, the advantages of this approach are not always the training dataset. Without this meticulous attention to detail, clear-cut. Gathering robust expenditure data is a complex and differences in questionnaires could introduce reporting biases, lengthy process. For example, obtaining trustworthy household thereby distorting poverty estimation. Yoshida et al. (2014) expenditure figures requires interviews exceeding an hour, offer comprehensive guidelines for appropriate data collection during which interviewees must perform calculations to provide and processing. accurate expenditure responses. Additionally, estimating poverty statistics involves multifaceted procedures: computing The results of the SWIFT framework should be scrutinized consumption aggregates, adjusting for prices, and determining by the experts familiar with the country’s socio-economic the poverty line. Typically, a National Statistics Office takes six conditions. They can examine whether all model variables look months to a year to complete the entire process. reasonable and whether the poverty estimates are consistent with other socio-economic conditions of the country. Therefore, Aside from reducing the time and resources required to at the World Bank, the SWIFT framework is typically utilized by produce poverty estimates compared to traditional methods, SWIFT experts in conjunction with country poverty economists the SWIFT framework also offers advantages with regards to familiar with the country’s socio-economic conditions. the quality of poverty statistics. SWIFT streamlines the data collection process and requires households to answer only Lastly, it should be emphasized that the SWIFT framework is 10-15 straightforward questions, usually with a ‘yes’ or ‘no’ intended to complement, rather than replace, the traditional answer. Simplified interview questions compared to traditional household budget or income survey. The success of the expenditure questionnaire modules reduces potential errors poverty estimation using SWIFT is contingent upon the in the reporting and compilation of data. The resulting data availability of high-quality household budget or income survey from these surveys is also much easier to check. Even when data. With the models trained on a quality household budget accounting for time to conduct meticulous quality control of survey, SWIFT can produce annual or even quarterly poverty the target data, the SWIFT poverty estimation process can be estimates during the periods when the household budget finalized within a week. survey data is unavailable. ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT <<< 13 3 Poverty Estimation with SWIFT: A Flexible Framework for Diverse Use Cases This report focuses on how the SWIFT framework has been successfully used to enhance both the frequency and quality of official poverty data, notwithstanding its application in a variety of other contexts. To do so, we present a comprehensive exploration of four distinct use cases, accompanied by in-depth illustrations of nine country cases. INCREASING THE FREQUENCY OF A.  POVERTY STATISTICS USING AVAILABLE FREQUENT SURVEYS If a country already has a frequent survey in place, frequent poverty estimates can be produced with minimal additional cost and time. This report will showcase three examples where SWIFT was effectively deployed with existing surveys. In Paraguay, SWIFT has been utilized to calculate poverty rates for six quarterly surveys from 2019 to 2022. The successful outcome has motivated the continuation of this approach into the foreseeable future, promising even more timely and accurate poverty statistics. Similarly, in Botswana, SWIFT has been harnessed to generate poverty rates for six quarterly surveys conducted between 2019 Q3 and 2022 Q4. These estimations were carried out using the Quarterly Multi-Topic Survey (QMTS). Close collaboration with the National Statistics Office in the country has encouraged the continuation of this initiative to ensure up-to-date poverty statistics. In the Democratic Republic of Congo (DRC), SWIFT was used to impute official poverty statistics into the Multiple Indicator Cluster Survey (MICS) of 2017/18. This was a significant achievement, as it filled a substantial gap in the DRC’s poverty data, which had not been updated since 2012. Implementing SWIFT has thus facilitated the availability of crucial poverty-related insights, providing valuable support for policymakers and researchers in the region. 14 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT Mongolia Photo: World Bank PRODUCING POVERTY STATISTICS WHEN AN B.  APPROPRIATE TRAINING DATASET IS NOT ALREADY AVAILABLE (SWIFT 2.0) SWIFT models cannot yield reliable poverty rates in economic crisis in Zimbabwe in 2019, which likely meant that circumstances where there are no appropriate training the 2017 data might not accurately capture the relationship datasets, particularly if the datasets are dated or if they between household expenditure and poverty proxies in 2019. were collected before a significant economic crisis or natural disaster. Either scenario may alter the relationship Moreover, the SWIFT 2.0 approach provides options to mitigate between the poverty correlates and household expenditures, some of the costs associated with generating official poverty undermining the model’s accuracy. In response to these statistics, as evidenced by an application of SWIFT in South challenges, a viable solution is to collect a mini-Household Sudan. The lack of a comprehensive household survey with Budget Survey (min-HBS) specifically for use as training complete expenditure data over an extended period posed data (i.e., SWIFT 2.0). a significant challenge for poverty estimation in the country. This was primarily due to the prohibitive costs associated This report highlights a compelling case study from Zimbabwe, with implementing such surveys. To circumvent this issue where a subset of a 2019 national survey was successfully and provide more current poverty statistics, the World Bank, used as a mini-HBS to train the SWIFT models. The Zimbabwe in collaboration with the National Statistical Office of South Statistics Office and the World Bank collaborated on how best Sudan, opted to collect a mini-HBS to use as the training data to produce poverty statistics for the 2019 survey following and then impute household expenditure into data collected by the SWIFT approach, opting to collect new data to train the a third party, resulting in state-level poverty estimates. This models rather than using the most recent household survey initiative is still underway at the time of this report’s preparation data from 2017. This decision stemmed from the significant and, as such, is not included in the Country Example section provided below. ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT <<< 15 RAPID POVERTY MONITORING WITH NON-TRADITIONAL C.  DATA COLLECTION Not all countries have the luxury of conducting frequent such as droughts and floods, on poverty. The community- household surveys, which can hinder their ability to monitor based data collection in conjunction with SWIFT proved poverty, even with SWIFT. Frequent data collection can be to be a cost-effective, rapid, and agile way to track poverty expensive, making it impractical to produce frequent poverty dynamics in response to changing environmental conditions. estimates. To address this challenge, multiple initiatives have The success of this approach is especially significant given begun to collect frequent data using innovative approaches the lack of frequent poverty data in low-income countries like such as phone interviews and community-based data Malawi. collection systems. This report will focus on two prominent examples of how SWIFT has effectively utilized data from Likewise, in Uganda, SWIFT played a pivotal role monitoring these approaches to address poverty monitoring needs. the poverty status of refugees by harnessing data from High-Frequency Phone Surveys. In contrast to traditional In Malawi, SWIFT has been an indispensable component data collection methods that entail extensive face-to-face of the Rapid Feedback Monitoring System (RFMS). RFMS interactions, phone surveys offer a more cost-effective is a community-based data collection system in which the alternative. This cost reduction is particularly valuable when it enumerators live within the local communities they survey, is too challenging or too costly to conduct traditional surveys. reducing travel costs and allowing data collection in the The integration of phone surveys into the SWIFT framework immediate aftermath of shocks. By utilizing data from RFMS, can enhance the feasibility and cost-effectiveness of poverty SWIFT effectively monitored the impact of climate events, monitoring efforts. RESTORING COMPARABILITY TO REESTABLISH A D.  POVERTY TREND It is a common challenge that the expenditure data produced ● Backward imputation: The newest household survey is from a new household survey is no longer directly comparable used to train the SWIFT model. Household expenditures to that of past surveys. This lack of comparability makes it are then imputed backward into the older survey(s), difficult to assess whether the prevalence of poverty has making the data from these older surveys comparable to increased or decreased over time. However, the SWIFT the most recent one. framework provides a solution to restoring comparability between two household surveys. This report presents three examples illustrating the effective application of SWIFT in these imputation scenarios. The cases This is achieved by harmonizing a set of comparable poverty of Mongolia and Zambia will highlight forward imputation, correlates between both surveys. One survey is used to train where a previously collected household survey is used to a SWIFT model, which is then used to impute comparable train the SWIFT model to impute expenditures into the most household expenditure into other surveys. This imputed recent household survey. In contrast, the case of Nigeria will data can then be used to estimate poverty trends over time. demonstrate backward imputation, where SWIFT models were The survey used to train the SWIFT model can be either the trained on the most recent household survey, and household most recently collected household survey or an older survey, expenditures are imputed into older surveys. allowing for two types of imputation: ● Forward imputation: An older household survey is used to train the SWIFT model. The model is then used to impute household expenditures forward into the newer survey(s). This approach is valuable when the goal is to estimate poverty trends based on the older data. 16 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT DRC - Children at the general hospital of Beni. Photo: Vincent Tremeau / World Bank ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT <<< 17 4 Cost Implications of the SWIFT Framework This section explores the cost implications of the above-mentioned applications of SWIFT and other implementation challenges. The actual savings in poverty estimation costs vary depending on the data environment of each country. Costs associated with SWIFT are incurred from data collection, model creation, and the imputation process required to produce poverty estimates. The cost of data collection for SWIFT varies widely since SWIFT questions can often be added to an existing survey at very low marginal cost. Costs for modeling and estimation are more standard. Based on the World Bank’s experience, the cost of creating SWIFT models is relatively fixed, ranging from USD 3,000 to 5,000. Similarly, the costs associated with imputation work for each round of target data hover around USD 3,000 to 5,000. Considerations for implementing SWIFT should be framed within the following contexts: FREQUENT HOUSEHOLD SURVEY (I)  ALREADY AVAILABLE When a frequent household survey is already in place, like in Paraguay, Botswana, and DRC, poverty rates can be estimated at a very low marginal cost. The incorporation of 10 – 15 questions into an existing survey minimally impacts interview time or survey logistics, keeping implementation costs largely unchanged. The primary SWIFT-related costs arise from preparing the SWIFT model, formulating relevant questions, and estimating poverty rates once the target data becomes available. As outlined above, this translates to approximately USD 3,000 to 5,000 to create the SWIFT models and an additional 3,000 to 5,000 for poverty estimates per round of target data. 18 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT Nigeria Photo: World Bank (II) NO APPROPRIATE DATA FOR TRAINING When there is no appropriate data to train the SWIFT model, (around 15,000 households) and agreed to incorporate the like in Zimbabwe and South Sudan, the best course of action is SWIFT questions into their questionnaire; this data was used to collect a mini-HBS as the training dataset (i.e., a dataset that as the target data. As a result, the only additional cost for the household expenditures/incomes and key poverty correlates). World Bank team was to train the SWIFT models and impute The cost in this scenario depends on the size of the mini- household expenditures, which cost around USD 3,000 to HBS. For example, in South Sudan, a mini-HBS of around 5,000, as mentioned above. When implementing SWIFT, it is 1,000 households used as training data, incurred a cost of extremely valuable to take advantage of other large multi-topic USD 200,000. At the same time, other development partners surveys conducted by the government or other development collected a household survey with a much bigger sample size partners.7 NO FREQUENT HOUSEHOLD SURVEY AVAILABLE, BUT (III)  PLANS FOR A FREQUENT PHONE SURVEY In countries lacking a frequent survey, implementing SWIFT expenses, there are the usual costs of modeling and estimation, requires establishing a new frequent survey that incorporates as well as designing the SWIFT questionnaire module to be the SWIFT questionnaire module. If this survey is conducted incorporated into the phone survey. All factors considered, the via phone interviews, the cost of data collection based on estimated total cost of SWIFT framework per round falls within the World Bank’s experience ranges from USD 15 to 20 per the range of USD 20,000 to 25,000. It’s important to note the household. The phone survey should cover at least 1,000 potential bias in phone surveys due to lower phone ownership households to achieve nationally representative poverty among poor households. Zhang et al. (2023) shows how to statistics, resulting in a data collection cost of approximately correct this sampling bias. USD 15,000 to 20,000 per round. Alongside data collection 7 Fujii and Van der Weide (2020) offer a framework for assessing the cost implications associated with achieving a specified statistical accuracy in poverty estimates under SWIFT 2.0, considering various data availability scenarios. ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT <<< 19 NO FREQUENT HOUSEHOLD SURVEY AVAILABLE, BUT (IV)  PLANS FOR COMMUNITY-BASED DATA COLLECTION This subsection introduces a novel data collection initiative – ● Large Sample Size: Achieving optimal cost-effectiveness community-based data collection. In a community-based data demands a significant number of households in each collection system, the enumerator lives in the same village data collection round, which spreads the governing where sample households reside. Recent cost analysis shows body’s establishment cost over the sample size and the that community-based data collection substantially reduces survey round. Currently, with a sample size of about 4000 the expenses associated with estimating poverty statistics households surveyed monthly, the average cost of the when incorporating SWIFT-based poverty estimates. With a governing body per household is USD 12.5. If the sample cost of USD 2 per household and an estimated sample of 1,000 size were to increase to 8000, this cost would drop to USD households, the data collection expense is approximately 6.3. Including the survey’s operational costs, the average USD 2,000 per round. Even when accounting for the cost of cost per household would be USD 8.3, notably less than designing the SWIFT models, questionnaire, and estimating a phone survey’s unit cost. poverty statistics, the total cost of implementing the SWIFT framework remains between USD 5,000 to 7,000 per round. These conditions can be met if many development partners and the government agencies use the community-based Moreover, community-level monitoring has a broader scope data collection system together. This is possible and mutually than merely monitoring poverty with immediacy and frequency. beneficial due to the system’s flexibility in incorporating In Malawi, this approach has been pivotal in overseeing food different questionnaire modules from round to round. security and climate adaptation and mitigation. It can also be used to assess the performance of policies and projects That said, the cost of establishing the governing body could spearheaded by government and development collaborators. be reduced by leveraging existing infrastructure. A pragmatic approach to economizing this cost would be to leverage existing To cater to these multifaceted roles, Malawi has set up a infrastructures capable of handling a myriad of requests for governing entity for its community-based data collection community-based data collection. For instance, the National mechanism. This body streamlines tasks like responding Statistics Offices (NSO) are often adept at accommodating to requests, hiring and training enumerators, creating and multiple data inquiries, designing, and conducting household overseeing questionnaires, and cleaning and processing surveys. Beyond an NSO, utilizing the skills and assets of data. A governing body is integral to community-based data present institutions can further enhance the economic viability collection systems, and simplifies processes significantly, of community-based data collection. but in Malawi, establishing the governing body with monthly data collections cost around USD 3 million for a five-year term. To make the community-based data collection approach genuinely cost-effective, specific conditions should be met: ● High Frequency: Regular utilization of the community- based data collection system will enhance its cost- effectiveness. Spreading the initial fixed cost over multiple data collection rounds diminishes the cost per round. In Malawi, community-based data collection has been implemented every month since August 2020 and will span five years, bringing the establishment cost per survey round to about USD 50,000. 20 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT Mongolia - Horse racing at the Naadam Festival, the biggest festival of the year for Mongolians, usually occurring in July. Photo: Dave Lawrence / World Bank ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT <<< 21 5 Brief Summary of the SWIFT Framework As described above, the SWIFT framework includes many steps to produce reliable poverty estimates. This section offers a walkthrough on how to obtain reliable poverty estimates with SWIFT, illustrated through a framework outlined in Figure 1 and detailed in Table 2. First, potential users of SWIFT must check whether there is an appropriate training dataset. An appropriate training dataset will vary by context, but in all cases, the primary consideration is whether the relationship between the poverty correlates will be similar between the training data and the target data. If there is an extremely long period of time between the training and the target data, the assumption that this relationship remains stable becomes implausible. This happens frequently as in many countries the latest available HBS is 10 or more years old. Similarly, this assumption is more difficult to justify if a country experiences a significant climate shock or economic crisis between the time when the training and target data were collected. Under these circumstances, even if fast-changing variables are included in the models, there remains a significant risk of model instability, which means the SWIFT models may fail to estimate poverty rates accurately. If there is no appropriate training dataset, the best course of action is to collect a mini-HBS to use as the training dataset. This approach, called SWIFT 2.0, was adopted in Zimbabwe and South Sudan. By limiting the sample size of the mini-HBS, the cost of data collection can be reduced, though it can still cost upwards of 200,000 USD. After identifying (or collecting) an appropriate training dataset, the target dataset needs to be identified. The target dataset will vary by the goal of the project and the availability of data within a country. If the goal is to increase the frequency of poverty statistics, it is best to first look at any existing frequent surveys that can be used to impute poverty rates. In some countries, there are frequent household surveys which do not contain household expenditure or income data, such as labor force surveys. These types of surveys can be used as the target datasets after harmonizing enough variables with the training data. If the goal is to restore poverty trends, the target dataset will be another HBS. Similarly, SWIFT models can be used to impute poverty rates after harmonizing the two datasets. 22 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT If there is no target dataset, it will first need to be collected. It is Once both the training and target datasets are available, important to ensure the target dataset includes many questions it’s essential to harmonize the variables and assess their that are the same or similar to those of the training dataset comparability. The alignment of variables is crucial; even if so that they can be easily harmonized. SWIFT modeling has they’re identically defined across both datasets, differences in proven to work effectively with three types of data collection. the data collection process might render them incomparable. First, data can be collected via a traditional survey, where the To evaluate this comparability, one should engage experts survey team travels around the country to conduct interviews.8 familiar with the country and context and consider other data Second, the data can be collected via phone interviews or web sources. When variables are confirmed to be comparable, surveys. Phone interview or web survey data have possible SWIFT models, trained using the training dataset, are sampling and reporting biases, but this can be addressed by employed to estimate poverty statistics in the target dataset. reweighting and careful review of the data. And third, the data However, if there are insufficient comparable variables to can be gathered via a community-based data collection system accurately impute household expenditures, relying on SWIFT where enumerators live within the same village or district as for poverty estimation may not be advisable. the sample households. The cost implications of each of these data collection methods are outlined in the section above. F I G U R E 1 - Steps of the SWIFT framework Is there an appropriate* training dataset? Implement a mini-HBS** Yes No to use as training data Is there an existing target dataset? Collect target data that Yes No includes poverty correlates to use as target data Harmonization & comparability test of variables Conduct imputations using SWIFT model * “Appropriate” training data will be relative to the context, as explained above. Notes:  “Mini-HBS” is used as shorthand to refer to a relatively small, nationally representative survey that includes poverty correlates and **  full consumption data. 8 Yoshida et al. (2015) describes details of how to collect data. ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT <<< 23 T A B L E 2 - Classification and activities of SWIFT analysis TRAINING DATA EXISTS YES NO Collect a mini–Household Budget Survey Conduct a harmonization exercise to use as the training data, ensuring between the training and target data that the collected poverty correlates are YES prior to using SWIFT to impute poverty. comparable to the existing target data. Examples: Paraguay, Botswana, DRC, TARGET DATA EXISTS Mongolia, Zambia, Nigeria Example: South Sudan (results not yet available) Develop SWIFT models from the training Collect a mini–Household Budget Survey data, produce SWIFT questionnaire to use as the training data and collect modules, and collect the necessary target data that contains the same NO poverty correlates in the target data poverty correlates. prior to using SWIFT to impute poverty. Example: Zimbabwe Example: Malawi, Uganda Malawi - Village development committee discussing MIRA project community level data Photo: CRS Staff / World Bank 24 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT Zimbabwe Photo: World Bank ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT <<< 25 6 Country Examples Since its debut in 2014, SWIFT-based poverty estimation has continued to improve and expand, both with regards to the modeling methodology and to data collection recommendations. SWIFT has proven to be user-friendly and flexible enough to cover a variety of use cases. The following section covers examples of SWIFT application in nine different countries in a variety of contexts. The first three examples (Paraguay, Botswana, and DRC) show how SWIFT has been used in conjunction with existing frequent surveys to provide poverty statistics. Example four (Zimbabwe) shows how SWIFT was adapted to provide poverty estimates where there were neither time nor funds to conduct a full household survey during a sharp economic downturn. Country cases five and six (Malawi and Uganda) cover SWIFT in use with non-traditional forms of data collection: community-based data collection and phone surveys, respectively. The final three examples (Mongolia, Zambia, and Nigeria) showcase how SWIFT can be used to restore comparability between household surveys to reestablish a poverty trend. TYPE OF APPLICATION Training data (I)  ARAGUAY: UTILIZING P QUARTERLY SURVEYS TO PROVIDE POVERTY STATISTICS INCREASING THE FREQUENCY OF POVERTY STATISTICS BY USING AVAILABLE FREQUENT SURVEYS Permanent Household Survey (EPHC) 2022 Quarter 4 9 Permanent Household Survey (EPHC) 2019, 2020, 2021, and 2022, Target data Quarters 1, 2, and 3 Version of SWIFT SWIFT Plus Types of model Household demographics, income & employment variables Model disaggregation Urban, Rural Modeling: USD 5,000 Data collection: 0 (Both training and target datasets were available Cost when the SWIFT-based poverty estimation was conducted.) Estimation per round: USD 3,000 9 This section prepared by Xueqi Li, Eliana Carolina Rubiano Matulevich, and Nobuo Yoshida. 26 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT Unlike many of the other country case examples in this report, Because poverty estimates are estimated quarterly, the Paraguay does not suffer from data deprivation as defined inclusion of fast-changing variables in the models are by the World Bank (Serajuddin et al. 2015). That said, more important in capturing possible sudden changes in economic frequent poverty statistics remain important data points to conditions. Income-related variables10 were included in the address the impacts of economic shocks in the country, such surveys for this purpose. As sub-components of aggregated as recent droughts and high inflation (Rubiano, 2023). household income, these variables are directly associated with household poverty status and have shown high comparability The Permanent Household Survey (EPHC) is conducted (in teams of mean and standard deviation) between the first quarterly in Paraguay, with the first three quarters targeting three quarters and the fourth quarter, enhancing the model’s employment conditions and the fourth quarter comprehensively potential prediction power. measuring household well-being. Using SWIFT, poverty estimates have been made available not just for the fourth The model was performance tested and validated by applying quarter but also for the first three quarters, where income data the model (which was trained in the fourth quarter of 2022) was limited. to the fourth quarters of 2021, 2020, and 2019. Results show that the model can produce poverty rate estimations close to Throughout the project, the training data for the SWIFT model the actual values, with an absolute difference of less than two has been updated to the most recent fourth quarter survey of percentage points. Additionally, the model is robust regarding EPHC, which at the time of publication of this report is EPHC capturing the variance of the poverty rate over time, especially 2022 quarter four. Separate models were created for urban and when comparing the results for urban areas between the rural areas to capture differences in the relationship between fourth quarter of 2019 and the fourth quarter of 2021. This is the poverty correlates and household income in these two an example of an instance where the SWIFT model was able areas. National estimates were produced as the aggregate of to be performance tested not just with a within-sample test rural and urban imputations. (within the training data) but also with an out-of-sample test (within data not used to train the model), bolstering confidence for the reliability of its estimates. The results of the model validation are shown in Table 3. T A B L E 3 - Paraguay model validation results 2019 Q4 2020 Q4 2021 Q4 2022 Q4 ACTUAL IMPUTED ACTUAL IMPUTED ACTUAL IMPUTED ACTUAL IMPUTED National 23.50% 23.90% 26.90% 27.70% 26.90% 25.90% 24.70% 24.50% Urban 17.50% 18.90% 22.70% 24.10% 22.40% 21.80% 19.50% 20.20% Rural 33.40% 31.10% 34.00% 34.00% 34.60% 32.80% 33.80% 32.50% Source: Authors’ estimation using EPHC 2019, 2020, 2021, and 2022. After the model was validated in 2019, 2020, and 2021 quarterly poverty rate from 2019 to 2022. Estimates were quarter-four surveys, it was applied to all quarter one, two, not produced for all quarters due to variable availability and and three datasets to impute household income and produce comparability. For instance, income-related variables were not poverty estimates. Figure 2 shows the results of the estimated available for the first, second, and third quarters of 2020. 10 Income-related variables were transformed into logarithmic form before being included in the variable pool. First, all income variables were adjusted for inflation based on the Consumer Price Index (CPI) published by the Central Bank of Paraguay. Second, the logarithmic form was taken based on the inflation-adjusted values. To avoid creating missing values for any variable with a zero value prior to taking the logarithmic form, the logarithm was taken after adding 1 to the inflation-adjusted values of all income-related variables. This transformation prevented any missing values from reducing the sample size for the regression. ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT <<< 27 F I G U R E 2 - Paraguay quarterly poverty projection results 34.6% 33.5% 34.0% 33.8% 33.1% 33.4% 33.3% 32.2% 31.1% 31.4% 26.9% 26.9% 26.4% 24.8% 25.4% 24.7% 23.7% 23.5% 24.4% 22.5% 22.7% 22.4% 22.4% 21.5% 19.6% 20.4% 19.5% 18.0% 17.5% 17.4% 19Q1 19Q2 19Q3 19Q4 20Q4 21Q1 21Q3 21Q4 22Q1 22Q4 National Urban Rural Note: Poverty rates for the fourth quarter (Q4) correspond to the data published by the INE. The figures for the other quarters are based on the model estimate. Source: Authors’ estimation using EPHC 2019, 2020, 2021, and 2022. The NSO’s recognition of the value of more frequent poverty this report, the SWIFT team and the government of Paraguay statistics prompted close collaboration between the NSO and are working on a report to collect and share lessons from the the SWIFT team at the World Bank. At the time of publication of program. (II)  OTSWANA: UTILIZING QUARTERLY LABOR B FORCE SURVEYS FOR POVERTY STATISTICS 11 TYPE OF APPLICATION INCREASING THE FREQUENCY OF POVERTY STATISTICS BY USING AVAILABLE FREQUENT SURVEYS Training data Botswana Multi-Topic Household Survey (BMTHS) 2015/16 Target data Quarterly Multi-Topic Survey (QMTS), Q3 2019 – Q4 2022 Version of SWIFT Standard SWIFT12 Districts and regions, household demographics, education, employment, sources of household income, Types of model variables dwelling conditions, food security Model disaggregation Cities & towns, Urban villages, Rural villages Data harmonization & modeling: USD 20,000 Data collection: 0 (Both training and target datasets were available when the SWIFT based poverty Cost estimation was conducted.) Estimation per round: USD 3,000 11 This section was prepared by Danielle Aron with inputs from Carolina Diaz-Bonilla and Nobuo Yoshida, based on Statistics Botswana and World Bank 2024 (forthcoming). 12 The type of SWIFT for Botswana is still primarily considered to be a standard SWIFT approach, despite the inclusion of food security, a fast-changing variable in one version of the models. Collaboration with Statistics Botswana is moving the modeling for the country towards SWIFT Plus, but currently the proportion of fast-changing variables is too small to consider the models to be SWIFT Plus. 28 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT Poverty in Botswana has declined significantly since 2002, variable set that was harmonized between BMTHS 2015/16 reaching 16 percent in 2015, the year of the most recently and QMTS, including dwelling conditions and sources of available income/expenditure household survey. That said, household income. The limited model utilizes the smaller Botswana remains one of the most unequal countries in the harmonized variable set, excluding dwelling conditions and world, with rural poverty far exceeding the national average. sources of household income, so that poverty projections The Botswana Poverty Assessment (PA) in 2015 highlights could be produced for QMTS 2020 Q4 and QMTS 2021 Q4. the strong reduction in poverty over this period but notes that And finally, the full + FIES model utilizes the full variable set much of the population remains poor or vulnerable to falling with the addition of a food security variable; the full + FIES below the poverty line, especially in rural areas. The PA’s model is only used to produce estimates for 2022 Q4. recommendations for poverty reduction include improving survey data for evidence-based policymaking, particularly For quarterly monitoring of poverty, it would be more appropriate by increasing the frequency of poverty statistics. Like most to incorporate fast-changing poverty correlates, such as food countries in the region, Botswana does not conduct frequent and non-food consumption or food security indicators, into the surveys to track poverty. In fact, in the last 20 years, only three models. However, because QMTS is focused primarily on labor such surveys have been conducted: Household Income and conditions, there are no available consumption variables to Expenditure Survey (HIES) 2002/03, Botswana Core Welfare harmonize. For food security, the reference period on the FIES Indicators Survey (BCWIS) 2009/10, and the Botswana Multi- questions varies between BMTHS 2015/16 and QMTS 2019 Topic Household Survey (BMTHS) 2015/16. Q3 – 2021 Q4, making the data not comparable. However, after feedback and collaboration with Statistics Botswana, a To fill these data gaps, SWIFT is being used to impute poverty comparable reference period on the food security questions into Botswana’s labor force surveys, the Quarterly Multi-Topic was added to QMTS 2022 Q4, allowing for the creation of new Survey (QMTS), which has been collected six times between models to include this data. 2019 Q1 and 2022 Q4. The models were trained using the entire BMTHS 2015/16 data, including Q1-Q4 data. For QMTS 2019 Q3 through 2020 Q1, where both the full and limited models could be applied, the imputed poverty rates The variable set used for the modeling was limited to what could are similar, differing by less than two percentage points and be harmonized between BMTHS 2015/16 and QMTS. This most differing by less than one percentage point. The full + included household demographics, dwelling characteristics, FIES models also show near identical results to the regular full sources of household income, and education and employment model in QMTS 2022 Q4. This supports the accuracy of both information. However, the harmonized variable set was further the limited and full models, but it should still be noted that the limited when considering QMTS 2020 Q4 and QMTS 2021 Q4 full + FIES model still has a significantly lower proportion of as the two surveys did not collect information on household fast-changing poverty correlates (the food security indicator) dwelling conditions or sources of household income. compared to slow-changing poverty correlates. Additionally, food security information comparable to BMTHS 2015/16 was collected in 2022 Q4. Results show that poverty at the national level has remained relatively stagnant at the national level, with a slight increase in Due to the difference in poverty levels in different regions in 2021 Q4 and then a decrease in 2022 Q4. The trend in urban Botswana, separate models were created for cities and towns, areas closely follows that of the national level. In cities and urban villages, and rural villages. In addition, three different towns, the poverty rate has remained the same over the entire sets of models were created with varying variable sets to period, while in rural areas, we see fluctuation. Rural areas cover the differing variable availability across all quarter saw a decline in poverty in 2020 Q1, an increase through 2021 surveys; the types of models created are named as (1) full, (2) Q4, and another decrease in 2022 Q4. limited, and (3) full + FIES. The full model utilizes the larger ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT <<< 29 F I G U R E 3 - Botswana poverty projection results 30% 27.08% 26.81% 26.90% 24.34% 27.38% 25% 24.96% 25.44% 20% 18.04% 17.05% 17.01% 16.65% 16.05% 15.84% 15% 14.47% 13.66% 13.80% 13.79% 15.63% 14.30% 13.32% 11.41% 10% 5.15% 5.25% 5.07% 4.70% 5% 4.68% 4.40% 3.21% 0% 6 Q3 Q4 Q1 Q4 Q4 Q4 /1 20 15 19 20 21 22 19 20 20 20 20 20 20 20 TS TS TS TS TS TS TS BM QM QM QM QM QM QM RV FULL RV LIM RV FIES NAT FULL NAT LIM NAT FIES UV FULL UV LIM UV FIES CT FULL CT LIM CT FIES Note: For visual clarity, the data labels in the figure are only for the limited models. Source: Prepared by Danielle Victoria Aron and Carolina Diaz-Bonilla using BMTHS 2015/16 and QMTS This work is one of the first attempts in Africa to monitor allow for comparability on some indicators of household food official, comparable poverty statistics quarterly. Furthermore, security. Including these types of fast-changing indicators will the SWIFT team at the World Bank has worked closely with allow the models to be better suited to capturing more sudden Statistics Botswana during this project, paving the way for changes in economic conditions. Further efforts like this will be future efforts to improve the models and projections. As a result important steps towards routinely producing reliable poverty of this collaboration, Statistics Botswana has already made estimates for Botswana. some amendments to the QMTS (starting for 2022 Q4) that will 30 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT (III)  EMOCRATIC REPUBLIC OF CONGO: D PROVIDING A POVERTY TREND DESPITE A LACK OF COMPARABLE HOUSEHOLD SURVEYS 13 TYPE OF APPLICATION INCREASING THE FREQUENCY OF POVERTY STATISTICS BY USING AVAILABLE FREQUENT SURVEYS Training data Household Budget Survey (HBS) 2012 Target data Multiple Indicator Cluster Survey (MICS) 2017/18 Version of SWIFT Standard SWIFT Types of model variables Provinces, household demographics, education, asset ownership, dwelling characteristics Urban and rural models, plus province-specific models for Kinshasa, Kongo Central, Nord-Ubangi, Model disaggregation Mongala, Tshopo Haut-Uele, Ituri, Sud-Kivu, Tanganyika, and Sankuru Data collection: 0 (Both training and target datasets were available when the SWIFT based poverty Cost estimation was conducted.) Data harmonization, modeling, & estimation: USD 10,000 The Democratic Republic of Congo (DRC) is one of the poorest was the case for eleven provinces: Kinshasa, Kongo Central, countries in the world. In the past several decades, only two Nord-Ubangi, Mongala, Tshopo, Haut-Uele, Ituri, Sud-Kivu, Household Budget Surveys (HBS) have been conducted Tanganyika, Sankuru, Kasai-Central. All the province- to calculate official poverty statistics – in 2005 and 2012 – specific models showed an improvement in performance for making the DRC extremely data-deprived. Over this period, these areas compared to the overall urban and rural models, the national poverty rate declined marginally from 69.3 to 64 except for Kasai. For Kasai, the urban and rural models were percent. Beyond 2012, there have been no national surveys used instead. to collect information on monetary welfare that could be used to produce a poverty trend. (Batana, 2023) Unlike a standard SWIFT approach that uses 20 rounds of imputation, 100 rounds of imputation were performed for each To fill this data gap, SWIFT was used to impute poverty into household to estimate consumption. The average of 100 a more recent survey in DRC to extend the poverty trend. poverty rates was taken to reach a final poverty headcount. In 2017/18, a Multiple Indicator Cluster Survey (MICS) was conducted within the country. However, this survey did not Variables used in the models were limited to what could be collect complete consumption information for households. To harmonized between the 2012 HBS and the 2018 MICS. produce poverty rates for MICS 2018, SWIFT was used to That said, concerns remain over the final comparability of the impute household consumption. harmonized datasets. Different institutions collected the two surveys, and the most recent population census was published Using 2012 HBS data, two separate SWIFT models were in 1985, making any population information very outdated. created for rural and urban households to capture fundamental differences in the relationship between the poverty correlates Another significant difference between the two surveys was and household consumption in the two areas. The models the urban population shares. Unfortunately, due to a lack of were tested within single provinces to ensure that poverty data at the province level, reweighting exercises could not rates could be reliably disaggregated at lower levels. A be used to correct this issue widely. The urban share was province-specific model was developed and used in the final adjusted for one province (Haut-Uele) that showed seemingly analysis for the provinces that showed significant differences unreliable results, where the urban share was adjusted from between actual and imputed consumption and poverty. This 64 to 13 percent based on HBS 2012 data. 13 This section prepared by Paripoorna Baxi based on Paci et al. (2023) and inputs from Shinya Takamatsu. ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT <<< 31 Results of the SWIFT work show that poverty headcount rates Rural areas and Kinshasa saw decreases in poverty of about likely declined from 2012 to 2018 at all three sub-national two percentage points, with rates remaining twelve percentage levels, with a marginal decline nationally from 63.9 to 59.6 points higher in rural areas than in Kinshasa. Table 4 shows percent. The largest decline in poverty occurred in urban areas the poverty rates for 2012 and 2018 for all SWIFT models. outside Kinshasa, with a nine-percentage point decrease. T A B L E 4 - DRC poverty estimates 2012 2018 National 63.9 59.6 Kinshasa 52.8 50.7 Other urbans 66.8 57.8 Rural 64.9 62.7 Kinshasa 52.8 50.7 Bandundu 77.3 74.2 Bas-congo 48.9 47.8 Katanga 62.4 60.2 Kasaï Oriental 79.4 68.4 Kasaï Occidental 67.7 70.5 Equateur 76.6 67.6 Nord-kivu 49.7 43.5 Sud-kivu 63.6 54.7 Maniema 63 64.1 Province Orientale 55.3 58.2* Note: The urban population share in Haute-Uele is adjusted using 2012 shares due to data quality Source: Authors’ compilation based on Paci et al. (2023) (IV) Z  IMBABWE: USING SWIFT TO PROVIDE POVERTY STATISTICS AFTER UNEXPECTED ECONOMIC CHANGES 14 PRODUCING POVERTY STATISTICS WITH SWIFT 2.0 (CONCURRENT IMPUTATION) WHEN AN TYPE OF APPLICATION APPROPRIATE TRAINING DATASET IS NOT AVAILABLE “Mini-PICES” (Poverty, Income, Consumption, and Expenditure Survey) 2019, subsample with full Training data consumption (478 households) “Mini-PICES” (Poverty, Income, Consumption, and Expenditure Survey) 2019, subsample without full Target data consumption Version of SWIFT SWIFT 2.0 Types of model variables Household demographics, education, asset ownership, dwelling conditions Model disaggregation Urban, Rural Data collection: USD 262,000 Cost Modeling & estimation: USD 5,000 14 This section prepared by Anjali Kini and Danielle Aron based on ZIMSTAT and the World Bank (2022) with inputs from Dhiraj Sharma, Shinya Takamatsu, and Rob Swinkels. 32 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT Zimbabwe has produced only two official poverty statistics sample, but still collected nonconsumption information for the in the last three decades using international poverty lines. In entire sample. Around 2,201 households were successfully 2017, Zimbabwe launched the Poverty, Income, Consumption, surveyed, of which 478 were in the subsample (230 urban, and Expenditure Survey (PICES) to produce official poverty 248 rural). The subsample was used to train two models, one statistics. While there were no plans to implement another for urban households and one for rural households. PICES in 2018 or 2019, significant disruptive economic and social shocks during these years spurred urgency in collecting In this unique application of SWIFT (SWIFT 2.0), wherein the data to monitor the changing economic conditions, especially training data is collected concurrently with the target data, concerning poverty. In 2018-2019, rapid food price inflation there is no risk of the model instability problem. Therefore, and poor rainfall during the agricultural season impacted there was no need to include fast-changing variables. poverty levels and increased the proportion of food-insecure The final poverty projections for Zimbabwe are shown in Zimbabweans, particularly during the lean season when Table 5. Poverty rates were produced for the country’s upper, households rely on food stocks. Furthermore, rising costs lower, and food poverty lines for rural and urban areas, and of imports due to the decline in the exchange rate of the aggregated at the national level. Imputed poverty rates for Zimbabwean dollar, along with increasing transport costs, all regions and across all poverty lines closely match that of negatively impacted households. the subsample. At the national level, about 72 percent of the Given the extent of the shocks and the lack of funds for a full population falls below the upper poverty line, with a higher PICES, a different approach was required for both the collection percentage in rural versus urban areas (84.8 and 42.2 percent, and the production of poverty statistics. To overcome these respectively). When looking at the most extreme poverty line challenges, the SWIFT team at the World Bank worked closely (food poverty line), only 10 percent of the urban population with the Zimbabwe NSO to develop a plan to produce poverty is considered food-poor, while over 50 percent of the rural statistics by implementing a “mini-PICES”, using SWIFT to population is considered food-poor. These results show that all concurrently impute poverty rates into the collected data. The poverty, especially extreme poverty, remains a more pressing 2019 mini-PICES was a hybrid survey that only collected the issue for rural populations. full consumption module for a subsample of the overall survey T A B L E 5 - Zimbabwe poverty rates by upper, lower, and food poverty lines UPPER POVERTY LINE LOWER POVERTY LINE FOOD POVERTY LINE SUBSAMPLE ENTIRE SAMPLE SUBSAMPLE ENTIRE SAMPLE SUBSAMPLE ENTIRE SAMPLE National 0.716 0.717 0.573 0.573 0.367 0.383 Rural 0.843 0.848 0.733 0.721 0.499 0.509 Urban 0.468 0.422 0.259 0.243 0.106 0.101 Note: Subsample in the table refers to the subsample of 478 households (230 urban, 248 rural) for which the full consumption questionnaire module was collected. The subsample's poverty rates are directly produced from the consumption data. The entire sample in the table refers to the full sample of 2,201 households surveyed. Poverty rates for the entire sample are partially imputed using the SWIFT model developed from the subsample. Source: Authors’ compilation based on ZIMSTAT and the World Bank (2022). ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT <<< 33 (V)  ALAWI: PROVIDING TIMELY POVERTY DATA IN M A CLIMATE CRISIS CONTEXT 15 TYPE OF APPLICATION RAPID POVERTY MONITORING WITH COMMUNITY-BASED DATA COLLECTION Training data Integrated Household Survey (HIS) 2019/20 Rapid and Frequent Monitoring System (RFMS) in 10 districts in southern rural Malawi, collected Target data monthly from August 2020 to March 2023 Version of SWIFT SWIFT Plus Household demographics, asset ownership, dwelling conditions, food consumption (past 7 days), food Types of model variables security Model disaggregation None, one model for all of rural southern Malawi Modeling: USD 5,000 Data collection per round: USD 2,000 (not included fixed costs of setting up community-based data Cost collection system) Estimation per round: USD 3,000 Since 2010, the national poverty rate in Malawi has been This system allows for extremely timely data collection after both stagnant and high, sitting at 50.7 percent in 2019, the climate shocks that may otherwise make it difficult for outside year of most recent official poverty data. Moreover, most of enumerators to travel into the community to collect data. the population relies on agriculture as their primary source of household income, leaving them extremely susceptible to The SWIFT models were trained using the 2019/20 IHS and climate shocks. The effects of climate change have caused used to impute poverty rates into selected rounds of the an increase in droughts, floods, and irregular rains in Malawi, RFMS. While an RFMS survey is collected every month, putting vulnerable households at an even greater risk of falling the questionnaire module used to collect the necessary into poverty and experiencing more extreme forms of poverty. information for SWIFT to impute poverty rates has been collected biannually since August 2020, with two additional Malawi produces official poverty statistics from the Integrated rounds when cyclones hit the country in February 2022 and Household Survey (IHS), a large national survey collected March 2023. To make RFMS data comparable to IHS data, every three years. While the IHS is thorough and informative sampling weight adjustments are made following Zhang et al. on a wide variety of socio-economic conditions, the frequency (2023). The model includes fast-changing poverty correlates does not allow for monthly or quarterly monitoring, which makes to ensure responsiveness to sudden negative shocks, such as the IHS unable to provide insights on how climate shocks the COVID-19 pandemic and extreme weather events. affect poverty. To better understand the extent and timing of how weather disasters affect poverty, SWIFT is being used Results of the poverty estimates for rural southern Malawi in conjunction with a Rapid and Frequent Monitoring System are shown in Figure 4. In August 2020, the poverty rate was (RFMS) in southern rural Malawi. RFMS is a community- 57.5 percent, slightly higher than the official poverty rate for based data collection system where enumerators live within the region in 2019/20 (56.7 percent), likely due to the impact the local communities they survey. Thanks to the community- of the COVID-19 pandemic and associated social distancing based enumerators, there is a high retention rate of 95 percent policies. It should be noted that because RFMS is not an among the 4,500 rural households that cover ten districts. official household survey, the produced poverty rates are not strictly comparable to official country poverty statistics. That said, the imputed poverty rates for August 2020 indicate some poverty increase in southern rural Malawi. 15 This section is prepared by Danielle Aron and Nobuo Yoshida based on Yoshida et al. (2023b). 34 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT Fluctuations in poverty also appear to align with the harvest poverty rates higher than the previous year’s harvest season. seasons in the region. In December 2020, the middle of the lean Similar to the effects of tropical storm Ana, the poverty rate season, there is an increase in poverty; in July 2021, just after did not immediately increase after cyclone Freddy hit the the harvest season, there is a decrease in poverty. However, country in March 2023. These limited immediate impacts are from July 2021 onwards, we see an increase in poverty. This likely because farmers were living off the stock from last year’s is likely due to the following lean season in December 2021 harvest. Crops lost due to cyclones will not be immediately felt, and the effects of tropical storm Ana. The slight decline in but rather affect the income and poverty rate in the upcoming poverty in July 2022 implies that despite the end of the harvest harvest season. Indeed, the poverty rate increased slightly to season helping to improve household welfare, the medium- 60 percent in July 2023, rather than declining like it did in the term impacts of tropical storm Ana and surging inflation kept 2021 harvest season. F I G U R E 4 - Rural southern Malawi poverty estimates using RFMS 61% 60.3% 60% 60.0% 59.0% 59.5% 59% 58.8% 59.2% 58.4% 58% 57.9% 57% 57.5% 56.6% 56% 55% 54% Aug 20 De 20 Jul 21 Dec 21 Feb 22 Jul 22 Dec 22 Jan 23 Mar 23 Jul 23 Notes: Grey areas represent the harvest season (roughly from April to June) and red lines represent when cyclones hit Malawi – Tropical Storm Ana hit Malawi in February 2022 and Cyclone Freddy hit Malawi in March 2023 Source: Yoshida et al. (2023b). ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT <<< 35 (VI)  GANDA: PROVIDING TIMELY POVERTY DATA IN U A REFUGEE CRISIS CONTEXT 16 TYPE OF APPLICATION RAPID POVERTY MONITORING WITH PHONE SURVEYS Training data Uganda Refugee Household Survey (URHS) 2018 Target data COVID-19 High Frequency Phone Survey (HFPS), 3 rounds: Oct/Nov 2020, Dec 2020, Feb/Mar 2021 Version of SWIFT Standard SWIFT and SWIFT Plus Region, household demographics, asset ownership, characteristics of refugee migration, remittances, Types of model variables food security Model disaggregation None, models applied to all refugee population in survey Modeling: USD 5,000 Cost Data collection per round: USD 50,000 – 60,000 Estimation per round: USD 5,000 Poverty in Uganda has been relatively stable over the past After the onset of the COVID-19 crisis, High Frequency Phone decade. The national poverty rate declined from 30.8 percent Surveys (HFPS) were implemented in Uganda to understand in 2012 to 27.0 percent in 2019 before the COVID-19 crisis the impacts of the pandemic. SWIFT was used to impute erased this progress and sharply increased the poverty rate to household expenditure and track the rapidly changing levels 33.2 percent in 2020. Throughout the same period, there was of refugee poverty. Data from the 2018 Uganda Refugee significant heterogeneity in both levels and trends of poverty Household Survey (URHS) were used as training data to rates across various sub-populations. Before COVID-19, the construct the SWIFT models, which were leveraged to impute urban population saw a poverty increase from 16 percent household consumption in three rounds of HFPS. Since phone in 2012 to 19.8 percent in 2019, while the rural population ownership among refugees is not universal among refugees in decreased from 35.1 percent to 33.8 percent over the same Uganda, the phone survey sample was biased toward the rich. period. One sub-population that experiences comparatively To make the sample of HFPS comparable to URHS, sampling high poverty in Uganda is refugees. Prior to the pandemic, weights were adjusted following Zhang et al. (2023). refugees had a significantly higher poverty rate (44 percent in 2019) than Ugandans overall. However, tracking the evolution To distinguish trends in economic well-being between the pre- of poverty rates for refugees and other sub-populations is COVID and COVID-19 periods, two respective SWIFT models difficult during crises such as the COVID-19 pandemic, as were created for this analysis. One model, denoted the “pre- household income is prone to large shocks over short periods. COVID” model, was developed without any fast-changing While Uganda has conducted three national household poverty correlates to impute poverty as it likely was just prior to surveys between 2012 and 2019 to track poverty rates, more the onset of the pandemic. The second model was developed frequent monitoring is necessary during a volatile period. To with fast-changing poverty correlates to track the rapidly this point, one of the primary objectives outlined in the Uganda changing conditions among refugees during the pandemic. Poverty Assessment (PA) in 2022 was to fill gaps in knowledge by measuring vulnerability to poverty. 16 This section prepared by Jeremy Schneider based on Atamanov et al. (2022) with inputs from Aziz Atamanov. 36 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT The pre-COVID model was applied only to the first round of every round of HFPS and the resulting poverty estimates are the HFPS (October/November 2020) to estimate the poverty representative of the period when the data was collected. This rate among refugees as they would have been just prior to the assumption is reasonable because the included food security onset of the COVID-19 crisis in March 2020. The lack of fast- indicator is likely to respond quickly to negative economic changing indicators means that though the data was collected shocks monthly (or even weekly). Asset variables, which may in October/November 2020, it is unlikely that the means of the bias downward changes in poverty rates due to their slow- model variables would have changed between then and March changing nature, were removed from this model. 2020, even in response to the shock of the pandemic. For example, it is unlikely that a household would have been able Results of the analysis show that the poverty rate among to sell assets in response to a negative economic shock due refugees in Uganda spiked in the first round of HFPS, jumping to the lack and inconsistency of second-hand markets. This from 44 percent in March of 2020 to 51 percent just 7-8 months highlights a unique advantage and application of the standard later in October/November 2020. The rate fell to 49.2 percent SWIFT methodology: by using only slow-changing poverty in December 2020, followed by another small decrease to correlates, such as demographics and asset variables, the 48.6 percent in February/March 2021. It is worth noting that simulated poverty rate is likely to change less over time and the jump in the poverty rate for the first 8-month period is not thus can be used for a date prior to a large shock. only a larger change than that experienced by refugees for the next eight months, but also far greater than any change The second model is a typical SWIFT Plus model, incorporating recorded by the Ugandan population in the last decade. food security as the fast-changing poverty correlates. This model is employed to impute household consumption for F I G U R E 5 - Uganda poverty estimates for refugees 51.0% 49.2% 48.6% 44.0% Pre-COVID Round 1 Round 2 Round 3 (Mar 2020) (Oct/Nov 2020) (Dec 2020) (Feb/Mar 2021) Source: Atamanov et al. (2022). The results outlined above are some of the first to demonstrate Future work along these lines will rely on ensuring strong data the usefulness of tracking poverty rates in sub-Saharan Africa quality and collecting additional fast-changing consumption throughout a crisis. Indeed, the jump in the poverty rate variables in frequent surveys, allowing SWIFT to provide the among refugees from early to late 2020 was a far greater most accurate and timely poverty estimates. change than any fluctuations recorded in the eight years prior. ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT <<< 37 (VII)  ONGOLIA: RESTORING POVERTY M TRENDS AFTER IMPROVEMENTS TO THE HOUSEHOLD SURVEY 17 TYPE OF APPLICATION RESTORING COMPARABILITY OF POVERTY DATA Training data Household Socio-Economic Survey (HSES) 2018 Target data Household Socio-Economic Survey (HSES) 2020 Version of SWIFT SWIFT Plus Types of model variables Aimag,18 household demographics, education, employment, consumption of non-food items, Model disaggregation 15 models by aimag group19 and urban/rural populations (8 urban groups, 6 rural groups) Data collection: 0 (Both training and target datasets were available when the SWIFT based poverty Cost estimation was conducted.) Data harmonization, modeling, & estimation: USD 3,000 – 5,000 Mongolia has implemented a Household Socio-Economic severe economic recession and extreme weather shocks in Survey (HSES) biennially since 2012 to produce official the country during this year, testing the model in the 2016 poverty statistics. However, the 2020 HSES was updated data also showed the models’ ability to accurately produce to incorporate international standards for the design of poverty rates during times of large negative economic shocks, consumption surveys. Modifications were also made to the as would be present in the 2020 HSES due to the onset of the reference periods and the number of food and non-food COVID-19 pandemic. The models’ ability to respond quickly items included in the consumption modules to better reflect to changes in economic conditions is due to the addition of the evolving consumption patterns in Mongolia. While these fast-changing poverty correlates, such as food and non-food adjustments improved the collected data, resulting poverty consumption and household head employment. Figure 6 statistics from the 2020 HSES were no longer comparable to shows the results of the validation exercise within the 2016 the previous trend. HSES, disaggregated by a subnational unit (aimag). The differences between imputed and actual poverty rates are The SWIFT team at the World Bank worked closely with the small and statistically insignificant for all aimag groups, with a Mongolian NSO to utilize SWIFT to restore comparability to few exceptions. In addition to validating imputed poverty rates the poverty trend from 2010-2018. The 2018 HSES was used in the 2016 data, Gini coefficients were also calculated using to train fourteen models for different aimag groups and rural/ the imputed household consumption and compared to the Gini urban populations, and household expenditure was imputed coefficient produced from the actual measures of household into the 2020 HSES to produce comparable poverty rates. consumption. This additional test showed that Gini coefficients Aggregate poverty rates for the overall urban, rural, and produced from imputed household consumption perform very national levels were produced from the individual estimates well, matching the national and urban Gini coefficients to three from the fourteen models. decimal places and varying on the rural Gini coefficient by only .006 in the 2016 data. Before applying the models to the 2020 data, they were validated in the 2016 HSES data. Using the models, poverty rates were imputed into the 2016 HSES data and then compared to the actual poverty rates to test performance in an out-of-sample test. Additionally, because there was a 17 This section prepared by Anjali Kini and Xueqi Li based on Uochi and Kim (2022) with inputs from Lydia Kim. 18 An aimag refers to one of the 21 provinces in Mongolia. 19 An aimag group consists of more than one aimag. 38 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT  ongolia validation exercise – comparison of 2016 actual and imputed poverty headcount FIGURE 6 - M by aimag group Urban G1 Urban G2 Urban G3 Urban G4 Urban G5 Urban G6 Urban G7 Urban G8 Rural G1 Rural G2 Rural G3 Rural G4 Rural G5 Rural G6 10 20 30 40 50 2016 Poverty rate Actual Actual 95% CI Imputed Source: Uochi and Kim (2022) The final results show that the national poverty rate declined household consumption into the 2020 data. From 2016 slightly from 2018 to 2020, from 28.4 percent to 27.8 percent. to 2020, we see minor change in the Gini coefficient at the Similar changes occurred in both urban and rural areas. national, urban, and rural levels, indicating no real change in Inequality statistics were also produced after imputing inequality throughout Mongolia. ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT <<< 39 T A B L E 6 - Mongolia actual and imputed Gini coefficients in 2016, 2018, and 2020 2016 2018 2020 ACTUAL IMPUTED ACTUAL IMPUTED IMPUTED National 0.323 0.323 0.323 0.323 0.323 Urban 0.331 0.331 0.331 0.334 0.334 Rural 0.296 0.29 0.296 0.29 0.286 Source: Uochi and Kim (2022) In addition to producing poverty and inequality statistics to correlates from the models in this exercise resulted in poverty restore a comparable trend from 2010 to 2020, SWIFT was estimates that are likely representative of what poverty was also used to produce poverty rates representative of the time before the sudden negative impacts of the pandemic. Results just prior to the onset of the COVID-19 pandemic. New models show that at the national level, the COVID-19 pandemic were created following the same process outlined above but caused a 3.5 percentage point increase in poverty in 2020. excluding all fast-changing poverty correlates. Because fast- changing poverty correlates allow the models to respond to sudden economic shocks, removing fast-changing poverty (VIII) Z  AMBIA: RESTORING POVERTY TRENDS DUE TO DIFFERENCES IN HOUSEHOLD SURVEYS 20 TYPE OF APPLICATION RESTORING COMPARABILITY OF POVERTY DATA Training data Living Conditions Monitoring Surveys (LCMS) 2015 Target data Living Conditions Monitoring Surveys (LCMS) 2022 Version of SWIFT SWIFT Plus Province, household demographics, livestock and asset ownership, own-production of food items, Types of model variables purchase of food items, food security Model disaggregation Urban, Rural Data collection: 0 (Both training and target datasets were available when the SWIFT based poverty Cost estimation was conducted.) Data harmonization, modeling, & estimation: USD 3,000 – 5,000 Given the 7-year lapse between the two latest Living compute the consumption aggregate (i.e., how to estimate Conditions Monitoring Surveys (LCMS) in Zambia, there were the user value of durable goods). While the differences driven bound to be differences across the two that would compromise by a revised methodology can usually be addressed by re- their comparability. In general, some differences are driven by estimating the previous numbers following the new approach, changes in survey design (i.e., recall period), while others are survey design changes often require statistical methods to driven by updates in the methodological recommendations to restore a comparable trend. 20 Maria Gabriela Farfan Betran and Sebastian Patrick Alexandre Silva Leander based on Zambia Statistics Agency and World Bank (2023). 40 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT There are two main differences between the 2015 and 2022 Two models were generated, one for the urban areas and LCMS. First, the food module had a different recall period. In one for rural areas. Explanatory variables include household 2015, the recall period was fixed, while in 2022, the reference composition, education levels, assets, labor market status period of the initial incidence question (i.e., did you purchase/ of household members, food consumption dummies, and consume/receive…?) was fixed, but respondents were then the log of the comparable non-food component. Food items allowed to select the relevant reference period for the follow- were grouped into 13 food types (meat, vegetables, etc.), up questions on quantities and value. Second, due to a typo and dummies were aggregated for each type, thus indicating in the programming of the CAPI questionnaire, the reference how many times the household consumed a specific food period for frequent non-food items changed from 4 weeks in type. Food items were restricted to those where at least 70 the 2015 survey to 12 months in the 2022 survey. percent of the 2022 sample selected the 4-week recall period, matching the recall period from 2015. The comparable portion In this context, SWIFT was implemented to estimate the official of non-food expenditure includes health, clothing, financial poverty and inequality trends for 2015-2022. To address the services, durables, and housing. Non-food expenditures high expectations that the government and civil society have exclude education because public secondary school fees regarding the new poverty numbers, Zambia Statistics Office were eliminated in 2022. On average, the comparable portion (ZamStats) decided to use 2015 as the reference survey to of non-food expenditure accounted for 33.7 percent of total train the SWIFT models. Choosing the 2015 survey as the consumption in 2015. training data allowed for the quicker publication of results and promoted continuity in discussions with stakeholders, as Table 7 shows the results of in-sample tests of the SWIFT the 2022 estimates could be discussed in comparison to the models for both poverty incidence and inequality (as measured widely cited 2015 numbers. by the Gini coefficient). T A B L E 7 - Zambia model performance results in 2015 data POVERTY INCIDENCE GINI COEFFICIENT OFFICIAL (ACTUAL) IMPUTED (SWIFT) OFFICIAL (ACTUAL) IMPUTED (SWIFT) National 54.4 54.8 0.546 0.555 Rural 76.6 76.5 0.434 0.455 Urban 23.4 24.3 0.476 0.484 Source: Prepared by Maria Gabriela Farfan Betran and Sebastian Patrick Alexandre Silva Leander using LCMS 2015 Results suggest that poverty in Zambia increased by close to to 0.51. The decrease in inequality is driven largely by a 6 percentage points between 2015 and 2022. The increase decrease in the rural/urban gap due to the rapid deterioration is driven primarily by a significant increase in urban poverty in urban living standards. There was also a statistically of about nine percentage points. The slight increase in significant decrease in intra-urban inequality, while within- rural areas is not statistically significant. At the same time, rural inequality remained stagnant. inequality declined, with the Gini coefficient falling from 0.55 ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT <<< 41 F I G U R E 7 - Zambia poverty and inequality trends Pov rt Incid nc Gini Co ffici nt 78.8 76.6 0.55 60.0 54.4 0.50 0.48 0.45 31.9 23.4 0.44 0.43 2015 2022 2015 2022 National Rural Urban National Rural Urban Source: Prepared by Maria Gabriela Farfan Betran and Sebastian Patrick Alexandre Silva Leander using LCMS 2015 and 2022. Further analysis between the SWIFT distribution and the raw- periods than better-off households. This meant that relative 2022 distribution will be done to provide indicative evidence to 2015, food consumption was likely to be overestimated for on the implication of adopting the self-selected recall period poorer households and underestimated for richer households in in 2022. The data show a systematic relationship between the the raw 2022 data. The subsequent implications for inequality length of the recall period and the socio-economic status of the estimates are substantial. The change in the Gini coefficient household. Poorer households – proxied by the education of based on raw 2022 data is twice as large as the change the household head – are more likely to select shorter recall estimated from imputed household expenditure from SWIFT. Zambia - At a water well. Photo: FOTOGRAFIA INC. / iStock 42 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT (IX)  IGERIA: USING AVAILABLE PAST SURVEYS N TO UNCOVER A POVERTY TREND 21 TYPE OF APPLICATION RESTORING COMPARABILITY OF POVERTY DATA Training data Nigerian Living Standards Survey (NLSS) 2018/19 Target data General Household Surveys (GHS) 2010/11, 2012/13, 2015/16, and 2018/19 Version of SWIFT SWIFT Plus Geographic zone, household characteristics, employment, dwelling conditions, asset ownership, Types of model variables consumption of food and non-food items Model disaggregation None Data collection: 0 (Both training and target datasets were available when the SWIFT based poverty Cost estimation was conducted.) Data harmonization, modeling, & estimation: USD 10,000 Nigeria contains the largest number of people living below parts of the 2018/19 NLSS, which allowed for comparison the extreme international poverty line ($2.15/day 2017 PPP) and confirmation that applying a SWIFT model developed in in Sub-Saharan Africa. Prior to COVID-19, the country faced NLSS to GHS data was a viable solution. During the modeling multiple negative shocks, including a recession after a process, additional checks were incorporated to tackle the drop in global oil prices in 2016. However, frequent poverty seasonality issue in some of the poverty correlates, such as statistics have been unavailable to see the effects of these the share of household heads employed in non-farm activities economic shocks on poverty. The first official poverty statistics or wage employment and food consumption of items like in a decade were produced using the 2018/19 Nigerian imported rice, beef, and fresh fish. Living Standards Survey (NLSS). Prior to that, the 2009 Harmonized Nigerian Living Standards Survey was used for The means of most of the candidate variables appear official estimates. However, crucial differences between the comparable over time. A few variables, like the household two surveys and issues of data quality in the 2009 survey head’s employment status and consumption frequency prevented comparable statistics from seeing a poverty trend. of food items, displayed higher variability, which may be due to seasonal variation. Several variables also show an To uncover the poverty trends in the past decade, SWIFT improvement in household welfare in the first three time was used to impute poverty into four General Household periods and then show a reduction, which may be due to the Surveys (GHS) for 2010/11, 2012/13, 2015/16, and 2018/19. 2016 Nigerian oil crisis. Five thousand households were surveyed in each wave of the GHS, and the data is representative at the national, zonal, Five different models were developed, each omitting a and urban-rural levels. The GHS contains a variety of non- different set of variables. For example, model 5 does not monetary variables that are collected in ways like the NLSS, contain employment or food consumption variables. As the making the GHS a viable candidate for survey-to-survey imputed estimates were stable across the five models, it was imputations with NLSS. The 2018/19 NLSS was used to train concluded that potentially seasonal variables had little effect the SWIFT models. on the results. Prior to developing the models, an exercise was conducted Results of the analysis (Table 8) show that poverty decreased to ensure comparability between NLSS and GHS despite a in the first three rounds of the GHS and then increased after difference in data collection months. The NLSS was surveyed 2016, which is attributed to the collapse of oil prices in Nigeria over 12 months, whereas the GHS was only collected in in that year. This trend is visible in the changes in the means the post-planting and post-harvest seasons. However, one of the model variables over this period. wave of the 2018/19 GHS took place at the same time as 21 This section prepared by Paripoorna Baxi based on Lain et al. (2022) with inputs from Marta Schoch. ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT <<< 43 Nigeria Photo: Arne Hoel / World Bank T A B L E 8 - Nigeria poverty estimates at international poverty lines, by GHS wave POVERTY RATE US$1.90 95% CONFIDENCE INTERVAL 2010/11 GHS 43.54% 40.97% 46.11% 2012/13 GHS 42.49% 39.77% 45.20% 2015/16 GHS 40.75% 37.30% 44.20% 2018/19 GHS 41.88% 38.31% 45.44% POVERTY RATE US$3.20 95% CONFIDENCE INTERVAL 2010/11 GHS 72.88% 70.45% 75.32% 2012/13 GHS 72.12% 69.58% 74.67% 2015/16 GHS 70.44% 67.00% 73.89% 2018/19 GHS 72.28% 69.22% 75.34% Source: Authors’ compilation based on Lain et al. (2022) The household consumption vector was assumed to be The 2018/19 NLSS had a slightly lower poverty headcount lognormally distributed. Robustness checks were carried rate (39.1 percent), making a lognormal distribution suitable out to test this assumption. Zero-skewedness and Box-Cox for the analysis. transformations were applied to the imputed 2018/19 GHS vector. The aim was to see whether the transformed imputed A big limitation in the survey-to-survey imputation approach for distribution is closer to the 2018/19 NLSS distribution than Nigeria’s case was the difference in data collection schedules the actual imputed consumption vector. The original imputed between the NLSS and GHS. As the GHS was only conducted vector produced a poverty headcount rate of 41.88 percent in the post-plating and post-harvest seasons, biased estimates (Table 8) at the $1.90 poverty line. When using the zero- could be produced. This was considered by conducting skewedness transformation, the rate was 42.4 percent, and sensitivity analyses and robustness checks, though the GHS using the Box-Cox transformation, it was 42.6 percent. These could still have overestimated the poverty rates due to this estimates confirm that the imputation process was robust. variation. If true, this would imply that the actual poverty rate in 2010/11 was even lower. 44 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT Nigeria - Shepherding in Ta Kuti village. Photo: Arne Hoel / World Bank ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT <<< 45 7 Conclusion This report illustrates nine successful applications of the SWIFT framework to enhance the frequency and quality of poverty data across different countries. In the Democratic Republic of Congo (DRC), Paraguay, and Botswana, SWIFT was integrated with existing household surveys to augment the frequency of poverty data. In Mongolia, Zambia, and Nigeria, SWIFT was employed to restore the comparability of poverty statistics over time. In Zimbabwe, a modified approach (SWIFT 2.0) was implemented during an economic crisis to estimate poverty statistics for 2019. In Malawi and Uganda, SWIFT was utilized in tandem with alternative data collection methods, a community-based data collection, and a phone survey, respectively. The potential for increasing the frequency of poverty data with SWIFT is on the rise. Although the frequency of household budget surveys remains limited, other forms of data collection and surveys are becoming more ubiquitous, propelled by a worldwide demand for more frequent and timely data. For instance, an increasing number of countries are now undertaking labor force surveys Nigeria - Construction workers on site. Photo: Arne Hoel / World Bank 46 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT on a quarterly basis. Moreover, the adoption of phone and web areas where data is limited. Yoshida et al. (2022) evaluated surveys has become more prevalent and straightforward in and adopted some machine learning techniques on poverty many developing countries. During the COVID-19 pandemic, projections for SWIFT. Since the SWIFT modeling methods the World Bank rolled out high-frequency phone surveys in still faces some technical challenges, it is critical for the SWIFT over 90 countries. UNICEF is now gearing up for a fresh round team to remain proactive in integrating these cutting-edge of Multiple Indicator Cluster Surveys (MICS), which will be techniques, drawing from both their own experiential insights followed by high-frequency phone surveys. Notably, Malawi’s and the collective wisdom of the expansive machine learning approach to community-based data collection has shown and poverty measurement research community. itself to be more cost-effective and produce better quality data than phone surveys. By weaving in a concise set of 10- As worldwide data efforts ramp up and the methodology 15 SWIFT questions into these data collection initiatives, it continues to improve, SWIFT becomes an increasingly becomes possible to substantially improve the frequency and valuable framework to empower even low-income nations to timeliness of poverty statistics with minimal additional cost track poverty on a monthly or annual basis. More frequent and and time. However, it is also important to continue investing timely poverty data enable governments and development in improving the quality of data using digital technologies and partners to understand the immediate, mid-term, and long- rigorous supervision. term impacts of factors like climate change, economic recessions, conflicts, and natural calamities on poverty and Alongside this enhanced data availability, there have been overall living conditions. Consequently, this insight aids in the many recent breakthroughs in Artificial Intelligence, Machine design of effective policy instruments and ensures the timely Learning, and statistical projection techniques, paving delivery of social assistance and humanitarian support to the for further refinements of the SWIFT modeling methods. most needed. The landscape of poverty estimation is seeing a surge in innovative projection models, with their reliability progressively strengthening. Seizing this momentum, the World Bank has conducted research aimed at honing the accuracy and reliability of poverty estimation techniques. For instance, Dang and Lanjouw (2023) recently shed light on advancements in regression-based imputation for poverty estimation in ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT <<< 47 List of contributors for country examples 1. Paraguay – Xueqi Li, Eliana Carolina Rubiano Matulevich, and Nobuo Yoshida (forthcoming) 2. Botswana – This section was prepared by Danielle Aron with inputs from Carolina Diaz-Bonilla and Nobuo Yoshida, based on Statistics Botswana and World Bank 2024 (forthcoming). 3. DRC – Paripoorna Baxi based on Paci et al. (2023) and inputs from Shinya Takamatsu 4. Zimbabwe – Anjali Kini and Danielle Aron based on ZIMSTAT and the World Bank (2022) with inputs from Dhiraj Sharma, Shinya Takamatsu, and Rob Swinkels 5. Malawi – Danielle Aron and Nobuo Yoshida based on Yoshida et al. (2023b) 6. Uganda – Jeremy Schneider based on Atamanov et al. (2022) with inputs from Aziz Atamanov 7. Mongolia – Anjali Kini and Xueqi Li based on Uochi and Kim (2022) with inputs from Lydia Kim 8. Zambia – Maria Gabriela Farfan Betran and Sebastian Patrick Alexandre Silva Leander based on Zambia Statistics Agency and World Bank (2023) 9. Nigeria – Paripoorna Baxi based on Lain et al. (2022) with inputs from Marta Schoch 8 3 5 7 2 9 6 4 1 48 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT References Atamanov, Aziz; Malasquez Carbonel, Douidich, M., A. Ezzrari, R. Van der Weide, and World Bank. 2022a. Mongolia - 2020 Poverty Eduardo Alonso; Masaki, Takaaki; Myers, P. Verme. 2013. “Estimating Quarterly Poverty Report: A Decade of Progress and Stagnation Cara Ann; Granguillhome Ochoa, Rogelio; Rates Using Labor Force Surveys: A Primer.” in Poverty Reduction (English). The World Sinha, Nistha. 2022. Uganda Poverty Policy Research Working Paper Series No. Bank. https://documents.worldbank. Assessment: Strengthening Resilience to 6466, World Bank, Washington, DC. org/en/publication/documents-reports/ Accelerate Poverty Reduction (English). documentdetail/099300009112214272/p1744 Washington, D.C.: World Bank Group. Fujii, T. and R. Van der Weide. 2020 “Is 290156e5d04f0b2bd09439956741f6 http://documents.worldbank.org/curated/ Predicted Data a Viable Alternative to Real en/099135006292235162/P17761605286900 Data?” The World Bank Economic Review. World Bank. 2023. Poverty & Equity b10899b0798dcd703d85 34(2): 485–508. https://doi.org/10.1093/wber/ Brief: Paraguay. The World Bank. https:// lhz007 databankfiles.worldbank.org/public/ Atamanov, Aziz; Yoshida, Nobuo; Alemi, ddpext_download/poverty/987B9C90-CB9F- Charles; Beltramo, Theresa Parrish; Ilukor, Lain, Jonathan. Schoch, Marta. Vishwanath, 4D93-AE8C-750588BF00QA/current/Global_ John; Rios Rivera, Laura Abril; Sarr, Tara. Making data count: Estimating a poverty POVEQ_PRY.pdf Ibrahima; Said, Ally Hamud; Waita, trend for Nigeria between 2009 and 2019. Peter; Yoshimura, Kazusa. Monitoring World Bank Economic Review, forthcoming Yoshida, N., R. Munoz, A. Skinner, C. Social and Economic Impacts of COVID-19 http://documents.worldbank.org/curated/ Kyung-eun Lee, M. Brataj, and D. Sharma. on Refugees in Uganda : Results from en/631201647459995911/Estimating-a-Poverty- 2015. SWIFT Data Collection Guidelines the High-Frequency Phone - Third Round Trend-for-Nigeria-between-2009-and-2019 version 2. The World Bank. https:// (English). Washington, D.C. : World Bank documents1.worldbank.org/curated/ Group. http://documents.worldbank.org/ Lanjouw, P. and N. Yoshida. 2021. “Poverty en/591711545170814297/pdf/97499-WP- curated/en/473751621359136592/Monitoring- Monitoring Under Acute Data Constraints: A P149557-OUO-9-Box391480B-ACS.pdf Social-and-Economic-Impacts-of-COVID-19- Role for Imputation Methods?” LIS Newsletter on-Refugees-in-Uganda-Results-from-the- Issue, No.19. https://www.lisdatacenter.org/ Yoshida, N., S. Takamatsu, K. Yoshimura, High-Frequency-Phone-Third-Round newsletter/nl-2021-19-im-2/ D. Aron, X. Chen, S. Malgioglio, S. Shivakumaran, and K. Zhang. 2022. The Brunckhorst, B., Kim, Y.S. and Cojocaru, Paci, P., Y. Batana, E. Skoufias, T. Masaki, N. Concept and Empirical Evidence of SWIFT A., 2023. Tracing Pandemic Impacts in Yoshida, K. Vinha, S. Takamatsu. 2023. DRC Methodology. Washington, D.C.: World Bank the Absence of Regular Survey Data. Poverty Assessment: Priorities for Better- Group. https://elibrary.worldbank.org/doi/ Washington, D.C.: World Bank Group. https:// Targeted Poverty Reduction Policies in a abs/10.1596/38095. openknowledge.worldbank.org/server/api/ Large, Fragile, Conflict-Affected Country. The core/bitstreams/e98156a4-3d85-4b9c-ba69- World Bank. Yoshida, N., S. Takamatsu, S. Shivakumaran, 8210a017446f/content K. Zhang, and D. Aron. 2023a. Poverty Rubiano, Eliana. 2023. Poverty & Equity Brief: projections and profiling using a new SWIFT Batana, Yele Maweki. 2023. Poverty Paraguay. Washington, D.C.: World Bank package during the COVID-19 pandemic. & Equity Brief: Democratic Republic of Group. https://databankfiles.worldbank.org/ Mimeo. https://iariw.org/wp-content/ Congo. Washington, D.C.: World Bank public/ddpext_download/poverty/987B9C90- uploads/2022/10/Yoshida-et-al-IARIW- Group. https://databankfiles.worldbank.org/ CB9F-4D93-AE8C-750588BF00QA/current/ TNBS-2022.pdf public/ddpext_download/poverty/987B9C90- Global_POVEQ_PRY.pdf CB9F-4D93-AE8C-750588BF00QA/current/ Yoshida, N., L. Marcela Cardona, and K. Global_POVEQ_COD.pdf Serajuddin, U., H. Uematsu, C. Wieser, Yoshimura. 2023b. “Creation of a new real- N. Yoshida, and A. Dabalen. 2015. “Data time and continuous poverty monitoring tool Campbell, J. 2022. Rapid Feedback Deprivation. Another Deprivation to End.” to monitor the immediate and lagged impact Monitoring System (RFMS): Bridging the Policy Research Working Paper Series No. of cyclones on poverty.” Mimeo. Gap: Using SWIFT to rapidly monitor poverty 7252. The World Bank. http://hdl.handle. and welfare in a time of crises. https:// net/10986/21867 Zambia Statistics Agency and World Bank, thedocs.worldbank.org/en/doc/4d610ef0c 2023. “Estimating a consistent Poverty 96943deb704d750d0bbf54c-0350012022/ Statistics Botswana and the World Bank. and Inequality trend in Zambia. 2015-2022 original/Rapid-Feedback-Monitoring-System- 2024. Advancing Botswana Poverty Estimates: Poverty and Inequality Trends Methodological RFMS-James-Campbell.pdf Introducing the Survey of Well-being via Note”. Mimeo Instant and Frequent Tracking (SWIFT). Christiaensen, L., P. Lanjouw, J. Luoto, and Zhang, K., S. Takamatsu, and N. Yoshida D. Stifel. 2012. “Small Area Estimation- Uochi, I. and L. S. Kim. 2022. Mongolia - 2023. “Correcting Sampling and Nonresponse Based Prediction Methods to Track Poverty: 2020 Poverty Report: A Decade of Progress Bias in Phone Survey Poverty Estimation Validation and Applications.” Journal of and Stagnation in Poverty Reduction. Using Reweighting and Poverty Projection Economic Inequality 10 (2): 267–97. Washington, D.C.: World Bank Group. Models.” Mimeo. http://documents.worldbank.org/curated/ Dang, H. and P. Lanjouw. 2022. “Regression- en/099300009112214272/P1744290156e5d0 ZIMSTAT and the World Bank. 2022. based Imputation for Poverty Measurement in 4f0b2bd09439956741f6 Zimbabwe Poverty Update 2017-2019. Data Scarce Settings,” Working Papers 611. https://www.zimstat.co.zw/wp-content/uploads/ ECINEQ, Society for the Study of Economic World Bank. 2020. COVID-19 Impact publications/Income/Finance/Zimbabwe_ Inequality. http://www.ecineq.org/milano/WP/ Monitoring: Uganda Round 3. © World Poverty_Updat_2017_19_Final.pdf ECINEQ2022-611.pdf Bank, Washington, DC. http://hdl.handle. net/10986/34993 License: CC BY 3.0 IGO. http://hdl.handle.net/10986/34993 ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT <<< 49 Annex 1. Metadata for country cases Country SWIFT Training survey Target Potentially Model Validation Publicly available document version and sample survey(s) and small validated criteria used size(s) sample size(s) sample using size used different for CV survey rounds/ data Botswana SWIFT BMTHS 2015/16 QMTS 2019 Q3 - Yes, in Cities No, not N/A Botswana Poverty Assessment 2022 Q4 & Towns feasible (forthcoming) 3 models per region, models exact sample size 19Q3: 1140 (R), varies: 1306 (U), 790 (CT) Rural villages: ~2600 19Q4: 1133 (R), Urban villages: ~2600 1302 (U), 769 (CT) Cities Towns: ~1700 20Q1: 1134 (R), 1298 (U), 757 (CT) 20Q4: 1139 (R), 1307 (U), 791 (CT) 21Q4: 1140 (R), 1308 (U), 792 (CT) 22Q4: 1199 (R), 1693 (U), 849 (CT) DRC SWIFT HBS 2012 MICS 2018 Yes, in No, not N/A https://documentsinternal.worldbank. models for feasible org/search/33852769 Urban: 9,190 5,879 Urban provinces Rural: 11,165 14,886 Rural Malawi SWIFT HIS 2019/20 RFMS August No (large No, not N/A https://thedocs.worldbank.org/en/doc Plus 2020 - 2023 sample size) feasible /15b6a023be65db6869e1f7710bfb70 4400 9a-0360012023/original/A-real-time- 1760 - 4390 monitoring-of-poverty-in-an-age-of- climate-change.pdf Mongolia SWIFT HSES 2018 HSES 2020 Yes, in Yes, model lies within https://documents1.worldbank.org/ Plus Total: 16,454 models for is validated 95% or 90% curated/en/099300009112214272/pdf/ Total: 16,460 urban/rural using confidence P1744290156e5d04f0b2bd09439956 Urban Group 1: 3573 groups previous interval of 741f6.pdf Urban Group 2: 839 round the actual Urban Group 3: 960 rate Urban Group 4: 720 Urban Group 5: 599 Urban Group 6: 720 Urban Group 7: 479 Urban Group 8: 1079 Rural Group 1: 958 Rural Group 2: 1247 Rural Group 3: 1152 Rural Group 4: 1344 Rural Group 5: 1623 Rural Group 6: 1152 Nigeria SWIFT NLSS 2018/19 GHS No (large No, not N/A https://documents1.worldbank.org/ Plus sample size) feasible curated/en/631201647459995911/ Total: 21,580 2010/11: 9438 pdf/Estimating-a-Poverty-Trend-for- 2012/13: 8905 Nigeria-between-2009-and-2019.pdf 2015/16: 8951 table continue to the next page 50 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT table continued from the previous page Country SWIFT Training survey Target Potentially Model Validation Publicly available document version and sample survey(s) and small validated criteria used size(s) sample size(s) sample using size used different for CV survey rounds/ data Paraguay SWIFT EPHC 2022 Q4 EPHC Q1-3 No (large Yes, model absolute Xueqi Li, Juan José Galeano, Eliana Plus sample size) is validated difference Rubiano-Matulevich, and Nobuo Urban: 2,609 2019 Q1: 1993 using of less Yoshida. 2022. Paraguay: Using Rural: 2,409 (U), 1358 (R) previous than two quarterly surveys to increase the 2019 Q2: 1825 rounds percentage frequency of poverty statistics. (U), 1509 (R) points mimeo. 2019 Q3: 2058 (U), 1385 (R) 2019 Q4: 2764 (U), 2335 (R) 2020 Q4: 2573 (U), 2269 (R) 2021 Q4: 2357 (U), 1760 (R) 2021 Q3: 2396 (U), 1713 (R) 2021 Q4: 2406 (U), 2240 (R) 2022 Q1: 2418 (U), 1798 (R) Uganda SWIFT, URHS 2018 COVID-19 HFPS Yes No, not N/A https://documents.worldbank.org/ SWIFT feasible en/publication/documents-reports/ Plus 806 R1 (Oct/Nov documentdetail/473751621359136592/ 2020): 2010 monitoring-social-and-economic- R2 (Dec 2020): impacts-of-covid-19-on-refugees- 1852 in-uganda-results-from-the-high- R3 (Feb/Mar frequency-phone-third-round. 2021): 1985 Zambia SWIFT LCMS 2015 LCMS 2022 No (large No, not N/A Zambia Statistics Agency and World Plus sample size) feasible Bank. 2023. Estimating a consistent Rural: 6,524 Urban: 3,313 Poverty and Inequality trend in Urban: 5,621 Rural: 5,157 Zambia 2015-2022 Poverty and Inequality Trend Methodological Note. Forthcoming. Zimbabwe SWIFT mini-PICES 2019 mini-PICES 2019 Yes Not needed N/A https://www.zimstat.co.zw/wp-content/ 2.0 (Training dataset) (Target dataset) (SWIFT uploads/publications/Income/Finance/ 2.0) Zimbabwe_Poverty_Updat_2017_19_ Total: 478 households Total: 2201 Final.pdf Urban: 230 households Rural: 248 Urban: 577 Rural: 1624 ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT <<< 51 Annex 2. Simulation process SWIFT follows the imputation process of STATA. A SWIFT model assumes the natural logarithm of household expenditure of household h follows a linear model (1) where x h (a (k X 1) vector of poverty correlates of household h ), ß s (a (k X 1) vector of coefficients of poverty correlates), and u h (an error term) follow a normal distribution (N ( 0,o ts ) ). lny h = x h'ß + u h (1) u h~ N ( 0,o ) The description of the imputation process is as follows: ˆ and o 1. We first run an ordinary regression for model (1) in the training dataset to obtain estimates ß ˆ 2 of the model parameters 2. Simulate new parameters o 2 , ß , and u rom the following distributions * * * 2 ˆ2 2 o* ~ o ( n 0 - k )/ X n 0- k ß ~N (ß ˆ , o 2( X' 0 X 0) -1) * * u ~ N (0, o 2) * 2 where X n 0 - q refers to a chi-squared distribution with a degree of freedom of ( n 0 - k ) and n 0 refers to the number of observations in the training dataset. 3. We impute the natural logarithm of household expenditures using model (1) with the new parameters in the target dataset 4. We repeat the imputation 20 times or more. The model and distributional assumptions can change depending on the datasets. More details are available in Yoshida et al (2022). Malawi - Enumerator Joe Aggrey Majausya goes through survey questions with respondent Sylvia Chinsamba. Photo: Dooshima Tsee / World Bank 52 >>> ENABLING HIGH-FREQUENCY AND REAL-TIME POVERTY MONITORING IN THE DEVELOPING WORLD WITH SWIFT