Policy Research Working Paper 10474 Learning during the Pandemic Evidence from Uzbekistan Syedah Aroob Iqbala Harry Anthony Patrinos Education Global Practice June 2023 Policy Research Working Paper 10474 Abstract School closures induced by the COVID-19 pandemic led deviation despite school closures. The outcomes among stu- to concerns about student learning. This paper evaluates the dents who were assessed in 2019 improved by an average effect of school closures on student learning in Uzbekistan, of 0.72 standard deviation over the next two years, slightly using a unique dataset that allows assessing change in learn- lower than the expected growth of 0.80 standard deviation. ing over time. The findings show that test scores in math The paper explores the reasons for no learning loss. for grade 5 students improved over time by 0.29 standard This paper is a product of the Education Global Practice. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at hpatrinos@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Learning during the Pandemic: Evidence from Uzbekistan Syedah Aroob Iqbala and Harry Anthony Patrinosb* JEL Classification: I21, I24 Keywords: COVID-19, learning loss, school closures, social inequality, digital divide */ Harry Anthony Patrinos, (Mail stop: MC 7-711) 1818 H Steet NW, Washington, DC. 20433, Tel: +1 240-899-6882, hpatrinos@worldbank.org; a Research Analyst, Education Global Practice, Europe and Central Asia, World Bank, Washington, DC, 20433; b Adviser, Office of the Chief Economist, World Bank, Washington, DC, 20433. Useful comments were received from Rita Almeida, Hiroshi Saeki, Victoriya Babakhodjaeva and Ayesha Vawda. We thank Ulugbek Tashkenbaev and the State Inspectorate for Supervision of Quality in Education for cooperation and access to information. 1. Introduction During the COVID-19 pandemic, most countries closed schools for several months. School closures threaten children’s schooling as in-person teaching is replaced by distance education, since it is likely less effective and denies peer interactions (Agostinelli et al. 2022). The closures could lead to learning loss – declines in student knowledge and skills – and future earnings losses (Azevedo et al 2021; Psacharopoulos et al. 2021). Since the closures, researchers have analyzed the impact on student learning. Most studies observed learning loss and increases in inequality where certain demographics of students experienced more significant learning loss than others. However, there are also countries that managed to limit the amount of loss, such as Denmark, through policy (Birkelund and Karlson 2022) and Sweden, by not closing schools (Hallin et al. 2022). On average, robust studies from more than 20 countries find average learning losses of 0.17 of a standard deviation (SD), equivalent to roughly a one-half year worth of learning (Patrinos et al. 2022). Most research comes from Western European and high-income countries (Engzell et al. 2021; Jack et al. 2023; Maldonado and De Witte 2022). Yet, recently more data is being published from middle income countries such as Brazil (Lichand et al. 2022), China (Clark et al. 2021), Ghana (Wolf et al. 2022) and Türkiye (Coskun and Kara 2022) – showing large losses on average. Existing studies report declining achievement and greater educational losses for disadvantaged children. However, the impact may vary across societies, school systems, and measures adopted to contain the pandemic. For instance, schools were closed in Spain for 12 weeks, yet learning losses were much smaller (Arenas and Gortazar 2022) than in say, the Netherlands (Haelermans et al. 2021) or Germany (Ludewig et al. 2022), which closed schools for only 10 weeks. But in general, the longer the closures, the greater the losses (Patrinos 2023), everything else constant. There are also differences in the application of distance education. In some countries this was a failure (Agostinelli et al. 2022), while other countries managed to support the online education with parental resources, such as France (Thorn and Vincent-Lancrin 2021). We contribute to this literature by leveraging a unique individual student-level dataset that assesses student learning outcomes in 2019 and 2021. As the pandemic started, the Government of Uzbekistan announced the closure of all educational institutions from March 18, 2020, for three weeks initially (UNESCO 2020). However, the closure was extended, and the education system operated on a hybrid model the entire 2020/21 academic year. By March 2021, 81 percent of school principals reported school closures for four months or longer (UNESCO 2022). With the rise in Omicron cases, the education system extended the school break in January 2022 for an additional three weeks. This is further confirmed by information from Global Monitoring of School Closures by the UNESCO Institute for Statistics, which reports that schools in Uzbekistan were partially or fully closed for a total of 14 weeks. These school closures and hybrid learning affected around 6 million students in Uzbekistan (UNESCO 2020). We evaluate the impact of the COVID-induced school closures due to the pandemic on learning outcomes of school pupils. Individual-level data for 2019 and 2021 allows us to account for possible pre-existing differences across cohorts, mitigating bias concerns. 2 2. Data Description and Empirical Strategy We use data from nationally representative student assessments conducted for mathematics in 2019 and 2021. In 2019, a nationally representative sample of grade five students sat for the assessment. In 2021, a nationally representative sample of grade five students from the same schools sat for the 2021 assessment, specifically designed to ensure comparability with the 2019 assessment. Additionally, the 2021 assessment traced and assessed the students who had participated in the 2019 assessment. Of the 3,922 students who participated in the mathematics assessment in 2019, 3,411 participated in the 2021 assessment leading to an attrition rate of 13 percent. In 2021, the assessment was complemented with student and teacher questionnaires providing a rich data set to analyze variables affecting learning during COVID-19-induced school closures. (For more details on the data, see the Appendix.) Using the data from student assessments, student, and teacher questionnaires, we test five hypotheses based on a reading of previous literature on learning loss due to temporary school closures or during summer recess: H1: Student learning declined due to COVID-19-induced school closures. H2: Learning loss is greater in schools with longer duration of COVID and non-COVID school closures. H3: Learning loss is greater among students with less-educated parents/guardians. H4: Learning loss is greater among students with lower access to and usage of digital devices. H5: Learning loss is greater among boys than girls. To test the first hypothesis (H1), we employ two identification strategies to analyze learning trends during COVID-19. The first strategy compares average scores from a nationally representative sample for grade 5 in 2019 to the average scores from the same nationally representative sample for grade 5 in 2021. The sample of schools was developed in 2019 according to the list of all schools in the country with students in grade 5. The same schools were surveyed in 2021. The sample frame was not renewed due to no major changes in the sampling frame over the two years. This identification strategy allows us to compare the performance of grade five cohorts in 2019 and 2021 and assess the trends in system-level student outcomes. The caveat with this identification strategy is that it ignores the changes in student composition or other factors that change from one year to the next and thereby may affect cohort-level learning outcomes. Our second identification strategy allows us to control for the student population and use data from students who appeared for mathematics assessment both in 2019 when they were in grade five and in 2021 when they were in grade seven. This allows us to control for time-invariant student-level characteristics and analyze the change in student-level learning outcomes. Though the attrition rate was high, we compare the learning outcome of students in 2019 who appeared for the 2021 assessment and those who did not and find that the difference between the student populations is 3 not statistically significant (see the Appendix for more details). We focus on this panel dataset to assess the change in learning from 2019 to 2021 and to identify student and teacher characteristics associated with the change in learning outcomes. The effects of the COVID-19 pandemic on the mathematics achievements of children were estimated using a simple difference in average learning outcomes in 2019 (pre-COVID) and 2021. We compare this average change to the benchmark for annual progress as obtained from the literature review to assess the extent to which students learned over time during COVID as compared to non-COVID years. As different schools were closed for different time periods due to COVID-19 and for other non- pandemic-related reasons, we add a variable for the duration of school closure to analyze the effect of duration of school closure on learning trends and test our second hypothesis (H2). Furthermore, we add a set of student characteristics to test the effect of student characteristics on learning progression and test H3 to H6. The overall empirical equation is represented by: ∆ = + + + 2021 2019 Where ∆ = - is change in learning outcomes of an individual student (i) in school (j) from 2019 to 2021 and is an error term clustered at the school level. The constant captures the average change in learning due to the pandemic. In this setup, we add one variable at a time to assess heterogeneity in the trend in learning outcomes by each student characteristic; includes student sex, parental education, index of student access to and usage of digital resources including internet, index of student-level information on learning continuity during COVID-19 induced school closures, indices of student-reported teachers’ pedagogical and classroom management skills and family support in learning. At the end, we include all variables together to evaluate the strength of the different factors on trends in learning outcomes. Benchmarks for Annual Progress As we have data for only two time periods, we compare the learning trend observed from 2019 to 2021 for students in Uzbekistan to the average expected rate of learning per year. The World Bank’s simulations of COVID-19-induced learning loss assume a progress rate of 0.40 SD per year (Azevedo et al., 2021) based on student outcomes data in the Programme for International Student Assessment (PISA). In the United States, Hill et al. (2008) report annual gains for the age range of 8-11 years based on nationally normed tests in math and reading. Their reported annual gains in math are from 0.89 SD at age 8 to 0.41 SD at age 11. Analysis of growth trajectories of reading and math achievement drawing upon multiple sources of national assessment data in the United States shows annual gains of 0.40 standard deviation in math for grades 5 to 8 during the past two decades (Lee 2010). Similarly, in low-middle-income settings in Pakistan, Bau et al. (2021) estimate that when controlling for students appearing in consecutive assessments and for family characteristics, the value of annual learning gains is approximately 0.39 SD. Based on this review, we use the benchmark average annual learning gain of 0.40 SD to compare the learning trend observed for students in Uzbekistan. 4 3. Results and Discussion A simple comparison of averages and distributions shows that learning outcomes in mathematics for grade five students have improved in Uzbekistan despite COVID-19-induced school closures. This improvement is statistically significant and substantial and falls at the high end (around the 85th percentile) of the distribution of effect sizes from randomized control trials of education interventions with standardized achievement outcomes (Kraft 2020). Similarly, a comparison of learning outcomes of students who participated in the assessment in 2019 and in 2021 shows that students’ learning outcomes have improved over time, albeit to a lower extent than the expected growth of 0.80 SD over two years based on international literature. This shows that at a system level, student learning in Uzbekistan did not decline and grade five students in 2021 performed better than grade five students in 2019. However, over the course of two years, students learned less than what they would be expected to learn based on international literature. Uzbekistan implemented nationally representative assessments for the first time in 2019 and 2021 and therefore we lack country-specific data on learning trends to compare the learning growth from 2019 to 2021 to the pre-COVID trend within Uzbekistan. The distributions of student scores for the mathematics assessment for 2019 and 2021 (grade five and grade seven) are laid out in Figure 1. The student cohort in grade five in 2021 performs significantly better at 0.29 SD than the student cohort in grade five in 2019. Similarly, students in grade seven in 2021 perform significantly better than the average of students in grade five in 2021 and grade five in 2019 (0.43 and 0.72 SD). This difference of 0.72 standard deviations is maintained when limiting the sample to the 3,411 students appearing for the mathematics assessment in 2019 and 2021, thereby limiting any possible bias due to differences in cohort composition (Figure 2). It is important to understand the possible reasons for this improvement despite COVID-induced school closures and ensure that these improvements can be sustained over time. The results stand in contrast to evidence from other lower- and middle-income countries such as Brazil (Lichand et al. 2022), South Africa (Ardington et al. 2021) and Türkiye (Coskun and Kara 2022), despite the fact that Uzbekistan has a much lower national income per capita than most other countries for which we have robust data. Only Kenya and Ghana have slightly lower income levels than Uzbekistan, and both of those countries had significant learning losses (Whizz Education 2021; Wolf et al. 2022). Uzbekistan’s results, rather, are closer to the results observed in Denmark and France, much higher-income countries, with much shorter durations (8 weeks) of school closures (Birkelund and Karlson 2022; Thorn and Vincent-Lancrin 2021). Additionally, the results presented here are for mathematics, and it is not certain whether similar learning gains will be observed for other subjects. 5 Figure 1: Distribution of student scores in mathematics assessment in 2019 (grade 5) and 2021 (grade 5 and 7) Figure 2: Distribution of student scores in mathematics assessment for students appearing both in 2019 (grade 5) and 2021 (grade 7). One possible reason for this improvement in student outcomes in the mathematics assessment could be that the Ministry of Public Education (MOPE) undertook several initiatives to ensure continuity of learning during COVID-19. As per the statistics of the Republican Education Center (REC), a total of 4,492 video lessons were created and broadcasted during March 31-May 25, 6 2020, covering all core subjects for grades 1 -11, and curricular areas meant for the last term of the academic year in three languages – Uzbek, Russian and Karakalpak. These lessons were broadcast through four government-owned television channels. Every day, four lessons per grade, each of 15-20 minutes duration were telecast (thus, a total of one to one and a half hour lessons for each grade), and the broadcasting timetable was announced well in advance for the week. In addition, the video lessons were also made available for students to access any time through the education portal online maktab through the Telegram (instant messenger service) channel, which has 84,000 subscribers and an estimated 2.6 million views daily, as well as the Telegram channel of the MOPE, with 80,000 subscribers and 250,000 views daily. Initial surveys also showed engagement with distance learning platforms. Four surveys (UNICEF Rapid Assessment – telephone survey, SISQE-UNICEF online poll – through SISQE web portal and U-Report poll – through SMS, Facebook, telegram, smartphone apps and Listening to the Citizens of Uzbekistan -L2C2) were conducted in the initial months after school closures in March 20201 and allow us to assess the reach and use of distance learning programs in the initial months of COVID-induced school closures. The four surveys show that most students (96 percent across four surveys) engaged in some form of study during the initial months of school closures (April – May 2020). Of these, a majority (82 percent) continued education through distance learning while some students (15 percent) were engaged in self-study using textbooks, teacher support, or by watching YouTube and other websites.2 Of the students who did not continue education in the initial months (3 percent), the major reasons cited were irregular power supply, unavailability of TV or internet at home, and unavailability of the TV channels that broadcast lessons. Furthermore, the Government of Uzbekistan provided strong support to schools and teachers to ensure continuity in learning. According to the Responses to Education Disruption Survey (REDS) conducted by IEA in 2021, a large majority of teachers (around 95 percent) felt supported by national and provincial education authorities (UNESCO 2022). This percentage is the highest among all countries that participated in REDS and participant countries included upper-middle- and high-income countries like Denmark, the Russian Federation and Slovenia. Additionally, 91 percent of teachers engaged in remote teaching reported that their school provided the office infrastructure to assist with teaching from home (highest percentage among participating countries). The majority of schools reported providing internet access for some or all students (85 percent), digital devices for some or all students (61 percent), and virtual learning environment or learning management system (79 percent). These percentages are significantly higher than other lower-middle income countries and closer to the results of upper middle- and high-income countries. Similarly, more than 80 percent of students reported receiving individual or group feedback from their teachers on all or almost all of their schoolwork (the highest percentage among countries participating in REDS) and a majority communicated with school staff and teachers regularly. Apart from feedback, the highest percentage of students (greater than 90 percent) reported receiving support from teachers in terms of interest, encouragement and adaptation of schoolwork to individual needs. The results from the REDS support the hypothesis that the national 1 UNICEF Rapid Assessment – telephone survey: June – July 2020, SISQE-UNICEF online poll (through SISQE web portal) – June – July 2020, U-Report Poll (SMS, Facebook, Telegram, App for smartphones): May 2020, Listening to the Citizens of Uzbekistan – May 2020. 2 There are slight variations in results by surveys and by student grade. 7 and local governments in Uzbekistan provided substantial support to teachers and students during COVID-induced school closures. This support could be one of the reasons why students in Uzbekistan did not experience substantial learning losses, at least in mathematics in basic education, in contrast to learning losses observed in other countries. Besides government efforts, parents in Uzbekistan engaged significantly with their children to ensure continuity in learning. The recent PIRLS 2021 (Mullis et al. 2021) results provide additional information about the role of parents during the pandemic. Interestingly, 57 percent of parents of attending students declare that they engaged in reading activities even before schools started –one of the highest shares among participating countries. Parents also provide other support, for example, books (83 percent compared to an international average of 71 percent) and online instruction or tutoring (65 of parents versus 50 percent). However, authorities in Uzbekistan were right to rely on television broadcasts for instruction since the country lags far behind others in providing digital devices and digitally-based learning activities. Success with the distance education modality in Uzbekistan seems to be supported by international experience. Remote instruction through phone call tutorials, SMS messages, and other measures, were found to be effective in reducing learning loss during the pandemic in a five-country study of multiple scalable models (Angrist et al. 2023). Uzbekistan’s model is more like blended learning since the local teachers also participated. Using technology to individualize content to students or bolster teacher capacity to deliver lessons is promising. Blended teaching programs combine modern online teaching and traditional offline instruction by partially replacing or supplementing in-person instruction with remote lectures, such as live broadcasted lessons, pre-recorded lecture videos, and TV shows. Studies provide evidence suggesting a positive role of hybrid teaching programs in improving educational outcomes in underserved regions where local teachers may not fully master the subject matter they are expected to teach or may not have sufficient teaching skills to deliver effective lectures (Beg et al. 2019; Bettinger et al. 2020; Bianchi et al. 2022; Borghesan and Vasey 2021; Borzekowski 2018; Borzekowski and Henry 2011; Borzekowski et al. 2019; Näslund-Hadley et al. 2014; Navarro-Sola 2021; Wennersten et al. 2015). The blended learning approach may help make up for the lack of quality teachers, especially in rural areas. A recent evaluation of a hybrid learning model shows significant learning gains (Li et al. 2023). China’s Dual-Teacher program, a computer-assisted instruction program, makes lecture videos and other teaching resources from an elite urban middle school available through the internet to schools in poor and remote areas. The evaluation shows significant improvement in student performance in math by 0.98 standard deviations over the three-year middle school education. This is an effective and low-cost means to improve education outcomes in underserved areas. Its low cost is due to low implementation costs. In addition to supporting teachers and students in response to COVID-induced school closures, the Government of Uzbekistan has demonstrated increased commitment to improving education outcomes in recent years. During the period 2021 to 2022, Uzbekistan participated for the first time in all major international assessments including Progress in International Reading Literacy Study (PIRLS), Trends in International Mathematics and Science Study (TIMSS), and Programme for International Student Assessment (PISA). The improvement in learning outcomes observed for grade 5 can be a reflection of the results of these government efforts. Further investigation using 8 rigorous impact evaluations for the different government measures would be extremely useful to inform future steps in the country. Factors Associated with Change in Learning Outcomes Different student groups are affected by COVID-19-induced school closures differently. Error! Reference source not found. visually presents the student characteristics that show a statistically significant relationship with change in learning outcomes when other factors are held constant. Interestingly, we do not find a statistically significant relationship between change in learning outcomes and duration of school closures. Even though duration of school closures shows a statistically significant negative relation with learning outcomes in 2021, the coefficient loses statistical significance when other student characteristics are added into the regression. This indicates that certain school characteristics (based on student characteristics) increased the likelihood that schools will be closed for longer durations and these school characteristics were also negatively related with student outcomes. For the change in learning outcomes, duration of school closures does not show a statistically significant relation for all regression models. Therefore, we fail to accept H2 in the specific situation of Uzbekistan. The fact that the negative effect of the duration of school closures is not statistically significant can also indicate the effectiveness of government response to COVID in ensuring learning continuity during school closures. The growth in learning from 2019 to 2021 is positively correlated with parental education, as expected in H3. It is established that students with more educated parents perform better than their peers with less educated parents (Buchmann 2002; Schady 2011). This is consistently supported in Uzbekistan, too, as students with better-educated parents (at least one parent with college or higher degree) perform better than the average of their peers and these results remain statistically significant after controlling for all other observed student characteristics. Furthermore, our results show that students with better-educated parents also learn more over the same time duration than their peers with less educated parents. Our results do not support H4. We do not find statistically significant differences in learning progression for students with lower access to and usage of digital devices as compared to their peers with high access and usage. Usage of devices shows a positive and statistically significant relationship with student outcomes in 2021, specifically for grade 7. However, the coefficient loses statistical significance as more student characteristics are added to the regression, indicating that the positive association observed for the usage of digital devices represents the effect of other factors that improved student outcomes and were also related to greater usage of digital devices, for example, students accessing learning resources, most likely including resources available online. In fact, we observe that students who used digital devices to access course material regularly performed much better than students who almost never used ICT devices to access course materials. Similarly, students who used digital devices to access learning assignments online at least once a week performed better than their peers who did not use digital devices to access learning assignments. 9 Similarly, our results do not support H5. We observe that girls performed lower than boys in 2019, after controlling for other student characteristics. However, this disadvantage disappears in 2021, for both grades five and seven. Girls seem to have experienced higher learning growth over time. However, the positive coefficient for girls loses statistical significance as more student characteristics are introduced to the regression. Therefore, controlling for all observed student characteristics, girls do not enjoy an advantage over boys in learning growth. Besides the hypotheses tested, our results emphasize the importance of teachers’ pedagogical and classroom management skills. Students who reported that their teachers have strong pedagogical skills improved their scores by 0.09 SD more than their peers. Under pedagogical practices, providing clear explanation, using different techniques for explanation, and providing time for students to problem solve on their own are positively associated with increased student outcomes. Similarly, students who reported disorderly classroom management improved their scores by 0.11 SD less than their peers with orderly classes. These are substantial effects and similar effects are consistently observed for learning outcomes of students in 2021 (both grades five and seven) and in 2019. Students with teachers with strong pedagogical skills performed between 0.07 to 0.09 SD higher than their peers. Similarly, students with disorderly classes performed between 0.11 SD to 0.17 SD lower than their peers. Figure 3: Regression results of student level improvement in learning outcomes on student characteristics. Only statistically significant results are presented. At least one parent has college education 0.12 Student faced challenges in learning during COVID -0.07 Student-reported disorderly classroom management -0.11 Student-reported teachers' pedagogical skills 0.09 Growth in Learning Outcomes From 2019 to 2021 0.68 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Change in Learning (Standard Deviation) Note: Benchmark for annual progress = 0.4 standard deviation. Expected progress over two years from 2019 to 2021 = 0.8 standard deviation. 4. Conclusions Uzbek children were out of the classroom for more than four months on average, a very long period of school closures due to the COVID-19 pandemic. There are many benefits to in-person teaching 10 but the good news is that students did not face learning losses in mathematics due to the school closures. In fact, Uzbekistan is one of the few countries that experienced the opposite during the pandemic. However, there are some differences across student groups. Students with less educated parents, who faced challenges in learning during COVID, and students of teachers with weak pedagogical and classroom management skills learned significantly less than their peers. There is some suggestive evidence that a longer duration of closures reduced learning, but those results were not significant. Still there is an equity concern since children of more educated parents increased their scores by a much greater margin than children of less educated parents. These findings suggest that policy action matters. Assessing students not only gives the country information they can use to build on, but also allows them to focus on areas of need. The substantial effort in creating online resources that could be accessed by most students and their families seems to have paid off in terms of learning continuity and avoiding losses. The effort to train teachers is commendable and worth it. The education system has built resiliency through its investments in assessment, online resources, and teacher training. This will serve them well as they adjust to the post-COVID schooling system. The assessments have also given the authorities information on who to target. The online system has given the country the tools to face possible future crises. Nevertheless, Uzbekistan’s scores on comparable international assessments are still below international benchmarks. The school closure experience gives them the tools to build upon. One of the limitations of this study is that we do not have a series of pre-COVID test score trends; we only have one point in time before the pandemic. Future research should focus on the reasons for limiting losses in Uzbekistan, preferably through impact evaluation of the different government initiatives targeted to improving student outcomes. Such impact evaluations can provide Uzbekistan as well as other countries valuable lessons for education policies and programs. 11 References Agostinelli, F., M. Doepke, G. Sorrenti, and F. Zilibotti. 2022. “When the Great Equalizer Shut Down: Schools, Peers, and Parents in Pandemic Times.’’ Journal of Public Economics: 206: 104574. Angrist, N, Ainomugisha, M, Bathena, SP, Bergman, P, Crossley, C, Cullen, C, Letsomo, T, Matsheng, M, Panti, RM, Sabarwal, S, Sullivan, T. 2023. Building Resilient Education Systems: Evidence from Large-Scale Randomized Trials in Five Countries. NBER Working Paper Ardington, C., G. Willis and J. Kotze. 2021. “COVID-19 learning losses: early grade reading in South Africa.” International Journal of Educational Development 86. Azevedo, J.P., A. Hasan, D. Goldemberg, K. Geven and S.A. Iqbal. 2021. “Simulating the potential impacts of COVID-19 school closures on schooling and learning outcomes: A set of global estimates.” World Bank Research Observer 36(1): 1-40. Beg, S, Lucas, A, Halim, W, Saif, U. 2019. Beyond the Basics: Improving Post-Primary Content Delivery through Classroom Technology. NBER Working Paper No. 25704. Bau, N., J. Das and A. Y. Chang. 2021. “New evidence on learning trajectories in a low-income setting.” International Journal of Educational Development 84: 102430. Bettinger, E, Fairlie, R, Kapuza, A, Kardanova, E, Loyalka, P, Zakharov, A. 2020. Does EdTech Substitute for Traditional Learning? Experimental Estimates of the Educational Production Function. National Bureau of Economic Research Working Paper No. 26967. Bianchi, N, Lu, Y, Song, H. 2022. The Effect of Computer-Assisted Learning on Students’ Long- Term Development. Journal of Development Economics 158. Birkelund, J.F. and K.B. Karlson. 2022. “No Evidence of a Major Learning Slide 14 Months into the COVID-19 Pandemic in Denmark.” European Societies 1-21. Borghesan, E, Vasey. G. 2021. The Marginal Returns to Distance Education: Evidence from Mexico’s Telesecundarias. Working Paper. Borzekowski, D. 2018. A Quasi-Experiment Examining the Impact of Educational Cartoons on Tanzanian Children. Journal of Applied Developmental Psychology 54: 53-59. Borzekowski, D, Henry, H. 2011. The Impact of Jalan Sesama on the Educational and Healthy Development of Indonesian Preschool Children: An Experimental Study. International Journal of Behavioral Development 35(2): 169-179. Borzekowski, D, Lando, A, Olsen, S, Giffen, L. 2019. The Impact of an Educational Media Intervention to Support Children’s Early Learning in Rwanda. International Journal of Early Childhood 51(1): 109-126. Buchmann, C. 2002. “Measuring family background in international studies of education: Conceptual issues and methodological challenges.” Methodological advances in cross- national surveys of educational achievement, 150-197. Clark, A.E., H. Nong, H. Zhu, R. Zhu. 2021. “Compensating for academic loss: Online learning and student performance during the COVID-19 pandemic.” China Economic Review 68. Coskun, K. and C. Kara. 2022. “The impact of the COVID-19 pandemic on primary school students’ mathematical reasoning skills: a mediation analysis.” London Review of Education 20(1): 19. Education, W. 2021. Measuring impact of COVID-19 on learning in Rural Kenya. https://www.whizzeducation.com/wp-content/uploads/Kenya-Covid-Impact- SCREEN.pdf 12 Haelermans, C., M. Jacobs, R. van der Velden, L. van Vugt and S. van Wetten. 2022. "Inequality in the Effects of Primary School Closures Due to the COVID-19 Pandemic: Evidence from the Netherlands." AEA Papers and Proceedings 112: 303-07. Hallin, A.E., H. Danielsson, T. Nordström and L. Fälth. 2022. “No learning loss in Sweden during the pandemic evidence from primary school reading assessments.” International Journal of Educational Research 114: 102011; Hill, C.J., H.S. Bloom, A.R. Black and M.W. Lipsey. 2008. “Empirical benchmarks for interpreting effect sizes in research.” Child development perspectives 2: 172-177. Engzell, P., A. Frey, D. M. Verhagen. 2021. ‘‘Learning Loss Due to School Closures during the COVID-19 Pandemic.’’ Proceedings of the National Academy of Sciences 118 (17): 1-7. Jack R., C. Halloran, J. Okun and E. Oster. 2023. Pandemic schooling mode and student test scores: evidence from US school districts. American Economic Review: Insights. Kraft, M.A. 2020. "Interpreting effect sizes of education interventions." Educational Researcher 49(4): 241-253. Lee, J. 2010. “Tripartite growth trajectories of reading and math achievement: Tracking national academic progress at primary, middle, and high school levels.” American Educational Research Journal 47(4): 800-832. Li, H, Liu, Z, Yang, F, Yu, L. 2023. The Impact of Computer-Assisted Instruction on Student Performance: Evidence from the Dual-Teacher Program. IZA Discussion Paper No. 15944. Lichand, G., C.A. Doria, O. Leal-Neto and J.P. Cossi Fernandes. 2022. “The impacts of remote learning in secondary education during the pandemic in Brazil.” Nature Human Behavior 6: 1079–1086. Maldonado, J. E. and K. De Witte. 2022. ‘‘The Effect of School Closures on Standardised Student Test Outcomes.’’ British Educational Research Journal 48: 49−94. Mullis, I. V. S., von Davier, M., Foy, P., Fishbein, B., Reynolds, K. A., & Wry, E. (2023). PIRLS 2021 International Results in Reading. Boston College, TIMSS & PIRLS International Study Center. https://doi.org/10.6017/lse.tpisc.tr2103.kb5342 Näslund-Hadley, E, Parker, S, Hernandez-Agramonte J. 2014. Fostering Early Math Comprehension: Experimental Evidence from Paraguay. Global Education Review 1(4):135-54. Navarro-Sola, L. 2021. Secondary Schools with Televised Lessons: The Labor Market Returns of the Mexican Telesecundaria. Working Paper. Patrinos, H.A., 2023. The longer students were out of school, the less they learned. Journal of School Choice 1-15. Patrinos, H.A., E. Vegas and R. Carter-Rau. 2022. “An Analysis of COVID-19 Student Learning Loss.” World Bank Policy Research Working Paper Series No. 10033. Psacharopoulos, G., V. Collis, H.A. Patrinos and E. Vegas. 2021. “The COVID-19 Cost of School Closures in Earnings and Income across the World.” Comparative Education Review 65(2): 271-287. Schady, N. 2011. “Parents’ Education, Mothers’ Vocabulary, and Cognitive Development in Early Childhood: Longitudinal Evidence From Ecuador.” American Journal of Public Health101(12): 9. UNESCO. 2020. Education continuity in COVID-19 Pandemic times: Impressions on introducing distance learning in basic education in Uzbekistan. 13 UNESCO. 2022. The impact of the COVID-19 pandemic on education. Accessed at: https://www.iea.nl/sites/default/files/202201/UNESCO%20IEA%20REDS%20Internatio nal%20Report%2021.01.2022-FINAL%20for%20digital.pdf UNESCO map on school closures (https://en.unesco.org/covid19/educationresponse) and UIS, March 2022 (http://data.uis.unesco.org) Wennersten, M, Quraishy, Z, Velamuri, M. 2015. Improving Student Learning via Mobile Phone Video Content: Evidence from the Bridge IT India Project. International Review of Education 61(4): 503–528. Wolf, S., E. Aurino, N.M. Suntheimer, E.A. Avornyo, E. Tsinigo, J. Jordan, S. Samanhiya, J. L. Aber and J.R. Behrman. 2022. “Remote learning engagement and learning outcomes during school closures in Ghana.” International Journal of Educational Research 115: 102055. 14 Appendix Data Sources We use data from nationally representative student assessments conducted for mathematics in 2019 (November) and 2021 (December). In 2019, a nationally representative sample of grade five students sat for the assessment. In 2021, a representative sample of grade five students from the same schools sat for the 2021 assessment. Additionally, the students who sat for the assessment in 2019 were tracked and participated in the 2021 assessment. Of the 3,922 students who participated in the mathematics assessment in November 2019, 3,411 participated in the 2021 assessment (attrition rate = 13 percent). In 2021, the assessment was complemented with student and teacher questionnaires, providing a rich set of data to analyze variables affecting learning during COVID- 19 induced school closures. Student Assessment Data Both assessments conducted in 2019 and 2021 were developed based on TIMSS assessment framework and used released items from TIMSS with the permission of the International Association for the Evaluation of Educational Achievement (IEA). The assessment form for 2021 was therefore designed to be “parallel” to the assessment form used in 2019. Table A 1 compares the assessment frameworks for the 2019 and 2021 assessments by content and cognitive domains as well as by distribution of items across the different difficulty thresholds as specified by the IEA. Both assessment forms had 37 items and students were given the same time to complete the assessment. Additionally, assessment forms for 2019 and 2021 were tested for equivalence using a single group design with counterbalancing.3 A sample of 264 students sat for both assessments in one sitting in October 2021. A randomly selected half of the sample sat for the 2019 assessment form first while the other half took the 2021 assessment form first. 3 Ryan, J. and F. Brockmann. 2009. "A Practitioner's Introduction to Equating with Primers on Classical Test Theory and Item Response Theory." Council of Chief State School Officers (2009). 15 Table A1: Assessment Frameworks for Assessment 2019 and Assessment 2021 – Percentage of overall assessment Content Domain Assessment 2019 Assessment 2021 Number 46 50 Geometric Shapes And Measures 31 29 Data Display 23 21 Cognitive Domain Knowing 46 42 Applying 43 45 Reasoning 11 13 International Benchmarks Low 20 16 Intermediate 34 29 High 43 45 Advanced 9 11 Using data from students who sat for the two assessment forms, we compare and find the assessment forms to be equivalent. The raw mean scores of students on the assessment form for 2019 and the assessment form for 2021 are not statistically different (21.1 for 2019 and 21.3 for 2021) and have similar standard deviations (8.0 for 2019 and 8.4 for 2021) [Figure A-1]Figure A-1: High degree of equivalence in mean and standard deviation of raw scores in assessment forms for 2019 and 2021 . Comparing percentile ranks for raw scores, we find high equivalency across the assessment forms for 2019 and 2021 [Figure A-2]. Using a three-parametric Item Response Model, we see that the test characteristic curves for the assessment forms for 2019 and 2021 are very similar, i.e., a student with a given ability level is expected to have very similar raw scores on the assessment forms 2019 and 2021 [Figure A-3]. We calculate IRT scaled scores of the students appearing in both assessments and find that the mean IRT scaled scores are also not statistically different for the student population appearing in both assessments [Figure A-4]. As we establish near-equivalence of assessment forms and the difference in difficulty is less than 0.1, we do not adapt a secondary step to equate the two assessment forms for 2019 and 2021 because using equating methods can lead to higher errors if the difference in difficulty between test forms is less than 0.4.4 4 Aşiret, S. and S.O. Sünbül. 2016. “Investigating test equating methods in small samples through various factors.” Educational Sciences: Theory & Practice 16: 647-668. 16 Figure A-1: High degree of equivalence in mean and Figure A-2: High degree of equivalence in standard deviation of raw scores in assessment forms for percentile ranks of raw scores of assessment form 2019 and 2021 for 2019 and assessment form for 2021 Figure A-3: Following a three-parametric IRT model, the Figure A-4: High degree of equivalence in two assessment forms show a very similar test percentile ranks of IRT scaled scores of characteristic curve. assessment form for 2019 and 2021. Student Background Data Data on student background was collected by SISQE along with the assessment data for all participating students in 2021.5 The student background questionnaire obtained information on students’ socioeconomic status, students’ access to and use of digital devices, students’ feedback on the quality of math instruction and how COVID-19 affected their learning. The questionnaire was adapted from student questionnaires as conducted by the IEA and the OECD. 5 Student background information was not collected in 2019. 17 The student information on socioeconomic status and students’ access to and use of digital resources is used to understand the differential impact of COVID-19 on student learning. Additionally, an understanding of how severely students were affected by COVID-19 in terms of the length of school closures and continuation of learning help us to understand better the mediating effects of learning losses during the pandemic. Teacher Questionnaire Data SISQE also collected data from the students’ mathematics teachers – a total of 379 teachers [154 teachers for 5th grade students and 225 teachers for 7th grade students]. For students in grade 5, the Primary Lead Teacher (when the students were in the fourth grade) was surveyed unless the Lead Teacher was not available (because of leaving the school or moving to other school). The Primary Lead Teacher was preferred because students were in close to but not as yet at the middle of the 5th grade and their accumulated knowledge in mathematics was likely to be more dependent on their primary lead teacher who taught them math for the last four years before joining 5th grade. In cases where the previous Primary Lead Teacher was not available, the 5th grade math teacher was surveyed. In case the 5th-grade math teacher was also not available, 5th grade teachers of other subjects were surveyed. In total, 155 teachers participated from 150 schools. Around half of these teachers (64) were not primary lead teachers – 35 were 5th grade math teachers and 29 were 5th grade teachers of other subjects. For 7th grade students, the 7th-grade math teachers were surveyed (a total of 226 teachers from 149 schools, with no teachers participating from one school). For some schools, more than one math teacher was surveyed (two teachers in 64 schools, 3 teachers in 5 schools and 4 teachers in 1 school). As the purpose of the grade 7 assessment was to track grade 5 students, the grade 5 students had progressed to different sections of grade 7 and math teachers of all grade 7 sections were surveyed to ensure coverage of all math teachers for the surveyed students. However, the way the data was collected, it is not possible to match students with their specific teachers. Therefore, to combine student-to-teacher data, we combine on the basis of school ID and class. Where we have more than one teacher survey per school per grade, we average the results per school per grade to combine with student-level data. The teacher information allows us to understand teachers’ access to and use of digital devices, how their teaching was affected by COVID-19 and the general pedagogical techniques teachers employ in their classrooms for effective instruction. The questionnaire also collects information on school- level factors limiting the quality of instruction. We employ teacher information in our analysis to assess the differential effect of COVID-19 on students’ learning losses by teacher characteristics. Outcomes – Curricular Tests Student achievement is measured using student responses on mathematics assessment forms as adapted from TIMSS. As discussed above, the assessments followed the TIMSS assessment framework and had questions on three content domains (numbers, geometric shapes and measures and data display) and three cognitive domains (knowing, applying, and reasoning.) 18 TIMSS also classifies items into difficulty benchmarks: Low, Intermediate, High, Advanced and Above Advanced. As all items are adapted from TIMSS, the items are psychometrically validated and have been implemented internationally. These items have also been successfully fielded in neighboring countries like Kazakhstan and Pakistan. We first calculate the percentage correct for assessments in 2019 and assessments in 2021 and compare the average percentage correct for Uzbekistan with the international average percentage correct on these items (as reported by the IEA). The average percentage correct in 2021 for grade five students is slightly higher than the average percentage correct in 2019 for grade five [Figure A-5]. Additionally, the average percentage correct for grade seven students in 2021 is higher than grade five students in 2019 and 2021 [Figure A-6 and Figure A-7]. Figure A-5 Percentage of correct responses in Figure A-6: Percentage of correct responses in Mathematics assessment: Uzbekistan 2019 (grade 5) mathematics assessment: Uzbekistan 2019 (grade 5) [n=3922] and 2021(grade 5) [n=3876] vis a vis respective [n=3922] and 2021 (grade 7) [n= 3463] vis a vis respective international average. international average Figure A-7: Percentage of correct responses in Mathematics assessment in grade 7 [n= 3463] vis a vis grade 5 [n=3922] in 2021. To compute student assessment scores in the scale used by the IEA, we use the fixed item parameters provided by the IEA. A central assumption of most IRT models is conditional 19 independence (sometimes referred to as local independence). Under this assumption, item response probabilities depend only on the latent ability and the specified item parameters – there is no dependence on any demographic characteristics of the students, responses to any other items presented in a test, or the survey administration conditions. This central assumption allows us to use the fixed item parameters developed for TIMSS 2011 to estimate student abilities in Uzbekistan.6 The same assumption is used for other similar efforts.7 There are limitations to the assumption of conditional independence, for example, caution is raised against treating scores from different language versions of a test as equivalent. However, equating for different language versions of an assessment will require bilingual students to appear for assessment in both languages and to use the results of these students to equate the test forms in the different languages. This is difficult to achieve practically. First, it is difficult to ensure that bilingual students have similar level of facility in both languages; secondly, it is difficult to find a large enough sample of bilingual students to allow for the equating process. The latter is true for several countries, including Uzbekistan. The IEA also does not currently use this approach to equate test forms across different test languages. The distributions for student scores for the mathematics assessments for 2019 and 2021 (grade five and grade seven) are laid out in Figure A-8. In student scores developed using fixed item parameters from the IEA on a standardized scale with a mean of 0, the student cohort in grade five in 2021 performs significantly better (+0.29 standard deviations, statistically significant difference) than the student cohort in grade five in 2019. Similarly, students in grade seven in 2021 perform significantly better than the average of students in grade five in 2021 and grade five in 2019 (0.43 standard deviations and 0.72 standard deviations respectively). This difference of 0.72 standard deviation is maintained when limiting the sample to 3,425 students appearing for the mathematics assessment both in 2019 and 2021 [Figure ]. 6 Most large-scale international assessments, including TIMSS, PIRLS, and PISA, use this assumption of conditional independence to estimate learning outcomes in countries and considering the item parameters as fixed. 7 Das, J. and T. Zajonc. 2010. “India shining and Bharat drowning: Comparing two Indian states to the worldwide distribution in mathematics achievement.” Journal of Development Economics 92(2): 175-187. 20 Figure A-8: Distribution of student scores in mathematics assessment in 2019 (grade 5) and 2021 (grade 5 and 7) Figure A-9: Distribution of student scores in mathematics assessment for students appearing both in 2019 (grade 5) and 2021 (grade 7). 21 Quality Control The State Inspectorate for Supervision of Quality in Education (SISQE) provided the national sample frame in 2019 – a list of all public schools with students in grade 5 and where the language of instruction is Uzbek, Russian, or both. A two-stage cluster sampling approach was then employed. From the exhaustive list of schools with grade 5 in the country, 150 schools were selected using proportional probability sampling8 from the universe of schools with grade 5 in Uzbekistan. The selected schools were contacted and a list of sections for grade 5 was developed for each school. Secondly, one complete section of grade 5 was selected from each school based on proportional probability sampling. In each school, an average of 27 students were assessed. In four schools, less than 15 students were assessed with a minimum of 10 students in one school. To ensure fidelity to the drawn sample, replacements were minimized and only three schools were replaced. Replacement schools were also randomly drawn together with the initial sample. In 2021, the same selected sample of schools and students was assessed except for two schools that were unable to participate due to the pandemic. Instead of them two replacement schools participated in the assessment. In total, students in 148 of 150 schools from the 2019 sample were assessed. For grade 5, one complete section of grade 5 was selected from each school based on proportional probability sampling and all students in the selected section of grade 5 appeared for the assessment. The average number of students in the sections assessed for grade 5 was the same as in 2019 (27 students per section). In five schools, less than 15 students were assessed with a minimum of 7 students in one school. Replacement schools randomly drawn along the initial sample selection were used to replace schools ensuring national representativeness for grade 5 assessment results. Furthermore, students who appeared for assessment in 2019 when they were in grade five were tracked and re-assessed in 2021 when they were in grade seven. The State Inspectorate was able to track 3,411 students out of 3,922 students assessed in 2019 [375 of the students who participated in the 2019 study transferred their studies to another school, 18 of them moved to another country, 5 of them died, and 56 of them could not participate in the study due to health conditions]. This implies an attrition rate of 13.03 percent. However, we compare the initial performance of untracked students with the performance of tracked students and find that the two groups are not statistically different in terms of their performance in the assessment in 2019 as also illustrated in Figure A-100. Student information on parental education and household assets was not collected in 2019, which limits our ability to compare the tracked and untracked students on socioeconomic status. It is possible that these are students who belonged to socioeconomically disadvantaged families, performed similar to their peers before COVID-19, dropped out of school due to COVID-19 related economic challenges, and if we had assessed them we would have observed a wider gap in learning. 8 Proportional probability sampling is random sampling weighted by the population of interest. 22 Figure A-10: Students tracked from 2019 to 2021 are statistically similar to students not tracked in terms of initial performance in mathematics assessment in 2019. 23