Review and Guidance on ECD Assessment Tools in FCV Contexts Tamara Arnold, Elizabeth Lauren Hentschel, Diego Luna-Bazaldua, Juliana Chen Peraza and Fatine Guedira January 2025 1 Acronyms ACE Adverse childhood experiences AIM-ECD-CR Anchor Items for the Measurement of Early Childhood Development – Caregiver Report AIM-ECD-DA Anchor Items for the Measurement of Early Childhood Development - Direct Assessment ARCH Academic Readiness of Children ASQ Ages and Stages BITSEA Brief Infant-Toddler Social-Emotional Assessment CBPCS Child Behavioral Problem and Competence Scale CFSQ-SL Child-Friendly School Questionnaire for Syrian Children in Lebanon COVID-19 Coronavirus disease CPBI Child Positive Behavioral Index CREDI Caregiver Reported Early Development Index ECBQ-S Early Child Behavior Questionnaire-Short Form ECD Early childhood development ECDI2020 Early Childhood Development Index EDI Early Development Instrument FCS Fragility and conflict situations FCV Fragility, conflict, and violence IDELA International Development and Early Learning Assessment IDELA-E International Development and Early Learning Assessment- Extended IDPs Internally displaced people INEE Inter-agency Network for Education in Emergency IPV Intimate partner violence ISELA International Social and Emotional Learning Assessment LMIC Low- and middle-income countries MFQ Mood and Feelings Questionnaire MODEL-CR Measure of Development and Early Learning– Caregiver Report MODEL-DA Measure of Development and Early Learning– Direct Assessment NYU New York University PSS Parent Stress Scale PTSD Post-traumatic stress disorder RACER Rapid Assessment of Cognitive and Emotional Regulation RSQ Response to Stress Questionnaire SDQ Strengths and Difficulties Questionnaire SERAIS Social-Emotional Response and Information Scenarios SGBV Sexual and Gender-based violence SRA-AR Self-Regulation Assessment-Assessor Report TOOLSEL Teacher Observation of Learners' Social Emotional Learning VAC Violence against children 2 Introduction By 2030, an estimated two-thirds of the world's extreme poor could be concentrated in countries and contexts characterized by fragility, conflict, and violence (FCV) (World Bank Group, 2020). FCV contexts, affected by humanitarian crises, prolonged emergencies, and armed conflicts, are major hindrances to poverty reduction and sustainable development. The cycle of instability and violence in these contexts often leads to the destruction of infrastructure and a strain on resources, making it difficult for communities to lift themselves out of poverty and achieve sustainable development goals. The list of FCV contexts includes the World Bank list of countries and territories affected by fragility and conflict situations (FCS),1as well as countries suffering from violence and those with large-forcibly displaced populations that are not included in the FCS list. Refer to Appendix 1 for the main FCV- related definitions used throughout this document. The objective of this review is to describe which early childhood development (ECD) measurement tools have been used in FCV contexts and to serve as a guide for tool selection in these settings. This guidance is intended to assist country teams in identifying appropriate tools for ECD measurement activities, provided that such activities are already recognized as priorities. While parents and caregivers are an essential part of ECD, particularly in FCV environments, this review does not concern tools that measure adults' well-being or parenting-related outcomes. Instead, it is focused on measuring children's developmental outcomes and other child-related constructs that are relevant in FCV contexts. We briefly describe the situation of children living in FCV contexts and how it can affect their development. We then make a case for the importance of ECD measurement, the lack of data on ECD in FCV contexts, and the challenges of data collection in these contexts. After that, we provide a deep dive into what was found in the desk review and provide a framework for tool selection. Finally, we present policy recommendations. Uniquely Vulnerable: Children in FCV settings Starting in the womb, poor maternal health, nutrition, and stress can hinder the development of the fetus (UNICEF, 2014). Settings with fragility, conflict, and violence often put additional strain on the healthcare system and negatively impact women's access to adequate health services, with only 60 percent of births attended by skilled health staff in FCS2 (World Development Indicators, 2019) compared to 81 percent in low- and middle-income countries (LMIC). Likewise, maternal mortality is 2.3 times higher in FCS than in LMICs (World Development Indicators, 2018). In FCS, newborns might 1 Please visit this website to access the most recent list of FCS https://www.worldbank.org/en/topic/fragilityconflictviolence/brief/harmonized-list-of-fragile-situations 2 The differences comparing FCV contexts and LMICs might be even larger than the difference between FCS contexts and LMICs., since the countries classified in the World Development Indicators database as FCS is less comprehensive than the full FCV list. 3 arrive in a setting lacking adequate hygiene and shelter and might not survive complications from preterm birth, low birth weight, and asphyxia, which could be prevented with adequate medical care (UNICEF, 2014). For every 1,000 live births in FCS, 28 will die before reaching 28 days of age, compared to 18 in LMICs on average (World Development Indicators, 2020). Malnutrition, poor health, and lack of stimulation affect children's physical growth and early brain development. Children growing up in FCV contexts often face food insecurity with periods of chronic or recurrent undernutrition that can lead to poor physical growth (i.e., stunting) and increased morbidity and mortality. Stunting can impair cognitive capabilities that are irreversible (World Bank, Forthcoming). Moreover, disrupted or destroyed health services in conflict situations often lead to children being left unattended and more vulnerable to poor health outcomes (UNICEF, 2014). In addition to the challenges presented by FCV, climate change increasingly exacerbates vulnerabilities for children living in these contexts. Rising temperatures, extreme weather events, and environmental degradation intensify food and water insecurity, disrupt livelihoods, and increase displacement (UNICEF, 2021). Children are particularly susceptible to climate-related risks, including malnutrition, waterborne diseases, and respiratory illnesses caused by deteriorating air quality (UNICEF, 2023). Climate-induced displacement further compounds the challenges of accessing healthcare, education, and essential resources, disproportionately affecting children in FCV contexts. This dual burden of FCV and climate change necessitates comprehensive approaches to protect children's well-being and promote resilience. Forcibly displaced populations, including children, are particularly vulnerable to the effects of conflict and instability (World Bank, 2017). In 2020, around 24 million children younger than 12 years old were forcibly displaced from their homes due to conflict, violence, and other crises, and almost 10 million of them were below the age of four (UNHCR, 2020).3 THE UNHCR estimates 1.5 million children were born into refugee life between 2018 and 2022 (UNHCR, 2021). Forcibly displaced populations often lack access to basic services such as healthcare, education, and clean water, which can have serious health consequences, especially in the case of children who are more vulnerable to malnutrition and illness (UNHCR, 2020). Separation from parents or caregivers has a negative impact on the social- emotional development, well-being, and mental health of children and can increase their risk of exploitation, abuse, and neglect (Waddoups, Yoshikawa, & Strouf, 2019). Additionally, exposure to war, trauma, and violence can have long-lasting effects on children's psychological well-being, leading to issues such as anxiety, depression, and post-traumatic stress disorder (PTSD) (Samara, Hammuda, Vostanis, El-Khodary, & Al-Dewik, 2020). Moreover, repetitive exposure to such stressful and traumatic experiences can lead to toxic stress, which can have profound and lasting effects on the well-being of young children. Children who live in areas affected by war and conflict are at a higher risk of developing mental disorders compared to their peers in more stable regions. Additionally, children in conflict zones often exhibit psychosomatic symptoms, and their play behavior may display more aggressive or withdrawn tendencies (Samara, Hammuda, Vostanis, El-Khodary, & Al-Dewik, 2020; Shonkoff & Garner, 2012). From a gender perspective, women and girls in FCV contexts have a higher likelihood of being exposed to intimate partner violence (IPV) and sexual and gender-based violence (SGBV). IPV and SGBV can lead to physical injuries, psychological trauma, and even death. Children who witness IPV or SGBV 3 This estimate might be underreported due to the pandemic. 4 may suffer from emotional and behavioral problems such as post-traumatic stress disorder (PTSD), depression, and aggression, which can affect their ability to learn and form healthy relationships (Rosser-Limiñana, Suriá-Martínez, & Mateo Pérez, 2020; Holt, Buckley, & Whelan, 2008). Women and girls who are victims of IPV and SGBV can have long-term physical and mental consequences, social exclusion, and economic marginalization (Ward & Vann, 2002; Rosser-Limiñana, Suriá-Martínez, & Mateo Pérez, 2020). In addition, exposure to conflict increases tolerance toward domestic violence. Experiencing violence during childhood correlates with long-term changes in attitudes and behaviors around violence, perpetuating the vicious cycle of violence (Weaver, Borkowski, & Whitman, 2008). Specifically, women that have been exposed to armed conflict in their childhood reported increased tolerance towards domestic violence and also were more likely to have experienced domestic violence in adulthood than those who had not been exposed to conflict. Men who experienced conflict during childhood were also more prone to exhibit domestic violence in adulthood (Mattina & Shemyakina, 2017). The long-term cycles of exposure to FCV contexts create intergenerational trauma that might be passed on from survivors to their children. Although there is not much research available on the lasting consequences of conflict, parental or intergenerational trauma, and stress during periods of conflict have been linked to a higher likelihood of children experiencing adverse childhood experiences (ACEs) (Devakumar, Birch, Osrin, Sondorp, & Wells, 2014). This risk can come from both biological transmission and exposure to the challenging social environments that exist in these contexts. For example, Vietnam War veterans' children were three times more likely to commit suicide than the general community Invalid source specified.. Adults who were exposed to conflict during infancy in Nepal have been found to have a shorter stature compared to those who were not exposed to such circumstances (Phadera, 2019). Moreover, even when the conflict is resolved, the negative effects of FCV can remain for decades and even generations (Corral, Irwin, Krishnan, Mahler, & Vishwanath, 2020). Research suggests that the longer children are exposed to violence, the greater the risk that it becomes ingrained as a norm in their behavior. The urgent case for ECD Measurement in FCV settings Despite these trends and the urgency of supporting young children in FCV contexts, most of the information to monitor their status has focused on aspects other than their development and well- being. Data collection efforts in FCV contexts tend to focus on assessing family access to basic services and shelters, as well as children's overall health and nutrition status. More information is needed to measure children's cognitive and socio-emotional development, and how they cope with challenges and toxic stress. Without data on the status of children, it is less likely that policymakers and other stakeholders will make informed decisions and implement effective interventions to strengthen early childhood development systems in FCV settings. Measurement efforts to assess ECD in FCV settings provide evidence to make the case for investment in ECD. Data from ECD measurement efforts can be used to inform the design of effective programs to support children and their families during their early years. Investing in ECD has proven to help 5 break poverty cycles, support sustainable development, and promote equity, which becomes even more relevant in FCV contexts. Lack of information on the measurement and monitoring of ECD in FCV Assessing child development has not been a priority in FCV contexts. According to a review on ECD and education in emergencies by Ponguta et al. 2022, there is scant evidence of ECD in emergency settings such as FCV. Only 20 out of 62 ECD and education organizations in FCV settings reported having collected ECD outcome data, and only 14 had a research agenda related to young children. ECD programs in FCV settings are underprioritized and underfunded. A recent analysis of international aid levels for ECD services in crisis contexts finds that only 3.3 percent of all development aid is allotted towards ECD services (Moving Minds Alliance & SEEk Development, 2020). The lack of evidence on ECD is holding back the ability of governments, humanitarian workers, and those working with young children to advocate on behalf of the ECD field to increase resources and programs (Ponguta, Moore, Varghese, Hein, & Angela Ng, 2022). Data collection efforts in FCV contexts have been more focused on short-term outcomes related to addressing basic needs. Among the existing studies focused on ECD in FCV contexts and other humanitarian settings, most of their focus has been on acute malnutrition or education (Aurino & GIunti, 2021). Evidence on other key outcomes, such as learning, psychosocial outcomes and child protection, are limited (Aurino & Giunti, 2021). Children who have lost their homes, fled from their home country, have been separated from their families, or have been exposed to violence and conflict can have long-lasting developmental consequences that are not currently being captured in research in FCV contexts. Assessing their developmental status can help to inform policymakers to allocate resources and design effective ECD policies, interventions and remedial programs in FCV contexts. Assessment review Tools used to measure early childhood outcomes in FCV. This guide is a desk review of documentation and information provided by experts to identify ECD measurement tools that have been designed, adapted, or used in FCV contexts. The Inter-agency Network for Education in Emergency (INEE) measurement library was used to identify measurement tools that assess children's learning and development in contexts of crisis. This network also provided guiding principles that were used to select ECD measurement tools based on their technical properties. In addition, international experts from universities and organizations supporting children in FCV were consulted and have recommended additional measurement tools used in FCV contexts. This guide identified a total of 23 tools that have been either specially designed or adapted for an FCV context, or previously used in an FCV context to measure ECD outcomes and relevant ECD constructs.4 A summary of the main characteristics of these tools can be found in Table 1. All tools were used in at 4 An individual summary of each revised ECD measurement tool can be found in the Complementary Resources: ECD tools used in FCV context. An Excel with more detailed information about each tool is available in the Complementary Resources: Landscape Review 6 least one country on the FCS list and/or targeted displaced or refugee children younger than 10 years old. Although we acknowledge that parents' experiences of stress have a strong influence on child development, this report is focused specifically on measuring children's developmental outcomes and ECD-related constructs –such as toxic stress– relevant to FCV contexts 7 Table 1. Characteristics of ECD measurement tools created, adapted, or used in FCV contexts. Purpose Target Age Main domain EF, SE, Stress Time of Target FCV targeted groups Reliability & Webpage groups & Mental application respondent Validation (*) Health FCV specific: These tools were originally developed for children living in FCV contexts. Child Behavioral Not specified 3-4 years Social and Stress No Caregiver Rohingya children in Reliability (0/3) Webpage Problem and emotional information Bangladesh Validation (2/3) Competence Scale skills/learning (CBPCS) International Social and Program 6-12 years Social and Stress 30 minutes Child Kurdish-speaking Syrian Reliability (3/3) Webpage Emotional Learning evaluation emotional refugee children in Iraq Validation (3/3) Assessment (ISELA) skills/learning Rapid Assessment of Not specified- it's 6-12 years Executive Executive 22 minutes Child Displaced Syrian children Reliability (0/3) Webpage Cognitive and Emotional an "assessment" functioning functioning living in Lebanon. Children Validation (1/3) (original) Regulation (RACER) tool in Diffa, Niger, affected by Webpage Boko Haram attacks. Social-Emotional Program 5-16 years Social and Socio- 20 minutes Child Syrian refugee children in Reliability (2/3) Webpage Response and evaluation emotional emotional Lebanon Validation (2/3) Information Scenarios skills/learning (SERAIS) Teacher Observation of Program 5-16 years Social and Socio- Teacher Syrian refugee children in Reliability (1/3) Learners’ Social evaluation emotional emotional Lebanon Validation (3/3) Emotional Learning skills/learning (TOOLSEL) Remote Assessment of Program 4-14 years Early Childhood Socio- 30 minutes Caregiver Reliability (3/3) Webpage Learning (ReAL) evaluation, Development/ emotional Validation (3/3) Population School readiness monitoring Adapted for FCV: One or more teams have adapted and validated these tools for use in a specific FCV context. Academic Readiness of Program 60 months School readiness Socio- 20 minutes Child & Turkish Children (ARCH-T) Reliability (2/3) Webpage Children in Arabic evaluation and over emotional Parent/Teach and Syrian Refugee Validation (3/3) (ARCH-A) and in Turkish er Children (ARCH-A) in Turkey (ARCH-T) Brief Infant-Toddler Screening 12-36 months Social and Socio- 7-10 minutes Caregiver Reliability (0/3) Webpage Social-Emotional emotional emotional Validation (2/3) Webpage Assessment (BITSEA) skills/learning Child-Friendly School Program 5-15 years Quality of No info Child Syrian refugee children in Reliability (1/3) Questionnaire for Syrian evaluation learning Lebanon Validation (2/3) Children in Lebanon environment (CFSQ-SL) (students’ perceptions) 8 Child Positive Behavioral Not specified 3-4 years Social and Socio- Caregiver Reliability (0/3) Webpage Index (CPBI) emotional emotional Validation (2/3) skills/learning Strengths and Screening, 2-4 years or Social and Socio- No Caregiver or Reliability (2/3) Webpage Difficulties Program 4-17 years emotional emotional information teacher Validation (3/3) Questionnaire (SDQ) evaluation skills/learning International Program 3-5 years Early Childhood Executive 20 minutes Child Reliability (2/3) Webpage Development and Early evaluation Development/ functioning Validation (1/3) Learning Assessment- School readiness Socio- Extended (IDELA-E) emotional Response to Stress Program 5-15 years Stress Stress No info Child Syrian refugee children in Reliability (2/3) Webpage Questionnaire (RSQ) evaluation experiences and Lebanon. Nigerian refugee Validation (3/3) response children in Niger Mood and Feelings Not specified 5-15 years Internalizing Socio- Child Syrian refugee children in Reliability (1/3) Webpage Questionnaire (MFQ) behaviors emotional Lebanon Validation (1/3) Self-Regulation Population 5-15 years Emotional and Executive 60 minutes Enumerator Nigerian refugee and Reliability (2/3) Webpage Assessment-Assessor evaluation behavioral self- functioning Nigerien children & Syrian Validation (3/3) Report (SRA-AR) regulation refugee children in Lebanon Early Child Behavior Program 18-36 months Emotional and Stress 60 minutes Caregiver Reliability (1/3) Webpage Questionnaire-Short evaluation behavioral self- Validation (2/3) Form (ECBQ-S) regulation (executive functioning) Used in FCV without adaptation: These tools have been used in FCV settings but to the best of our knowledge, have not been specifically adapted for those contexts.5 AIM-ECD Direct Program 4-6 years School readiness Executive 25 minutes Child Syrian refugees in Jordan Reliability (2/3) AIM-ECD Assessment (DA) evaluation functioning Validation (2/3) webpage Socio- emotional AIM-ECD Care-giver Impact evaluation 4–6-year-olds School readiness Executive 5 minutes Caregiver Reliability (2/3 AIM-ECD Report (CR) (with DA) and functioning Validation (2/3) webpage population Socio- monitoring emotional Ages and Stages (ASQ) Screening, 0-5 years and Early Childhood Socio- Caregiver Reliability (0/3) Program 6 months Development emotional Validation (0/3) evaluation CREDI long form Impact evaluation 0-3 years olds Early childhood Socio- 15 minutes Caregiver Reliability (3/3) Webpage development emotional Validation (3/3) Mental health CREDI short form Population 0–3-year-olds Early childhood Socio- <5 minutes Caregiver Reliability (3/3) Webpage monitoring development emotional Validation (3/3) 5 As of March 2023, the team did not find any validation work of these tools in FCV setting. 9 Mental health Early Childhood Population 2-4 years Early Childhood 3 minutes Caregiver Reliability (N/A) Webpage Development Index monitoring Development Validation (N/A) (ECDI2030) 6 Early Development Population Preschool age Early Childhood Socio- Teacher Reliability (N/A) Webpage Instrument (EDI) monitoring, children Development/ emotional Validation (N/A) Program School readiness evaluation Global Scales for Early Population 0-3 years Early Childhood Socio- 30 minutes Caregiver Reliability (0/3) Webpage Development (GSED) monitoring Development/ emotional (short form) (short form) Validation (0/3) School readiness Child (long 60-120 form) minutes (long form) Measure of Program 4-6 years Early Childhood Executive 25-35 Child Reliability (N/A) Webpage Development and Early evaluation Development/ functioning minutes Validation (N/A) Learning School readiness Socio- Direct Assessment emotional (MODEL-DA) Measure of Program 4-6 years Early Childhood Executive 15-25 Caregiver or Reliability (N/A) Webpage Development and Early evaluation Development/ functioning minutes teacher Validation (N/A) Learning School readiness Socio- Caregiver Report emotional (MODEL-CR) Source: Authors elaboration (*)The number in parenthesis reflects the number of indicators that fulfill the following criteria: Reliability Indicator 1: Evidence of high internal consistency (e.g., with reported Cronbach’s alpha statistics above 0.70), Reliability Indicator 2: Evidence of consistency between traine d enumerators and master enumerators (e.g., with inter-rater score agreement above 80 percent), Reliability Indicator 3: Publication of reporting procedures (e.g., enumerator training checks, guidelines and training tests) to ensure consistency (recommended).Validation Indicator 1: There is reported information on the item/question/task content with the construct it intends to measure. Validation Indicator 2: Evidence on inter-item correlations and/or factor analysis results. Validation Indicator 3: Evidence on score correlations with other tools or measures of ECD outcomes and other relevant constructs. 6 The ECDI2030 tool deserves special attention for its intended purpose and alignment with measurement initiatives in FCV settings. UNICEF developed this measurement tool to monitor ECD outcomes globally and support countries in reporting on specific Sustainable Development Goal indicators linked to early childhood development. Henceforth, ECDI2030 has been implemented in FCV and non-FCV settings following guidelines suggested by UNICEF for SDG monitoring and reporting purposes. We invite those interested in using the ECDI2030 to review the supplemental materials developed by UNICEF and contact UNICEF to ensure that this tool meets their measurement objectives. 10 Most of the tools designed specifically for the FCV contexts measure social-emotional learning and behaviors as well as cognitive functions. Most of these tools target preschool and primary school-age children and have been designed for evaluating programs to support refugee children (e.g., Rohingya children in Bangladesh; Syrian refugee children in Lebanon, Iraq, and Turkey). The majority rely on information from the caregiver, but a few require the direct assessment of the child. Tools that were specially adapted for the FCV context are usually shorter than the original tool. During the tool adaptation process, teams usually kept only the most relevant items or scales of the original tools or added additional items to measure domains that were not included previously. In addition, implementation guidelines were adapted to the context (e.g., conducting the assessment on the floor instead of a table, or using little stones as manipulatives). The adapted tools usually measure social- emotional skills and executive functions. Mainstream tools that have been used in FCV settings but not specially adapted or validated to those contexts usually assess early childhood development and school readiness. Child development assessments usually measure domains of early numeracy, early literacy, executive functioning, motor development, language development, cognitive development, social-emotional skills, and health. Most of these tools are meant for population monitoring and, with some exceptions, they tend to target a younger group of children. These mainstream tools have robust evidence on reliability and validity at large, have been used in multiple countries, and are available in a variety of different languages. How to choose an ECD tool for an FCV context? Based on previous work focused on choosing the right tools to measure ECD outcomes (Pushparatnam, Seiden, & Luna Bazaldua, 2022), the following guiding questions help in the tool selection process: • The purpose for which they were designed (why ) • Relevant populations and age ranges with whom they are appropriate to use (who ) • Information about child development they produce including developmental domains, behaviors, skills, or other constructs that they assess (what ) • the manner in which they are administered to respondents (how ) STEP 1. Clarify the purpose of measurement: the "why" Before choosing a tool, you will need to state the rationale for the collection of information and the intended use of the measurement tool. Are you planning to evaluate a specific program or policy? Do you want to describe trends at the population level? Do you want to screen and identify individual children that may be at risk of developmental delays? Measurement tools are designed for different purposes, as shown in Box 1, ECD tools are typically used for population monitoring, program or impact evaluations, research, formative assessment, and screening, among others. 11 The intended purpose of the tool needs to be aligned with your measurement goals to draw meaningful and valid conclusions (Pushparatnam, Seiden, & Luna Bazaldua, 2022). For example, tools that have been developed for monitoring populations might not provide enough information or include guidelines to identify individual children at risk. Likewise, tools developed to analyze the impact of ECD interventions on specific developmental domains may not be suitable for teachers implementing formative assessment of learning outcomes. Among the ECD tools identified in this review, most were designed for program evaluation and population monitoring (see Figure 1). There are three tools that, according to their guidelines and previous uses, can be used for more than one purpose. For example, CREDI has a short form for population monitoring and a long form for impact evaluation. AIM-ECD includes a direct assessment for impact evaluation and a caregiver report version for population monitoring. EDI is commonly used for population monitoring and program evaluation. Tools initially developed for screening purposes have also been used for impact evaluations in FCV settings, including BITSEA, ASQ, and SDQ. Finally, several tools did not specify their measurement purpose (for example, CBPCS, CPBI, RACER, and MFQ). Figure 1. Purpose of ECD tools used, adapted, or created for FCV setting. 18 16 16 14 12 Number of Tools 10 8 6 4 4 4 3 2 0 Population Program evaluation Child screening Not specified monitoring Source: Authors elaboration Box 1: Common purposes of ECD outcome measurement  Population monitoring consists of measuring ECD outcomes of a large representative sample of a given population of young children. In population monitoring studies, the focus is on aggregated information to describe trends at the population level rather than the individual scores on ECD outcomes of every child (for instance, a yearly survey of children aged 3 to 5 in a given context). ECD outcomes tend to be just one of many 12 aspects being measured as part of the data collection effort; thus, measurement efforts with this purpose typically require brief, holistic ECD measurement tools. 2. Program/impact evaluation tests how a policy or intervention affects ECD outcomes. Typically, data is collected at multiple timepoints (with at least baseline and endline measures) and may attempt to follow a sample of children in treatment and control groups longitudinally. The developmental domain coverage of the tools will depend on the focus of the intervention and specific developmental outcomes targeted by the intervention (e.g., foundational reading skills or social emotional development). 3. Formative assessments are most often used within classrooms by teachers to adjust teaching practice, to provide constructive feedback to children, and to offer additional opportunities to promote development and learning. Formative assessments are usually repeated frequently to determine learning progress and are usually not used for high- stakes decision making. 4. Screening for further evaluation or diagnosis is conducted to identify children who may be at risk of developmental delays and to help children access further needed services. The results of screening tools alone are usually insufficient to diagnose children, and instead are used to refer children to professionals for further evaluation and support. 5. Research to explore relationships or test hypotheses is often conducted by academics and research centers studying how children develop and what factors influence their development. Researchers often require much more intense measurement, often using multiple assessments with the same children on multiple occasions, but typically with the use of smaller sample sizes than in impact evaluation or population monitoring. Source: Pushparatnam, Seiden, & Luna Bazaldua, (2022) STEP 2. Identify the population of interest: the "who" ECD tools need to be selected with the context in which they will be used in mind. The population of interest for FCV settings might be refugees, displaced children, or children exposed to violence and conflicts. In these contexts, the measurement approach might be more focused on assessing socioemotional behaviors and stress than in other global settings. Most global ECD measurement tools attempt to capture universal aspects of development (for instance, physical, motor, cognitive, and socioemotional development) but may miss important context-specific factors that are relevant for tool selection in FCV contexts, such as socioemotional development and coping behaviors. Hence, ECD population monitoring tools designed to be used with broader populations may require gathering additional modules to properly capture relevant aspects of early childhood and the living conditions of children in FCV contexts (Pushparatnam, Seiden, & Luna Bazaldua, 2022). The tools reviewed either target specific affected groups of children (for instance, refugees, children living in conflict or fragility), or children living in countries listed in the FCS list. Tools targeting FCV affected groups were mostly designed or adapted for those contexts, while those used in the general population were not necessarily adapted for being used in FCV settings. An adaptation example may 13 include, first conducting a conflict analysis to understand the local FCV context and the needs of children and families. For example, during the adaptation of IDELA to produce IDELA-E for the assessment of Rohingya children, the team added additional tasks to measure executive functioning and modified the implementation guidelines considering the cultural and multilingual local context. They also adapted illustrations used in the assessment considering the face, clothing, and hairstyle to look similar to that of Rohingya children. Most of the tools originally developed to assess affected children in FCV contexts have only been tested in one specific country or context. Only ISELA has been used more broadly around the world. ISELA is a scenario- and performance-based measure for low-resource and emergency contexts to measure social and emotional learning. ISELA has been used in at least 12 countries including the Democratic Republic of Congo, Egypt, Haiti, Iraq, Jordan, Kenya, Mexico, South Sudan, Tanzania, Syria, Thailand, and Uganda. As shown in Figure 2, among the FCS-listed countries, Lebanon and Niger are the ones where most of the reviewed ECD tools have been applied. Regarding the targeted group of children, nearly ten tools have been used to assess Syrian refugee children attending preschools in different countries (e.g., Lebanon, Jordan, Turkey, and Iraq). Some tools, such as RSQ and SRA-AR, have been used to assess conflict-affected Nigerian refugee children. Other tools like IDELA-E, CBPCS, and CPBI have been used to measure ECD outcomes in Rohingya children. Figure 2. Use of ECD measurement tools in FCS-listed countries. 10 Number of tool used 8 6 4 2 0 Source: Authors' elaboration Ages range. The tools in this report cover the age range from 0 to 10 years old (and above). Most of the tools reviewed focus on older children. There are a few tools designed for caregivers of children younger than three years old, such as ECBQ-S, BITSEA, ASQ, and CREDI. The first two are social-emotional and behavior assessments, with the ECBQ-S measuring executive functioning and the BITSEA measuring social-emotional development. CREDI measures socio-emotional development, mental health, motor, cognitive, and language skills. ASQ measures child development more broadly (including communication, gross motor, fine motor, and problem-solving), and includes a personal-social domain. The ECBQ-S tool has been adapted and validated for FCV contexts, while the other tools have been utilized in FCV settings with no psychometric validation for those contexts being reported. 14 Another group of tools covers children in the age ranges of 3 to 5 years old. Some of these tools are caregiver-reported and others are child direct assessments. Most of those tools, including ECDI2030, AIM-ECD, IDELA-E, MODEL, EDI, and ASQ, assess overall child development and school readiness. SDQ, CPBI and CBPCS are focused on social-emotional behaviors. Tools for children aged 5 or above are usually answered directly by the child. Those tools focus on responses to stress (RSQ), student-perceived quality of learning environments (CFSQ-SL), school readiness (ARCH-A), socio-emotional skills (SERAIS, ISELA, TOOLSEL), executive function in school settings (SRA-AR, RACER), and internalizing behaviors (MFQ). Figure 3. Target age range of ECD measurement tools created, adapted, or used in FCV contexts. Source: Authors' elaboration Is there evidence of validity and reliability for these tools? When selecting a tool for an FCV setting it is important to review if it has been validated in that setting or if it only works on a specific population. FCV settings present a series of challenges in collecting high-quality data, making it especially important to ensure that tools have been adapted, and tested for reliability and validity evidence before assuming that they work as expected. Three main types of reliability and validity are useful to consider in making this decision. Reliability means that the child will be accurately measured, independently of the enumerator or other factors involved in the measurement process. The key features of reliability include internal consistency reliability and inter-rater reliability. Our team also includes the reliability of reporting procedures as a criterion for tool selection. The internal consistency reliability indicates how well the tool's items or tasks work together to measure a given developmental domain. Inter-rater reliability indicates how 15 accurate independent enumerators are when scoring the same tool when assessing a child or interviewing a caregiver. Reliability of reporting procedures focuses on determining whether there are any assessor training checks or published guidelines on using the tool that can support maximizing accuracy in the scoring process. The validity of a tool indicates the extent to which the tool measures what it is intended to measure and the extent to which its scores can be used to make valid inferences about individuals. Some tools are valid in one setting, but when adapted to another may require additional evidence for their intended use. Some key features of validity include content validity, structural validity, and concurrent validity. • Content validity refers to the extent to which the items and tasks are linked to the construct it intends to. For example, subject-matter experts can review the tool and provide feedback on the appropriateness and relevance of the items for the measurement of a developmental domain. • Structural validity indicates that there are some underlying constructs that the tool is accurately measuring. For example, suppose a tool measures gross motor skills and language skills. In this example, statistical analyses would show that the items tend to cluster into two different constructs aligned to their corresponding construct. • Concurrent validity represents the extent to which the tool correlates with other tools that measure a similar construct. This is done by assessing the extent to which the scores on the tool one intends to validate correlate with another tool that is known to measure the same construct or a similar one. Some tools also report on the measurement invariance of the tool. Similar to structural validity, this technical property indicates if the tool accurately measures a specific construct across different groups of interest, such as boys and girls, children in urban or rural schools, children in different age ranges or developmental stages, etc. For example, if there is a tool that intends to measure in a similar manner socioemotional development for preschool-aged children not living in FCV contexts and those living in them, it would be important that empirical data informs whether there is any measurement bias in the construct among these two subgroups. In terms of reliability, about half of all tools show evidence of high internal consistency with reported Cronbach's alpha statistic above 0.7. Several tools report internal consistency but have one or more scales with suboptimal Cronbach's alpha (less than 0.7), such as the CFSQ-SL and ARCH-A. Other measurement papers do not report the information, such as the ECBQ-S and RACER, or report the information in the initial measurement paper but not in an FCV setting, such as the CREDI, IDELA-E, BITSEA, MODEL, EDI, and ASQ. Three of the selected tools (ARCH-A, ISELA, and IDELA-E) provide evidence of inter-rater reliability between trained enumerators and master trainers (for example, inter-rater score agreement above 80 percent). Twelve tools have a publication or additional resources to support consistency and accuracy, such as enumerator training checklists, guidelines, and training tests. Regarding validity, almost all the tools reported information on the item, question or task content with the construct intended to be measured. Fourteen tools show evidence of inter-item correlations or factor analysis results, nine tools provide evidence of concurrent validity, and six conduct a 16 measurement invariance analysis and show consistency in structure by key variables of interest such as age and gender. Figure 4. Reliability and Validity Fulfillment of reliability indicators7 1/3 2/3 3/3 ECBQ-S ARCH CREDI RSQ SERAIS ISELA CFSQ-SL SRA-AR MFQ AIM ECD-DA TOOLSEL AIM ECD-CR IDELA-E SDQ Fulfillment of validity indicators 1/3 2/3 3/3 IDELA-E CFSQ-SL CREDI RACER ECBQ-S ISELA MFQ SERAIS RSQ AIM ECD-DA ARCH AIM ECD-CR SRA-AR CBPCS SDQ CPBI TOOLSEL BITSEA Source: Authors' elaboration STEP 3. Map the relevant ECD domains or outcomes: the "what" Mainstream measurement tools tend to focus on assessing cognitive, early literacy, early numeracy, motor, and socio-emotional development. However, in FCV contexts, the measurement of some additional constructs, including toxic stress, executive functioning (including emotional and behavioral self-regulation), internalized behaviors and externalized behaviors, are relevant. There also is much more of a focus on social and emotional skills and their relationship with learning. Tools that have been used in FCV settings, like ARCH-A, CREDI, ECDI2030, IDELA-E, AIM-ECD, MODEL, EDI and ASQ all assess overall child development and school readiness. They capture information on 7 CREDI, ECDI2030, EDI, MODEL-DA, ASQ, AND MODEL-CR have been used in FCV’s and validated in a variety of countries, but validity and reliability evidence in FCV’s specifically was not found. Therefore, they were not included in Figure 4. For the CPBI, RACER, BITSEA, and the CBPCS, no reliability evidence was found. 17 ECD outcomes across multiple developmental domains, such as motor functions, language, cognition, emergent literacy, numeracy, executive function, health, and social-emotional skills. CREDI (short form), AIM-ECD-CR, and ECDI2030 provide an overall score across domains, while ASQ and AIM-ECD - DA and CREDI (long form) also can be used to estimate individual scores by domain. Various other tools used in FCV contexts focus primarily on the social and emotional skills and learning domain. These tools include the SERAIS, ISELA, CBPCS, CPBI, BITSEA, SDQ, and TOOLSEL. All of these tools were created specifically for FCV settings, except for the CPBI, SDQ, and BITSEA which were adapted. Other constructs of development that are measured in FCV include stress experiences and response (RSQ), student perceptions of the quality of their learning environment (CFSQ-SL), executive functioning (ECBQ-S, SRA-AR, and RACER), and internalizing behaviors (MFQ). Tools specifically created for FCV contexts tend to measure social-emotional skills and learning (SERAIS, ISELA, CBPCS, and TOOLSEL) or executive functioning (RACER). Tools adapted to FCV settings most often measure social-emotional skills and learning (CPBI, BITSEA, SDQ), early childhood development and school readiness (ARCH, IDEAL-E), executive functioning (ECBQ-S, SRA-AR), or other constructs related to stress (RSQ) and quality of learning environments (CFSQ-SL). Box 2: Capturing stress Children living in FCV contexts tend to be more exposed to different sources of stress in their immediate environment. Experiencing toxic stress early in life disrupts brain development, hindering cognitive and socioemotional development (National Scientific Council on the Developing Child, 2010). Policies and interventions can address and reduce sources of toxic stress and help children to thrive in those settings. Thus, it becomes necessary to assess toxic stress on time to inform policymakers about potential sources of stress, its impact on children and their families, and effective actions for stress reduction. Among the reviewed tools, RSQ, CBPCS, ECBQ-S, ISELA, and RSQ, measure stress directly. Even though CREDI's mental health domain contains some stress items, we don't include it here because it does not measure stress as a distinct construct. The original Response to Stress Questionnaire (RSQ: Connor-Smith et al., 2000) was designed to capture the ways that individuals react to and cope with specific sources of stress, including parental depression, childhood cancer, family conflict, economic hardship, chronic pain and academic problems and others. Researchers have adapted the child self-report version of the RSQ-Academic Problems (RSQ-AP) to assess local and refugee children's stress experiences and stress responses in public school settings in Lebanon and Niger. This tool is the only one that measures stress explicitly by asking children to rate on a 4-point scale their level of stress, with a 4 indicating the most stress and 1 indicating no stress. The International Social and Emotional Learning Assessment (ISELA) was developed by Save the Children to measure social and emotional learning (SEL) competencies that could meet diverse programmatic and contextual needs, especially in an emergency. This tool is a scenario- and performance-based measure that helps to understand the development of self-concept, stress management, perseverance, empathy and conflict 18 resolution in children between 6-12 years. It has been used in Kurdish-speaking Syrian refugee children in Iraq and other countries like DRC, Egypt, Haiti, Jordan, Kenya, Mexico, South Sudan, Tanzania, Syria, Thailand, and Uganda The Child Behavioral Problem and Competence Scale (CBPCS) was developed to assess behaviors of Rohingya children in Bangladesh, related to competencies and social- emotional problems. The items of this tool were derived primarily from the Early Development Instrument (EDI) and the Brief Infant-Toddler Social-Emotional Assessment (BITSEA). It provides scores for social, empathetic, anxious and externalizing scales. The original Early Child Behavior Questionnaire-Short Form (ECBQ-S, Putman et al., 2006) is a 107-item questionnaire that is designed to measure temperament in toddlers. The adapted version of the ECBQ-S kept 75 items and has five factors: Effortful Control, Intensity, Emotional Regulation, Sociability and Sensory Threshold. It does not have explicitly a stress domain, but has items that capture stress i.e. "During everyday activities, how often did your child: Become distressed when his/her hands were dirty and/or sticky?" This tool was adapted for Syrian refugee children in Lebanon and for Niger. In an FCV context, it might be also relevant to measure caregiver stress in addition to children stress and coping mechanism. The Parent Stress Scale (PSS) has been used for this purpose in several countries. We wanted to highlight this tool here, even though we don't include it in this review, since it is focused on parents and not on children. This tool assesses of parental stress for both mothers and fathers and for parents of children with and without clinical problems. It contains various measures of stress, emotion and role satisfaction, including perceived stress, work/family stress, loneliness, anxiety, guilt, marital satisfaction/commitment, job satisfaction, and social support. The PSS is a self- report measure, containing 18 items that representing the positive themes of parenthood (emotional benefits, self-enrichment, personal development) and the negative components (demands on resources, opportunity costs and restrictions). Parents are asked to agree or disagree with items in terms of their typical relationship with their child or children, and to rate each item on a five-point scale from strongly disagree to strongly agree. The NYU team adapted the PSS for Rohingya caregivers in Bangladesh. STEP 4. Consider logistical realities of data collection: the "how" All the logistical aspects of how data will be collected weighs into the decision of which tool is a better fit for the data collection effort. Child development can be measured by assessing children directly or indirectly through their caregivers or teachers. Usually, direct assessment takes more time, requires more intensive training, and is more difficult to implement remotely compared to an indirect assessment method, or caregiver report. However, direct assessments usually provide a more accurate and nuanced understanding of the child's developmental status. 19 While all the caregiver reports reviewed could be implemented remotely if needed, only some direct assessments have that flexibility. This flexibility might be useful when it is difficult or unsafe to go into the field for data collection. Some tools, such as the RACER, a tablet-based measure of executive functioning, have already been implemented in FCV settings using technology that has the potential to be implemented remotely. Other tools that are solely indirect assessment, such as the CREDI or ECDI, can easily be implemented remotely and have previously been implemented remotely in other non-FCV settings. For example, a 2021 phone survey in Pakistan remotely assessed child development via the CREDI (Hentschel et al., 2023). Direct assessment tools, such as the ARCH and ISELA, assess children directly through a series of tasks and games that the children must perform. Direct assessment that targets older children could be implemented remotely if needed. For example, CFSQ- S asks children directly about their perception of security at school, SEARIS employs a scenario-based format in which children are asked to interpret the actions of others and report what they would do and feel in different social situations. Direct assessment of younger children, however, would not be able to be implemented remotely, as it requires the assessor to observe the child's behaviors in real- time. One example of this is the SRA-AR, which is based on enumerator observation and can only be implemented in person. To assess children's emotional regulation and attention/impulsivity, the enumerators observe the child during the course of another child assessment and rate how they behave throughout the entire assessment period. As FCV contexts vary in terms of the ability for in- person data collection due to security and access issues, the ability to collect ECD data remotely provides a major opportunity. However, it is also worth noting that direct assessment of ECD outcomes in-person is considered to be the gold standard, and there are bias trade-offs in using an indirect assessment tool, especially when the tool is implemented remotely. When deciding on utilizing an in-person versus remote tool, it is important to distinguish between ECD measurement activities that are able to be implemented during crisis situations as opposed to non- crisis situations, but in FCV settings. For example, a country might have been classified as an FCV but there has not been active conflict in several years. As such, a remote tool might not be necessary when conducting ECD measurement activities and data could be collected in-person. However, it might be useful to utilize a tool that has the capability to also be collected remotely, so that if data collection is interrupted due to the onset of conflict the team can easily adapt their data collection method, while still being able to capture high-quality ECD data. This is especially important if the data collection efforts are longitudinal. Figure 5. In-person or remote viability of implementation In person Could be implemented remotely Direct Assessment ARCH, ISELA, AIM-ECD DA, IDELA- RSQ, CFSQ-SL, SERAIS, RACER, E, MODEL-DA MFQ Caregiver Report ECBQ-S, CREDI, ECDI2030, AIM- ECD CR, BITSEA, SDQ, EDI, MODEL-CR 20 Enumerator observation SRA-AR Source: Authors elaboration Adapting and translating tools to the local context and languages might be time intensive. Selecting a tool with adaptation guidance or already available in the local languages can save resources and time. The tools used in a given FCV context might need to be translated and adapted to multiple languages. For example, the first language of children in refugee camps might be the language of the more established refugee groups, which might not necessarily be the language of the host country or their home country. Among the reviewed tools only CFSQ-SL, ISELA, CREDI, AIM-ECD and ECDI2030 include guidelines for translation. All the mainstream tools that have been used in FCV settings have been translated into several languages (Arabic, Russian, Spanish, French, Portuguese, and others). Furthermore, CREDI and SDQ are available in more than 40 languages. Most of the tools created and the tools adapted for FCV contexts are available in English and Arabic. Almost all the tools include implementation support, in the form of training slides, manuals, or implementation guidance. These resources are even more useful in the FCV context since local capacity is very scarce. Depending on the complexity of the tool, the training might need to be extended to make sure that enumerators reach the needed reliability. Among the revised tools, very few tools provide information regarding the length of the enumerator training required before implementation. This is a major limitation for two reasons. First, in some of the most high-risk settings, enumerators often lack adequate training on how to measure ECD constructs, and the lack of guidance on the length of training required exacerbates this. Similarly, in these contexts, the availability of trained enumerators that are willing to travel to the high-risk setting is limited. As a result, clear guidelines are needed on the type of training, length of training, and other guiding principles to ensure quality assurance. None of the revised tools report the estimated cost of implementation, however, all but two tools are open access. The cost of a given ECD tool tends to vary depending on the tool's complexity, which might need more specialized enumerators or longer training, as well as the time per child that it takes to implement the tool. Indirect assessments tend to be cheaper to implement because they are simpler and require less time and resources to train enumerators. The cost of measuring ECD increases when there is a fee for using the tool, but only the ASQ and BITSEA charge for use, the rest of the identified tools are provided free of charge. In summary, there are four key considerations to keep in mind when selecting a tool to assess child development in a FCV setting. These considerations are summarized in Figure 6. Figure 6. Summary of Key Considerations Key Considerations Summary In most contexts, the way in which a survey is administered might be The manner in which they are considered last. However, in FCV settings with hard-to-reach populations, administered to respondents the “how� is often the most limiting factor. As such, the way the survey is (how ) able to be administered, remotely or in-person, should be the first consideration. If it is not feasible to administer a survey in-person, only tools that were developed with a remote option should be considered. 21 Decide if you need a tool for population monitoring, program evaluation The purpose for which they were or child screening. Depending on the purpose of the data collection, only designed (why ) select tools that fit under the correct category. • If the goal is to assess levels of child development across a population-level sample, only consider population-level assessments that are quick to administer and most often caregiver-reported. • If the goal is to assess the impact of a program, a more granular child development assessment is needed, which might be caregiver-reported or direct assessment and will provide more in-depth domain-specific information about child development. • If the goal is to screen children and identify at-risk or another category of children at the individual level, a direct assessment tool is needed that typically needs to be administered by an expert in the field of child development (e.g., developmental psychologist, pediatrician, etc.) Once you identify the purpose of the survey and how the survey will be Relevant populations and age administered, make sure that the tool was developed for the age range ranges with whom they are you intend to measure. For example, if the goal is to monitor child appropriate to use (who ) development among a population of 0-3-year-olds, and make sure that the tool was developed and validated among 0-3-year-olds. At this phase, it is also recommended to select a tool that has been previously used and validated in the context of interest. This will not always be possible but should be one of the considerations when selecting a tool. Once the first three steps are taken (modality of survey administration is Information about child decided, level of survey identified, age and context considered), the type development they produce of child development should be selected. For example, an intervention including developmental might target specific types of child development such as cognitive skills or domains, behaviors, skills, or social emotional learning. Therefore, a tool must be selected that is other constructs that they assess granular enough to provide domain-specific child development scores. In (what ) other situations, only the overall child development might be of interest. Lastly, the reliability and validity of the tools should be considered. If the tool has been validated in the context of interest, with a population similar to the sample you are working with, that tool should be prioritized over one that has not been validated in an FCV setting before. Challenges of measuring ECD in FCV settings Measuring ECD outcomes can be especially challenging in FCV contexts. In addition to the multiple implementation issues that need to be considered for doing data collection in FCV settings, it is difficult to find reliable ECD measurement tools with validation in FCV contexts to assess relevant domains of development. This section summarizes some of the key challenges that could be faced. How? 22 Data collection in FCV contexts has some differences with respect to data collection in non-FCV contexts. FCV contexts present unique challenges due to the instability, insecurity, and lack of infrastructure that often characterizes these environments. In such contexts, data collection efforts may be hampered by factors such as limited access to locations, lack of trust to enumerators, trauma that may prevent respondents from disclosing personal information, and a lack of resources (Puri, Aladysheva, Iversen, Ghorpade, & Bruck, 2016). In FCV contexts, limited access can be caused by a variety of factors such as armed conflict, political instability, and/or natural disasters. These factors, in addition to poor infrastructure and inadequate transportation systems, often create physical barriers that restrict the movement of data collection teams. Due to security issues, the entry of an outsider to affected territories might be restricted by the government or by institutional safety guidelines. In the case that locals are allowed to safety move around, it is recommended to hire a local data collection team and train them remotely, if possible. In addition, remote data collection can be done via phone and text messages to access children and families in remote or conflict-affected areas. However, this approach limits what can be collected in terms of child development outcomes, might increase the likelihood of social desirability bias of the respondent, and relies on the availability of phones at home or in the community. People living in FCV contexts may not trust enumerators. Asking a caregiver for their phone number or any other personal information might not be an issue in most non-FCV settings, but it might be a very sensitive issue among forcibly displaced populations, particularly if they are perceived to be associated with specific stakeholders such as government agencies or international organizations. People may also worry that their personal information could be used for another purpose, such as surveillance, targeted violence or other forms of repression. Therefore, it is important to carefully review all questions to be administered and, if possible, avoid including sensitive ones that may trigger mistrust and suspicion. If sensitive information is needed, allot an appropriate amount of time to explain how this information will be used. Lack of human resources can be a significant challenge for data collection efforts in FCV contexts. FCV contexts usually have limited capacity to conduct data collection, with local organizations limited in their staff size and skills to carry out data collection activities. In addition, high levels of insecurity and instability make it difficult to recruit and retain qualified personnel that would need to work in high- risk areas. Why? Deciding what to measure and selecting appropriate tools in FCV contexts is particularly challenging due to the inherent instability and resource constraints of these settings. While the purpose of data collection—population monitoring, program evaluation, or child screening—remains a universal consideration, the practical feasibility of implementing these tools is uniquely affected in FCV environments. For population-level assessments, caregiver-reported tools are often the most practical due to their ease of administration. However, in FCV contexts, caregivers may themselves be experiencing significant stress or trauma, potentially affecting their ability to provide accurate information about their children’s development. Additionally, cultural and linguistic diversity within 23 displaced or refugee populations can make it difficult to adapt standardized tools appropriately for all groups. Granular tools are essential for assessing program impacts, but their use in FCV settings is often constrained by a lack of trained professionals and logistical challenges in remote or crisis-affected areas. For example, some tools may require direct assessments conducted by trained experts, which are difficult to deploy when security concerns or political instability restrict access to affected communities. Furthermore, these tools often require considerable time, resources, and expertise, all of which may be in short supply in FCV settings. Screening children to identify those at risk or in need of specialized interventions is particularly challenging in FCV contexts. Direct assessments, which often need to be administered by professionals such as developmental psychologists or pediatricians, are frequently unavailable due to the limited presence of specialized healthcare providers in such areas. This is compounded by the high turnover of humanitarian personnel and the difficulty in sustaining long-term capacity building for local professionals. Additionally, the ethical considerations of identifying at-risk children in environments where follow-up interventions may not be consistently available add another layer of complexity to the use of screening tools in FCV contexts. The overlapping challenges of security, resource constraints, and caregiver stress in FCV contexts make the selection and implementation of ECD measurement tools particularly complex. Tools must be adapted not only to meet the technical requirements of validity and reliability but also to address the unique cultural, linguistic, and logistical realities of crisis-affected populations. These challenges highlight the importance of ensuring that selected tools are feasible to administer in low-resource settings and that they provide actionable data that can inform immediate and longer-term interventions for children and their families. Defining the purpose of data collection helps visualize key challenges and guide effective solutions in FCV contexts. By identifying resource gaps and tailoring interventions, stakeholders can improve outcomes and advocate for increased funding. Clear and purpose-driven data also strengthens the case for resource allocation and long-term investments in ECD initiatives in crisis-affected areas. Who? Children and caregivers in FCV settings are likely to have experience trauma. Trauma can influence both learning in children and the care provided by caregivers. This needs to be taken into consideration in the instrument selected, the domains to be measured, the data collection effort and the data analysis. For example, learning outcomes can be severely influenced by trauma and hence it might be key to measure other constructs such as toxic stress, executing functioning and behaviors. In addition, enumerators must be prepared to engage with children and caregivers in a sensitive, compassionate, and culturally appropriate manner. It would be beneficial to train them to recognize the signs of trauma and avoid triggers. Enumerators may be also trained to refer or provide trauma- affected families with information about mental health and social services. What? 24 Socio-emotional development delays and stress are key issues to monitor in children living in FCV contexts. Mainstream ECD tools usually focus on cognitive development and less on socio-emotional development delays and stress, while tools created for FCV contexts tend to measure socio-emotional skills and learning or executive function. Given the close interrelation between socio-emotional development and learning, it is key to assess ECD in a more holistic manner in FCV settings and use some of the instruments that have been used and validated in these settings. Conclusions: Identification of gaps for future work This review allowed our team to identify different early childhood measurement tools used in FCVs. We hope that this summary of available tools will be useful for different stakeholders and decision- makers who need to conduct measurement efforts in FCVs. At the same time, we hope that these key considerations regarding the why? what? who? and how? during the selection of ECD measurement instruments will help users identify the optimal tools for their projects. At the same time, we expect the list of available measurement tools to increase in the future as more countries and organizations scale up the implementation of measurement activities in FCV settings. Some final points for future work in this field include: • Addressing gaps. Several gaps were identified with the analysis of ECD assessment tools used in FCV settings. For instance, only a handful of instruments have been tested in multiple FCV countries and contexts. In addition, there is a scarcity of instruments with high reliability and validity that could be implemented remotely. Periodically updating the data base of which tools have been used and adapted for FCV settings can help keep track of this evolving landscape. Future research could also address how to improve reliability and validity of instruments that could be collected remotely. • Translation and adaptation of tools. Most of the tools here presented are available in English and other international languages. In addition, some tools have no guidance for their translation and adaptation into local contexts. Future work, in collaboration with tool developers, should include the translation of tools and their supporting materials into more international languages and the development of more resources to guide the translation and adaptation into local languages and languages spoken by children affected by fragility, conflict, and violence. • Expand the use of household surveys. Household surveys represent a unique opportunity to expand measurement efforts in contexts including FCV settings. However, many of the tools presented here have no information for their adaptation as household survey modules. Additional work with survey experts should include guidelines and considerations to adapt these tools as brief household survey modules that can help to increase the available information on children and their families. • Ethical guidelines to measure children in FCV. While some tools included in this review were developed or adapted to monitor children in FCV contexts, little space is devoted to ethical 25 considerations linked to measurement activities. Users of these tools and those responsible for any data collection should be aware and adhere to ethical guidelines for any measurement activities given the level of vulnerability of these populations. It is recommended to work with international experts in ethics to expand the available guidelines and recommendations to ensure that children living in FCV contexts, their families, and other stakeholders are treated with respect and are ensured privacy and dignity before, during, and after any measurement activity. 26 Appendix 1. FCV-related definition used in this note Basic definitions: Fragile, conflict, and violent contexts and forcibly displaced population Fragility and conflict situations (FCS). The World Bank classifies and updates annually the list of countries or territories affected by fragility and conflict situations (FCS) based on publicly available indicators. The list of 2022 included 37 countries and territories and was updated on July 1st. Fragility, conflict, and violence (FCV) contexts include countries and territories that are affected by fragility, conflict and/or violence. It also includes countries or territories with large-forcibly displaced populations. Fragile countries are defined as those facing a high level of institutional crisis and social fragility, with poor transparency and government accountability or weak institutional capacity. Fragile situations tend to be characterized by deep grievances and/or high levels of exclusion, lack of capacity, and limited provision of basic services to foster the human development of the population. Fragile situations tend also to be characterized by the inability or unwillingness of the state to manage or mitigate risks, including those linked to social, economic, political, security, or environmental and climatic factors (World Bank Group, 2020). Countries in conflict are defined based on the population's threshold number of conflict- related deaths relative to the population. Violent conflicts occur when organized groups, institutions, or even governments use violence to settle grievances or assert power (World Bank Group, 2020). Countries with violence are identified based on the per capita level of intentional homicides. These countries have high levels of interpersonal and gang violence, including sexual gender-based violence (SGBV), violence against children (VAC), and violence against minority groups (World Bank Group, 2020). Forcibly displaced populations encompass internally displaced populations, refugees, and asylum seekers. The term does not include those displaced due to economic conditions, and only includes those that are displaced due to safety and security issues, which may lead them to settle in areas where there are limited employment opportunities (World Bank, 2017). Internally displaced populations (IDPs). The Guiding Principles on internal 27 displacement, established by the United Nations in 2004 defines IDPs as "persons or groups of persons who have been forced or obliged to flee or to leave their homes or places of habitual residence, in particular as a result of or in order to avoid the effects of armed conflict, situations of generalized violence, violations of human rights or natural or human-made disasters, and who have not crossed an internationally recognized State border" (UNCHR, 2004). Refugees are individuals who have escaped from situations of war, violence, conflict, or oppression, and have fled their country or place of residence seeking a secure haven in a different country. The 1951 Refugee Convention defines a refugee as: "someone who is unable or unwilling to return to their country of origin owing to a well-founded fear of being persecuted for reasons of race, religion, nationality, membership of a particular social group, or political opinion" (UNHCR, 2010). Asylum seekers are individuals who have requested international protection in another country due to experiencing danger, persecution, or other forms of harm in their home country. However, their appeal for protection is still being evaluated and has not yet been decided upon (World Bank, 2017). Source: Based on UNCHR (2004, 2010) and World Bank Group (2017, 2020) 28 Bibliography Aurino, E., & Giunti, S. (2021, August). Social protection for child development in crisis: A review of evidence and knowledge gaps. 37(2), 229–263. Tratto da https://academic.oup.com/wbro/advance- article/doi/10.1093/wbro/lkab007/6305018 Corral, P., Irwin, A., Krishnan, N., Mahler, D. G., & Vishwanath, T. (2020). Fragility and Conflict. On the Front Lines of the Fight Against Poverty. Washington: World Bank. Tratto da https://openknowledge.worldbank.org/bitstream/handle/10986/33324/9781464815409.pdf Devakumar, D., Birch, M., Osrin, D., Sondorp, E., & Wells, J. C. (2014, April). The intergenerational effects of war on the health of children. BMC Medicine, 12(57). Tratto da https://doi.org/10.1186/1741-7015-12-57 Fernald, L. C., Prado, E., Kariger, P., & Raikes, A. (2017). A Toolkit for Measuring Early Childhood Development in Low and Middle-Income Countries. Washington, DC: World Bank. Tratto da https://openknowledge.worldbank.org/bitstream/handle/10986/29000/WB-SIEF-ECD-MEASUREMENT- TOOLKIT.pdf?sequence=1&isAllowed=y. Franke, H. A. (2014, November 3). Toxic Stress: Effects, Prevention and Treatment. Children, 1(3), 390-402. Tratto da https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4928741/pdf/children-01-00390.pdf Garrido, D., Siblini, K., Craig, S., Isabirye, P., Kemperman, M., Koehling, W., & Libresco, B. (2022). How to Improve Results in Situations of Fragility, Conflict & Violence: 12 Recommendations. Washington: World Bank. Tratto da https://documents1.worldbank.org/curated/en/099752306082290620/pdf/IDU0ebb8a8210b053045560 b63b0e292d1942e23.pdf Hentschel, E., Tomlinson, H., Hasan, A., Yousafzai, A., Ansari, A., Tahir-Chowdhry, M., & Zamand, M. (2023, March). Risks to Child Development and School Readiness Among Children Under Six in Pakistan: Findings from a Nationally Representative Phone Survey. International Journal of Early Childhood. Tratto da https://doi.org/10.1007/s13158-023-00353-2 Holt, S., Buckley, H., & Whelan, S. (2008, Agust). The impact of exposure to domestic violence on children and young people: A review of the literature. Child Abuse Negl, 32(8), 797–810. Tratto da https://pubmed.ncbi.nlm.nih.gov/18752848/ Inter-agency Network for Education in Emergencies (INEE). (s.d.). Measurment Library | INEE. Tratto il giorno September 2022 da https://inee.org/measurement-library Mattina, G. L., & Shemyakina, O. N. (2017). Domestic Violence and Childhood Exposure to Armed Conflict: Attitudes and Experiences. University of Sussex. Brighton: Households in Conflict Network. Tratto da http://www.hicn.org/wordpress/wp-content /uploads/2012/06/HiCN-WP255.pdf. Moving Minds Alliance; SEEk Development. (2020). Analysis of international aid levels for early childhood services in crisis contexts. Tratto da https://movingmindsalliance.org/wp-content/uploads/2020/12/analysis-of- international-aid-levels-for-early-childhood-ser National Scientific Council on the Developing Child. (2010). Persistent fear and anxiety can affect young children’s learning and development: Working paper no. 9. Boston, MA: NSCDC. Tratto da https://developingchild.harvard.edu/resources/persistent-fear-and-anxiety-can-affect-young-childrens- learning-and-development/ Office of the High Commissioner for Human Rights. (2022, 07 15). United Nations. Tratto il giorno September 2022 da Children affected by armed conflict and violence: https://www.ohchr.org/en/speeches/2022/07/children-affected-armed-conflict-and-violence 29 Phadera, L. (2019). Unfortunate Moms and Unfortunate Children: Impact of the Nepali Civil War on Women's Stature and Intergenerational Health. World Bank. Washingon, DC: Policy Research Working Paper. Tratto da https://openknowledge.worldbank.org/handle/10986/31999 Ponguta, L. A., Moore, K., Varghese, D., Hein, S., & Angela Ng. (2022). Landscape Analysis of Early Childhood Development and Education in Emergencies . JOURNAL ON EDUCATION IN EMERGENCIES, Volume 8, Number 1. Puri, J., Aladysheva, A., Iversen, V., Ghorpade, Y., & Bruck, T. (2016). Can rigorous impact evaluations improve humanitarian assistance? Journal of Development Effectiveness, 9(4), 519-542. Tratto da https://www.tandfonline.com/doi/full/10.1080/19439342.2017.13882 Pushparatnam, A., Seiden, J. M., & Luna Bazaldua, D. A. (2022). Guiding Questions for Choosing the Right Tools to Measure Early Childhood Outcomes : Why, What, Who, and How. Washington, DC: World Bank. Tratto da https://openknowledge.worldbank.org/handle/10986/37030 Rosser-Limiñana, A., Suriá-Martínez, R., & Mateo Pérez, M. (2020). Children Exposed to Intimate Partner Violence: Association Among Battered Mothers’ Parenting Competences and Children’s Behavior. International Journal of Environmental Research and Public Health, 17(4). Tratto da https://doi.org/10.3390/ijerph17041134 Samara, M., Hammuda, S., Vostanis, P., El-Khodary, B., & Al-Dewik, N. (2020, November). Children’s prolonged exposure to the toxic stress of war trauma in the Middle East . Tratto da https://www.bmj.com/content/371/bmj.m3155 Save the Children. (2019). Stop the War on Children. Tratto da https://www.savethechildren.org/content/dam/usa/reports/ed-cp/stop-the-war-on-children-2019.pdf Scharf, M. (2007, Spring). Long-term effects of trauma: psychosocial functioning of the second and third generation of Holocaust survivors. Dev Psychopathol, 19(2), 603-22. Tratto da https://pubmed.ncbi.nlm.nih.gov/17459186/ Shonkoff, J. P., & Garner, A. S. (2012, January). The Lifelong Effects of Early Childhood Adversity and Toxic Stress. Pediatrics, 129(1), e232–e246. Tratto da https://doi.org/10.1542/peds.2011-2663 UNCHR. (2004). Guiding Principles of Internal Displacement. UN. Tratto da https://www.unhcr.org/43ce1cff2.pdf UNHCR. (2010). Convention and Protocol Relating to the Status of Refugees. Geneva. Tratto da https://www.unhcr.org/3b66c2aa10.html UNHCR. (2020). Global Trends Forced Displacement in 20220. Tratto da https://www.unhcr.org/statistics/unhcrstats/60b638e37/global-trends-forced-displacement-2020.html UNHCR. (2021). Global Trends Forced Displacement in 2021. Tratto da https://www.unhcr.org/62a9d1494/global- trends-report-2021 UNHCR. (s.d.). The UN Refugee agency. Tratto il giorno March 2023 da What is a refugee?: https://www.unhcr.org/what-is-a-refugee.html UNICEF. (2014). Early Childhood Development in Emergencies. Integrated Programme Guide. Rwanda. Tratto da https://www.unicef.org/media/73736/file/Programme-Guide-ECDiE-2014.pdf.pdf UNICEF. (2021). Children uprooted in a changing climate. New York. Tratto da https://www.unicef.org/media/109421/file/Children%20uprooted%20in%20a%20changing%20climate.p df UNICEF. (2023). The climate-changed child: A children's climate risk index supplement. New York. 30 Waddoups, A. B., Yoshikawa, H., & Strouf, K. (2019, December). Developmental Effects of Parent–Child Separation. Annual Review of Developmental Psychology, 1(1), 387-410. Tratto da https://doi.org/10.1146/annurev- devpsych-121318-085142 Ward, J., & Vann, B. (2002, December). Gender-based violence in refugee settings. The Lancet, 360(Special Issue), 13–14. Tratto da https://doi.org/10.1016/S0140-6736(02)11802-2 Weaver, C. M., Borkowski, J. G., & Whitman, T. L. (2008, January). Violence Breeds Violence: Childhood Exposure and Adolescent Conduct Problems. J Community Psychol, 36(1), 96-112. Tratto da https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3124247/ World Bank. (2017). Forcibly Displaced : Toward a Development Approach Supporting Refugees, the Internally Displaced, and Their Hosts. Washington, DC: World Bank. Tratto da https://openknowledge.worldbank.org/handle/10986/25016 License: CC BY 3.0 IGO World Bank. (2022, July 01). worldbank.org. Tratto il giorno September 2022 da Classification of Fragile and Conflict-Affected Situations: https://www.worldbank.org/en/topic/fragilityconflictviolence/brief/harmonized-list-of-fragile-situations World Bank. (Forthcoming). Policy Report: Strategies for Addressing Stunting and Learning in South Sudan. Washington, DC: World Bank. World Bank Group. (2020). World Bank Group Strategy for Fragility, Conflict, and Violence 2020–2025. Washington, D.C. Tratto da https://documents.worldbank.org/curated/en/844591582815510521/World- Bank-Group-Strategy-for-Fragility-Conflict-and-Violence-2020-2025 World Development Indicators. (2018). The World Bank. Tratto il giorno September 2022 da World Development Indicators DataBank: https://databank.worldbank.org/indicator/SP.DYN.LE00.IN/1ff4a498/Popular- Indicators# World Development Indicators. (2019). The World Bank. Tratto il giorno September 2022 da World Development Indicators DataBank: https://databank.worldbank.org/indicator/SP.DYN.LE00.IN/1ff4a498/Popular- Indicators# World Development Indicators. (2020). The World Bank. Tratto il giorno September 2022 da World Development Indicators DataBank: https://databank.worldbank.org/indicator/SP.DYN.LE00.IN/1ff4a498/Popular- Indicators# 31 © 2025 The World Bank 1818 H Street NW, Washington DC 20433 Telephone: 202-473-1000; Internet: www.worldbank.org Some rights reserved. This work is a product of the staff of The World Bank. The findings, interpretations, and conclusions expressed in this work do not necessarily reflect the views of the Executive Directors of The World Bank or the governments they represent. The World Bank does not guarantee the accuracy of the data included in this work. The boundaries, colors, denominations, and other information shown on any map in this work do not imply any judgment on the part of The World Bank concerning the legal status of any territory or the endorsement or acceptance of such boundaries. RIGHTS AND PERMISSIONS The material in this work is subject to copyright. Because The World Bank encourages the dissemination of its knowledge, this work may be reproduced, in whole or in part, for noncommercial purposes as long as full attribution to this work is given. Attribution— Please cite the work as follows: "Tamara Arnold, Elizabeth Lauren Hentschel, Diego Luna-Bazaldua, Juliana Chen Peraza and Fatine Guedira. 2025. Review and Guidance on ECD Assessment Tools in FCV Contexts Washington, DC: The World Bank. Creative Commons Attribution CC BY 4.0 IGO. � All queries on rights and licenses, including subsidiary rights, should be addressed to World Bank Publications, The World Bank Group, 1818 H Street NW, Washington, DC 20433, USA; fax: 202-522-2625; e-mail: pubrights@worldbank.org. Graphics and layout design: Ana Ofelia Yanez Barragan For more information on the World Bank's Early Childhood Development work: http://www.worldbank.org/en/topic/earlychildhooddevelopment#3 For more information on the World Bank's Early Learning Partnership: http://www.worldbank.org/en/topic/education/brief/early-learning-partnership This guidance note was funded by the Early Learning Partnership (ELP). We would also like to acknowledge the participation of the following persons that provided valuable information of this report: Anita Anastacio, Caroline Hiott, Mark Jordans, Ha Yeon Kim, Dana Charles McCoy, Jonathan Michael Seiden, Alice Jean Wuermli and Hirokazu Yoshikawa. Peer reviewers of this note included Adelle Pushparatnam, Ibrahima Samba, and Samira Nikaein Towfighian. We would like to acknowledge Aishwarya Khurana for her early contribution for this work. We also would like to thank Ella Victoria Humphry and Catalina Quintero for providing feedback on earlier versions of the note. We would also like to acknowledge the support of the broader ELP team, led by Amanda Devercelli and Amer Hasan under the management of Halil Dundar, Practice Manager of the Global Knowledge and Innovation Team of the Education Global Practice at the World Bank.