i PLANNING A LARGE- SCALE ASSESSMENT of STUDENT ACHIEVEMENT: WHAT ARE ESSENTIAL ELEMENTS OF THE ASSESSMENT FRAMEWORK? Diego Luna-Bazaldua and Marguerite Clarke © 2022 International Bank for Reconstruction and Development / The World Bank 1818 H Street NW, Washington, DC 20433 Telephone: 202-473-1000; Internet: www.worldbank.org Some rights reserved 1 2 3 4 24 23 22 21 This work is a product of the staff of The World Bank with external contributions. The findings, interpretations, and conclusions expressed in this work do not necessarily reflect the views of The World Bank, its Board of Executive Directors, or the governments they represent. The World Bank does not guarantee the accuracy, completeness, or currency of the data included in this work and does not assume responsibility for any errors, omissions, or discrepancies in the information, or liability with respect to the use of or failure to use the information, methods, processes, or conclusions set forth. The boundaries, colors, denominations, and other information shown on any map in this work do not imply any judgment on the part of The World Bank concerning the legal status of any territory or the endorsement or acceptance of such boundaries. Nothing herein shall constitute or be construed or considered to be a limitation upon or waiver of the privileges and immunities of The World Bank, all of which are specifically reserved. RIGHTS AND PERMISSIONS This work is available under the Creative Commons Attribution 3.0 IGO license (CC BY 3.0 IGO) http://creativecommons.org/licenses/by/3.0/igo. Under the Creative Commons Attribution license, you are free to copy, distribute, transmit, and adapt this work, including for commercial purposes, under the following conditions: Attribution—Please cite the work as follows: Luna-Bazaldua, Diego, and Marguerite Clarke. 2022. Planning a Large-Scale Assessment of Student Achievement: What are Essential Elements of the Assessment Framework? Washington, DC: World Bank. License: Creative Commons Attribution CC BY 3.0 IGO Translations—If you create a translation of this work, please add the following disclaimer along with the attribution: This translation was not created by The World Bank and should not be considered an official World Bank translation. The World Bank shall not be liable for any content or error in this translation. Adaptations—If you create an adaptation of this work, please add the following disclaimer along with the attribution: This is an adaptation of an original work by The World Bank. Views and opinions expressed in the adaptation are the sole responsibility of the author or authors of the adaptation and are not endorsed by The World Bank. Third-party content—The World Bank does not necessarily own each component of the content contained within the work. The World Bank therefore does not warrant that the use of any third-party-owned individual component or part contained in the work will not infringe on the rights of those third parties. The risk of claims resulting from such infringement rests solely with you. If you wish to reuse a component of the work, it is your responsibility to determine whether permission is needed for that reuse and to obtain permission from the copyright owner. Examples of components can include, but are not limited to, tables, figures, or images. All queries on rights and licenses should be addressed to World Bank Publications, The World Bank Group, 1818 H Street NW, Washington, DC 20433, USA; e-mail: pubrights@worldbank.org. Cover and interior design: Danielle Willis, World Bank. Table of Contents 1. Introduction .......................................................................................................................................... 1 2. Overview and context for the assessment exercise ............................................................................. 3 3. Content and skills to be covered by the assessment ........................................................................... 4 4. Additional instruments to be used as part of the exercise.................................................................. 8 5. Assessment frameworks for different large-scale assessment activities............................................ 9 6. Conclusions ......................................................................................................................................... 15 References................................................................................................................................................ 16 Annex A. Checklist of assessment framework contents ....................................................................... 17 Acknowledgments Work on this guidance note was led by Diego Luna-Bazaldua and Marguerite Clarke. The team received support and inputs from Julia Liberman and Victoria Levin and worked under the overall guidance of Omar Arias (Practice Manager, Education Global Knowledge and Innovation Team). Peer reviewers included Juan Baron (Senior Economist, HSAED), Amira Kazem (Senior Operations Officer, HMNED), and Huma Kidwai (Senior Education Specialist, HAEE2). Additional valuable inputs were received from Laura Gregory, Victoria Levin, Julia Liberman, and Alonso Sanchez. The guidance note was designed by Danielle Willis. 1. INTRODUCTION An important part of planning a large-scale assessment of student achievement is developing the framework for the assessment. In many education projects financed by the World Bank and other international organizations, the development of an assessment framework is included as one of the results indicators for the project. This note addresses frequently asked questions about the contents of an assessment framework. Examples of assessment frameworks are included at the end. An assessment framework describes what is being assessed, why it is being assessed, how it is being assessed, and who is being assessed. A well thought-through assessment framework supports the development of appropriate assessment instruments, sample designs, implementation strategies, analytical approaches, and reporting structures (Box 1). Indeed, assessment frameworks are one of the building blocks for strengthening national assessment systems. At the same time, frameworks are only guiding documents and should be flexible enough to allow for necessary changes to the design, conduct, or analysis of the assessment in response to conditions on the ground. This guidance note discusses how to develop an assessment framework for a large-scale assessment exercise that will provide information on overall learning levels in an education system. Typically, such assessments are national in scope, but they may also be sub-national or international. BOX 1. Why is It Important to Create an Assessment Framework? An assessment framework plays a major role in ensuring the quality and credibility of a large-scale assessment exercise. Specifically, developing an assessment framework will help: • ensure that relevant content and skills are covered by the assessment; • guide the development of appropriate assessment tools (including test questions and background questionnaires); • strengthen transparency and public dissemination of information about the assessment process; and • facilitate the interpretation and use of the results. Source: INEE & CENTRO UC Medición MIDE (2019). As shown in Box 2, the development of an assessment framework is one of the initial steps in planning a large-scale assessment, lasting around three to six months depending on the number of grades and subjects involved and the resources allocated to the task. While the national assessment agency is commonly in charge of developing the framework document, the contents typically must be agreed with, and approved by, the Ministry of Education or the National Steering Committee that oversees the assessment activities. The assessment framework should include information that helps drive all subsequent steps in the process, such as the subjects to be assessed, language(s) of assessment, the target student population, assessment format and length, adaptations or accommodations for students with disabilities, and appropriate interpretations and uses of the results. More detailed information on the steps listed in Box 2 can be found in Anderson and Morgan (2008). 1 BOX 2. What are the Main Stages and Activities in a National Large-Scale Assessment (NLSA)? Stage 1: Creating an enabling context for the large-scale assessment. Step 1. Developing appropriate policies to support the NLSA program Step 2. Forming a unit responsible for implementing NLSA activities Step 3. Ensuring appropriate budget and staff for carrying out key assessment activities Stage 2: Assessment preparation. Step 4. Assessment frameworks: Establishing what will be assessed and how Step 5. Sample design and selection of schools and students Step 6. Informing stakeholders about the upcoming assessment Step 7. Assessment items and background questionnaire development Step 8. Printing, organizing, and distributing the assessment instruments Stage 3: Assessment administration. Step 9. Test administrator training Step 10. Test administration Stage 4: Data cleaning and analysis. Step 11. Test scoring and data management Step 12. Data analysis Stage 5: Assessment results dissemination. Step 13. Reporting of results Adapted from Anderson & Morgan (2008) and Clarke & Luna-Bazaldua (2021). An assessment framework specifies the body of knowledge and skills to be assessed. The reference point for this body of knowledge and skills is usually the official learning standards or curriculum for the education system in which the assessment is to take place. Textbooks, policy documents, national learning targets, international learning standards, and frameworks of international large-scale assessments may also be used as reference points. During the development of the framework, assessment agencies may consult with expert advisory boards, policymakers, and other stakeholder groups that can provide inputs and review the draft document. The framework document should make the assessment process and the assumptions behind it transparent, not just for the test developers but also for a much larger audience, including teachers, curriculum personnel, and policymakers. According to Anderson and Morgan (2008), the assessment framework allows different stakeholders to share a common understanding of the assessment objectives; the knowledge, subject area, or skills to be assessed; the characteristics of the items, questions, or tasks that will be used to assess that knowledge, subject area, or skills; and how results will be interpreted and used. The framework can also become a reference point for generating quality assurance protocols and can facilitate a consistency of approach across multiple administrations of the assessment program that in turn increases the comparability of results over time. The key components of an assessment framework document are: (a) Overview and context for the assessment exercise; (b) Content and skills to be covered by the assessment; and (c) Additional instruments to be used as part of the exercise. 2 2. OVERVIEW AND CONTEXT FOR THE ASSESSMENT EXERCISE Assessment framework documents usually have an initial chapter that provides an overview of the assessment exercise and situates it within a specific context. • Introduction. Provides a summary and overview of the contents of the document. • Country context. Commonly includes information about the broader education system; organizations responsible for the development and implementation of the large-scale assessment program; a review of learning assessments implemented in the past; previous and planned participation in international large-scale assessment studies; general alignment between the large-scale assessment program and the national curriculum or learning goals; and the role of the Ministry of Education in approving the assessment framework and the assessment activities linked to it. • Overview of the large-scale assessment. General information about the subjects or skills to be assessed; school grades or age of the target student population; relevance of the assessment for the cohort(s) of students to be assessed; data collection plan (e.g., sample or census-based approach, sampling design to be used for sample-based assessments); administration mode (e.g., paper-based, computer-based, online); links with previous assessment studies (e.g., the use of common items, questions, or tasks; capacity to capture learning trends over time); sources of information consulted to produce the assessment framework; and any background or contextual questionnaires to be used as part of the study. • Planned uses and reporting of the assessment results. This section describes how the assessment results will be interpreted (e.g., whether raw scores will be converted into scaled scores, proficiency levels, or percentiles) and used (e.g., monitoring learning at the system level, resource allocations to schools, revisions to the curriculum and learning materials, education policy reforms). It may also include information about the planned dissemination strategies for sharing the assessment results with different audiences. It is good practice for the assessment framework document to include detailed information on: • school and student sampling procedures; • booklet design, in the case of assessments with more than one test version; • assessment adaptations for students with disabilities; • language(s) of assessment, particularly in multicultural contexts; • permitted item types and scoring procedures; • inclusion of items from previous assessment rounds to calculate learning trends over time; • description of the score scale and proficiency levels to be used to describe student achievement; and • approaches to reporting assessment results. For an in-depth discussion of these methodological issues, please consult the National Assessments of Educational Achievement book series. Clarke and Luna-Bazaldua (2021) provide a summary of the key methodological considerations. 3 3. CONTENT AND SKILLS TO BE COVERED BY THE ASSESSMENT There is usually a dedicated chapter for each subject or knowledge domain to be assessed, which should have been discussed with relevant stakeholder groups and approved by the Ministry of Education. Each of these chapters should include: • Subject or knowledge domain definition. This is the theoretical definition of the subject or knowledge domain to be assessed. It usually includes a set of interrelated attributes (e.g., behaviors, knowledge, skills, attitudes, values) that will be assessed. This definition is developed after reviewing documentation related to the national curriculum and research on the subject or knowledge domain. Information on the alignment between the subject or knowledge domain to be assessed and the national curriculum should be included (for instance, specifying national curriculum topics or objectives that will be covered by the assessment). The theoretical description of the knowledge domain may be accompanied by information on its relevance to students' later success in school or life. o Description of subdomains (or subskills). In the case of domains that have several facets or components (e.g., each of the subskills that allow a student to read with comprehension, including phonemic awareness, letter recognition, and word recognition, among others), it is useful to define each of them conceptually. • Operationalization of the subject or knowledge domain. This involves specifying the types of items, questions, or tasks (e.g., multiple-choice items, open-ended short-answer items, essays scored using a rubric, performance tasks) that will be included in the assessment. Typically, this is done using examples of each question type. In many assessment frameworks, this section also specifies the cognitive taxonomy1 of the questions or tasks to be included in the assessment. • Subject or knowledge domain blueprint. The test blueprint (also known as the specifications table) arranges key information in tables that list the data that must be collected, the test length, and the proportion of items that will address various aspects of the targeted subject or knowledge domain (Anderson & Morgan, 2008). For national large-scale assessments, these domains are frequently linked to official national learning standards; for international large-scale assessments, these domains are determined by consultations with experts and analysis of the common learning standards of participating countries. Three key elements in the test blueprint are (Box 3 and Table 1 provide more information): o Subject or knowledge sub-domains. The test blueprint should cover all relevant sub-domains under the domain of interest. For instance, if a test is measuring arithmetic skills, specific sections of the table should be devoted to the sub-skills of addition, subtraction, multiplication, and division. o Cognitive level. For each subdomain, the test blueprint should indicate the proportion of items at different cognitive levels. 1 In educational psychology, cognitive taxonomies help specialists to understand and rank tasks from simpler (e.g., rote knowledge memorization) to more cognitively complex (e.g., applying knowledge to solve problems in new contexts). Subject matters experts (e.g., teachers) and assessment specialists can use these taxonomies to identify the appropriate cognitive level linked to specific learning goals, and to produce questions, items, or tasks that measure the desired cognitive level. Perhaps the best-known taxonomy in the context of schooling is Bloom’s Taxonomy. 4 o Indicators. Indicators are written statements that describe students' observable actions to be measured using items, questions, or tasks. Indicators must be directly related to the subject or knowledge sub-domain they intend to measure; each knowledge sub-domain usually has one or more indicator linked to it to denote different types of tasks or items that can provide evidence of what a student knows and can do. Indicators must be clear enough to allow different subject matter experts to develop items aligned to the indicator description. For instance, an algebra indicator could be: "the student can solve an equation with one unknown located on both sides of the equation." BOX 3. What are the Desired Characteristics of a Test Blueprint? A good blueprint should indicate the following: • The proportion of items in the final test form that address each subject or knowledge domain and subdomain (for example, mathematics, literacy and language, science, or social studies). • The proportion of items within a subject or knowledge domain that assess different skills (for example, in mathematics—number, measurement, space, and pattern; in writing— ideas, content knowledge, structure, style, vocabulary, spelling, and grammar). • The proportion of items that address different cognitive processing skills (such as knowledge facts or recall, interpretation, or reflection). • The proportion of multiple-choice and open-ended items. Adapted from Anderson & Morgan (2008). 5 TABLE 1. What Elements Should be Included in a Test Blueprint? • Do the included sub-domains represent the entire knowledge domain? Are relevant content areas excluded? About the sub-dimensions • Are the sub-domains exclusive (in other words, there is no or knowledge domains content overlap among sub-domains)? • Are the sub-domains labeled according to their content? • Are the indicators coherent with the knowledge domain and sub-domain to which they belong? • Do the indicators represent clear evidence of the presence of the knowledge domain? • Do the indicators refer to observable behaviors, and therefore About the indicators are measurable through questions with a specific format? (e.g., learning objective • Are the indicators necessary and sufficient to fully cover the specifications to produce entire knowledge domain? items, questions, and tasks). • Are the indicators clearly written and can they be understood unequivocally by different users? • Do the indicators elicit a single performance? Otherwise, do the indicators require the achievement of more than one skill during the assessment? • As a whole, is the test blueprint aligned with the assessment objective and with the knowledge domain? • Does the test blueprint reflect learning objectives included in the curriculum? • Is the complexity of the knowledge domain adequately represented? Test blueprint as a whole • Does the test blueprint adequately represent the complexity of the knowledge domain? • Is the test blueprint proportionate to the planned length of the assessment instrument(s)? • Does the test blueprint represent a clear and useful guide for the development of the items, questions, or tasks? Source: INEE & CENTRO UC Medición MIDE (2019). Table 2 provides an example of a test blueprint for an early grade reading assessment with 30 items. The blueprint covers three main reading skills: oral language comprehension, language decoding, and reading comprehension. Each reading skill is broken into subskills and these are mapped to indicators linked to specific learning standards to describe the types of tasks to be developed to assess the subskills. The blueprint specifies the number of items to include in the assessment tool to measure each indicator and the aggregation of these items determines the total assessment length. The proportionate representation of each reading skill is calculated using a domain weight, which is computed using the proportion of items that measure that skill. Having this information included in an assessment framework can help assessment agencies and Ministries of Education better communicate with key stakeholders about what is being measured and how the assessment is aligned with key learning objectives in the curriculum. 6 TABLE 2. Example of an Early Grade Reading Assessment Blueprint Number Domain Reading skill Reading sub-skill Indicators of items weight Comprehend spoken language at the 1 word or phrase level Retrieve information at Recognize the meaning of common word level grade-level words in a short, grade- 2 level continuous text read to the learner Comprehension of spoken Retrieve 20% Retrieve explicit information in a language information at short grade-level continuous text 1 sentence or text read to the learner level Interpret Interpret information in a short grade- information at level continuous text read to or 2 sentence or text signed for the learner level Identify symbol-sound/fingerspelling and/or symbol-morpheme 3 Precision correspondences Decoding 27% Decode isolated words 3 Say or sign a grade-level continuous Fluency 2 text at pace and with accuracy Recognize the meaning of common 3 grade-level words Retrieve explicit information in a Retrieve grade-level text by direct- or close- 3 information word matching Retrieve explicit information in a grade-level text by synonymous word 2 Reading matching 53% comprehension Identify the meaning of unknown words and expressions in a grade- 2 level text Interpret information Make inferences in a grade-level text 3 Identify the main idea in a grade-level 3 text Total number of items 30 100% 7 4. ADDITIONAL INSTRUMENTS TO BE USED AS PART OF THE EXERCISE Many large-scale assessment exercises involve the administration of background questionnaires to collect information on home and school factors that may be related to student achievement. There should be a dedicated chapter in the framework document that describes the topics to be covered by these questionnaires, specific indicators to be included (or created from the data collected), likely relationships between indicators and student learning outcomes, and potential uses of the data. For more information on the contents of background questionnaires and their role, please consult Clarke and Luna-Bazaldua (2021). 8 5. ASSESSMENT FRAMEWORKS FOR DIFFERENT LARGE-SCALE ASSESSMENT ACTIVITIES Large-scale assessment activities differ in their objectives, intended uses of assessment results, and capacity to compare student achievement internationally. The principal large-scale assessment activities include national large-scale assessments (NLSA) for monitoring the overall status of education systems at particular grade or age levels, international large-scale assessments (ILSA) for providing comparative monitoring feedback on the status of education systems in a particular region or globally, and high-stakes examinations (HSE) for making selection or certification decisions about students as they progress through the education system. Clarke and Luna-Bazaldua (2021) discuss the similarities and differences among these large-scale assessment activities in detail. Despite their different objectives, all of these large-scale assessment activities commonly start with the development of an assessment framework. Table 3 discusses some of the main features that differentiate the contents and uses of assessment frameworks for NLSAs, ILSAs, and HSEs. For instance, while typical sources of information for both NLSAs and HSEs would be the national curriculum, national learning standards, and available teaching and learning materials, NLSAs frameworks might also draw on global learning frameworks—such as the Global Proficiency Framework (2020). Assessment frameworks for HSEs may also be defined based on consultations with stakeholders regarding the necessary skills that students should have for the next education level or their entry into the labor force. In the case of ILSAs, sources of information for the assessment framework generally include consultations with stakeholders from all participating countries, identification of common elements across national curricula, and aspects of global learning frameworks. 9 TABLE 3. Features of Assessment Frameworks for Different Large-Scale Assessment Activities Feature NLSA ILSA HSE Source of • National curriculum • Consultations with • National curriculum information and learning policymakers and and learning to develop the standards experts from standards assessment • Teaching and participating • Teaching and framework learning materials countries learning materials aligned to curriculum • Analysis of common aligned to curriculum • Global frameworks learning standards • Consultations with and learning from participating experts and standards countries employers regarding • Global frameworks necessary skills for and learning next education level standards or labor force Organization National assessment International Examinations board responsible for the agency organization in charge development of the of international study assessment framework Dissemination of the • Publicly available • Publicly available • A summary with assessment online online publicly disclosable framework • Printed materials for • Printed materials for information is different stakeholder different stakeholder included online groups groups • Test blueprints • Test blueprints • Test blueprints shared with item shared with item shared with item writers and item writers and item writers and item reviewers reviewers reviewers • A list of assessment topics is commonly included in study guides and preparation materials for exam takers Other information • Topics to be covered • Topics to be covered • Accommodations for included by background by background students with questionnaires questionnaires disabilities • Sampling plans • Sampling plans • Grading and results • Temporal • Temporal reporting comparability of comparability of results results • Reporting • Reporting approaches approaches • Accommodations for • Accommodations for students with students with disabilities disabilities 10 Examples. Content structure of national and international assessment frameworks. A key difference between the frameworks for national and international large-scale assessments is that the former are typically based on a country's national curriculum and learning standards while the latter tend to cover a broader set of skills or content that are considered relevant by stakeholders from most or all countries participating in the international study (Clarke & Luna-Bazaldua, 2021). National and international assessment framework documents usually cover the three broad categories discussed before. Examples of the tables of contents from these documents are included below. I. Programme for International Student Assessment (PISA) 2018 Assessment and Analytical Framework Framework sections, chapters, and subchapters Category 1. What is PISA? Overview and context for the • What makes PISA unique assessment • The PISA 2018 test • An overview of what is assessed in each domain • The evolution of reporting student performance in PISA • The context questionnaires • A collaborative project 2. PISA 2018 Reading Framework Content and skills to be • Introduction covered • Defining reading literacy • Organizing the domain For the sake of brevity, only • Assessing reading literacy chapter 2 headings are • Reporting proficiency in reading listed. A similar approach is used for chapters 3 to 6. 3. PISA 2018 Mathematics Framework 4. PISA 2018 Science Framework 5. PISA 2018 Financial Literacy Framework 6. PISA 2018 Global Competence Framework 7. PISA 2018 Questionnaire Framework Additional instruments to be • Introduction used • Defining the questionnaire core in PISA 2018 • Coverage of policy issues in PISA 2018 8. PISA 2018 Well-being Framework • Introduction • Well-being as a multi-dimensional construct • Addressing measurement challenges • Suggested quality of life indicators 11 II. Progress in International Reading Literacy Study (PIRLS) 2016 Assessment Framework Framework sections, chapters, and subchapters Category Introduction Overview and context for • PIRLS 2016—Monitoring trends in reading literacy achievement the assessment • History of the PIRLS, PIRLS Literacy, and ePIRLS international assessments • Updating the PIRLS 2016 framework for assessing reading achievement • Policy relevant data about the contexts for learning to read • Using PIRLS data for educational improvement 1. PIRLS 2016 Reading Framework Content and skills to be • Definition of reading literacy covered • Overview of the PIRLS framework for assessing reading achievement • PIRLS framework emphases in PIRLS, PIRLS Literacy, and ePIRLS • Purposes for reading • Processes of comprehension • Introducing ePIRLS—An assessment of online informational reading • ePIRLS—Assessing the PIRLS comprehension processes in the context of online informational reading • Selecting PIRLS and PIRLS Literacy passages and ePIRLS online texts 2. PIRLS 2016 Context Questionnaire Framework Additional instruments to • Home contexts be used • School contexts • Classroom contexts • Student characteristics and attitudes toward learning 3. Assessment design for PIRLS, PIRLS Literacy, and ePIRLS in 2016 Methodological design • Student population assessed of the assessment • Reporting reading achievement • PIRLS and PIRLS Literacy booklet design • Question types and scoring procedures • Releasing assessment materials to the public • ePIRLS 2016 design • Context questionnaires and the PIRLS 2016 encyclopedia 12 III. National Assessment of Educational Progress (NAEP) 2019 Reading Framework Framework sections, chapters, and subchapters Category 1. Overview Overview and context for • NAEP overview the assessment • Overview of NAEP reading assessment 2. Content and Design of NAEP in Reading Content and skills to be • Texts to be included on the NAEP reading assessment covered • Literary text • Informational text • Characteristics of texts selected for inclusion • Vocabulary on the NAEP reading assessment • Cognitive targets • Item types 3. Reporting results Methodological design • Legislative provisions for NAEP reporting of the assessment • Achievement levels • Reporting NAEP results • Reporting state NAEP results • Reporting trend data IV. Program for the Analysis of Educational Systems of the CONFEMEN (PASEC) 2014 Early Grade Assessment Framework Framework sections, chapters, and subchapters Category 1. Introduction Overview, justification and • Context for the development of the early grade literacy and context for the early grade numeracy PASEC assessments literacy and numeracy • PASEC’s approach for early grade assessment development assessments. • Relevance of foundational literacy and numeracy skills • Importance of assessing students in early grades 2. PASEC 2014 Early Grade Literacy Assessment Framework Context, content, and skills • Scientific and international references for the early grade reading to be covered in the early assessment framework grade reading assessment. • Learning to read and related factors to improve reading outcomes • School reading programs in participating countries For the sake of brevity, only • Domains and skills assessed in the PASEC early grade literacy chapters 2 and 3 headings assessment are listed. • Language of instruction • Listening comprehension skills covered in PASEC 2014 • Writing skills covered in PASEC 2014 • Reading comprehension skills covered in PASEC 2014 3. Assessment design for PASEC 2014 context questionnaires Methodological design of • PASEC’s global approach for the study of educational context the assessment. • Framework for the context questionnaires • Specifications of context questionnaires The context questionnaire • Questionnaire for students in grade 2 framework is a separate • Questionnaire for students in grade 6 report in PASEC 2014. • Questionnaire for teachers and principals in grades 2 and 6 • Country questionnaire 13 V. India's National Achievement Survey (NAS) 2021 Framework Framework sections, chapters, and subchapters Category 1. Introduction Overview and context for the • Description and overview of Assessment Framework for NAS assessment 2021 • Overview of curricular areas and school grades assessed 2. Assessment frameworks for classes 3, 5, 8 and 10. Content domains and • Modern Indian Languages subdomains to be assessed • Reading Comprehension in NAS 2021 • Mathematics • Environmental Science / Science • Social Science 3. Annexes List of measurable learning • Learning outcomes measured in previous rounds of NAS outcomes assessed in NAS • Learning outcomes measured in NAS 2021 over time VI. United Kingdom's Key Stage 2: English Reading Test Framework. Framework sections, chapters, and subchapters Category 1. Overview Overview and context for the • Purposes of statutory assessment assessment • Nature of the assessment • Population to be assessed • Test format 2. Contents of reading test Content domains and • Content definition subdomains to be assessed • Cognitive domain in the reading assessment, • Test specifications cognitive complexity, test • Breadth and emphasis of the test specifications, and marking • Format of test questions and responses criteria. Assessment results interpretation and intended • Marking and mark schemes uses of the results. • Reporting of results • Desired psychometric properties • Performance descriptors 3. Diversity and inclusion Assessment methodology • Access arrangements and administration • Pupils with English as an additional language considerations to promote • Glossary of terminology assessment fairness. 14 6. CONCLUSIONS The assessment framework is an important guiding document for any large-scale assessment activity. Therefore, ensuring its quality is critical to improving the design, implementation, analysis, and use of the assessment results. At the same time, the contribution of the assessment framework is most optimal when other enabling factors are in place. For instance, the leadership within the Ministry of Education and stakeholder groups in society must value the assessment process and understand how to use the assessment results. Likewise, sufficient financial, human, and time resources must be allocated for the successful implementation of all assessment activities. Moreover, large-scale assessments will be more impactful when they are coherently aligned with other elements of the education system, including the curriculum and learning standards, learning materials for teachers and students, and opportunities to strengthen teachers' assessment competencies. For more information about these enabling factors, readers are encouraged to review the World Bank's National Assessments of Educational Achievement book series. 15 REFERENCES Anderson, P. & Morgan, G. (2008). National Assessments of Educational Achievement. Volume 2: Developing Tests and Questionnaires for a National Assessment of Educational Achievement. Washington, DC: World Bank. Clarke, M. & Luna Bazaldua, D. (2021). Primer on Large-Scale Assessments of Educational Achievement. Washington, DC: World Bank. INEE & CENTRO UC Medición MIDE (2019). Cuadernillo técnico de evaluación Educativa 3. Definicion del referente de la evaluación y desarrollo del marco de especificaciones. México: INEE. Website: https://www.inee.edu.mx/wp-content/uploads/2019/08/P2A353.pdf Government of the United Kingdom (2022). National curriculum assessments: test frameworks. Website: https://www.gov.uk/government/publications/key-stage-2-english-reading-test-framework Ministry of Education, Government of India. (2021). National Achievement Survey 2021. Technical Note on Assessment Framework. New Delhi, India: Ministry of Education. https://nas.education.gov.in/reportAndResources Mullis, I. V. S., & Martin, M. O. (Eds.). (2015). PIRLS 2016 Assessment Framework (2nd ed.). Retrieved from Boston College, TIMSS & PIRLS International Study Center. Website: http://timssandpirls.bc.edu/pirls2016/framework.html National Assessment of Educational Progress (2020). Assessment Frameworks. NAEP Website: https://nces.ed.gov/nationsreportcard/assessments/frameworks.aspx PASEC (2014). Cadre De Référence Des Tests Pasec2014 De Langue Et De Mathématiques De Début De Scolarité Primaire. http://www.pasec.confemen.org/wp- content/uploads/2016/03/PASEC_2014_CADRE_REFERENCE_TEST_2A.pdf OECD. (2019). PISA 2018 Assessment and Analytical Framework. Paris, France: OECD Publishing. Website: https://doi.org/10.1787/b25efab8-en. USAID, UNESCO, UKAID, ACER, Bill & Melinda Gates Foundation, & World Bank. (2020). Global Proficiency Framework for Reading. Grades 1 to 9. URL: https://www.edu-links.org/resources/global-proficiency- framework-reading-and-mathematics 16 ANNEX A. CHECKLIST OF ASSESSMENT FRAMEWORK CONTENTS The following checklist can be used to determine if an assessment framework document includes the minimum necessary information. 1. Overview and context 1.1. Description of the local context and the broader education system. [ ] 1.2. Description of the organizations responsible for the development and implementation [ ] of the large-scale assessment program. 1.3. Information on the alignment between the large-scale assessment program and the [ ] national curriculum and learning goals. 1.4. Description of the sources of information consulted to produce the assessment [ ] framework. 1.5. Information on whether the large-scale assessment is a once-off activity or part of an [ ] ongoing program. 1.6. Guidelines regarding the interpretation of the assessment scores. [ ] 1.7. Information on the intended uses of the assessment scores. [ ] 1.8. Planned dissemination activities for the assessment results. [ ] 1.9. Methodological section indicating the constructs, subjects, or knowledge domains to [ ] be assessed, school grades or age of the target student population. 1.10. Methodological section indicating whether the large-scale assessment will be a [ ] sample- or census-based study. 1.11. Methodological section indicating whether the large-scale assessment will consist of a [ ] paper-and-pencil, computer-based, or mixed-format administration. 1.12. Methodological section about strategies to psychometrically link and compare results over time: use of common items, use of common students, psychometric models to link [ ] scores. 1.13. Methodological section describing the number of test versions and design to [ ] psychometrically link scores. 1.14. Methodological section describing the item types to be used in the assessment and [ ] item scoring procedures. 2. Assessment domains 2.1. Theoretical definition of each subject or knowledge domain to be assessed. [ ] 2.2. Theoretical definition of the subdomains. [ ] 2.3. Alignment of the measured domains and the national curriculum. [ ] 2.4. Description and examples of the methods to be used to measure the chosen domains. [ ] 2.5. Test blueprint for each domain to be assessed. [ ] 3. Additional instruments 3.1. Topics to be covered and indicators to be included in the background/contextual [ ] questionnaires. 3.2. Target respondents (for instance, students, teachers, school principals) for the [ ] background/contextual questionnaires. 3.3. Student, teacher, and school identifiers that will be used to merge data at different [ ] levels of aggregation. 17 18