EDUCATION NOTES February 2012 67346 Measuring learning: How EffEctivE StudEnt ASSESSmEnt SyStEmS cAn HElp AcHiEvE lEArning for All Why Assess Student Learning? Research on assessment reveals that the right kinds of assessment activities, and the right uses of data generated It is not enough for children to be enrolled in school and by those activities, can contribute to better outcomes— sitting in classrooms. For the benefits of education to such as better learning and better informed policy decisions accrue, children must be learning. But how do we measure (Clarke, 2011). Evidence on assessment shows: whether children are learning and what do we do with that • A link between high-quality, formative classroom information? As governments and development partners assessment activities and better learning outcomes, as strive to improve student learning outcomes, it is vital to measured by student performance on standardized tests. develop strong systems for assessing student learning. • A link between countries that have exit examination The importance of learning assessment is linked to growing policies and higher performance levels on international evidence that learning drives prosperity. Research finds a assessments. one standard deviation increase in scores on international assessments of reading and mathematics is associated • A link between the use of data from large-scale with a 2 percent point increase in annual growth rates of assessments to hold schools and educators accountable GDP per capita (Hanushek and Woessmann, 2007). and improvements in student learning outcomes. Assessment is the process of gathering and Challenges to Assessing evaluating information on what students Student Learning know, understand, and can do in order to Today, too few countries have in place the policies, structures, make an informed decision about what to do practices, and tools that would constitute an effective student next in the educational process. assessment system. This is particularly the case for low- income countries, which stand to benefit most from systematic An assessment system is a group of policies, efforts to measure student learning outcomes—particularly as structures, practices, and tools for generating testing is among the least expensive innovations in education and using information on student learning. reform (Hoxby, 2002; Wolff, 2007). This system can support a variety of purposes While some low-income countries have experimented with and uses, such as informing instruction, standardized assessments of student learning, participation determining progress, and providing data on in such assessments has often been ad-hoc—neither accountability. integrated into an education strategy, nor sustained over time. One-time assessments may generate shock value and EDUCATION NOTES create an opening for wider discussions of educational Large-scale survey assessments are designed to provide quality, but it is sustained assessment systems that allow information on system performance levels and related or countries to monitor learning trends and gain a better contributing factors (Greaney and Kellaghan 2008), typically understanding of the relative contribution of various in relation to an agreed set of standards or learning goals. inputs and educational practices to changes in student Assessment results inform both educational policy and learning outcomes. practice. Examples include international assessments of student achievement levels such as TIMSS, PIRLS, and PISA; regional assessments such as PASEC in What Do Assessment Francophone Africa, SACMEQ in Anglophone Africa, and LLECE in Latin America; national-level assessments such Systems Look Like? as SIMCE in Chile;1 and sub-national assessments such as state-level tests in the USA and Canada. Assessment systems tend to be made up of three main kinds of assessment activities, corresponding to three Education systems can have very different profiles in main information needs or purposes: these three assessment areas. For example, Finland’s education system emphasizes classroom assessment • classroom assessments -- which provide real- as a key source of information on student learning and time information to support teaching and learning in draws less on external examinations. On the other hand, individual classrooms; China has traditionally placed considerable emphasis on examinations as a means to sort and select from its large • examinations -- which are used to make decisions student population, and less on classroom assessments about the progress of individual students through the (although this is changing.) education system (such as certification or selection for university entry); and A recent study (Darling-Hammond and Wentworth 2010) reviewed the practices of high-performing education • large-scale survey assessments -- which systems around the world (e.g., Australia, Finland, Hong monitor learning trends and provide both policy- Kong, Singapore, Sweden, UK), and found that the and practitioner- relevant information on overall assessment systems in these countries: performance levels in an education system, and contributing factors. • Provide feedback to students, teachers, and schools about what has been learned and “feed-forward� Classroom assessments are those carried out by teachers information that can shape future learning, as well as and students in the course of daily activity. They include guide college- and career-related decision making. a variety of activities, tools and procedures for collecting and interpreting written, oral, and other forms of evidence • Closely align curriculum expectations, subject and on student learning. Research shows a strong link between performance criteria, and desired learning outcomes. effective classroom assessment activities and better student learning outcomes as measured by performance on • Engage both teachers and students. standardized tests, with the largest gains being made by low • Prioritize quality over quantity with respect to achievers (Black and Wiliam, 1998). standardized testing. Examinations provide information for crucial decisions about individual students—for example, whether they should be promoted to the next grade level, assigned to a particular type of school or academic program, graduate from high school, or gain admission to university (Greaney and Kellaghan 1 TIMSS – Trends in International Mathematics and Science Study; PIRLS 1995; Heubert and Hauser 1999). The high-stakes nature of – Progress in International Reading Literacy Study; PISA – Programme most examinations means they exert a backwash effect on for International Student Assessment; PASEC – Programme d’Analyse the education system in terms of what is taught and what is des Systèmes Educatifs (Program on the Analysis of Education Systems); SACMEQ – Southern and Eastern Africa Consortium for Monitoring Edu- learned. This in turn impacts the skills and knowledge profile of cational Quality; LLECE – Laboratorio Latinoamericano de Evaluación de graduates. Because exams can have negative consequences la Calidad de la Educación (Latin American Laboratory for Assessment of for individual students—particularly those from disadvantaged the Quality of Education). SIMCE – Sistema de Medición de la Calidad de groups—their uses and outcomes must be carefully monitored. la Educación (Assessment System for Measuring the Quality of Education). February 2012 Effective Assessment Systems performance over time, has been key to the improvement of achievement levels in countries as diverse as Brazil, The effectiveness of an assessment system depends on Jordan, and Poland. the quality of the information that it generates, particularly The need for good assessment quality applies not for decision making. Major drivers of information quality only to large-scale survey assessments, but to any are (Clarke, 2011): kind of assessment activity (AERA, APA, and NCME • the enabling context -- the broader context in which 1999). If an assessment is not sound in terms of its assessment activity takes place and the extent to design, implementation, analysis, reporting, or use, it which that context is supportive of assessment; may contribute to poor decision making with respect to student learning and system quality. Two overarching • system alignment -- the extent to which assessment technical issues for any assessment activity are activities are aligned with the rest of the education reliability and validity, that is, whether an assessment system; and activity produces precise data (reliability)—a particularly important consideration for high-stakes examinations • assessment quality -- the technical quality of the instruments, processes, and procedures used for and for monitoring trends over time—and whether assessment activity. test scores represent intended values and are used in intended ways (validity). Test score validity can, for example, be threatened by a difference between the language of instruction and the language of testing, which may make it difficult for a child to show what he To be effective, comprehensive systems or she knows and can do. Validity considerations include for learning assessment must feed into careful consideration of the consequences of the uses education practice and policies that of test scores, including the social, economic, and other improve student learning. impacts on different groups in the population. Assessment systems, and the activities that comprise them, may be characterized according to four levels of The enabling context refers to the broader legislative development: or policy framework for assessment activities; the institutional and organizational structures for carrying • latent -- at the beginning stages, out and using the results from assessment activities; the • emerging -- on the way to meeting an acceptable availability of sufficient and stable sources of funding; and the presence of trained human resources. The minimum standard, enabling context is important to get right because it is • established -- acceptable minimum standard, and a key driver of the long-term quality and effectiveness of an assessment system and no assessment system is • advanced -- best practice. sustainable in its absence. Systems that make the shift from emerging to established System alignment includes the connection between (acceptable minimum standard) are distinguished by assessment activities and system learning goals, a concerted focus on reforms, inputs, and practices standards, and the curriculum (Fuhrman and Elmore 1994). Alignment involves more than a simple match that strengthen the enabling context for assessment. between what is tested and what is in the curriculum. The framework shown in Table 1 illustrates these For example, while the correspondence between a given development levels (with the exception of latent, which country’s curriculum and what is tested on international basically represents the absence of any assessment assessments such as PISA and TIMSS may be low, activity), based on specific recommended indicators for an assessment might still be aligned with (and useful the enabling context, system alignment, and assessment for informing) the overall goals of its education system quality. The indicators are most relevant to examinations or related reforms underway. Indeed, the use of data and large-scale survey assessment activities, but from TIMSS, PIRLS, and PISA to identify drivers of with some modifications, can be applied to classroom performance, and monitor the impact of reforms on assessment. EDUCATION NOTES Table 1. Levels of Assessment System Development Emerging Established Advanced Enabling • No or limited policy framework • Presence of policy framework The same as for Established Context • Few trained staff; high turnover • Training programs/trained staff • Unreliable funding with low turn over + strong focus on: • Unclear institutional structures/ • Stable funding arrangement • Clear institutional structures/ • Assessment for learning arrangement • School-based and classroom assessment System • Assessments not fully aligned • Assessments aligned with learn- • Role of teachers alignment with learning goals, standards, ing goals, standards, curriculum • Innovation and curriculum • Assessments in sync with research-based practices • Assessments out of sync with reforms in other areas reforms in other areas Assessment • Limited awareness or • Awareness and application of quality application of technical or technical or professional professional standards standards Source: M. Clarke, 2011. References AERA (American Educational Research Association), APA (American Psychological Association), and NCME (National Council on Measure- ment in Education). 1999. Standards for Educational and Psychological Testing. Washington, DC: AERA. Black, P., and D. Wiliam. 1998. “Assessment and Classroom Learning.� Assessment in Education: Principles, Policy and Practice, 5(1), 7-73. Clarke, M. 2011. “Framework for Building an Effective Student Assessment System.� READ/SABER Working Paper. Washington, DC: World Bank. Darling-Hammond, L., and L. Wentworth. 2010. “Benchmarking Learning Systems: Student Performance Assessment in International Context.� Stanford Center for Opportunity Policy in Education, Stanford University, CA. Fuhrman, S., and R. Elmore, eds. 1994. The Governance of Curriculum, Alexandria, VA: ASCD. Greaney, V., and T. Kellaghan. 2008. Assessing National Achievement Levels in Education. Washington, DC: World Bank. Hanushek, E., and L. Woessmann. 2007. Education Quality and Economic Growth. Washington, DC: World Bank. Heubert, J., and R. Hauser. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: National Academy Press. Hoxby, C. 2002. “The Cost of Accountability.� NBER Working Paper Series, Vol. w8855. Available at SSRN: http://ssrn.com/abstract=305599. Wolff, L. 2007. “The Costs of Student Assessment in Latin America.� PREAL Working Paper 38. Partnership for Educational Revitalization in the Americas (PREAL), Washington, DC. Education Notes is a series produced by the World Bank to share lessons learned from innovative approaches to improving education practice and policy around the globe. Background work for this piece was done in partnership, with support from the Russia Education Aid for Development (READ) program. For additional information or hard copies, please go to www.worldbank.org/education or contact the Education Advisory Service: eservice@worldbank.org. Author: Marguerite Clarke Photo Credit: Liang Qiang/World Bank