H lth Polic Ev lu tion Guid lin 2025  Health Policy Evaluation Guideline Saudi Health Council Kingdom of Saudi Arabia 2025 ii Acknowledgment Health Policy Evaluation Guideline Acknowledgment This guideline was jointly developed by the Saudi Health Council (SHC) and the World Bank (WB) under the Reimbursable Advisory Services (RAS) program between the Kingdom of Saudi Arabia (KSA) and the WB. It was prepared under the strategic direction of Khalid Almoteiry (SHC) and Volkan Cetinkaya (WB). Core contributions were provided by the SHC team—Shahad Al-Homidi, Ahlam Alshehri, Ali Alqaysi, Rana Saber, Khalid Farhan, Abdulaziz Alrabiah, and Mushabab Al-Asiri; and the WB team—Parviz Ahmadov and Isa Aliyev. Richard Crabbe provided editorial services and Dania Kibbi handled design and production. We are grateful to the National Advisory Team for their technical guidance and oversight throughout the process. The team included Bader Alamer (Health in All Policies General Secretariat); Ahmed Almokharshum (Ministry of Health); Faisal Sunaid (Saudi Food and Drug Authority); Muhannad Almughthim (Saudi Commission for Health Specialties); Dalia Alghamdi (Ministry of Defense); Thamir Alsaiary (Ministry of National Guard – Health Affairs); Haitham Bader (Ministry of Interior); Ziyad Rayes (King Faisal Specialist Hospital and Research Centre); Ziyad Almalki (Ministry of Education); and Razan Aljaser (Council of Health Insurance). We also thank H.E. Dr. Nahar Al-Azemi (Secretary General, SHC); and Michele Gragnolati (Practice Manager for Health, Nutrition, and Population, Middle East and North Africa Region, WB) for their overall support and leadership. This guideline is a technical product of the SHC–World Bank RAS program. It is intended to support institutional capacity-building and the standardization of health policy evaluation practices in KSA. As a living document, it may be further refined over time through implementation experience and stakeholder feedback. Health Policy Evaluation Guideline Table of Contents iii Table of Contents Glossary of key terms vi Foreword ix Executive Summary 1 Introduction 3 What is health policy evaluation? 5 Why evaluate health policies? 7 When to evaluate health policies? 10 What are the types of health policy evaluation? 12 Formative Evaluation 15 What is Formative Evaluation in Health Policy? 15 Why is Formative Evaluation Important in Health Policy? 15 Process Evaluation 17 What is Process Evaluation in Health Policy? 17 Why is Process Evaluation Important in Health Policy? 17 Outcome evaluation 18 What is Outcome Evaluation in Health Policy? 18 Why is Outcome Evaluation Important? 18 Impact evaluation 20 What is Impact Evaluation? 20 Why is Impact Evaluation Important? 20 When Should an Impact Evaluation Be Conducted? 20 Economic evaluation 21 What is an Economic Evaluation? 21 Why is Economic Evaluation Important? 21 How to Conduct Health Policy Evaluations 22 Which Steps to Follow in Conducting Health Policy Evaluations? 24 Step 1: Identify and Engage Stakeholders 24 Step 2: Describe the Health Policy 28 Step 3: Design Evaluation 33 Step 4: Collect Data 44 Step 5: Justify Conclusions 52 Step 6: Use and Disseminate Evaluation Findings 55 iv Table of Contents Health Policy Evaluation Guideline References 60 Annexes 64 Annex 1: Regulatory Impact Analysis 65 What is Regulatory Impact Analysis (RIA)? 65 Why is Regulatory Impact Analysis Important? 65 When is a Regulatory Impact Analysis Performed? 65 Annex 2: Rapid Evidence Assessment 68 Annex 3: Key Questions for Each Evaluation Type (Non-Exhaustive) 69 Annex 4: Sample Data Collection Plan 71 Annex 5: Prioritized Health Status and Health System Indicators 71 Annex 6: Key Assessment Points to Validate Indicators 75 Annex 7: A Sample Informed Consent Template for Health Policy Evaluations 79 Annex 8: Consolidated Evaluator Checklist for the Six-Step Evaluation Process 81 Step 1: Identify and Engage Stakeholders 81 Step 2: Describe the Health Policy 81 Step 3: Design Evaluation 81 Step 4: Data Collection and Analysis 82 Step 5: Justify Conclusions 82 Step 6: Use and Disseminate Evaluation Findings 82 Health Policy Evaluation Guideline Table of Contents v List of Tables Table 1: Key Differences Between Policy Monitoring and Evaluation.......................................................6 Table 2: Common Health Policy Evaluation Types........................................................................................... 13 Table 3: Stakeholder Groups under the policy on “Mandatory Medical Malpractice Insurance for Other Health Practitioners” (sample)........................................................................................ 25 Table 4: Roles of Stakeholders in the Evaluation of the Mandatory Medical Malpractice Insurance Policy for Other Health Practitioners (sample)........................................................................... 26 Table 5: Stakeholder Involvement Plan in the Evaluation of the Mandatory Medical Malpractice Insurance Policy for Other Health Practitioners (sample)................................................. 26 Table 6: Key Health Policy Components............................................................................................................... 28 Table 7: Commonly Used Research Methods..................................................................................................... 33 Table 8: Types of Theory-Based Evaluation Methods.................................................................................... 37 Table 9: Detailed Overview of Experimental and Quasi-Experimental Methods...............................41 Table 10: Overview of Commonly Used Value-for-Money Evaluation Methods................................ 42 Table 11: Overview of Commonly Used Evidence Synthesis Methods.................................................... 44 Table 12: Illustrative Indicators Developed from the Evaluation Findings of the Mandatory Medical Malpractice Insurance Policy for Other Health Practitioners................................................... 45 Table 13: Acceptability Check of Evaluation Findings................................................................................... 54 Table 14: Dissemination Plan (Sample case from the Mandatory Medical Malpractice Insurance for Other Health Practitioners Policy).............................................................................................58 List of Figures Figure 1: Types of Evaluation and their Interactions with Policy Components..................................14 Figure 2: Overview of the Steps and Standards of Health Policy Evaluation.................................... 23 Figure 3: Stakeholder mapping for the policy on “Mandatory Medical Malpractice Insurance for Other Health Practitioners” (sample)........................................................................................ 27 Figure 4: Logical Framework Template.................................................................................................................30 Figure 5: Illustrative Logical Framework Developed from the Evaluation Findings of the Mandatory Medical Malpractice Insurance Policy for Other Health Practitioners............ 32 Figure 6: Choosing Experimental and Quasi-Experimental Methods....................................................40 Figure 7: Selecting Evidence Synthesis Methods............................................................................................. 43 Figure 8: Illustrative Data Flow and Mapping Visualization Developed from the Evaluation Findings of the Mandatory Medical Malpractice Insurance Policy for Health Practitioners...... 47 Figure 9: Steps in Ensuring Data Quality............................................................................................................ 49 vi Glossary of key terms Health Policy Evaluation Guideline Glossary of key terms This section presents key terms and definitions used throughout this Health Policy Evaluation Guideline. It aims to establish a shared understanding among policy makers, evaluators, researchers, and practitioners involved in designing, implementing, or using health policy evaluations in the Kingdom of Saudi Arabia. Terms Definition Adaptive Evaluation A flexible evaluation approach that evolves in response to real-time implementation challenges and emerging findings. Assumptions The conditions or factors believed to hold true in the Theory of Change or logic model, which if incorrect, can affect the success or validity of the evaluation. Attribution vs. Attribution assigns direct causal responsibility to a policy for observed changes. Contribution Contribution recognizes the policy as one of several influencing factors in a complex system, without exclusive causal claims. Baseline Data Data collected before policy implementation, serving as a reference point to assess changes and policy impact. Benchmarking Comparing policy performance against standards, past performance, or peer jurisdictions to assess effectiveness and identify improvement opportunities. Causal Attribution Techniques used to determine whether observed effects can be causally linked to a policy, Methods including randomized control trials, difference-in-differences, and instrumental variables. Causal Chain A logical sequence showing how policy inputs and activities lead to outputs, outcomes, and impacts. Often visualized through logic models. Cost-Effectiveness A form of economic evaluation that compares the relative costs and outcomes of different Analysis (CEA) interventions to assess value for money. Counterfactual An estimate of what would have occurred in the absence of the policy, used in causal attribution and impact evaluation designs. Data Governance The framework of rules, procedures, and ethical principles that guide the collection, management, protection, and use of data in health evaluations. Disability-Adjusted Life A composite measure of disease burden that combines years of life lost due to premature Year (DALY) death (YLL) and years lived with disability (YLD). One DALY represents one lost year of healthy life. Disaggregation The breakdown of evaluation data by subgroups – (for example, age, gender, region) to assess differential effects and promote equity. Equity Assessment The analysis of how a policy affects different population groups, particularly vulnerable or underserved populations, to promote fairness. Ethical Clearance Approval from an ethics review body that ensures that evaluations meet national and international human subject protection standards. Evaluation A systematic, objective, and time-bound process of assessing a health policy’s design, implementation, and results. It provides evidence on whether the policy was appropriately formulated, implemented as intended, and effective in achieving its intended outcomes or impacts. Evaluation Methods The tools and techniques used to gather evidence in evaluations, including qualitative methods such as interviews; quantitative methods, for example, surveys; or mixed methods. Evaluation Type The specific focus of an evaluation, including regulatory impact analysis, process, outcome, impact, or economic evaluation, aligned with different stages of the policy lifecycle. Evaluation Utilization The process of applying evaluation findings to improve policy design, implementation, scaling, or de-implementation decisions. External Factors Influences outside the policy – for example, economic shocks, pandemics, or parallel reforms that may affect its outcomes or impacts – which must be considered in evaluation design. Health Policy Evaluation Guideline Glossary of key terms vii Field Testing (of Piloting indicators in actual settings to evaluate their clarity, feasibility, and reliability before Indicators) full implementation. Health Policy Decisions, plans, or actions undertaken by governments or institutions to achieve public health objectives. These may be preventive, curative, or systemic and operate at various levels of the health system. Health Policy Cycle The cyclical process of initiating, designing, implementing, evaluating, and revising policies to adapt to emerging health system needs. Health Policy A systematic and objective assessment of a health policy’s design, implementation, and Evaluation outcomes, aimed at determining its relevance, effectiveness, efficiency, impact, and sustainability. Health Technology A comprehensive evaluation of the clinical, economic, and social aspects of health Assessment (HTA) technologies to guide policy decisions. Impact The long-term, systemic changes a policy aims to produce, such as reduced mortality, improved health equity, or increased life expectancy. Impact is influenced by multiple external factors. Implementation Contextual, organizational, or systemic obstacles that prevent a policy from being delivered Barriers as intended – for example, lack of staff, legal constraints, or technology failures. Indicators (Process, Measurable variables used to monitor and evaluate different aspects of a policy’s Output, Outcome, performance across the results chain. Impact) Informed Consent A formal process ensuring that evaluation participants understand and voluntarily agree to participate in line with ethical standards. Inputs The financial, human, and material resources mobilized to support policy implementation – for example, funding, staff, and equipment. Inputs are the starting point of the logic model. Logic Model A visual or narrative tool that maps the relationships between inputs, activities, outputs, outcomes, and impacts, guiding evaluation design. Logical Framework A structured planning and evaluation methodology using a hierarchy of objectives, Approach (LFA) indicators, assumptions, and means of verification. Mixed-Methods An approach that integrates both qualitative and quantitative data collection and analysis to Evaluation provide a more complete understanding of policy performance and impact. Monitoring A continuous and routine process of tracking a policy’s inputs, activities, and outputs using predefined indicators. It supports operational oversight, identifies implementation issues, and enables timely adjustments. Outcomes The short- to medium-term effects of the policy, such as increased coverage, improved behaviors, or better quality of care. These result from outputs and precede long-term impact. Outputs The immediate results of policy activities, such as services delivered, training conducted, or materials produced. Outputs are tangible and within the control of the implementing agency. Performance Established benchmarks or thresholds against which actual policy results are assessed. Standards Policy Impact A type of evaluation that measures long-term, system-level, and population-wide effects of a Assessment policy, including unintended consequences. Policy Implementation The extent to which a policy is delivered as intended, critical for understanding deviations Fidelity that may affect results. Power Calculation A statistical process used to determine the appropriate sample size needed to detect a policy effect with a given level of confidence. Process vs. Structural Process indicators measure actions taken, such as services delivered; structural indicators Indicators assess system capacity, for example, staff and infrastructure. Quality-Adjusted Life A metric that combines life expectancy and quality of life to assess the value of health Year (QALY) interventions in economic evaluations. viii Glossary of key terms Health Policy Evaluation Guideline Quasi-Experimental An impact evaluation approach that compares intervention and non-intervention groups Design without randomization, used when Randomized Controlled Trials (RCTs) are not feasible. Real-World Evidence Evidence derived from routine clinical or administrative data (rather than trials), increasingly (RWE) used for evaluation in applied settings. Realist Synthesis An evaluation approach that explores how and why policies work in specific contexts, focusing on context–mechanism–outcome (CMO) configurations. Reliability The degree to which evaluation methods yield consistent and replicable results across repeated applications or evaluations. Sensitivity Analysis A technique used to test how the results of an evaluation change when key assumptions, inputs, or parameters are varied. Common in economic evaluation. Stakeholder The process of identifying and involving individuals or groups affected by, or influential to, Engagement the policy and its evaluation to ensure relevance, trust, and use of findings. Target Groups The individuals, populations, or health system actors for whom the health policy is intended—whether as direct beneficiaries (e.g., patients, communities) or as implementing agents (e.g., healthcare providers, facilities, or administrative bodies). Theory of Change A structured explanation of how and why a policy is expected to achieve its goals, outlining (ToC) assumptions and causal linkages. Triangulation A method of validation that uses multiple sources, methods, or perspectives to increase the credibility and reliability of evaluation findings. Type I and Type II Statistical errors in hypothesis testing: Type I error occurs when an evaluation incorrectly Errors concludes that a policy had an effect (false positive). Type II error occurs when it fails to detect a real policy effect (false negative). Use Case (in A defined policy scenario or decision-making context in which a specific evaluation method, evaluation) tool, or framework is applied to address an operational or strategic need. Validity (Internal and This refers to the extent to which the observed outcomes can be attributed to the policy, free External) from confounding factors. External validity refers to the degree to which results can be generalized to other settings, populations, or contexts. Health Policy Evaluation Guideline Foreword ix Foreword In line with the Kingdom of Saudi Arabia’s Vision 2030 and the national commitment to building a high- performing, transparent, and people-centered health system, the Saudi Health Council is proud to present this Health Policy Evaluation Guideline. As the health sector undergoes transformative reforms, the ability to assess the design, implementation, and outcomes of health policies with rigor and consistency becomes more critical than ever. This guideline serves as a foundational tool to support evidence-informed policymaking. It offers a practical and structured approach for evaluating health policies across their lifecycle—from early planning and implementation to long-term impact assessment. By embedding evaluation within the health policy process, we can ensure that decisions are data-driven, resources are used efficiently, and programs are continuously refined to deliver the best possible outcomes for the Saudi population. The guideline draws on global best practices while remaining firmly grounded in the Saudi context. It provides clarity on the types of evaluations that can be conducted, the steps to follow, and the standards that must be upheld to ensure transparency, accuracy, and accountability. It also highlights the importance of inclusive stakeholder engagement and the use of logic models to clarify assumptions and causal pathways. We encourage all policy makers, public health professionals, healthcare institutions, and partners to use this guideline as a standard reference when designing, implementing, or evaluating health policies. By doing so, we move one step closer to realizing a health system that is not only efficient and sustainable but also equitable and responsive to the evolving needs of our people. On behalf of the Saudi Health Council, I commend all those who contributed to this important work and look forward to its widespread adoption across the health sector. Dr. Nahar Al-Azemi, MD Secretary General Saudi Health Council x Executive Summary Health Policy Evaluation Guideline Executive Summary Health Policy Evaluation Guideline Executive Summary 1 The Health Policy Evaluation Guideline for the Kingdom of Saudi Arabia (KSA) is a national reference developed by the Saudi Health Council (SHC) to institutionalize rigorous, context-specific, and systematic evaluation practices across the health sector. Developed in alignment with Saudi Arabia’s Vision 2030 and its commitment to transparent, evidence-informed governance, the guideline is designed to support policy makers, regulators, evaluators, and implementing partners in optimizing the design, implementation, and impact of health policies. It offers a unified approach that integrates the best international practices with local realities, aiming to enhance accountability, resource efficiency, and the effectiveness of policy interventions. This guideline defines health policy evaluation as the systematic and objective assessment of a policy’s design, implementation, and results. It recognizes evaluation not as a stand-alone technical activity, but as a strategic function that generates evidence for learning, strengthens public accountability, informs iterative improvements, and enhances the legitimacy and performance of health policy decisions. By embedding evaluation throughout the policy life cycle, the guideline enables stakeholders to assess what works, under what conditions, and for whom. The guideline identifies five core criteria for evaluating health policies: 1. Relevance Assesses alignment with current health needs and priorities, ensuring that the policy addresses pressing challenges within KSA’s health, social, economic, and cultural context. 2. Effectiveness Measures the achievement of intended goals, analyzing outcomes and identifying success factors or barriers. 3. Efficiency Evaluates use of resources – for example, financial, human, time – against outcomes, ensuring cost-effectiveness and value for money. 4. Sustainability Examines the policy’s capacity to maintain benefits over time, adapting to changing conditions and resource availability. 5. Impact Assesses broader effects—intended and unintended—on population health, equity, and health system performance, as well as associated social and economic outcomes, including changes in social determinants of health and economic burden or productivity. To operationalize evaluation across the health policy life cycle, the guideline presents five evaluation approaches: 1. Formative evaluation Conducted during policy design or pre-implementation, often before large-scale rollout, to assess feasibility, acceptability, and appropriateness. 2. Process evaluation Examines how a policy is implemented and whether it is being delivered as intended. 3. Outcome evaluation Measures whether the policy has achieved its short- to medium-term objectives and delivered intended services to the target population. 4. Impact evaluation Assesses whether the policy produced long-term, systemic changes in health status or equity, using rigorous causal attribution methods. 5. Economic evaluation Evaluates cost-effectiveness, cost-benefit, or cost-utility to ensure that resources are used efficiently and equitably. 2 Executive Summary Health Policy Evaluation Guideline The guideline introduces a six-step process for conducting health policy evaluations, adapted from international frameworks and best practices, and tailored to the context in KSA: 1. Identify and engage Mapping and involving actors who are affected by, implement, or will stakeholders use the evaluation results, ensuring relevance and ownership. 2. Describe the health policy Articulating the policy’s goals, target groups,1 resources, activities, and intended outcomes using logic models. 3. Design the evaluation Selecting appropriate evaluation types, formulating questions, and choosing methodological approaches—quantitative, qualitative, or mixed. 4. Collect data Gathering credible and context-relevant data through routine systems, surveys, or administrative records. 5. Analyze and justify conclusions Synthesizing findings through triangulation to draw reliable, evidence- based conclusions. 6. Use and disseminate findings Ensuring that evaluation results inform policy adjustments, public accountability, and institutional learning. Throughout this process, the guideline emphasizes adherence to four evaluation standards: 1. Utility Serving the information needs of decision-makers and stakeholders. 2. Feasibility Being realistic given time, budget, and data availability. 3. Propriety Upholding ethical standards and stakeholder rights. 4. Accuracy Ensuring findings are technically sound, valid, and trustworthy. To ensure that the guideline is grounded in real-world experience, the Mandatory Medical Malpractice Insurance Policy for Other Health Practitioners was used as a sample. The data and findings presented in this guideline are illustrative and based on assumptions made by the SHC team, rather than actual data. This policy, which expanded malpractice insurance coverage to 18 non-physician specialties, served not merely as a case study but as a foundational reference throughout the guideline. The evaluation findings were used to structure and validate the six-step process itself—informing the development of stakeholder engagement typologies, logic model construction, indicator design, data collection procedures, and approaches for analyzing and using findings. By codifying a unified, evidence-based approach to health policy evaluation and demonstrating its application through a nationally relevant policy, this guideline equips KSA’s health institutions to assess what works, under what conditions, and for whom. It establishes evaluation as a routine and strategic component of good governance—contributing to more effective health policies, better outcomes for the population, and a resilient, more accountable health system. 1In this guideline, “target groups” refer to the individuals, populations, or health system actors for whom the health policy is intended, whether as direct beneficiaries – for example, patients and communities – or as implementing agents, such as healthcare providers, facilities, or administrative bodies. Health Policy Evaluation Guideline Introduction 3 Introduction 4 Introduction Health Policy Evaluation Guideline Health policy evaluation is essential for ensuring that the Kingdom of Saudi Arabia’s (KSA) health policies are well-designed, effectively implemented, and aligned with national priorities, including Vision 2030 and international commitments. Vision 2030 outlines an ambitious national transformation agenda, under which the Kingdom is undertaking major healthcare reforms that emphasize a data-driven approach—leveraging performance indicators and measurable outcomes to support continuous monitoring, evaluation, and accountability.2,3 This strong foundation enables evidence-based decision-making and fosters sustained policy improvement. This guideline aims to provide a structured framework for the systematic assessment of health policies and to equip policy makers, evaluators, and stakeholders with practical tools to enhance policy impact and societal well-being. It is designed to support KSA in addressing its unique health system challenges while promoting efficient resource use and accountability in the delivery of intended health outcomes. This guideline applies to a broad spectrum of health policies, including public health initiatives, healthcare regulations, and interventions aimed at improving population health. It offers practical strategies to evaluate health policies across their life cycle—from initial design and implementation to mid- or long-term outcomes—is organized into the following sections: 1. What is a health policy evaluation? Defines the concept and underscores its significance in the KSA context. 2. Why evaluate health policies? Explores the dual purposes of learning and accountability that drive the evaluation process. 3. When to evaluate health policies? Provides guidance on optimal timing across the policy lifecycle. 4. What are the main types of health policy Introduces evaluation types and their applications. evaluation? 5. How to conduct health policy Details step-by-step methodologies tailored to KSA. evaluations in practice? Designed for policy makers, public health professionals, evaluators, and academics in KSA, this guideline delivers actionable insights and methodologies. It contributes to the development of robust, evidence-informed health policies that advance KSA’s national health objectives. 2KSA Vision 2030, Health Sector Transformation Program. https://www.vision2030.gov.sa/en/explore/programs/health-sector-transformation-program. 3Ministry of Health. Healthcare Transformation Strategy. Retrieved from https://www.moh.gov.sa/en/Ministry/vro/Documents/Healthcare- Transformation-Strategy.pdf. Health Policy Evaluation Guideline What is health policy evaluation? 5 What is health policy evaluation? 6 What is health policy evaluation? Health Policy Evaluation Guideline In this guideline, “health policy” broadly encompasses organized actions, initiatives, or interventions—whether preventive, curative, or systemic—aimed at improving public health or healthcare at local or national levels in KSA. To ensure clarity and consistency, this guideline establishes a standardized vocabulary for evaluators, healthcare professionals, decision-makers, academicians, and the public, fostering transparent communication throughout the evaluation process. Based on the international best practices, this guideline defines the health policy evaluation as “a structured and objective assessment of an ongoing or completed health policy, focusing on its design, implementation, and results. The aim is to evaluate its relevance, efficiency, effectiveness, impact, and sustainability, as well as its overall significance.”4 As defined above, this guideline emphasizes evaluating five key aspects of a health policy:5 1. Relevance: Assesses alignment with current health needs and priorities, ensuring the policy addresses pressing challenges within KSA’s health, social, economic, and cultural context. 2. Effectiveness: Measures the achievement of intended goals, analyzing outcomes and identifying success factors or barriers. 3. Efficiency: Evaluates resource use (e.g., financial, human, time) against outcomes, ensuring cost- effectiveness and value for money. 4. Sustainability: Examines the policy’s capacity to maintain benefits over time, adapting to changing conditions and resource availability. 5. Impact: Assesses broader effects—intended and unintended—on population health, equity, and health system performance, as well as associated social and economic outcomes, including changes in social determinants of health and economic burden or productivity. To ensure clarity for evaluators and other stakeholders, this guideline also differentiates the concepts of monitoring and evaluation, which are two complementary yet distinct activities: 1. Monitoring: An ongoing process that tracks policy performance using predefined indicators. It provides real- time data for project managers and stakeholders to identify challenges, adjust health policy, and ensure accountability in resource use. 2. Evaluation: An episodic, systematic assessment of a policy’s design, implementation, and outcomes. It analyzes relevance, efficiency, effectiveness, impact, and sustainability to inform strategic planning and policy refinement. To guide evaluators in applying monitoring and evaluation, Table 1 summarizes the differences. Table 1 Key Differences Between Policy Monitoring and Evaluation Criteria Policy Monitoring Policy Evaluation Nature Ongoing, operation-focused oversight Episodic, strategy-focused assessment System Suitability Tracks broad, anticipated issues Addresses specific policy questions Data Collection Routine, predefined measures Customized methods for evaluation goals Attribution Links actions to effects via indicators Analyzes causal links systematically direct links to effects Resource Allocation Uses existing infrastructure Requires dedicated resources Information Use Real-time data for management Findings for future policy design 6 Source: Adapted from OECD 2020, and World Bank 2009. 4OECD. 2020. Improving Governance with Policy Evaluation: Lessons From Country Experiences. OECD Public Governance Reviews. Paris: OECD Publishing. https://doi.org/10.1787/89b1577d-en. 5 OECD. 2020. Improving Governance with Policy Evaluation: Lessons From Country Experiences. 6Görgens, M., and Jody Zall Kusek. 2009. Making Monitoring and Evaluation Systems Work: A Capacity Development Toolkit. Washington, DC: World Bank. https://hdl.handle.net/10986/2702. Health Policy Evaluation Guideline Why evaluate health policies? 7 Why evaluate health policies? 8 Why evaluate health policies? Health Policy Evaluation Guideline Health policy evaluation serves two core purposes – learning and accountability7 – which are essential to ensure policy’s effectiveness and efficiency, as well as its alignment with KSA strategic health goals incorporated in the “Vision 2030” document. Under the “learning” function, an evaluation generates evidence to refine policies, drives continuous improvement, and guides future initiatives. The learning function contributes to the following key aspects in health policy design and implementation: • Managing risks assesses whether health policies perform as intended, and reduces implementation uncertainties. • Enhancing performance identifies the opportunities to optimize health policies through using early findings to boost outcomes. • Guiding decisions informs whether to sustain, adjust, or end policies, and improves resource allocation. • Shaping future policies reveals what works, for whom, and why, preventing past errors and strengthening program design. Under the “accountability” function, evaluations ensure that people implement policies more responsibly, deliver value, maintain trust, and give legitimacy to policies. The accountability function specifically contributes to the following key aspects in a health policy design and implementation: • Transparency in spending verifies if the public funds are used efficiently, and benefits reach citizens. • Regulatory compliance: an evaluation confirms policy’s adherence to the national laws and regulations, thereby reinforcing public confidence. • Public reporting openly shares the findings, which demonstrates the success or failure of a policy. In practice, evaluations are driven by the following main reasons: • Formal requirements: Legal mandates, as in the case of the Medical Malpractice Insurance Policy in KSA, or regulatory roles as in the case of the Saudi Health Council’s mandate to review policies. • Cabinet oversight: Guidelines and decisions mandate assessing the financial, economic, and social impacts of policies. • Government priorities: Policies identified as government priorities in national plans or programs – for example, Health Sector Transformation Program (HSTP) initiatives. • International commitments: International agreements or commitments such as the Sustainable Development Goals. • Selective evaluations: Only handle certain policies, due to time and cost constraints – for example, the Medical Malpractice Insurance Policy. The following key criteria could guide decision-makers or evaluators to prioritize the policies to be evaluated: » Relevance to strategic goals: Policies that contribute directly to overarching frameworks—such as the KSA Vision 2030 or the Health Sector Transformation Program—should be prioritized because evaluating them can provide critical insights into progress toward these high-level goals. Evaluating such policies ensures that resources are directed toward what matters most at the strategic level. » Magnitude of impact: The extent to which a policy affects the population is a key consideration. Policies with wide-reaching effects, or those targeting high-need or vulnerable groups, are more likely to generate meaningful outcomes. Evaluating such policies provides information on how they are improving lives and whether they are achieving their intended impact. » Level of investment: High-cost or resource-intensive policies warrant evaluation to ensure that investments are yielding expected results. This includes financial investments, human resources, infrastructure, or administrative commitment. Evaluation helps to assess cost-effectiveness and informs decisions on continuing or scaling back such policies. 7HM Treasury. 2020. Magenta Book: Guidance for Evaluation, pages. 9–10. https://assets.publishing.service.gov.uk/media/5e96cab9d3bf7f412b2264b1/ HMT_Magenta_Book.pdf. Health Policy Evaluation Guideline Why evaluate health policies? 9 » Stage of implementation: Evaluations are most valuable when the policy is at a stage where results can influence implementation—such as during a pilot phase, at mid-term, or when preparing for scale-up. Evaluating too early may not yield results, while evaluating too late may limit the potential for improvement. » Feasibility of evaluation: Even important policies may be deprioritized if evaluation is not feasible. Factors such as the availability of data, access to stakeholders, and appropriateness of methods must be considered. Evaluation should only proceed when it can be conducted rigorously and within the available timeframe and resources. » Risks and controversy: Policies that are politically sensitive, under scrutiny, or potentially associated with adverse outcomes should be prioritized for evaluation. These evaluations can enhance transparency, build public trust, and provide clarity around the policy’s effects. » Timing and windows of opportunity: Evaluation should be aligned with key decision-making moments, such as budget planning, policy renewal, or strategic reviews. Prioritizing evaluations that can inform upcoming decisions ensures that findings are timely and actionable. 10 When to evaluate health policies? Health Policy Evaluation Guideline When to evaluate health policies? Health Policy Evaluation Guideline When to evaluate health policies? 11 Evaluation occurs across the following stages, each targeting a distinct component of the health policy lifecycle: 1. Before a policy is fully formed, it is important to use evaluation to help shape its design and how it will be implemented. Using existing evaluation evidence or working through the Theory of Change of the health policy, typically developed during the policy design stage.8 1.1. Input level: Assesses whether a policy is feasible, appropriate, and acceptable before it is fully implemented. It can include process and outcome measures. 2. During implementation provides the greatest opportunity for the evaluation to influence decisions and to help ensure a policy can realize its intended benefits. During implementation, evaluations will typically look at gaining evidence about the efficacy of the policy’s design, its implementation, and emerging outcomes. 2.1. Process level: Assess the actions and activities undertaken during the implementation of a policy. This involves monitoring the efficiency of healthcare delivery processes and the type or timeliness of a policy. 2.2 Output level: Measure the initial results directly associated with the healthcare services delivered by a policy. This includes evaluating patient volumes, the healthcare benefits provided, public health actions, or other service delivery outputs that the policy aims to produce. 3. After a policy has been implemented, the entire policy can be examined looking at outcomes and impacts. 3.1 Outcome level: Analyze a policy’s medium-term effects on health outcomes. This could include assessing improvements in access to health services, reductions in disease incidence, changes in health behaviors, or enhancements in public health knowledge attributable to the policy. 3.2 Impact level: Examine the long-term impact of policy on population health. This includes evaluating the overall effectiveness of the policy in achieving public health goals—such as reducing mortality rates, improving quality of life, or promoting health equity across population groups—as well as its broader social and economic effects, including changes in social determinants of health, economic burden, and productivity. 8For more details on the Theory of Change in policy development, see The Health Policy Maker’s Manual: Integrating Data and Evidence. 2024. https://shc. gov.sa/Arabic/Documents/The%20Health%20Policy%20Makers%20Manual%20-%20KSA%20-2024.pdf. 12 What are the types of health policy evaluation? Health Policy Evaluation Guideline What are the types of health policy evaluation? Health Policy Evaluation Guideline What are the types of health policy evaluation? 13 A type of health policy evaluation refers to a specific approach used to assess various aspects of a health policy. The types of evaluation are tailored to address targeted questions9 about the policy’s design, implementation, outcomes, or overall impact. The selection of an evaluation type depends on the level of the policy evaluated and the specific objectives of the evaluation. Each type of evaluation provides unique insights: enables policy makers and evaluators to determine how effectively a policy is functioning; assesses its effects; and identifies areas for improvement. These insights are crucial for informed decision-making and ensuring that health policies achieve their intended goals. In some cases, particularly during policy formulation, Regulatory Impact Assessment (RIA) may be used as an ex-ante evaluation tool to assess the potential health, social, and economic impacts of proposed policies. RIA supports evidence-informed policy making by systematically analyzing policy options and their likely consequences—see Annex 1 for more on RIA. The common types of health policy evaluation, along with what they show, when to use them, their utility, and illustrative examples are summarized in Table 2: Table 2 Common Health Policy Evaluation Types Evaluation What it Shows When to Use Why it is Useful Examples Type Formative Whether the proposed During the development Allows adjustments to Assessing if the health Evaluation health policy is feasible, of a new health policy. be made before full policy can be appropriate, and implementation begins. implemented as acceptable to the target When an existing health planned. population before it is policy is being adapted fully implemented. for a different setting. Assessing if the health policy will be accepted When the focus is on by the target groups. the who, what, when, where, and how. Process The extent to which the Beginning of health Provides early Assessing reach of Evaluation health policy is being policy implementation. identification of activities to target implemented as implementation gaps or groups. intended. During implementation challenges. of an existing health Assessing resource policy. Allows for real-time mobilization/allocation. adjustments to improve policy execution fidelity. Assessing threshold level of participation or exposure to the health policy. Assessing process – for example, procedures, roles, or timeliness for implementing the health policy. Outcome The degree to which the After the health policy Identifies whether the Assessing change in Evaluation health policy has has been implemented health policy is knowledge, attitudes, achieved its intended with the target groups. achieving its stated and behaviors among outcomes or results. objectives. target groups. Does not determine Assessing change in causality, only whether policies, regulations, or outcomes have social norms in target occurred. groups. Assessing change in incidence, mortality, and morbidity. 9 See Annex 2 for details. 14 What are the types of health policy evaluation? Health Policy Evaluation Guideline Evaluation What it Shows When to Use Why it is Useful Examples Type Economic Examines health policy At the beginning of Helps understand the Assessing if the value of Evaluation effects relative to the health policy cost of implementing a the health policy’s costs of the health implementation. health policy and can outcomes exceeds the policy. assess policy effects cost. During implementation relative to the cost to of an existing health produce them. policy. After completion of the health policy intervention. Impact Compares the At the end of a health Provides evidence of Assessing the extent to Evaluation outcomes of a health policy. causal attribution to which the outcomes policy to estimates of inform policy and can be related to the what the outcomes At defined intervals investment decisions. health policy as would have been during health policy opposed to other without it. Usually seeks implementation. external factors- to determine whether (attribution/causality). activities caused the observed outcomes. Source: Adapted from CDC’s Program Evaluation Framework Action Guide, CDC 2024. Figure 1 visualizes the types of evaluations and their interconnection with the stages of health policy components. Figure 1 Types of Evaluation and their Interactions with Policy Components10 Outcome evaluation Formative evaluation Process evaluation Needs/ Inputs/ Processes/ Outputs Outcomes Impacts Objectives Resources Activities Economic Evaluation / Cost-Effectiveness Economic Evaluation / Cost-Benefit Impact evaluation Source: Adapted from Improving Governance with Policy Evaluation, OECD 2020. 10This diagram illustrates key evaluation levels (Input, Process, Output, Outcome, Impact) but is not exhaustive; additional evaluation types and nuances may apply to specific health policies in KSA. Health Policy Evaluation Guideline What are the types of health policy evaluation? 15 Formative Evaluation What is Formative Evaluation in Health Policy? Formative evaluation assesses a health policy during its design or early development phase to determine whether it is feasible, appropriate, and acceptable for the intended target groups and context.11 Unlike process evaluation, which focuses on how a policy is implemented, formative evaluation is conducted before full-scale implementation begins. It aims to strengthen the policy by identifying design flaws, contextual barriers, and stakeholder concerns to enable refinement before nationwide rollout. This type of evaluation is especially valuable during pilot testing, adaptation to new settings, or early-stage policy development. Why is Formative Evaluation Important in Health Policy? Formative evaluation improves the design and contextual fit of a health policy before full-scale implementation. It enables stakeholders to: 1. Test feasibility Determine whether the policy can be implemented given current infrastructure, governance, and workforce capacity. 2. Assess acceptability Evaluate whether the target groups and implementers consider the policy relevant and appropriate. 3. Identify risks and gaps Detect design weaknesses, contextual limitations, or stakeholder concerns in advance. 4. Support adaptation Allow timely refinement to improve alignment with system needs and increase effectiveness. 5. Increase implementation By resolving issues early, formative evaluation enhances the likelihood success of smooth and effective rollout. Case Study Formative Evaluation of Integrating Routine Screening for Opioid Use Disorder into Primary Care Settings Overview: This case study presents a formative evaluation of routine opioid use disorder (OUD) screening implementation across ten primary care clinics in the United States. Conducted from July 2020 to July 2021, the evaluation took place within a larger multi-site randomized controlled trial (RCT) assessing the effectiveness of the Collaborative Care Model (CoCM) for patients with co-occurring OUD and mental health conditions. While the broader trial used randomization to evaluate clinical outcomes, the formative evaluation was qualitative and non-randomized, aiming to capture real-world implementation experiences, adaptations, and challenges during early rollout. Policy Context: Despite updated recommendations from the U.S. Preventive Services Task Force supporting universal screening for unhealthy drug use, routine OUD screening remains rare in primary care. Challenges include concerns about staff capacity, stigma, unclear workflows, and lack of comfort addressing positive screens. At the same time, fewer than 21% of individuals diagnosed with OUD receive medications for opioid use disorder (MOUD), underscoring the urgent need to expand access through primary care. This initiative sought to address these issues by integrating OUD screening into CoCM and evaluating its early implementation. Objectives of the Formative Evaluation: The formative evaluation aimed to: • Document how population-based OUD screening was implemented across 10 primary care clinics. • Identify common challenges and contextual barriers encountered during early implementation. • Explore emerging strategies and adaptations that supported feasibility and workflow integration. • Inform future implementation support efforts by analyzing real-world experiences and fidelity data. 11National Academies of Sciences, Engineering, and Medicine. 2023. Review of four CARA programs and preparing for future evaluations. Washington, DC: The National Academies Press. https://doi.org/10.17226/26831 16 What are the types of health policy evaluation? Health Policy Evaluation Guideline Methodology: A qualitative formative evaluation was conducted using a Rapid Assessment Process, guided by the Consolidated Framework for Implementation Research (CFIR). Data Sources included: • 90 structured observation summaries from meetings between clinics and AIMS Center practice facilitators • 59 summaries from internal facilitator debriefings • 10 structured fidelity assessments with clinic teams • The research team used matrixed data displays and iterative team discussions to identify patterns across clinics, guided by the CFIR’s five domains (intervention characteristics, inner setting, outer setting, individual characteristics, and implementation process). Key Findings: The following cross-cutting barriers were identified: • Clinics faced uncertainty about who should be screened and at what frequency, especially across different visit types and patient populations. • Staff and patients found the screening tool difficult to use and interpret, leading to inconsistent implementation. • Many staff were unsure how to introduce the screening or respond to positive results, especially in the absence of a warm handoff. • Screening was often missed due to lack of clarity on roles, especially when staff shortages disrupted communication between medical assistants and providers. • Frequent staff changes and competing demands reduced screening consistency and implementation momentum. • Few patients screened positive, leading some staff to question the value of the screening effort. • Internalized and structural stigma toward opioid use reduced patient engagement and staff confidence in addressing substance use. Promising Implementation Strategies: Through fidelity assessments and observation of implementation meetings, the formative evaluation identified several strategies clinics developed to improve screening feasibility: • Implementing screening for all patients at every visit to simplify staff decision-making. • Using patient portals and Electronic Health Records alerts to standardize and track screening. • Equipping staff with language and confidence to approach patients sensitively. • Adapting workflows with continuous feedback from facilitators and internal teams. • Updating clinic policies and publicly signaling that effective medications are available to normalize treatment and reduce patient hesitation. Conclusion and Implications: Formative evaluation was crucial for identifying real-time barriers and informing adaptive policy strategies during early implementation experiences. The formative evaluation demonstrated that routine OUD screening in primary care is feasible but requires structured support, clear workflows, and continuous adaptation. Key barriers included tool complexity, staff uncertainty, workflow inconsistencies, and stigma. Clinics that employed universal screening, digital tools, and team-based planning saw improved consistency and acceptability. Source: Austin, Elizabeth J. et al. 2023. “Integrating Routine Screening for Opioid Use Disorder into Primary Care Settings: Experiences from a National Cohort of Clinics.” Journal of General Internal Medicine 38 (2): 332-340. https://doi.org/10.1007/s11606-022-07675-2 Health Policy Evaluation Guideline What are the types of health policy evaluation? 17 Process Evaluation What is Process Evaluation in Health Policy? Process evaluation examines how a health policy is implemented, focusing on its delivery dynamics rather than its outcomes.12 It analyzes the execution process, the interaction of policy components, and operational strengths or weaknesses such as funding delays or staff training gaps to ensure effective functioning. By offering actionable insights, it refines implementation and aligns policies with their intended design. Why is Process Evaluation Important in Health Policy? Process evaluation helps to ensure that policies are implemented as intended. It provides insights that allow stakeholders to: 1. Assess implementation fidelity Ensure that the policy is delivered as planned, preserving its core objectives. 2. Identify barriers and facilitators Understand what supports or hinders implementation, such as resource constraints or stakeholder dynamics. 3. Enhance policy quality Use feedback to improve implementation over time. 4. Inform decision-making Provide real-time data to guide course corrections and effective resource use.  rocess Evaluation of the Implementation and Delivery of Nurse-Family Partnership in British Columbia, Case Study P Canada Overview: The Nurse-Family Partnership (NFP) program, delivered across five regional health authorities in British Columbia (BC), Canada (2013–2018), provided structured home visits by public health nurses to socioeconomically disadvantaged, first-time pregnant and parenting girls and young women. As part of the British Columbia Healthy Connections Project (BCHCP), a comprehensive process evaluation was conducted to explore how NFP was implemented, adapted, and experienced in real-world public health systems. This evaluation illustrates how process evaluation can assess fidelity, reach, supervision, acceptability, and contextual dynamics in complex interventions. Policy Context: Initiated in response to provincial child development and health equity goals, NFP was integrated into BC’s public health services and aligned with the Ministries of Health and Children & Family Development. With variation across urban and remote geographies, the policy required tailored delivery by experienced nurses, supported by interministerial and academic collaboration. NFP aimed to improve pregnancy outcomes, child development, and maternal life-course. Objectives of the Process Evaluation: The evaluation aimed to: • Assess fidelity to NFP core model elements. • Measure reach, dose delivered/received, and program participation. • Explore program acceptability among public health nurses (PHNs), supervisors, and managers. • Examine implementation barriers, workforce dynamics, and retention. • Analyze reflective supervision practices and team structures. • Document adaptations in rural, remote, and high-adversity contexts. • Describe how nurses supported clients facing intimate partner violence (IPV), mental health, or child protection challenges. 12Grant, A., Carol Bugge, and Mary Wells. 2020. “Designing process evaluations using case study to explore the context of complex interventions evaluated in trials.” Trials 21: 982. https://doi.org/10.1186/s13063-020-04880-4. 18 What are the types of health policy evaluation? Health Policy Evaluation Guideline Methodology: A mixed-methods study (2013–2018) used qualitative and quantitative data: • Qualitative: 343 transcripts from 82 PHNs, 19 supervisors, and 23 decision-makers, including 38 exit interviews. • Quantitative: NFP fidelity reports (visit timing, dose), encounter logs (n=14,000+), supervision records, and session data. Key Metrics: Home visits completed (78.9%), average visits per client (37), mean visit duration (75–80 min), travel distance (average 14.3 km), and reflective supervision coverage (62.4%). Key Findings: • Fidelity: NFP was implemented in line with core model elements. Clients received an average of 37 visits; 79% of scheduled visits were completed. Most visits occurred in-home (70%) and lasted 75–80 minutes. • Reach Challenges: Only 27.7% enrolled by 16 weeks gestation, below the 60% benchmark. • Supervision: 62.4% of reflective supervision sessions occurred; average session length was 59 minutes. PHNs emphasized its essential role in clinical growth and emotional resilience. • Acceptability & Retention: Nurses found NFP deeply rewarding, citing full-scope practice, structured education, and strong team cohesion as motivators. Turnover was linked to geography, workload, or job insecurity. • Adaptations: In rural/remote regions, nurses used flexible visit locations, accommodated long travel (up to 120 km), and occasionally used telehealth. • Client Needs: Nurses supported clients with complex adversities—homelessness, IPV, mental illness— requiring tailored strategies and deep therapeutic alliances. • Organizational Factors: Management support, supervision quality, and reflective practice strongly influenced implementation success. Conclusion and Implications: The BCHCP process evaluation demonstrated that NFP can be delivered with fidelity and adaptability across varied BC settings. It provided critical insight into how implementation structures, nurse experience, and contextual realities shape service quality. This case exemplifies process evaluation’s role in strengthening complex public health interventions through real-time learning and system- responsive improvements. Source: Jack, S. M. et al. 2020. Implementation and delivery of Nurse-Family Partnership in British Columbia, Canada: A synthesis of selected findings from the British Columbia Healthy Connections Project Process Evaluation (2013–2018). https://phnprep.ca/wp-content/uploads/2021/09/BCHCP_ Process-Evaluation_Final-Report.pdf. Outcome evaluation What is Outcome Evaluation in Health Policy? Outcome evaluation systematically assesses a health policy’s effects to determine if it meets its intended goals.13 It focuses on measuring changes such as improved health status, healthcare access, or system efficiency resulting from the policy, offering a clear picture of its impact on the target population. This approach is also vital to understand whether a policy delivers its promised benefits. 13Office for Health Improvement and Disparities (U.K.). 2018. “Outcome evaluation: evaluation in health and wellbeing.” GOV.UK. https://www.gov.uk/ guidance/evaluation-in-health-and-wellbeing-outcome. Health Policy Evaluation Guideline What are the types of health policy evaluation? 19 Why is Outcome Evaluation Important? Outcome evaluation shows whether a policy is achieving its intended results. It provides evidence that helps to: Ensure accountability Offer proof of effectiveness, enabling stakeholders—funders, policy makers, and the public—to hold implementers accountable. Drive improvement Identify gaps in performance, supporting adjustments to enhance results over time. Facilitate learning Uncover what works and why, informing future policy design and knowledge sharing. Validate impact Confirm the policy’s contribution to intended changes, justifying its continuation or revision. Case Study Outcome Evaluation of the Ottawa Model for Smoking Cessation in Ontario, Canada Overview: The Ottawa Model for Smoking Cessation (OMSC), implemented in primary care settings across Ontario, Canada, exemplifies outcome evaluation’s role in assessing health policy impact. This evidence- based intervention, evaluated in 2007–2009 by the University of Ottawa Heart Institute, uses the 3 A’s framework—Ask, Advise, Act—to boost cessation rates. It demonstrates how outcome evaluation measures effectiveness and informs policy refinement. Policy Context: Rolled out in 32 primary care practices, OMSC addressed Canada’s 19% smoking prevalence (2007) by training providers to identify smokers (Ask), counsel quitting (Advise), and offer support or medication (Act). It aimed to reduce smoking-related diseases like lung cancer and heart disease. Objectives of the Outcome Evaluation: The evaluation aimed to: • Measure increases in cessation intervention delivery. • Assess patient quit rate improvements. • Identify success factors (e.g., training, fidelity). • Evaluate cost-effectiveness and scalability potential. Methodology: A before-and-after design included: • Data Collection: Surveyed 3,800 patients and 481 providers across 32 practices at baseline and six months post-implementation. • Outcome Measures: Tracked Ask, Advise, and Act rates, plus patient quit status, via statistical analysis. • Fidelity Assessment: Linked outcomes to adherence to 10 Best Practices. • Cost Analysis: Estimated savings from reduced smoking-related healthcare use. Key Findings: • Intervention Delivery: Ask rose from 54.9% to 70.8%, Advise from 40.1% to 64.7%, Act from 34.8% to 59.6% (all significant). • Quit Rates: Quit attempts increased by 15%; sustained quit rates rose approximately 10% from baseline – for example, 5% to 15%, estimated. • Influencing Factors: Practices with 8+ Best Practices saw higher quit rates, highlighting fidelity’s role. • Cost-Effectiveness: Reduced healthcare use, such as hospitalizations, suggested savings, estimated at $500 per quitter annually. Conclusion and Implications: OMSC’s evaluation confirmed its effectiveness in raising intervention delivery and quit rates, with cost-effective benefits tied to fidelity. It supported scalability, though long-term cessation and disease reduction need further study, underscoring outcome evaluation’s role in evidence-based policy. Source: Papadakis, S. et al. 2016. Increasing rates of tobacco treatment delivery in primary care practice: Evaluation of the Ottawa Model for Smoking Cessation. https://doi.org/10.1370/afm.1909. 20 What are the types of health policy evaluation? Health Policy Evaluation Guideline Impact evaluation What is Impact Evaluation? Impact evaluation examines the long-term, causal effects of a health policy on health outcomes, systems, or populations. It goes beyond immediate results—assessed in outcome evaluation—to measure the policy’s broader contribution, including intended and unintended changes in societal well-being, equity, or systemic performance. By using rigorous methods to attribute effects to the policy, impact evaluation answers the question, “What difference did this make?” and reveals its transformative impact over time. Why is Impact Evaluation Important? Impact evaluation helps to determine a policy’s long-term value and system-wide effects. It generates evidence that supports efforts to: 1. Assess long-term Evaluate sustained success—beyond short-term outcomes—such as effectiveness reduced disease burden or improved equity, justifying continued support. 2. Improve future policy design Highlight strengths and weaknesses, guiding refinements for greater impact in future iterations. 3. Uncover broader effects Detect unintended outcomes—positive (e.g., economic gains) or negative (e.g., access disparities)—expanding the policy’s impact profile. 4. Ensure accountability Provide evidence of meaningful change, demonstrating transparency to funders, agencies, and the public. 5. Support informed decisions Deliver data-driven insights, including causal attribution, to enable evidence-based resource allocation. When Should an Impact Evaluation Be Conducted? Impact evaluations occur post-implementation, once long-term effects emerge, depending on the policy’s goals, such as chronic disease reduction, and data availability. Timing balances sufficient impact manifestation with actionable relevance, ensuring results inform current decision-making. Case Study Impact Evaluation of The Free ART Program in South Africa Overview: South Africa’s free antiretroviral therapy (ART) program, launched in 2004, exemplifies impact evaluation by assessing its long-term effects on HIV/AIDS outcomes nationwide. Targeting Black Africans aged 25–49—the group bearing two-thirds of HIV cases—this initiative expanded treatment access through public facilities. Evaluated using longitudinal survey data collected between 2006 and 2016, the program demonstrates how impact evaluation can quantify sustained health improvements. Policy Context: Initiated in 2004, the ART program tackled South Africa’s HIV epidemic (19% national prevalence, 2004, Stats SA), rolling out free treatment to curb mortality and enhance health. By 2018, 4.6 million of 7.7 million HIV-positive individuals received ART, reversing a crisis that killed over 300,000 annually prerollout. Objectives of the Impact Evaluation: The evaluation aimed to: • Measure reductions in HIV-related mortality among Black Africans aged 25–49. • Assess improvements in self-reported health for this group. • Estimate the causal impact of ART availability on population health outcomes. Health Policy Evaluation Guideline What are the types of health policy evaluation? 21 Methodology: A difference-in-differences design used • Data collection: Longitudinal data from the National Income Dynamics Study (NIDS) tracked over 28,000 individuals nationwide (2008–2016/7). • Analysis: Compared communities with varying ART rollout timing, controlling for confounders – for example, age, wealth. Indicators: Included annual mortality rates and self-reported health scores. Key Findings: • Mortality reduction: ART availability reduced mortality by 27% among Black Africans aged 25-49 over 2006–2016. • Health improvement: The likelihood of reporting poor health dropped by 36% in this group, reflecting treatment efficacy. • Demographic impact: Within this high-prevalence group, annual mortality fell by 31%, and poor health reports decreased by 47%. Conclusion and Implications: The evaluation confirmed ART’s dramatic impact, sharply reducing mortality and boosting health among South Africa’s most affected population. It highlighted the power of nationwide treatment access, offering lessons for global HIV strategies while emphasizing the need for ongoing monitoring of long-term effects. Source: Burger, C., Ronelle Burger, and Eddy van Doorslaer. 2022. The Health Impact of Free Access to Antiretroviral Therapy in South Africa. https://doi. org/10.1016/j.socscimed.2022.114832. Economic evaluation What is an Economic Evaluation? Economic evaluation assesses the costs and health outcomes of a policy to determine its value-for-money. It compares resource use (funding, staff) with benefits achieved (improved health, lives saved), aiding policy makers in understanding how to allocate resources efficiently. Common methods include Cost-Effectiveness Analysis (CEA), and Cost-Benefit Analysis (CBA), each measuring costs against health benefits from different perspectives.14 Why is Economic Evaluation Important? Economic evaluation is essential for making informed health policy decisions by identifying which health policy delivers the best value for money. It helps policy makers: 1. Assess efficiency Compare costs and health outcomes to determine which policies provide the greatest benefit per unit of resource. 2. Guide resource Identify cost-effective strategies—such as vaccination programs or new allocation technologies—especially in resource-constrained settings. 3. Ensure equity Analyze how costs and benefits are distributed across population groups, informing policies that promote health equity. 4. Provide evidence Supply robust data to support the prioritization of health policies that offer the highest health returns for investment. 14 Centers for Disease Control and Prevention. 2024. “Economic evaluation: Overview.” POLARIS. https://www.cdc.gov/polaris/php/economics/index.html. 22 What are the types of health policy evaluation? Health Policy Evaluation Guideline Case Study Economic Evaluation of The HPV Vaccination Program in Australia Overview: Australia’s Human Papillomavirus (HPV) vaccination program, launched in 2007, exemplifies economic evaluation by assessing its costs and health outcomes. Targeting girls aged 12–13, it aimed to reduce cervical cancer through a national rollout. This evaluation, modeled from 2007 onwards, shows how economic evaluation informs efficient health policy. Policy Context: Introduced in 2007, the HPV program addressed Australia’s cervical cancer rate of 7 cases per 100,000 women annually (AIHW, pre-2007). Delivered free via schools, it cost approximately AUD 240 million initially, seeking to lower cancer incidence through widespread vaccination. Objectives of the Economic Evaluation: The evaluation aimed to • Assess the cost-effectiveness of HPV vaccination in reducing cervical cancer. • Compare program costs against projected health benefits. • Examine resource use efficiency for long-term outcomes. Methodology: A Cost-Effectiveness Analysis (CEA) was used, involving • Data collection: Combined vaccination costs (AUD 100–150 per dose) with modeled health outcomes (cancer cases averted). • Analysis: A dynamic transmission and Markov model compared vaccination versus no-vaccination scenarios, projecting effects to 2035. • Indicators: Measured costs per Quality-Adjusted Life Year (QALY) gained, discounted at 5%. Key Findings: • Cost-Effectiveness: The quadrivalent vaccine cost AUD 18,000 per QALY gained, below Australia’s AUD 50,000 threshold. • Health outcomes: Projected 50–60% reduction in cervical cancer incidence long-term with continued vaccination and screening. • Efficiency: Modeled savings in treatment costs outweighed vaccination expenses over decades. Conclusion and implications: The evaluation confirmed the HPV program’s value-for-money, providing evidence of substantial health benefits at an acceptable cost. It supports sustained vaccination efforts, offering a model for cost-effective cancer prevention strategies globally. Source: Simms K. T. et al. 2016. “Cost-effectiveness of the next generation nonavalent human papillomavirus vaccine in the context of primary human papillomavirus screening in Australia: a comparative modelling analysis.” Lancet Public Health 1 (2): E66–E75. https://www.thelancet.com/journals/ lanpub/article/PIIS2468-2667(16)30019-6/fulltext. Note: Outcomes are modeled projections, not empirical 2017 data. How to Conduct Health Policy Evaluations This section describes the main standards and steps required to conduct health policy evaluations in practice, which ensure to assess policies consistently, produce reliable findings, and improve health outcomes – see Figure 2 below. This section specifies the four core standards and the overview of the six key steps, followed by subsections that provide detailed guidance on applying these steps. The health policy evaluation conducted based on this guideline follows the four evaluation standards which guide the overall evaluation process and ensure its quality: 1. Utility This standard ensures that the evaluation meets the needs of its users, delivers relevant information to those who need the evaluation, and considers the interests of the stakeholders at each step of the evaluation process. 2. Feasibility This standard keeps the evaluation practical within available resources, requires the evaluators to consider the available time, resources, and expertise to complete the evaluation. Health Policy Evaluation Guideline What are the types of health policy evaluation? 23 3. Propriety This standard maintains fairness and ethical conduct throughout the evaluation process, and ensures respecting the rights and well-being of individuals and stakeholders involved. 4. Accuracy This standard requires the evaluation to provide reliable and precise information. The steps described in the following subsections offer a clear framework to evaluate a health policy. The health policy evaluation is conducted following six main steps: 1. Identify and engage Involve those affected by or interested in the policy. stakeholders 2. Describe the health policy Define the policy’s purpose and expected health outcomes. 3. Design the policy Plan the methods, scope, and data sources. evaluation 4. Collect data Gather credible information to address evaluation questions. 5. Analyze and justify Interpret data to draw evidence-based conclusions. evaluation findings 6. Use and disseminate the Present findings and apply them to enhance the policy. evaluation findings Figure 2 Overview of the Steps and Standards of Health Policy Evaluation 01 06 Identify and Use and disseminate engage evaluation findings stakeholders Standards 05 Utility Analyze data and justify Feasibility 02 evaluation Propriety Describe the findings health policy Accuracy 04 Collect data 03 Design policy evaluation Source: Adapted from CDC’s Program Evaluation Framework Action Guide, 2024. 24 What are the types of health policy evaluation? Health Policy Evaluation Guideline Which Steps to Follow in Conducting Health Policy Evaluations? Health policy evaluations follow a systematic, six-step process guided by core standards to ensure effective and structured assessment. These steps build upon each other sequentially and are described in detail in the following subsections. In line with this framework, this guideline presents findings from the evaluation of the Mandatory Medical Malpractice Insurance Policy for Other Health Practitioners in Saudi Arabia, structured according to the six-step evaluation process. Step 1: Identify and Engage Stakeholders Why is Stakeholder Identification and Engagement Important? Involving and committing all key stakeholders at each step is crucial for the success of health policy evaluation. Engaging stakeholders leads to stronger relationships and better communication between the evaluators and the stakeholders, and common understanding of health policies evaluated. That also creates a platform for stakeholders to put any specific questions that may arise during the evaluation process. Engaging stakeholders also facilitates access to necessary data, acknowledgement of the judgments about the evidence gathered, and interpretation of the evaluation findings. Stakeholder engagement also increases the credibility of the analysis and the likelihood that the findings will be used; otherwise, the evaluation could be ignored, criticized, or resisted. Who are the Potential Stakeholders in a Health Policy Evaluation? Stakeholders cover a wide range of participants in health policies, including general public, patients, healthcare providers, health system regulators, health financing organizations, patients, academia, private firms, and many others. They are critical players of any evaluation process. Identifying and engaging stakeholders in an evaluation process require transparency and predetermined criteria that align with evaluation standards.15 The selection of stakeholders is based on their role—to enhance the credibility of the evaluation, ensure effective day-to-day implementation, advocate for necessary changes, and secure funding or authorization for the continuation or expansion of a health policy. Stakeholders in health policy evaluations could be classified under the following groups: 1. Stakeholders implementing Include the stakeholders who have responsibility to develop, approve, or health policy implement the health policy being evaluated and to perform health policy evaluation. 2. Stakeholders affected by Include the stakeholders who are responsible for scrutinizing government health policy decisions and spending as well as the beneficiaries of the health policy being evaluated. 3. Stakeholders utilizing the Include the stakeholders who are responsible for future policies and use evaluation findings evaluation results for further analysis. Case Study Mandatory Medical Malpractice Insurance for Other Health Practitioners in KSA After the successful launch and implementation of the mandatory medical malpractice insurance for physicians and dentists in 2022, KSA’s government decided to expand the coverage of the policy to other health practitioners in 18 specialties. Following the launch of the expansion, an assessment has been conducted to evaluate the awareness and opinion of other health practitioners about the policy. The stakeholders of the Mandatory Medical Malpractice Insurance for Other Health Practitioners are categorized into the groups described in Table 3. 15Centers for Disease Control and Prevention (CDC). 2024. CDC’s Program Evaluation Framework Action Guide, pg. 18. https://www.cdc.gov/evaluation/ media/pdfs/2024/12/FINAL-Action-Guide-for-DFE-12182024_1.pdf. Health Policy Evaluation Guideline What are the types of health policy evaluation? 25 takeholder Groups under the policy on “Mandatory Medical Malpractice Insurance for Other Health Table 3 S Practitioners” (sample) Stakeholder groups Stakeholders Stakeholders Stakeholders Stakeholders Stakeholders Participants / Stakeholders responsible for responsible for responsible for scrutinizing recipients of responsible for policy-making similar policies evaluation government the policy health policy in future analyses decisions and implementation spend Ministry of ☒ ☒ ☒ ☒ ☒ ☒ Health Saudi Health ☒ ☒ ☒ ☒ ☒ ☒ Council Patient Safety ☐ ☐ ☐ ☐ ☐ ☐ Center Saudi ☐ ☐ ☐ ☐ ☐ ☐ Commission for Health Specialties Ministry of ☐ ☐ ☐ ☐ ☒ ☒ Health Hospitals Ministry of ☐ ☐ ☐ ☐ ☒ ☒ Defense Ministry of ☐ ☐ ☐ ☐ ☒ ☒ Interior Ministry of ☐ ☐ ☐ ☐ ☒ ☒ National Guard King Faisal ☐ ☐ ☐ ☐ ☒ ☒ Specialist Hospital and Research Centre Ministry of ☐ ☐ ☐ ☐ ☒ ☒ Education (Health Affairs) Patients ☐ ☐ ☐ ☐ ☒ ☐ Private Sector ☐ ☐ ☐ ☐ ☒ ☒ Health care ☐ ☐ ☐ ☐ ☒ ☒ Practitioners Depending on the health policy, its coverage, benefits, and impacts as well as implementation arrangements, stakeholders are also mapped, based on their power and interests, to understand their position and plan the engagement activities. How to Engage Stakeholders? After identifying key stakeholders, the roles of the stakeholders in the evaluation process are assessed and a stakeholder engagement plan is developed to ensure their support and input in the subsequent steps of the health policy evaluation. Such a plan clarifies each stakeholder’s role and holds all involved parties accountable for the evaluation’s success (Table 4), including to increase the credibility of the evaluation, implement health policies, advocate changes as a result of the evaluation findings, and support the expansion or continuation of a policy. 26 What are the types of health policy evaluation? Health Policy Evaluation Guideline  oles of Stakeholders in the Evaluation of the Mandatory Medical Malpractice Insurance Policy for Other Table 4 R Health Practitioners (sample) Stakeholders’ role in policy evaluation process Stakeholders Increase credibility Implement health Advocate for Fund/authorize the of the evaluation policies that are changes to continuation or central to this institutionalize the expansion of the evaluation evaluation findings policy Ministry of Health ☐ ☒ ☒ ☒ Saudi Health Council ☒ ☐ ☒ ☒ Patient Safety Center ☐ ☐ ☒ ☐ Saudi Commission for ☐ ☐ ☒ ☐ Health Specialties Ministry of Health ☐ ☒ ☒ ☐ Hospitals Ministry of Defense ☐ ☒ ☒ ☐ Ministry of Interior ☐ ☒ ☒ ☐ Ministry of National Guard ☐ ☒ ☒ ☐ King Faisal Specialist ☐ ☒ ☒ ☐ Hospital and Research Centre Ministry of Education ☐ ☒ ☒ ☐ (Health Affairs) Patients ☐ ☐ ☒ ☐ Private Sector ☐ ☒ ☒ ☐ Health care Practitioners ☐ ☒ ☒ ☐ Following the stakeholder mapping and identifying the roles of the individual stakeholders, the evaluation team plans stakeholder engagement in the evaluation process (Table 5). Hence, stakeholders are involved in making their inputs to the evaluation process and take the ownership of the overall evaluation.  takeholder Involvement Plan in the Evaluation of the Mandatory Medical Malpractice Insurance Policy for Table 5 S Other Health Practitioners (Sample) Stakeholders’ inputs in policy evaluation process Stakeholders Step 2 - Step 3 - Step 4 - Data Step 5 - Justify Step 6 - Use Describing Designing collection conclusions and health policy policy disseminate evaluation findings Ministry of Health ☐ ☐ ☒ ☐ ☐ Saudi Health Council ☒ ☒ ☒ ☒ ☒ Patient Safety Center ☐ ☐ ☒ ☐ ☐ Saudi Commission for ☐ ☐ ☒ ☐ ☐ Health Specialties Ministry of Health ☐ ☐ ☒ ☐ ☐ Hospitals Ministry of Defense ☐ ☐ ☒ ☐ ☐ Ministry of Interior ☐ ☐ ☒ ☐ ☐ Ministry of National Guard ☐ ☐ ☒ ☐ ☐ King Faisal Specialist ☐ ☐ ☒ ☐ ☐ Hospital and Research Centre Ministry of Education ☐ ☐ ☒ ☐ ☐ (Health Affairs) Patients ☐ ☐ ☐ ☐ ☐ Private Sector ☐ ☐ ☒ ☐ ☐ Health care Practitioners ☐ ☐ ☒ ☐ ☐ Health Policy Evaluation Guideline What are the types of health policy evaluation? 27 Source: Adapted from Iskarpatyati et al 2011, and CDC’s Program Evaluation Framework Action Guide, 2024. In addition to identifying stakeholders’ roles and involvement, the evaluation team should develop a targeted engagement strategy based on a stakeholder power-position matrix. Stakeholders can be grouped into five categories—Influence, Leverage, Manage, Include, and Persuade—depending on their level of power and their stance toward the evaluation. Each group requires a tailored engagement approach: 1. Influence Parties: These are high-power actors whose opposition may hinder the evaluation process or the uptake of its findings. Evaluators should engage them early to address concerns about potential bias or the implications of evaluation results. Participatory approaches such as consultative workshops can help build mutual understanding and trust. 2. Leverage Parties: These high-power supporters can institutionalize evaluation practices, allocate resources, and promote the use of findings. Evaluators should actively engage them to secure political buy-in, ensure sufficient resources, and facilitate dissemination and application of results. 3. Manage Parties: These are low-power opponents who may have concerns or fears about the evaluation but limited ability to obstruct it. Evaluators should manage them through transparency, involvement in the evaluation design, consistent communication, and reassurance that findings will be used constructively, not punitively. 4. Include Parties: These low-power supporters can offer valuable insights and help to extend the reach of findings. Evaluators should involve them in data collection, analysis, and validation activities, and support their role as knowledge brokers in dissemination efforts. 5. Persuade Parties: These neutral or unengaged actors can become valuable allies if adequately informed. Evaluators should raise awareness of the evaluation’s objectives, tailor messaging to their interests, and engage them in learning and dissemination events to encourage support. This engagement typology helps to ensure that stakeholder participation is strategic, inclusive, and aligned with the overall evaluation goals. The matrix in Figure 3 provides a template to operationalize this typology by classifying stakeholders and identifying tailored engagement strategies. takeholder mapping for the policy on “Mandatory Medical Malpractice Insurance for Other Health Figure 3 S Practitioners”16 (sample) High HIGH POWER, OPPOSED HIGH POWER, IN FAVOR Influence Parties: NA Leverage Parties: Ministry of Health, Saudi Health Council, Saudi Commission for Health Specialties, and Patient Safety Center NEUTRAL Power Persuade Parties: Media and Other Industries OPPOSSED, LOW POWER IN FAVOR, LOW POWER Manage Parties: NA Include Parties: Ministry of Health Hospitals, Ministry of Defense Hospitals, Ministry of Internal Affairs Hospitals, Ministry of National Guard Hospitals, Ministry of Education (Health Affairs), Private Sector, Healthcare Low Low In Favor High Source: Developed by the authors using evaluation findings from the implementation of the Mandatory Medical Malpractice Insurance Policy for Other Health Practitioners (2024), in consultation with the Saudi Health Council. 16The evaluation findings of the Mandatory Medical Malpractice Insurance Policy for Other Health Practitioners, introduced by the Kingdom of Saudi Arabia (KSA), serve as the sample case presented in this guideline. 28 What are the types of health policy evaluation? Health Policy Evaluation Guideline Checklist for Step 1: Identify and Engage Stakeholders Identify and categorize stakeholders into three main groups: those impacted by the policy, those involved in its ☐ implementation, and those who will use the evaluation findings. Review the stakeholder list to pinpoint key stakeholders that are crucial for enhancing the evaluation’s ☐ credibility, ensuring policy implementation, advocating for institutionalization, or approving the continuation or expansion of the evaluation. Engage individual stakeholders or representatives from relevant organizations, understand their interest and ☐ powers. ☐ Develop a plan for stakeholder involvement and specify areas where their input is required. Ensure selected stakeholders regularly participate in crucial steps, such as describing the policy, developing ☐ and choosing evaluation questions, and disseminating the evaluation results. Step 2: Describe the Health Policy Understanding the Health Policy and Its Key Elements The next step is to clearly define the health policy being evaluated. This involves thoroughly analyzing the policy, exploring its underlying assumptions, and identifying the key questions that the evaluation aims to answer. Therefore, a comprehensive understanding of the policy is vital, because it informs the subsequent steps of the evaluation, including the development of the evaluation design, the selection of suitable methods, and the identification of relevant indicators and data sources.17 Table 6 presents elements to examine and detail at this phase of the evaluation.18 Table 6 Key Health Policy Components # Policy Description Sample from Mandatory Medical Malpractice components Insurance for Other Health Practitioners’ Policy Evaluation Findings (2024) 1 Need for The need for the health policy is defined by the The significant increase in the number of policy public health issue or other critical challenge(s) medical malpractice lawsuits from 1,097 in that the policy aims to address. This need 2016 to 1,379 in 2018 spurred action to should be articulated in terms of its enhance patient safety, minimize medical consequences for the population or a group, errors within the healthcare system, and reduce the overall scale of the issue and its prevalence, the financial burden on health practitioners in and significant changes or trends in its KSA. The increasing number of lawsuits incidence or prevalence. Clearly defining the signaled challenges in quality of care and risks need for the policy is essential for evaluating its of financial burdens for health practitioners. relevance and effectiveness in addressing the Therefore, the KSA government decided to identified public health challenges. extend the existing Mandatory Medical Malpractice Insurance policy to include other specified healthcare providers across 18 specialties. 2 Target Groups Target groups are the specific populations or The main target group of the policy is the stakeholders that the health policy seeks to health practitioners from the following 18 engage in addressing the public health issue. specialties: nurses, pharmacists, anesthesia Understanding and clearly defining the target specialists, midwifery specialists, diagnostic groups is crucial for evaluating how well the radiologists, diagnostic radiology technicians, health policy is reaching the intended audience emergency medical service providers, lab and whether it is effectively mobilizing these specialists, physiotherapists, speech and groups toward achieving the desired health language pathologists, respiratory therapists, outcomes. nutrition specialists, audiologists, phlebotomy specialists, ophthalmologists, operating room technicians, cardiac perfusion specialists and blood draw optometry specialists. 17 Note: In cases where evaluators have limited prior knowledge or documentation of the policy, it may be necessary to begin with a detailed description of the policy itself. In such instances, this step might precede stakeholder engagement to ensure clarity and alignment throughout the evaluation process. 18 CDC’s Program Evaluation Framework Action Guide. https://www.cdc.gov/evaluation/media/pdfs/2024/12/FINAL-Action-Guide-for-DFE-12182024_1.pdf. Health Policy Evaluation Guideline What are the types of health policy evaluation? 29 3 Activities Activities are the actions undertaken by the The main activities under the policy include the health policy and its implementing bodies to dissemination of the policy, standardization of achieve the desired outcomes within the target the health insurance guidelines, and population. Evaluating these activities involves improvement of the awareness among health assessing whether they are being implemented practitioners about the medical malpractice as planned and whether they are contributing insurance. to the intended outcomes. 4 Resources / Resources and inputs refer to the people, The main resources used for the policy Inputs funding, and information required to implement implementation are the human resources. health policy activities. In the evaluation context, it is important to assess whether the necessary resources are available. If the intended outcomes are not being achieved, the evaluation should consider whether resource constraints or misallocations are contributing factors. 5 Outputs Outputs are the direct products of the health An increasing number of insured healthcare policy activities, typically represented as practitioners, improved knowledge on tangible deliverables. Evaluating outputs malpractice insurance, and establishment of involves measuring whether the activities are clear insurance guidelines are the main producing the expected results in quantifiable expected outputs of the policy. terms. 6 Outcomes Outcomes represent the actual changes that Improved patient safety, reduced incidence of occur as a result of the health policy, directly medical errors, enhanced trust in healthcare related to its goals and objectives. Evaluating services, and reduced financial burden on the outcomes involves assessing short-term and health practitioners are the main expected long-term changes: outcomes of the Medical Malpractice • Short-term outcomes should be evaluated Insurance for Other Health Practitioners’ Policy. to determine whether the policy is beginning to have the desired effects, such as changes in attitudes, behavior, or health status, – for example, within the first one to three years of implementation. • Long-term outcomes involve evaluating the sustained effects of the policy such as broader changes in health behavior, practices, or status – for example, over four to six years – that build on short-term outcomes. 7 Impact Impact refers to the broader, after long-term Improved patient satisfaction, increased life results of health policy implementation. expectancy, better overall healthcare quality, Evaluating impact involves assessing the and strengthened healthcare system are the overall effect of the policy on public health, main expected impacts of the Medical including organizational-, community-, or Malpractice Insurance for Other Health system-level changes. This evaluation focuses Practitioners’ Policy. on whether the policy has led to improved public health conditions, enhanced system capacity, and significant changes in the policy environment, ultimately achieving its long-term goals. Source: Adapted from CDC’s Program Evaluation Framework Action Guide, 2024, and the Mandatory Medical Malpractice Insurance Policy for Other Health Practitioners, 2024. 30 What are the types of health policy evaluation? Health Policy Evaluation Guideline Beyond understanding the individual components of a health policy, it is crucial to examine how they interact with one another. These interrelationships can be effectively explored using a logic model which serves as a systematic approach to organizing and illustrating the relationships between planned activities and their measurable objectives, offering a detailed explanation of health policy. The logical framework presented in this guideline helps evaluators clearly outline the key components of a policy and their logical interconnections. By mapping outputs, activities, and underlying assumptions, the framework demonstrates a clear progression from inputs and actions to the desired outcomes and impacts. It reflects the underlying Theory of Change of the health policy, typically developed during the policy design stage.19 The sample logical framework template below presents the core elements of the framework and illustrates the logical connections among policy components. Figure 4 Logical Framework Template Description of the health policy (title and brief description) Policy structure Indicators Means of verification Assumptions/risks Impact 1. Describe the policy 12. ...then policy impact should be achieved impacts (expected) Outcome 2. Describe the policy Assumptions for policy outcomes (expected) 11. If outcomes are achieved and the assumptions hold true... outcomes 10. ...then outcomes should be achieved Output 3. Describe the policy Assumptions for policy outputs 9 . If outputs are achieved and assumptions hold true... outputs 8 . ....then outputs should be achieved 7. If activities/processes happen and assumptions hold true... Activities / Processes 5. Describe inputs and resources utilized 6. Identify the costs 4. Describe the policy Assumptions for policy to perform activities / and sources thereof used activities/processes activities/processes processes for resources and inputs Source: Adapted from Log Framework Handbook, World Bank.20 The logical framework presented above is designed to support a structured evaluation of health policy by applying two complementary approaches: a vertical (top-down) analysis of policy goals, and a horizontal analysis of measurement components. Together, these approaches help evaluators to understand what a policy aims to achieve and how to verify whether those goals are being met. 19 The Health Policy Maker’s Manual: Integrating Data and Evidence (2024) provides more details on the Theory of Change in policy development. 20 World Bank. 2005. The Logical Framework (Logframe) Handbook: A Logical Framework Approach to Project Cycle Management. https://documents1. worldbank.org/curated/en/783001468134383368/pdf/31240b0LFhandbook.pdf. Health Policy Evaluation Guideline What are the types of health policy evaluation? 31 1. Top-Down Approach: Evaluating the Hierarchy of Policy Goals The vertical logic of the framework follows a top-down approach, enabling evaluators to trace how each level of the policy contributes to achieving the overarching goals. This mirrors the evaluation process, which typically begins by assessing whether the intended impact has been achieved, and then works backward through outcomes, outputs, and activities. Start at the top (impacts): The policy’s long-term objectives (impacts) are the ultimate changes or 1.1  benefits the policy aims to achieve. Move to outcomes: To understand whether the policy is moving towards its impact, evaluators assess 1.2  the outcomes—the medium-term effects or changes directly resulting from the policy’s outputs. Assess outputs: The next step is to look at the outputs, which are the immediate results of the policy’s 1.3  activities, such as training sessions completed, facilities built, or services delivered. Evaluate activities and processes: Finally, evaluators examine the activities and inputs—the specific 1.4  actions and resources used to implement the policy. This top-down sequence enables evaluators to trace the logic of the policy from its highest-level goals (impacts) to the specific actions taken (activities), determining whether the policy’s intended objectives have been met at every level. 2. Horizontal Approach: Measuring and Verifying Success at Each Level Once the vertical structure is established, the framework applies a horizontal logic to identify how success will be measured and verified across all levels. This includes three key components: Indicators: These define the metrics or criteria used to measure success at each level, from activities 2.1  to impacts. Means of verification: This explains how the data will be collected or verified to ensure the indicators 2.2  are accurate. Assumptions/risks: These outline any external factors that may influence the success of each level, 2.3  helping evaluators to anticipate challenges and risks. The following sample, based on Saudi Arabia’s Mandatory Medical Malpractice Insurance Policy for Other Health Practitioners, demonstrates how the logical framework can be applied to visualize the interconnections among policy components (Figure 5). 32 What are the types of health policy evaluation? Health Policy Evaluation Guideline  llustrative Logical Framework Developed from the Evaluation Findings of the Mandatory Medical Malpractice Figure 5 I Insurance Policy for Other Health Practitioners Mandatory Medical Malpractice Insurance for Other Health Practitioners (sample) Policy structure Indicators Means of verification Assumptions/risks % of patients very highly or Survey 1 Improved patient highly satisfied with patient satisfaction safety in health services delivered by healthcare 2 Improved healthcare practitioners Impact quality Medical errors per 10.000 National statistics 3 Strengthened health persons reported as a result of system the negligence by healthcare practitioners ...then policy impact should be achieved % of healthcare practitioners Survey 1.1 Enhanced trust in satisfied with the malpractice healthcare services insurance benefits against costs Outcome 2.1 Improved patient safety Policy outcomes are % of allegating patients Survey assumed to be achieved 2.2 Reduced incidence of received malpractice medical errors compensation corresponding to the severity of injury made 3.1 Reduce the financial burden by healthcare practitioner on the health practitioners ...then outcomes should be achieved 1.1.1 Improved knowledge on medical malpractice insurance % of healthcare professionals obtained malpractice Output 2.1.1 Establishment of the insurance coverage National statistics Policy outputs are achieved insurance guidelines 3.1.1 Increasing the number of the insured healthcare practitioners ...then outputs should be achieved Activities / Processes 1.1.1.1 Disseminating the policy 2.1.1.1 Standardization of the health insurance Activities are performed and guidelines Human resources Costs to be determined the necessary resources are allocated 3.1.1.1 Conduct awareness activities about the policy among health practitioners Source: Developed by the authors using evaluation findings from the implementation of the Mandatory Medical Malpractice Insurance Policy for Other Health Practitioners (2024), in consultation with the Saudi Health Council. When constructing a comprehensive logic model, understanding the context—such as the broader environment in which the policy operates, which may present both opportunities and challenges—is another critical consideration. This context may include political dynamics, funding, interagency cooperation, competing organizations and interests, social and economic conditions, and the history of the program or agency, or past collaborations. Checklist for Step 2: Describe the Health Policy ☐ Identify the key policy components, including, inputs, activities, outputs, outcomes, and impacts. ☐ Describe each policy component to explore the policy evaluated. ☐ Develop a logical map of the health policy components and determine their relationships. ☐ Create a visual representation of the logical map. ☐ Determine the current stage of policy implementation. ☐ Analyze the context (political, social, financial, etc.) in which the policy is being implemented. Health Policy Evaluation Guideline What are the types of health policy evaluation? 33 Step 3: Design Evaluation What types of methods are used to conduct health policy evaluation? In practice, most evaluation designs employ a combination of methods, integrating both qualitative and quantitative approaches to address questions related to impact, process, and economic (value-for-money) evaluations.21 The common policy evaluation methods are categorized into the following groups: 1. Commonly used research methods. 2. Theory-based evaluation methods. 3. Experimental and quasi-experimental evaluation methods. 4. Value-for-money methods. 5. Synthesis methods. The following subsections discuss in detail the various types of these methods and the considerations for their application. Commonly Used Research Methods When evaluating health policies, there are cases where the mechanism by which a policy induces change is so straightforward that its impact can be directly observed or assessed through process evaluation without the need to account for other influencing factors. For example, in a country where a new nationwide vaccination policy is introduced to combat a specific infectious disease, if the disease incidence sharply declines following the implementation of the policy, it can be directly attributed to the vaccination efforts. This scenario reflects clear causality, as the vaccination policy is the primary intervention, and there is strong confidence that no other significant changes would have occurred in the absence of the policy. Table 7 outlines specific methods that can be applied in such scenarios.22 Table 7 Commonly Used Research Methods Evaluation Description Analytical focus Strengths Limitations Methods Interviews Interviews are qualitative data Interviews enable Can be used to Can be resource- and Focus collection methods that involve in-depth exploration of gather detailed intensive and Groups direct, in-depth conversations with health policies with insights from time-consuming to individuals to explore their participants such as individuals directly conduct and knowledge, experiences, or policy makers, involved in or affected analyze. perceptions about a policy. healthcare providers, by the health policies. and community Does not provide Focus groups bring together members. Helps to uncover numerical multiple participants to discuss a underlying reasons estimates. topic, promoting interaction and a Focus groups are behind stakeholders’ range of views. useful for eliciting views and sheds light There may be a views from a diverse on patterns emerging risk of bias in the Both methods are used to gather group of stakeholders, from other pieces of views collected, nuanced, context-rich insights that providing a broad evidence, such as potentially may not emerge through perspective on the quantitative affecting the quantitative approaches.23 health policy being monitoring data. validity of the evaluated. findings. Capture diverse This method is often viewpoints and used to supplement support triangulation. quantitative data by revealing the rationale behind observed trends. 21 Magenta Book, page 41. 22 Magenta Book, page 42. 23 Patton, M.Q. 2002. Qualitative Research and Evaluation Methods. SAGE Publications. 34 What are the types of health policy evaluation? Health Policy Evaluation Guideline Evaluation Description Analytical focus Strengths Limitations Methods Case Studies Case studies are a method of In-depth investigation Captures real-life It is challenging to inquiry that involves an in-depth of specific health situations in depth generalize findings examination of a single or small policies within their and detail, aiding in to different number of cases within their real-world context. the understanding of contexts, real-life contexts. They are complex health policy situations, or particularly valuable in Subjects are often issues. health policies, understanding the implementation purposely selected to limiting the and effects of complex health represent unique or Works well in broader policies.24 typical cases, combination with or applicability of the revealing critical supplementing other insights gained information about the methods, such as from the case implementation and surveys. study. outcomes of health- related actions. Helps to Interpretation may communicate be influenced by Often uses multiple effective health researcher bias. sources of evidence, policies to Often time and such as interviews, stakeholders. resource- documents, and intensive. observations to gain a holistic understanding of policy processes and outcomes. Surveys Surveys use structured Commonly used to An effective method Less useful for questionnaires to collect data collect data from a for obtaining providing in-depth from a large population or large number of information from a insights into the sample.25 individuals, such as large number of nuances of health healthcare providers, participants, policy. patients, or members providing a broad of the public affected overview of the health Response-rate by a health policy. policy being issues can evaluated. decrease the Quantifies attitudes, quality and knowledge, and Provides data reliability of the behaviors. suitable for statistical findings, analysis that, if potentially leading Provides population- well-designed, can be to biased or level data for generalized to the incomplete data. monitoring or population of interest. comparison across demographic groups. 24 Yin, R.K. 2014. Case Study Research: Design and Methods. SAGE Publications. 25 Dillman, D.A., Smyth, J.D., and Christian, L.M. 2014. Internet, Phone, Mail, and Mixed-Mode Surveys: The Tailored Design Method. Wiley. Health Policy Evaluation Guideline What are the types of health policy evaluation? 35 Evaluation Description Analytical focus Strengths Limitations Methods Observational Observational studies are Involves observing Allows for a deeper Participants may Studies qualitative or mixed-methods and noting the understanding of how alter their behavior (including approaches that involve behavior of if they know they individuals experience Ethnography) systematically watching and participants, including a health policy in are being observed recording behaviors, interactions, healthcare providers practice. (the ‘Hawthorne or events as they naturally occur in and policy effect’), affecting real-life settings, without implementers, within Observation can help data accuracy. interference by the researcher. In their usual improve the accuracy the context of health policy environments to of other data by Resource- evaluation, these studies are used understand how reducing biases that intensive, with to understand how policies are health policies impact arise from self- potential ethical implemented and experienced in day-to-day practices. reporting. implications and practice, especially by frontline practical barriers staff and affected populations. Often supplemented to implementation. with interviews to Ethnography, a more immersive contextualize the form of observational research, observations and build involves extended engagement in relevant theories. the field—clinics, community settings, or policy offices—where the researcher participates in and observes daily routines to understand the cultural, organizational, and social dynamics influencing policy outcomes. Ethnography may also involve informal conversations, field notes, and reflexive analysis to generate deep, contextualized insights. This method is particularly useful when evaluating complex interventions, uncovering discrepancies between policy as planned and policy as enacted, and identifying unanticipated outcomes.26 Source: Adapted from Magenta Book, pages 42–43. 26 Hammersley, M., and P. Atkinson. 2019. Ethnography: Principles in Practice. Routledge. 36 What are the types of health policy evaluation? Health Policy Evaluation Guideline  valuating the Implementation of the Mandatory Medical Malpractice Insurance for Other Health Sample Case E Practitioners’ Policy Policy context Following a three-year implementation of the Mandatory Medical Malpractice Insurance for Physicians, the government of KSA extended the coverage of the policy across 18 specialties of health practitioners. The Saudi Health Council conducted a survey among health practitioners to evaluate the implementation status of the policy. Focus of the survey The survey aimed to provide a comprehensive assessment of this policy among other health practitioners by measuring the compliance of healthcare providing entities, the number of insured other healthcare workers, and the level of awareness and understanding of the policy across the health practitioners. The survey objective includes the following: • Measuring the percentage of healthcare practitioners who possess the mandatory medical malpractice insurance coverage. • Measuring the level of awareness and understanding of the policy across the healthcare practitioner through a survey. • Assessing the compliance of public and private entities providing healthcare in requiring their healthcare professionals to be insured against malpractice. • Identify the challenges and limitations in implementing the Mandatory Medical Malpractice Insurance for Other Health Practitioners Policy. • Provide actionable recommendations to enhance the designing of the implementation process for the Mandatory Medical Malpractice Insurance for Other Health Practitioners Policy. Survey methodology A cross-sectional study was conducted using a random sample, with two specific questionnaires created— one for nonphysician healthcare practitioners and the other for healthcare facilities providing the service. This design aimed to collect comprehensive data within a specified period from June to September 2024, helping to understand the perspectives of healthcare practitioners and facilities towards the mandatory cooperative insurance policy against medical errors for nonphysician healthcare practitioners. The study includes several subobjectives: measuring the percentage of non-physician healthcare practitioners who have medical error insurance; assessing the level of awareness and understanding of the policy among nonphysician healthcare practitioners through a questionnaire; evaluating the compliance of public and private healthcare institutions and facilities with the mandatory cooperative insurance requirements for nonphysician healthcare practitioners against medical errors; and identifying challenges in implementing the mandatory medical insurance policy against errors for nonphysician healthcare practitioners. The study also contributed practical recommendations to enhance the design and implementation of the mandatory medical insurance policy against errors for nonphysician healthcare practitioners, providing valuable information for decision-makers. Additionally, data were analyzed using advanced statistical methods with the IBM® SPSS statistical software to ensure the accuracy and objectivity of the results. Various data analysis methods were used, including descriptive analysis that covered basic statistics such as means, frequencies, and percentages to understand the general distribution of the data. Inferential analysis was also conducted through hypothesis tests such as the T-test and chi-square test to determine if there were statistically significant differences between groups. Survey sample A total of 261 responses were collected from the healthcare facilities, which included a variety of institutions such as hospitals, clinics, primary care centers, polyclinics, and optical shops. Additionally, 1,112 healthcare practitioners responded to the questionnaire. After excluding physicians and dental assistants, the sample consisted of 1,090 healthcare practitioners. Following the removal of duplicate responses, the final sample size was 1,069 healthcare practitioners. Source: Developed by the authors using evaluation findings from the implementation of the Mandatory Medical Malpractice Insurance Policy for Other Health Practitioners (2024), in consultation with the Saudi Health Council. Health Policy Evaluation Guideline What are the types of health policy evaluation? 37 Theory-based evaluation method Theory-based approaches offer a valuable method for evaluating health policies by focusing on the specific pathways through which a policy is expected to produce its effects.27 These approaches go beyond simply measuring outcomes; they aim to uncover the underlying mechanisms and contextual factors that drive change. Unlike methods that prioritize precise estimates of effect size, theory-based evaluations emphasize understanding the policy’s contribution to observed outcomes, making them particularly useful in complex, real-world settings where multiple factors influence results. These methods are particularly effective when evaluating complex health policies or those in complex policy environments. In such cases, where measuring exact impact is difficult, theory-based methods help to confirm whether the policy is moving in the intended direction. They provide insights into why a policy worked or did not work, making them valuable for adapting policies to different populations, settings, or time periods. While theory-based methods can incorporate various evaluation techniques, they are especially suited for those that explore the processes within the policy’s implementation. Theory-based impact evaluation methods are particularly appropriate when one or more of the following conditions apply:28 • The policy environment is complex, involving a mix of different policies. • The policy aims to create change within a complex system or in situations where there is adaptive management or modification of the policy. • Outcomes are emergent and cannot be predicted from the start. • It is impossible to develop a suitable counterfactual. • There is a need to understand if the same outcomes would occur in a different setting or context. The methods outlined below are closely aligned with this approach, offering a framework to investigate the effectiveness of health policies in their specific contexts. Table 8 provides an overview of the most common theory-based methods, their application, and challenges.29 Table 8 Types of Theory-Based Evaluation Methods Evaluation Description Analytical Focus Strengths Limitations Methods Realist Investigates how and why a Identifies and tests Helps to refine Requires significant Evaluation policy works (or fails), for causal mechanisms theoretical time, resources, and whom, and under what within different understanding specific expertise. circumstances. It contexts. to health policies and emphasizes identifying informs impact The complexity of the underlying mechanisms Emphasizes evaluations even when analysis can make and the contextual understanding how creating a results hard to conditions that activate stakeholders respond counterfactual is interpret or them, offering nuanced to policy under challenging. communicate. insight into complex specific policies.30 circumstances. Rarely provides quantitative effect sizes. Contribution A structured approach to Tests causal claims Strengthens the Less effective in Analysis evaluating the contribution by developing and argument about the situations where there a policy or intervention verifying a policy’s impact through is significant variation makes to observed contribution story. logical reasoning, in how the policy is outcomes by linking Uses Theory of particularly when direct implemented or in its activities and results, while Change as backbone. attribution is difficult. outcomes. ruling out alternative explanations.31 Enhances credibility of May oversimplify causal claims without complex causal requiring pathways. counterfactuals. 27 Magenta Book, page 37. 28 Magenta Book, page 37. 29 Magenta Book, page 45. 30 Pawson, R., and Tilley, N. 1997. Realistic Evaluation. SAGE Publications. 31 Mayne, J. 2001. “Addressing attribution through contribution analysis: Using performance measures sensibly.” Canadian Journal of Program Evaluation 16 (1): 1–24. 38 What are the types of health policy evaluation? Health Policy Evaluation Guideline Evaluation Description Analytical Focus Strengths Limitations Methods Process A qualitative method used Examines a single Can validate causal Evidence may not be Tracing to test hypotheses about case of policy assumptions after the available or reliable. causal mechanisms in a implementation to fact, provided the specific case. It involves test whether the method is applied Time-consuming. systematically collecting expected causal rigorously. and analyzing evidence to mechanisms actually May not generalize trace how an outcome lead to the observed Allows for the beyond a single case. came about, and whether outcomes, based on consideration of the sequence of events the logic map of the alternative matches the theory of policy. explanations. change or alternative causal pathways.32 Bayesian A formal reasoning method Enhances the rigor of Strengthens evidence Resource-intensive, Updating that uses probabilistic logic health policy assessment by using requiring skilled to revise the level of belief evaluations by probabilistic reasoning, facilitators and in a causal relationship as combining with other especially useful when extensive expertise. new evidence becomes methods to update other logical methods available. In policy the probability of are applied. evaluation, it is often contribution claims combined with qualitative based on new methods to assess the evidence. strength of contribution claims under uncertainty.33 Contribution A structured approach that A mixed-method Efficiently focuses on Requires time for the Tracing merges the logic of process approach that evidence that can policy’s effects to tracing with Bayesian combines enhance confidence in become evident. reasoning. It engages stakeholder a policy’s contribution, stakeholders in assessing participation with while minimizing bias Must also explore contribution claims by systematic criteria through critical peer other potential causes. defining expected for data collection. It review. outcomes, testing evidence includes a Not suitable for against competing “contribution trial” comparing different explanations, and where all policies. quantifying confidence stakeholders evaluate levels in the results.34 what evidence will support or refute a policy’s impact. Qualitative A cross-case analysis Compares multiple Well-suited to Requires consistent Comparative method based on set policy cases to complexity. data across cases and Analysis theory that identifies identify patterns and careful analysis to (QCA) combinations of causal combinations of Reveals multiple determine which conditions associated with factors associated pathways to success or factors are most an outcome. It is with desired or failure. successful in different particularly suited for undesired outcomes, contexts. analyzing complex, using qualitative Supports cross-case context-dependent policies insights. learning. Interpretation of by revealing multiple results can be pathways to success or complex. failure across several cases.35 32 Beach, D., & Pedersen, R. B. 2013. Process-Tracing Methods: Foundations and Guidelines. University of Michigan Press. 33Befani, B., and Stedman-Bryce, G. 2017. “Process Tracing and Bayesian Updating for Impact Evaluation.” Evaluation 23 (1): 42–60. https://doi. org/10.1177/1356389016654584. 34Mayne, J. 2012. “Making causal claims.” ILAC Brief 26. https://hdl.handle.net/10568/70211. See also Befani, B., D’Errico, S., Booker, F., and Giuliani, A. 2016. “Clearing the fog: new tools for improving the credibility of impact claims,” https://www.iied.org/17359iied. 35Ragin, C. C. 1987. The Comparative Method: Moving Beyond Qualitative and Quantitative Strategies. University of California Press; and Schneider, C. Q., and Wagemann, C. 2012. Set-Theoretic Methods for the Social Sciences: A Guide to Qualitative Comparative Analysis. Cambridge University Press. Health Policy Evaluation Guideline What are the types of health policy evaluation? 39 Evaluation Description Analytical Focus Strengths Limitations Methods Outcome A participatory and Gathers evidence of Encourages Resource-intensive Harvesting retrospective method that change and traces it stakeholder and may be difficult to identifies outcomes back to assess the participation, providing apply in large-scale or (intended or unintended) health policy’s real-time insights and highly complex health and works backward to contribution, with clarity during policy policies. determine how a policy or ongoing stakeholder implementation. program may have involvement for influenced them. It does not monitoring. Flexible and adaptive to rely on preset indicators complex environments. and is especially useful in dynamic or uncertain Useful when outcomes environments.36 are not predefined. Most A qualitative, participatory A participatory Useful for evaluating Time-consuming, Significant technique that gathers method that involves unpredictable resource-demanding, Change significant change stories collecting and outcomes or when and requires skilled (MSC) from stakeholders to selecting significant stakeholders need to facilitation to ensure understand the most change stories from agree on prioritized meaningful results. valued outcomes of the stakeholders. results. policy. It involves group Limited dialogue and selection of Fosters understanding generalizability. stories that reflect and engagement meaningful impacts, among stakeholders helping surface unintended and enhances learning. or less tangible results.37 Captures rich qualitative data. Source: Adapted from Magenta Book, page 45. Experimental and quasi-experimental evaluation methods Experimental and quasi-experimental methods provide robust frameworks for measuring the impact of health policies by comparing observed outcomes in both controlled and real-world settings. The fundamental principle behind the experimental and quasi-experimental methods is the use of a counterfactual—a comparison between observed outcomes in a group that received the health policy intervention and a control group that did not. • Experimental designs: The intervention and control groups are made effectively identical through randomization, ensuring that any differences in outcomes can be attributed to the policy itself – for example, Randomized Controlled Trials (RCTs). • Quasi-experimental designs: The groups may differ in known ways, but these differences are accounted for analytically during the evaluation process. Depending on the scale of the health policy evaluated, the groups in these evaluations can consist of individuals receiving health care services, healthcare facilities, communities, or even entire regions. It is crucial to minimize interaction or “mixing” between the groups to prevent bias, such as the intervention’s “contamination” of control groups. By collecting and analyzing comparable data from both the intervention and control groups, evaluators should be able to confidently determine the extent to which any observed changes in health outcomes are attributable to the policy. These methods are particularly useful when it is important to quantify the average impact or net benefit of a health policy intervention. The selection of either an experimental or quasi-experimental evaluation method depends on several key factors: • Randomization feasibility. For experimental methods like RCTs,38 it is crucial to determine if people receiving health care services can be randomly assigned to receive the health policy intervention. If randomization is not feasible, quasi-experimental methods, which do not rely on random assignment, should be considered. 36 Wilson-Grau, R., &and Britt, H. 2012. Outcome Harvesting. Ford Foundation. 37Davies, R., & Dart, J. 2005. The ‘Most Significant Change’ (MSC) Technique: A Guide to Its Use. https://www.mande.co.uk/wp-content/uploads/2005/ MSCGuide.pdf. 38There are several variations of RCTs to suit different needs in health policy evaluation: (1) Factorial RCTs independently randomize participants to multiple interventions; (2) Cluster RCTs randomize groups of participants, such as entire communities or healthcare facilities, rather than individuals; (3) Stepped-Wedge RCTs apply an intervention sequentially and at random to different groups of participants. 40 What are the types of health policy evaluation? Health Policy Evaluation Guideline • Expected effect size. Consider the anticipated magnitude of the policy’s impact on health outcomes. Larger expected effects might be easier to detect with quasi-experimental methods, whereas smaller effects might require the rigor of an RCT to be confidently identified. • Data availability. Evaluate the type and amount of data available for analysis. Experimental methods may require baseline data and follow-up data for both intervention and control groups, whereas quasi-experimental methods may need historical data or data on specific variables that should be used to create a comparable control group. • Control group availability. Determine whether suitable control groups can be identified. For experimental methods, this involves creating control groups through randomization. For quasi-experimental methods, control groups should be identified through natural experiments, historical comparisons, or matching techniques. To assist in selecting the most appropriate experimental or quasi-experimental method for evaluating a health policy, a set of guiding questions should be used. Figure 6 illustrates the decision-making process for choosing these methods. By following this flowchart, evaluators should systematically assess the conditions under which each method is appropriate, ensuring that the selected approach is well-suited to the specific evaluation context. If none of these methods are suitable, theory-based methods should be considered as an alternative approach. Figure 6 Choosing Experimental and Quasi-Experimental Methods Can you compare groups affected and not affected by the intervention? No Theory-based methods Yes Yes Can you assign participants randomly? No Explore experimental methods Explore quasi-experimental methods Consider these questions simultanesouly Do you have information on both Difference in Can you assign intervention and control groups before Yes difference may Individual RCT Yes individuals at random? and after the intervention? be an option Are individuals assigned to intervention Regression and control groups based on a cut-off in Discontinuity Do you need to assign to Yes Design may be Factorial RCT Yes multiple interventions? a pre-intervention measure? an option Do you have individual-level data on A matching known factors influencing participation Yes approach may Can you assign groups for both intervention and control groups? be an option Clustered RCT Yes at random? Can you use historical data to yes Synthetic construct a 'clone' of a group receiving Yes control may be an intervention? an option Stepped-wedge Can you yes assign Yes groups sequentially and RCT Instrumental at random? Is there an external factor that affects the likelihood of being affected, but not Yes variables may the outcomes of interest? be an option Trends before and after but no Interrupted time Yes series may be an concurrent control group? option Can the outcome affect the likelihood Timing of Yes events may be of the intervention? an option Source: Magenta Book, page 47. Building on this decision-making process, Table 9 provides a comprehensive comparison of the experimental and quasi-experimental methods commonly used in health policy evaluation. This table is designed to assist evaluators by detailing the key characteristics, strengths, and limitations of each method. By consulting this table, evaluators can make informed decisions that align with the specific objectives and context of their health policy assessments. Health Policy Evaluation Guideline What are the types of health policy evaluation? 41 Table 9 Detailed Overview of Experimental and Quasi-Experimental Methods Evaluation Description Analytical Focus Strengths Limitations Method Randomized Randomly assigns individuals, Estimates causal Gold standard for Often costly and Controlled communities, or institutions to impact by comparing internal validity; time-consuming; ethical Trial (RCT) treatment or control groups to outcomes between controls for both and logistical isolate the causal effect of a randomly assigned observed and constraints in real-world health policy.39 groups. unobserved health policy settings. confounders. Interrupted Analyzes outcome trends Detects immediate Useful when Requires clear Time Series before and after policy and long-term randomization is not intervention timing; (ITS) implementation to detect changes in outcomes feasible; controls for vulnerable to other whether a significant shift linked to policy timing. baseline trends and simultaneous events occurs post-intervention, seasonality. (“history bias”). assuming prepolicy trends would have continued unchanged.40 Difference- Compares changes in Isolates policy effects Controls for Requires strong parallel in- outcomes over time between by subtracting out time-invariant trends assumption; less Differences a policy-exposed group and a trends common to differences; useful reliable when groups (DiD) comparison group, assuming both treated and with observational differ in trend similar trends would have control groups. data and phased trajectories. occurred without the policy.41 policies. Regression Exploits a pre-determined Estimates local causal High internal validity Results apply only to Discontinuity eligibility cutoff (e.g., income, effects around the near the cutoff; observations near the Design age) for assigning a policy to cutoff point. avoids some ethical threshold; requires (RDD) estimate impact by comparing issues of dense data around the those just above and below randomization. cutoff. the threshold.42 Propensity Matches individuals who Adjusts for Controls for Does not address Score received the policy with similar confounding by observable unobserved Matching individuals who did not, based balancing covariates differences; intuitive confounding; quality (PSM) on observable characteristics, across policy and to communicate and depends on matching to estimate the treatment comparison groups. implement. quality and data effect.43 richness. Synthetic Constructs a weighted Compares observed Useful for policy Requires extensive Control combination of untreated units outcomes of treated evaluation when no preintervention data; Method to serve as a synthetic control unit to a single control group sensitive to selection of (SCM) group, against which the counterfactual is available; predictor variables. policy’s effects on the treated constructed from transparent and unit are compared.44 similar untreated reproducible. units. Instrumental Uses an external factor Estimates policy Enables causal Finding a valid Variables (instrument) that influences impact in situations inference without instrument is difficult; policy exposure but is where unmeasured randomization when interpretation limited to unrelated to the outcome, to confounding would a valid instrument compliers (local identify causal effects.45 bias other methods. exists. average treatment effect). 39Zabor, E. C., Kaizer, A. M., and Hobbs, B. P. 2020. “Randomized controlled trials.” Chest 158 (1 Suppl): S79–S87. https://doi.org/10.1016/j.chest.2020.03.013. 40Lopez Bernal, J., Cummins, S., and Gasparrini, A. 2017. “Interrupted time series regression for the evaluation of public health interventions: A tutorial.” International Journal of Epidemiology 46 (1): 348–355. https://doi.org/10.1093/ije/dyw098. 41Wing, C., Simon, K., and Bello-Gomez, R. A. 2018. “Designing difference in difference studies: Best practices for public health policy research.” Annual Review of Public Health 39: 453–469. https://doi.org/10.1146/annurev-publhealth-040617-013507. 42 Sasabuchi, Y. 2022. “Introduction to regression discontinuity design.” Annals of Clinical Epidemiology 4 (1): 1–5. https://doi.org/10.37737/ace.22001. 43Austin, P. C. 2011. An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate Behavioral Research 46 (3): 399–424. https://doi.org/10.1080/00273171.2011.568786. 44Bonander, C., Humphreys, D., & Degli Esposti, M. (2021). Synthetic control methods for the evaluation of single-unit interventions in epidemiology: A tutorial. American Journal of Epidemiology, 190 (12), 2700–2711. https://doi.org/10.1093/aje/kwab211. 45Iwashyna, T. J., & Kennedy, E. H. (2013). Instrumental variable analyses: Exploiting natural randomness to understand causal mechanisms. Annals of the American Thoracic Society, 10 (3), 255–260. https://doi.org/10.1513/AnnalsATS.201303-054FR. 42 What are the types of health policy evaluation? Health Policy Evaluation Guideline Evaluation Description Analytical Focus Strengths Limitations Method Timing of Models the relationship Examines temporal Captures Computationally Events between timing of policy patterns in outcomes anticipatory effects intensive; requires exposure and timing of relative to policy and delayed precise policy timing outcomes, capturing dynamic engagement. impacts; useful in and long follow-up effects over time while staggered rollouts. periods. accounting for observed and unobserved factors. Source: Adapted from Magenta Book, page 48. Value-for-Money evaluation methods46 In a health policy evaluation, several methods are commonly used to assess value-for-money by comparing the costs and benefits of different policies. These methods guide policy makers in determining the most efficient allocation of resources to achieve desired health outcomes. To aid in selecting the most appropriate economic evaluation method, Table 10 provides a detailed overview of each method, including its key features, when to use it, pros, and cons. Table 10 Overview of Commonly Used Value-for-Money Evaluation Methods Evaluation Description Analytical Focus Strengths Limitations Methods Cost- Compares the costs and Measures the cost Useful for evaluating Cannot compare Effectiveness health outcomes of one or per unit of health policies with common across policies with Analysis more health policies. outcome – for health outcomes; different outcome (CEA) Estimates the cost to example, cost per supports efficient types, such as life-years achieve a unit of health life-year gained resource allocation gained vs. cases outcome—such as a life-year – enabling within the health prevented; does not gained, QALY, or DALY— comparison of sector; often based on express health benefits relative to another policy or interventions clinical or in monetary terms, the status quo,47 for example, targeting the same epidemiological data; limiting use in cross- cost per life-year gained from health outcome but widely use in health sector decisions – for a national cancer screening differing in costs. technology example, life-years policy compared to no assessments. gained from tobacco screening. control vs. maternal deaths averted through skilled birth attendance. Cost-Benefit Evaluates whether the total Examine the net Facilitates comparison Monetizing health Analysis benefits of a health policy, economic value of of policies across outcomes can be (CBA) translated into monetary a health policy by diverse sectors by methodologically terms, exceed its costs, also comparing total standardizing both complex and ethically expressed in monetary expected benefits costs and benefits in sensitive – for example, terms,48 such as estimating and costs, allowing monetary terms; valuing the long-term the net economic return of a decision-makers to provides a single benefits of preventing nationwide HPV vaccination assess return on monetary metric for chronic diseases policy, including reduced investment. decision-making; through early childhood cancer treatment costs and supports broad nutrition policies. productivity gains. resource allocation decisions. Source: Adapted from Magenta Book, page 49. Synthesis methods Synthesis methods are vital in health policy evaluation as they allow for the integration of findings from various studies to form a comprehensive understanding of a policy’s impact and its implementation. These methods help to address key evaluation questions by combining results from different methodologies, whether quantitative, qualitative, or mixed-methods. By synthesizing findings around specific evaluation questions, rather than merely presenting individual results, these methods aim to provide a more robust and cohesive 46 See the Economic Evaluation Types subsection for more detailed information. 47 World Bank. Cost-effectiveness analysis. DIME Wiki. https://dimewiki.worldbank.org/Cost-effectiveness_Analysis. 48Independent Evaluation Group. 2010. Cost-benefit analysis in World Bank projects. World Bank. https://ieg.worldbankgroup.org/sites/default/files/Data/ Evaluation/files/cba_full_report1.pdf. Health Policy Evaluation Guideline What are the types of health policy evaluation? 43 evidence base—a process often referred to as “triangulation.” This approach can enhance the reliability of the evidence by providing a consensus where multiple sources align. However, if the evidence is conflicting, evaluators must carefully analyze discrepancies, consider alternative explanations, and seek additional data, if necessary. In addition to their role post-evaluation, synthesis methods can be employed at the pre-evaluation stage to integrate existing knowledge on a health policy topic. This helps to identify what is already known and highlights any gaps in the evidence base, guiding the design of new policies or interventions. Formal synthesis techniques, such as meta-evaluation and meta-synthesis, are particularly useful to bring together the findings from multiple studies on the same policy issue. These methods involve systematically combining data to create a coherent narrative that addresses specific evaluation objectives. They rely on strict protocols and predefined criteria to assess the quality of included studies, ensuring that the synthesis is both comprehensive and credible. A key challenge with synthesis methods is their reliance on existing data. The effectiveness of these methods is contingent on the availability and quality of prior studies, making rigorous inclusion criteria and thorough quality assessments essential. Systematic reviews and rapid evidence assessments are examples of synthesis methods that integrate existing literature to provide a narrative summary. While systematic reviews offer a thorough and methodical approach, they can be time-consuming, often requiring several months to complete. Rapid evidence assessments, on the other hand, provide quicker insights but may be less rigorous, making them suitable when decisions need to be made promptly—see Annex 3 for more information. To assist evaluators in selecting the most appropriate synthesis method for their health policy evaluations, a set of guiding questions is provided below in Figure 7. Figure 7 Selecting Evidence Synthesis Methods Meta-etnography Integrating Rapid evidence assessment and summarizing Do you need to synthesize evidence Do you have tight time lived experiences? pressure (likely few weeks to few months)? Realist synthesis Do you need to explicitly account for different study designs and contexts? Meta analysis Systematic review Do you have sufficient Do you need a robust homogeneity of data for comprehensive thematic formal analysis? narrative review of the literature? Source: Adapted from Magenta Book, page 51. After considering the guiding questions outlined in the figure above, evaluators should use Table 10 below to further refine their choice of synthesis method. The table provides a comprehensive overview of commonly used evidence synthesis methods, detailing their key characteristics, advantages, and potential limitations. This comparison will help evaluators select the most appropriate approach to effectively integrate and summarize evidence in their health policy assessments. 44 What are the types of health policy evaluation? Health Policy Evaluation Guideline Table 11 Overview of Commonly Used Evidence Synthesis Methods Evaluation Description Analytical Focus Strengths Limitations Method Rapid A streamlined literature review Rapidly scopes and Faster than May omit relevant Evidence method that rapidly assesses the synthesizes existing traditional reviews; studies due to time Assessment existing body of evidence on a evidence to inform supports rapid constraints; less (REA) focused policy question. Typically time-sensitive policy decision-making; methodological completed within 2–3 months decisions. can identify depth and using simplified protocols, and evidence gaps for transparency than may optionally incorporate expert future research. full systematic input.49 reviews. Systematic A rigorous method (mixed data) Provides a High Resource- and Review for identifying, appraising, and comprehensive and methodological time-intensive (often synthesizing all empirical unbiased synthesis rigor; minimizes ≥6 months); less evidence that meets prespecified of high-quality bias; supports feasible when criteria to answer a specific evidence to assess evidence-based evidence is sparse health policy question. It policy effectiveness. policymaking. or decisions are minimizes bias through urgent. structured and transparent protocols.50 Meta-Analysis A statistical technique Quantifies the Enhances precision; Quality depends on (quantitative data) that combines magnitude of policy increases statistical original studies; results from multiple eligible impact by power; reveals requires studies to produce a pooled statistically patterns or methodological estimate of the effect of a health aggregating inconsistencies in consistency; policy or intervention, improving comparable findings study results. sensitive to precision and identifying patterns from multiple publication bias and across data sets.51 studies. heterogeneity across studies. Meta- A qualitative synthesis method Interpretive Generates context- Dependent on Ethnography that interprets and translates synthesis of rich, theory- quality of source findings from multiple qualitative stakeholder informed insights; studies; subjectivity studies to generate new perspectives to captures lived in interpretation; conceptual insights into health develop conceptual experience; useful limited policy impacts, particularly from understanding of for complex policy generalizability. lived experiences.52 policy outcomes. issues. Realist An interpretive review method Explores the causal Unpacks Requires domain Synthesis that investigates how, why, and mechanisms by complexity; explains and methodological for whom health policies work (or which policies how and why expertise; time- do not) by analyzing context– produce outcomes policies succeed or consuming; not mechanism–outcome (CMO) in specific contexts fail; adaptable to easily replicable or relationships. Combines realist to build explanatory diverse standardized. evaluation principles with theory. implementation structured literature synthesis.53 settings. Source: Adapted from Magenta Book, page 52. Checklist for Step 3: Design Evaluation ☐ Identify the evaluation methods, considering the data needs, level of comprehensiveness, and resource needs ☐ Utilize the appropriate method(s) to perform the policy evaluation Step 4: Collect Data Overview of Data Collection Following the evaluation design, data collection becomes a critical component of the health policy evaluation 49Crawford, C., Boyd, C., Jain, S., Khorsan, R., and Jonas, W. 2015. “Rapid Evidence Assessment of the Literature (REAL©): Streamlining the systematic review process and creating utility for evidence-based health care.” BMC Research Notes 8: 631. https://doi.org/10.1186/s13104-015-1604-z. 50Ahn, E., and Kang, H. 2018. “Introduction to systematic review and meta-analysis.” Korean Journal of Anesthesiology 71 (2): 103–112. https://doi. org/10.4097/kjae.2018.71.2.103. 51 Ahn, E., and Kang, H. 2018. “Introduction to systematic review and meta-analysis.” 52Sattar, R., Lawton, R., Panagioti, M., and Johnson, J. 2021. “Meta-ethnography in healthcare research: A guide to using a meta-ethnographic approach for literature synthesis.” BMC Health Services Research 21 (1): 50. https://doi.org/10.1186/s12913-020-06049-w. 53Schick-Makaroff, K., MacDonald, M., Plummer, M., Burgess, J., and Neander, W. (2016). What synthesis methodology should I use? A review and analysis of approaches to research synthesis. AIMS Public Health 3 (1): 172–215. https://doi.org/10.3934/publichealth.2016.1.172. Health Policy Evaluation Guideline What are the types of health policy evaluation? 45 process. It requires meticulous planning, as inadequate preparation or restricted access to data can render the evaluation infeasible, significantly constrained, or prohibitively costly. Furthermore, a poorly designed data collection strategy may result in the acquisition of inaccurate, incomplete, or irrelevant data, leading to flawed conclusions that compromise the validity and usefulness of the evaluation findings. A key step in the data collection process is the gathering of baseline data—information collected before the policy is implemented. Baseline data provides a reference point for measuring the policy’s impact over time. The data collection process begins with the identification of appropriate indicators, followed by determining data needs and sources, and ultimately gathering the necessary data. Effective data collection also requires managing logistics, conducting quality checks, and ensuring that data is collected in sufficient quantity and quality—see sample template in Annex 4. Each of these elements contributes to collecting the right data, in the right amount, at the right quality, in the most efficient and effective way—topics that are further discussed in the following subsections. How to Determine Indicators for Evaluation The focus of the evaluation—identified through the policy description, stakeholder engagement, and the key questions developed during the evaluation design phase—is translated into measurable indicators during the data collection stage. An indicator is a specific, observable, and measurable characteristic or change that reflects progress toward achieving a policy’s intended outcomes. Indicators are essential tools in the evaluation process: they determine what data will be collected, help to standardize assessment efforts, and ultimately inform evidence-based policy decisions. Indicators can be categorized based on their purpose—for example: process indicators, which track the implementation of activities; and outcome indicators, which measure the results or changes brought about by those activities. Evaluators should either select existing indicators or develop new ones to address the evaluation focus and stakeholder questions. When feasible, using existing indicators is advantageous, as they are often validated, come with defined data sources, and support comparability across policies and contexts. Pretested indicators also improve the efficiency and credibility of the evaluation process. However, if existing indicators do not fully capture the specific processes, outcomes, or impacts of the policy in question, evaluators may develop new indicators, tailored to the policy’s unique context and goals. Regardless of whether existing or new indicators are used, they should be aligned with or contribute to prioritized health indicators (Table 12). See also Annex 5. To ensure quality and relevance, evaluators should follow established criteria for indicator selection or development. These criteria are detailed in Annex 6.  Table 12 I llustrative Indicators Developed from the Evaluation Findings of the Mandatory Medical Malpractice Insurance Policy for Other Health Practitioners List of indicators to be used for illustrative purposes Dimensions for Policy questions to be answered Indicators (illustrative) addressing the policy assessment questions Patient safety Does the Mandatory Medical Malpractice % of patients’ allegations against healthcare Insurance for Other Health Practitioners Policy practitioners for negligence contribute to improving patient safety? % of malpractice lawsuits per 1000 patients % of patients very highly or highly satisfied with patient safety in health services delivered by healthcare practitioners % of healthcare practitioners optimistic about the impact of the Malpractice Insurance Policy in reducing the number of medical errors Quality of care Does the Mandatory Medical Malpractice Medical errors per 10,000 persons reported as Insurance for Other Health Practitioners Policy a result of the negligence by healthcare contribute to improvement in healthcare practitioners quality? % of clinical guidelines actively used in service delivery by healthcare practitioners % of healthcare practitioners using standard clinical guidelines in service delivery 46 What are the types of health policy evaluation? Health Policy Evaluation Guideline List of indicators to be used for illustrative purposes Risk mitigation Does the Mandatory Medical Malpractice % of patients receiving malpractice Insurance for Other Health Practitioners Policy compensation against the allegations for reduce financial burden for healthcare negligence of healthcare practitioners practitioners? % of allegations patients received malpractice compensation corresponding to the severity of injury made by healthcare practitioner Amount of insurance payments per patient claiming for the malpractice compensation Does the Mandatory Medical Malpractice % of healthcare professionals obtained Insurance for Other Health Practitioners Policy malpractice insurance coverage reduce financial burden for healthcare practitioners? % of healthcare providers sharing malpractice insurance premiums with healthcare practitioners % of annual malpractice insurance premiums in the total income of healthcare practitioners % of healthcare practitioners satisfied with the malpractice insurance benefits against costs Profitability of MPI Is the Mandatory Medical Malpractice % of insurance payments in total malpractice for insurance Insurance for Other Health Practitioners Policy insurance premiums companies attractive for the insurance companies? Access to Does the Mandatory Medical Malpractice % of healthcare practitioners withdrawing from healthcare Insurance for Other Health Practitioners Policy the delivery of healthcare services with higher services affect access to healthcare services? risks Efficiency in health Does the Mandatory Medical Malpractice Number of healthcare services per patient service delivery Insurance for Other Health Practitioners Policy provided by healthcare practitioners affect the efficiency of health service delivery? Source: Developed by the authors using evaluation findings from the implementation of the Mandatory Medical Malpractice Insurance Policy for Other Health Practitioners (2024), in consultation with the Saudi Health Council. What data sources should be considered, and how should they be collected for an evaluation? Once the activities and outcomes to be measured have been identified and the indicators for tracking progress determined, selecting the appropriate data collection methods and sources becomes essential. A critical decision is whether existing data sources—secondary data collection—are sufficient to measure the indicators, or if new data must be gathered through primary data collection. Whenever possible, leveraging secondary data sources helps minimize both the financial costs of the evaluation and the burden on respondents. Common secondary data sources in health policy evaluation include: • Existing administrative and monitoring data. Administrative data, such as healthcare resource information, services delivered, and beneficiary records, are often collected for operational purposes and can be repurposed for evaluation. Monitoring data, gathered throughout the policy cycle, help to track policy progress and can address policy questions related to inputs, processes, and outcomes. • National health surveys. These surveys provide large, representative samples, offering valuable statistical insights into health behaviors and outcomes across populations. • Disease-specific health registries. These registries contain detailed information about specific diseases, which can be useful for evaluating health policies targeting certain conditions. • Vital statistics. Data from birth and death records, among others, provides critical information for evaluating the long-term impacts of health policies on population health. • Health surveillance systems. These systems track disease outbreaks, health trends, and outcomes, making them crucial sources of information for health policy evaluations. • Hospital and clinic administrative data. Patient data from healthcare facilities can be repurposed to assess health policy implementation and outcomes. • National population censuses and household surveys. These broad datasets provide essential demographic and socio-economic information, relevant for evaluating the reach and equity of health policies. Health Policy Evaluation Guideline What are the types of health policy evaluation? 47 If secondary data sources do not adequately address the evaluation’s needs, primary data collection methods may be necessary. Primary data collection methods also fall into several broad categories, such as: • New sources of data designed specifically for the evaluation. These include tailored initiatives such as social media data collection to meet specific evaluation requirements. • New surveys conducted for evaluation purposes, including personal interviews, telephone interviews, and instruments completed by respondents through mail or email. • Group discussions or focus groups. These qualitative methods gather in-depth insights from specific populations or stakeholders. • Observation. Observing behaviors and processes in real-time offers valuable data, especially for understanding how policies are being implemented. • Document reviews. This involves reviewing records such as medical logs, diaries, or meeting minutes to gather relevant data for the evaluation. Choosing the appropriate method from the available secondary and primary data collection options requires careful consideration of both context and content. Contextual factors include the budget available for data collection, the timeline for obtaining results, and any ethical considerations. Content-related questions must also be addressed, such as whether the topic involves sensitive issues, observable behaviors, or information the respondent is likely to know. While the primary focus is on selecting the most relevant data sources and methods, mapping out how the evaluation questions, indicators, and data collection methods relate to each other can offer additional clarity— see Figure 8. This type of visualization provides an optional tool to help evaluators to organize the relationships between various data sources and formats, allowing for easier tracking and understanding of how each part of the evaluation fits together. llustrative Data Flow and Mapping Visualization Developed from the Evaluation Findings of the Mandatory Figure 8 I Medical Malpractice Insurance Policy for Health Practitioners Questions Indicators Data needs Data source Total number of Annual Saudi Health Council Digital allegations against % of patients Annual allegating against healthcare practitioners Other relevant entities health practitioners for negligence Total number of patients Annual Saudi Health Council Digital served by healthcare Does the Malpractice practitioners Annual Other relevant entities Insurance policy contribute improving patient safety? Annual Saudi Health Council Digital Number of malpractice lawsuits Annual % of malpractice Other relevant entities lawsuits per 1000 patients Total number of patients Annual Saudi Health Council Malpractice insurance Digital served by health for other health practitioners Annual Other relevant entities practitioners' Policy Number of health Annual Saudi Health Council practitioners obtained Digital malpractice insurance % of health Annual Other relevant entities practitioners coverage obtained malpractice insurance coverage Total number of health Digital practitioners to be Annual Saudi Health Council covered with the Does the Malpractice Malpractice Insurance Annual Other relevant entities Insurance Policy reduce Policy financial burden for health practitioners? Number of health practitioners satisfied Every 3 Years Saudi Health Council % of health Digital with the malpractice practitioners insurance benefits Every Other relevant entities satisfied with the against costs 3 Years malpractice insurance benefits against costs Digital Number of health practitioners surveyed to Every evaluate the 3 Years Saudi Health Council implementation of Malpractice Insurance Every Other relevant entities 3 Years for Other Healthcare Practitioners Source: Developed by the authors using evaluation findings from the implementation of the Mandatory Medical Malpractice Insurance Policy for Other Health Practitioners (2024), in consultation with the Saudi Health Council. 48 What are the types of health policy evaluation? Health Policy Evaluation Guideline Employing Multiple and Mixed Methods in Data Collection In many cases, a single data collection method may not adequately capture the complexity of a health policy, or the full range of outcomes being assessed. In such instances, employing multiple or mixed methods of data collection can enhance the depth and credibility of the evaluation. Using mixed methods—the deliberate integration of both qualitative and quantitative approaches—allows evaluators to explore not only what changes occurred, but also how and why they happened. This approach can improve the validity of findings and provide a more comprehensive understanding of the policy’s effects. One common benefit of mixed methods is triangulation, where findings from different methods are cross validated to increase the reliability and robustness of the conclusions. There are two typical designs for using mixed methods in evaluation: 1. Sequential mixed methods. One method is used first to inform or complement the next. For example, focus groups (qualitative) may be conducted to inform the design of a survey instrument (quantitative), followed by follow-up interviews (qualitative or mixed) to explore findings in greater depth. 2. Concurrent mixed methods. Both qualitative and quantitative methods are applied in parallel. For example, focus groups or open-ended interviews may be conducted alongside a quantitative survey to confirm or enrich the interpretation of survey responses. By employing mixed methods, evaluators can address complex evaluation questions more effectively, enhance stakeholder confidence, and strengthen the overall rigor of the health policy evaluation. This approach is especially valuable when the outcomes being evaluated are abstract or when no single high-quality data source exists, as it maximizes the strengths and mitigates the limitations of each method.54 How to Ensure Data Quality Ensuring data quality is essential to produce valid, reliable, and actionable evaluation findings. High-quality data allows stakeholders to trust the conclusions of the evaluation and use the results for informed decision-making. In a health policy evaluation, data quality encompasses several key factors: reliability, validity, accuracy, consistency, and completeness. Each of these components must be carefully addressed to maintain the credibility of the evaluation process and its findings. • Reliability: This refers to the consistency of data over time and across different data collectors or sources. High reliability ensures that repeated measurements under the same conditions produce the same results. For example, if different evaluators collect data from the same health policy intervention, the outcomes should remain consistent across time, methods, or personnel. • Validity: This determines whether the data accurately reflects what it is intended to measure. This factor is crucial in health policy evaluation as it ensures that data supports valid conclusions about the policy’s effects on health outcomes. • Accuracy: Data must be precise and error-free. Accurate data accurately reflects real-world conditions and policy impacts, avoiding bias or misrepresentation. • Consistency: This refers to ensuring that data remains stable across different contexts and time frames. For example, data collected in one phase of a health policy should align with data collected in other phases to allow for meaningful comparisons over time. • Completeness: Complete data ensures that no critical information is missing, which can impact the evaluation’s conclusions. Incomplete data can lead to biased results and misinterpretation of the policy’s effects. To achieve high-quality data, evaluators must follow a series of steps designed to enhance the factors influencing data quality. Figure 9 shows the essential steps that ensure that these factors are realized during data collection and analysis. 54 CDC’s Program Evaluation Framework Action Guide, page 63. Health Policy Evaluation Guideline What are the types of health policy evaluation? 49 Figure 9 Steps in Ensuring Data Quality Design of Data Collection Instruments Training of Data Collectors Ongoing Monitoring Ensuring and Review high data quality Data Management and Coding Error Checking and Pretesting Source: Authors. 1. Design data Design well-structured data collection instruments that align with the evaluation’s goals. collection Questions should be clearly worded to avoid ambiguity, and the instruments should be instruments pretested to identify any issues. 2. Train data Proper training is vital to ensure that data collectors understand the protocols and collectors methods for gathering data consistently. Training should cover both technical aspects (such as using data collection tools) and ethical considerations (such as confidentiality). 3. Data Proper data management practices should be implemented to ensure that data is management recorded, stored, and retrieved in an organized manner. This includes coding data and coding systematically to allow for easy analysis and retrieval. 4. Error checking Conduct error checks throughout the data collection and entry process to identify and pretesting inconsistencies or mistakes. Pretesting data collection tools allows evaluators to identify potential issues before large-scale data collection begins, improving data accuracy. 5. Ongoing Regularly review the data quality throughout the evaluation process. Continuous monitoring monitoring can help to identify and resolve any data quality issues before they affect the and review final analysis. By following these steps, evaluators can ensure that the data used in health policy evaluations is of high quality— reliable, valid, accurate, consistent, and complete—which, in turn, contributes to more credible, transparent, and actionable conclusions. What Ethical Considerations Need to Be Addressed in Data Collection and Quality Assurance? It is important to address the ethical considerations involved in data collection and analysis. Ethical data practices help to maintain trust with stakeholders and ensure that the evaluation process respects privacy, confidentiality, and integrity. Following are key considerations. 1. Confidentiality Protecting the privacy of individuals whose data are collected is paramount. Ensure that all personal and sensitive information is kept confidential, and only authorized personnel have access to it. Data anonymization techniques should be used where applicable. 2. Informed Participants in health policy evaluations must provide informed consent, understanding consent the purpose of data collection, how their data will be used, and their right to withdraw from the study at any time without penalty. Consent forms should be clear, accessible, and culturally appropriate—see Annex 7 for a sample. 50 What are the types of health policy evaluation? Health Policy Evaluation Guideline 3. Data security Strong data security measures should be implemented to protect against unauthorized access or data breaches. This includes encrypting data, restricting access, and ensuring that data are stored in secure systems. Evaluators should also be prepared to handle any potential ethical dilemmas that arise during data collection, such as breaches of confidentiality or unanticipated harms to participants. By consistently applying these standards and following these steps, evaluators deepen their understanding of a health policy context and significantly enhance the effectiveness and impact of their evaluations. How Much Data Should Be Collected? Determining the appropriate quantity of data is essential to ensuring accurate and reliable conclusions. While some evaluations require data of the highest validity and reliability, particularly when supplemented by research studies, there are instances where a smaller sample or convenience sampling may suffice. The following factors should guide the decision on the quantity of data to collect. 1. Evaluation questions: The data requirements depend on the evaluation’s scope. For example, evaluations focusing solely on policy-related questions may require fewer data points than those with broader research objectives. Additionally, quantitative evaluations generally require more data than qualitative evaluations. The sample size will vary based on the level of detail needed and the types of comparisons being made – for example, comparing different population groups or time periods. 2. Level of jurisdiction: The scale of the evaluation—national, regional, or local—significantly impacts the amount of data needed. Evaluations at higher jurisdictional levels, such as national or regional, typically require larger and more representative samples due to greater population heterogeneity. Probability sampling methods, such as stratified or multistage sampling, may be necessary to ensure that the data is representative of the broader population. These methods help to account for population diversity and improve the reliability of the findings. In contrast, local or community-level evaluations generally involve smaller samples, which are less costly and can still provide sufficient insights. In these cases, non-probability sampling methods like convenience or purposive sampling may be more appropriate when generalizability is less of a concern, but the limitations of such methods should be noted. 3. Size of change (effect size): Evaluations aiming to detect small changes in health outcomes or policy impacts typically require larger sample sizes. This is because small changes are harder to detect and require more data to ensure statistical power. Effect size refers to the anticipated magnitude of the policy’s impact, such as a 5% or 10% reduction in hospital readmissions. The smaller the expected effect size, the larger the sample size needed to detect it with confidence. 4. Statistical power and power calculations: To ensure that the sample size is adequate, power calculations should be used. These calculations help evaluators to determine how much data is needed to achieve a statistically sound evaluation. Key elements of power calculations include » Effect size: The anticipated impact of the policy – for example, reduction in hospital readmissions by 5%. » Significance level: Usually set at 0.05, this represents the likelihood of incorrectly concluding that the policy had an effect when it did not (Type I error). » Desired power level: Typically set at 80%, meaning there is an 80 percent chance of detecting a true effect if it exists, thus minimizing the risk of missing significant outcomes (Type II error). Power calculations balance data sufficiency with resource efficiency. For example, detecting smaller changes in health outcomes requires larger sample sizes, while more significant changes can be identified with fewer data. By conducting power analysis, evaluators can ensure that they collect enough data to avoid Type I errors (false positives) and Type II errors (false negatives).55 5. Adjusting for data loss: Evaluators should anticipate potential data loss, such as incomplete responses or dropouts, and adjust the sample size accordingly. This is often done by inflating the calculated sample size to maintain the study’s statistical validity. Alternatively, methods like data imputation can be used to address 55In the context of health policy evaluation, it’s important to recognize the potential for Type I and Type II errors during the analysis phase: Type I Error (False Positive): This occurs when an evaluation wrongly concludes that a health policy had an effect when it did not. For example, an - evaluation might mistakenly attribute a reduction in hospital readmissions to a newly implemented policy, when in fact the change was due to unrelated external factors. This can lead to the continuation of ineffective policies based on incorrect conclusions. Type II Error (False Negative): This happens when an evaluation fails to detect a true effect of the policy. For instance, if a public health campaign - significantly improved vaccination rates, but the evaluation did not have sufficient data or statistical power to detect the change, the policy’s effectiveness may be overlooked. This can result in the abandonment or underutilization of effective policies.  Evaluators must be mindful of these errors and use power calculations to ensure that they collect enough data to make accurate, well-supported conclusions about the policy’s true impact. Health Policy Evaluation Guideline What are the types of health policy evaluation? 51 missing data, ensuring that conclusions remain reliable despite incomplete datasets. How to handle data collection Appropriate data collection and handling are fundamental to ensuring the integrity and security of health policy evaluations. All data used in an evaluation, regardless of its source, must be collected, transferred, stored, processed, and deleted in accordance with national laws and relevant security processes. Data Security and Access Protocols It is critical that everyone handling data is properly trained to understand data security procedures. Data access protocols should clearly define who has authority to access the data, how remote access is managed, and the specific requirements for handling and securing sensitive information. Some key considerations include: • Authorizing data access for different user groups based on roles and responsibilities. • Implementing security measures for remote access, including the use of secure Wi-Fi. • Conducting mandatory data handling and security training. • Masking or encrypting personal data to ensure privacy. Data should be stored in secure formats, such as databases, spreadsheets, or data warehouses. Access to this data should be limited to those with legal entitlement and a work-related need, ensuring that only authorized personnel can read, write, or modify the information. Anonymization and De-identification of Data To safeguard privacy, data should be anonymized as early as possible during the evaluation process. This involves removing direct identifiers, such as names or addresses, from the analytical dataset. If retaining this information is temporarily necessary, it should be stored separately with strictly controlled access. Even with anonymization, data may still qualify as personal data, necessitating careful handling under data protection laws. Evaluators should ensure full alignment with national efforts to promote cybersecurity and ethical data governance. In the Kingdom of Saudi Arabia, this includes adhering to the Personal Data Protection Law and relevant national bioethics frameworks. Key references include the National Committee of Bioethics (NCBE)56 and the Personal Data Protection Law.57 These frameworks guide the ethical handling of sensitive data and uphold the principles of privacy, confidentiality, and responsible data use in research and evaluation context. At the conclusion of an evaluation, every effort should be made to fully anonymize the data. This could involve replacing precise identifiers such as date of birth with broader categories like year of birth, or substituting exact locations with general geographic regions. If full anonymization is not feasible and personal data must be retained, appropriate safeguards must be established, including secure storage and restricted access. Risks of Improper Data Handling The risks associated with improper data handling are significant and include breaches of confidentiality or data security, harm to individuals or communities through privacy violations, and legal challenges or reputational damage to the department or government. Mitigating these risks requires adhering to strict data handling protocols and regularly reviewing compliance with legal and organizational standards. Agreements and Roles in Data Collection Agreements among the parties involved in the evaluation process are essential for clarifying roles, responsibilities, and procedures for managing data and conducting the evaluation. These agreements may take the form of legal contracts, memoranda of understanding, or detailed protocols. Elements of an agreement include: • Intended users and uses of the data. • The purpose of the evaluation and key evaluation questions. • Evaluation design and methods. • A summary of the deliverables, timeline, and budget. Creating these agreements helps establish a shared understanding of the evaluation activities and data handling procedures. It also provides a reference point for any necessary modifications during the evaluation process. 56National Committee of Bioethics (NCBE). “About the Committee.” King Abdulaziz City for Science and Technology (KACST). Retrieved from https://ncbe. kacst.edu.sa. 57Saudi Data and Artificial Intelligence Authority (SDAIA). 2022. Personal Data Protection Law. https://sdaia.gov.sa/ar/SDAIA/about/Files/ RegulationsAndPolicies02.pdf. 52 What are the types of health policy evaluation? Health Policy Evaluation Guideline Agreements ensure that all parties involved are aligned on the data security measures, access protocols, and other logistical aspects of the evaluation. Checklist for Step 4: Data collection and analysis ☐ Identify the indicators required to measure the identified evaluation focus and evaluation questions ☐ Decide to choose the indicators from the existing inventory or to develop new ones. ☐ Consider the availability of data and choose the most appropriate data source(s). ☐ Review various data collection methods and select those that are best suited to the context and content. ☐ Pilot test new newly developed indicators to identify and control potential sources of error. ☐ Consider adopting a mixed-method approach for data collection. ☐ Evaluate both quality and quantity issues related to the data collection process and adjust to the available resources and time. ☐ Create a detailed protocol to guide the data collection process. Step 5: Justify Conclusions The next step in health policy evaluation is the justification of conclusions, which increases the likelihood to use the evaluation results for decision-making. At this stage, policy evaluators analyze the data, make assertions regarding a policy’s outcomes and impacts, and justify these conclusions. Justifying conclusions involves analyzing and synthesizing data, establishing performance standards for the health policy, interpreting the results, and making well-supported claims that align with the expectations of the health system and its stakeholders. When stakeholders—who may have differing perspectives on what constitutes success of a policy—see if the conclusions reflect their values and the policy’s intended goals, they are more inclined to use the findings to guide decisions.58 At this stage the indicators are reviewed, results are tabulated, outcomes are compared and presented in a way that is easily understandable by policy makers, stakeholders, and evaluators. The goal is to present a clear narrative of the policy’s effectiveness and impact, based on the evaluation findings. The process of analysis and justification typically goes through the following key stages: 1. Data entry and error checking. Enter the collected data into a secure system or database, ensuring that it is accurate and free of errors. Health policy evaluations often utilize existing systems like health information systems or administrative datasets, but if new data are collected – for example, surveys – it is essential to use appropriate software to enter, check, and accurately tabulate the data. 2. Tabulate data. For each indicator, calculate basic statistics to generate key insights, such as the total number of participants or health beneficiaries, the number of participants who experienced the intended health outcome, and the percentage of participants or regions achieving the policy’s desired outcomes. 3. Stratify data by key variables. Analyze the data by breaking down into relevant demographic or geographic categories, such as age, gender, socioeconomic status, or region. This stratification allows evaluators to assess whether the policy had different impacts across subgroups, ensuring that the policy’s outcomes are equitable. 4. Make comparisons. In health policy evaluation, it is important to compare data across different groups, regions, or time periods. This could involve comparing intervention groups with control groups, pre-policy vs. post-policy outcomes, or regional variations in policy impact. Statistical tests may be used to show differences between these groups, helping to identify where the policy was most or least effective. 5. Present data in an understandable format. Use clear visual tools, such as bar charts, pie charts, line graphs, and maps to present the evaluation data. Policy makers and stakeholders need to easily interpret the findings to make informed decisions about the health policy’s future. Visual data presentations help to highlight key trends and outcomes, making complex data more accessible. 58 CDC’s Program Evaluation Framework Action Guide, page 74. Health Policy Evaluation Guideline What are the types of health policy evaluation? 53  urvey to Assess the Implementation of the Mandatory Medical Malpractice Insurance for Other Health Sample Case S Practitioners Policy Survey overview Saudi Health Council conducted a survey, which aimed to provide a comprehensive assessment of this policy among other health practitioners by measuring the compliance of healthcare providing entities, the number of insured other healthcare workers, and the level of awareness and understanding of the policy across the health practitioners. Data entry and error checking The evaluation team collected the survey data based on the standardized questionnaires which were designed for health practitioners and healthcare providing facilities separately. To ensure the accuracy and the objectivity of the results the evaluation team used IBM® SPSS statistical software in analyzing data and employed advanced statistical methods. Tabulate data A total of 261 responses were collected from the healthcare facilities, which included a variety of institutions such as hospitals, clinics, primary care centers, polyclinics, and optical shops. Additionally, 1,112 healthcare practitioners responded to the questionnaire. After excluding physicians and dental assistants, the sample consisted of 1,090 healthcare practitioners. Following the removal of duplicate responses, the final sample size was 1,069 healthcare practitioners. The survey provides basic statistics including the frequencies and percentages across the participants. Stratify data by key variables The health practitioners participating in the survey were demographically stratified to gender and age groups to assess the policy’s impact across different demographic groups. Out of 1,069 participants, 556 were males, making up 52.0% of the sample, while the number of females was 513, representing 48.0%. Among the 1,069 participants, the 35–44 years age group constituted the largest group, with 437 participants, equivalent to 40.9% of the total. This was followed by the 25–34 years age group with 419 participants, representing 39.2%. The youngest age group (18–24 years) had 34 participants, accounting for 3.2%. The 55–64 years age group comprised 35 participants, making up 3.3%. There were only two participants aged 65 years or older, 0.2% of the total. The participants were also stratified to geographic locations across 13 regions. Riyadh region had the largest number, 304 participants, representing 28.4% of the total. This was followed by Makkah region with 201 participants, accounting for 18.8%, and the Eastern region with 183 participants, making up 17.1%. Al- Baha region had only 12 participants, representing 1.1%, making it the region with the lowest percentage among the regions of the Kingdom. Additionally, the survey designers broke down the participants into groups according to the level of their education. Thus, the data showed that those with a bachelor’s degree represented the largest proportion at 57.25%. They were followed by those with a diploma, at 32.27%. Others were: participants with a master’s degree, 6.74%; a doctoral degree, 2.06%; and a board certification, 1.68%. Make comparisons Various data analysis methods were used to understand the general distribution of the data, including descriptive analysis that covered basic statistics such as means, frequencies, and percentages. Inferential analysis was also conducted through hypothesis tests such as the T-test and chi-square test to determine if statistically significant differences existed between groups. Present data in an understandable format The survey results were presented in formats such as tables, bar charts, and pie charts to compare the results across groups identified. The different formats helped to cross-check various data to produce indicators. Source: Developed by the authors using evaluation findings from the implementation of the Mandatory Medical Malpractice Insurance Policy for Other Health Practitioners (2024), in consultation with the Saudi Health Council. 54 What are the types of health policy evaluation? Health Policy Evaluation Guideline In evaluations that use multiple methods (quantitative and qualitative), it is important to combine various sources of evidence to reach a comprehensive understanding of the policy’s impact. This involves detecting patterns in the data, integrating qualitative insights, such as from focus groups or interviews, and comparing these with the quantitative findings like health outcomes or cost savings. For example, if a policy aimed at improving access to healthcare shows a quantitative improvement in service utilization, qualitative data from healthcare providers might provide insight into barriers that remain or unintended effects of the policy. This synthesis helps evaluators not only to assess whether the policy met its objectives but also to understand why certain outcomes occurred, providing actionable recommendations for improvement. In the evaluation process, performance benchmarks59 are also essential tools used to assess the effectiveness and impact of policies. These benchmarks reflect the values and expectations of stakeholders regarding the health policy and are fundamental to sound evaluation practices. Policy makers and stakeholders must work together to define the criteria that will be used to determine whether a policy is deemed successful, adequate, or in need of improvement. This process ensures that the evaluation aligns with national health priorities and the needs of the population. When conflicting views arise about the quality, value, or significance of a policy, it often suggests that stakeholders are employing different standards or values. Such disagreements should encourage stakeholders to clarify their values and work toward a consensus on how to interpret the evaluation findings. To prevent such conflicts, it is critical to establish clear performance benchmarks for each indicator early in the evaluation process—see Step 1: Identify and Engage Stakeholders. These benchmarks are often based on expected improvements from baseline data and should be realistic yet aspirational, considering the policy’s maturity, the capacity of the health system, and the priorities of the stakeholders involved. Early agreement on benchmarks helps to prevent disputes later in the process and contributes to a smoother, more effective evaluation. Evaluators can apply the following best practices to accurately interpret and communicate evaluation findings: • Interpret results in relation to the policy’s original goals and objectives. • Tailor the reporting to the intended audience, ensuring that findings are relevant and understandable to policy makers, implementers, and the public. • Acknowledge limitations, including » Potential sources of bias » Limitations in data validity or reliability » Constraints in study design or context • Consider alternative explanations for the observed results. • Compare findings with those relating to similar policies or interventions in other contexts to identify trends or anomalies. • Triangulate results across data collection methods to assess consistency and credibility. • Assess alignment with established theories and previous research findings. • Check whether the results align with expectations, based on prior assumptions, baseline data, or logic models of the policy. The set of questions in Table 13 offers a practical tool to enhance the acceptability of judgments made based on the findings of the policy evaluation. Table 13 Acceptability Check of Evaluation Findings # Questions Response 1 Who will analyze the data (and who will coordinate this effort)? 2 How will data be analyzed and displayed? Against what values and benchmarks will the interpretations be compared in forming the 3 judgments? 4 Who will be involved in making interpretations and judgments, and what process will be employed? 5 How will the conflicting interpretations and judgments be dealt with? 59A performance benchmark in health policy evaluation is a specific, measurable standard or target used to assess the effectiveness or efficiency of a health policy. It serves as a reference point to determine whether the policy has achieved its intended outcomes or to compare its performance against established norms or peers, often based on historical data, best practices, or policy goals. Health Policy Evaluation Guideline What are the types of health policy evaluation? 55 6 Are the results like what was expected? If not, why are they different? 7 Are there alternative explanations for the results? 8 How do your results compare with those of similar programs? What are the limitations of the data analysis and interpretation process – for example, potential 9 biases, generalizability of results, reliability, validity)? If multiple indicators were used to answer the same evaluation question, were the results received 10 similar? How will the findings be communicated to ensure stakeholders interpret them accurately and 11 effectively for decision-making? Source: CDC’s Program Evaluation Framework Action Guide, page 80. Checklist for Step 5: Checklist - Justify Conclusions ☐ Consider context when analyzing data and make judgments. ☐ Assess results against available literature and those of similar policies. ☐ Consider alternative explanations of the findings. ☐ Use existing values and benchmarks – for example, KSA Vision for 2030 and Health Sector Transformation Program – as a starting point for comparisons. ☐ Compare the health policy outcomes with those of previous years. ☐ Compare actual findings with intended outcomes of the health policy. ☐ Document potential biases. ☐ Examine the limitations of the evaluation. Step 6: Use and Disseminate Evaluation Findings What is the value of using and disseminating the evaluation findings? The true value of a health policy evaluation is realized through its use in decision-making. Since the primary goal of any policy evaluation is to apply the insights gained in enhancing or reforming health policies, this should be a priority from the planning stage of the evaluation and continuously revisited throughout the process. The purpose(s) identified early on must guide how the findings are applied to support evidence-based health policy decisions. Evaluations act as critical feedback mechanisms, helping to guide the development of new policies, refine existing ones, and justify resource allocation. Effective use of health policy evaluations means embedding insights and conclusions into the broader policy-making process, fostering accountability, and creating incentives for continuous improvement. If the results are not used, there is a disconnect between the evidence gathered and the real-world policy decisions, resulting in missed opportunities for learning as well as applying lessons to facilitate improved health outcomes. Evaluation findings are used more specifically in strengthening health policies by providing evidence of their effectiveness and progress toward health goals. They also help to identify areas for improvement in policy design and implementation, ensuring that strategies remain adaptive and responsive to the evolving needs of the health sector. Moreover, evaluation results are valuable tools to justify existing funding and advocate for increased investments in health programs. They assist in informed budget planning and allocation, highlight urgent health priorities, and enhance communication with stakeholders, including the public, healthcare providers, and policy makers. Users of health policy evaluation findings can be categorized into direct and indirect users. • Direct users: These are the stakeholders directly involved in the design, implementation, and oversight of the health policy. They use the findings to refine policies, improve their implementation, and ensure that intended health outcomes are achieved. For example, health sector policy makers, public health agencies, and program managers directly apply evaluation results to make informed decisions. • Indirect users: These include the external stakeholders who use the evaluation findings to inform future health policy design, improve similar programs, or ensure the responsible use of public funds. For instance, international organizations, healthcare funders, researchers, or advocacy groups may apply the lessons learned to new policies or campaigns in similar health domains. 56 What are the types of health policy evaluation? Health Policy Evaluation Guideline How to maximize the use of evaluation findings To maximize the use of the evaluation findings, the following five essential elements should be considered:60 1. Recommendations. These are suggested actions derived from the evaluation findings. They can enhance the evaluation’s impact if they are aligned with stakeholders’ concerns and supported by strong evidence. However, if recommendations lack sufficient backing or do not resonate with stakeholder values, they can undermine the evaluation’s credibility. The relevance and usefulness of recommendations vary depending on the target audience and the evaluation’s goals. Actions taken during earlier stages, such as stakeholder identification, policy design, and data collection, help to ensure that the recommendations are pertinent and beneficial to all parties involved. 2. Preparation. This involves the steps taken to facilitate the practical use of evaluation findings. Through adequate preparation, stakeholders can: » Improve their ability to apply new insights effectively. » Consider the potential influence of the findings on decision-making processes. » Analyze the possible positive and negative impacts of the findings and identify options for policy enhancement. 3. Feedback. This is the continuous exchange of information among all participants in the evaluation process. It is essential at all stages to build trust and maintain alignment with the evaluation’s objectives. Early feedback helps keep the evaluation on track by keeping everyone informed about the policy implementation and the evaluation’s progress. As preliminary findings become available, feedback allows stakeholders to provide input on important decisions. Effective feedback is typically gathered through discussions, sharing interim results, and reviewing draft reports. 4. Follow-up. It is a means to provide ongoing support to users after they receive the evaluation findings and start drawing conclusions. Active follow-up can remind users of the intended use of the findings, ensure that the results are applied to the core evaluation questions, prevent misapplication, and safeguard against the loss or neglect of important lessons during complex or politically charged decision-making. 5. Dissemination. This refers to the process of sharing the evaluation findings and methodologies with relevant audiences in a timely and impartial manner. The aim is to achieve full transparency and unbiased reporting. Effective dissemination requires early discussions with intended users and stakeholders about the reporting strategy and adapting the timing, style, tone, source, medium, and format of the information to suit the needs of the audience. Identifying and using the most appropriate communication channels and formats is essential to ensure that the evaluation findings reach and effectively impact all stakeholder groups.  ecommendations Based on the Study Results to Assess the Implementation of the Mandatory Medical Sample Case R Malpractice Insurance for Other Health Practitioners Policy Recommendations Based on Study Results The mandatory insurance policy against medical errors is an important step aimed at enhancing patient safety and protecting healthcare practitioners. Based on the results to date, the following recommendations have been formulated to ensure the effective implementation of this policy. • Mandate healthcare service providers to verify the insurance coverage of healthcare practitioners. • Introduce supplemental policies to support implementation—such as subsidies for insurance premiums, a phased implementation timeline, and other financial relief mechanisms for specific practitioner categories. • Review insurance prices in line with the specializations of healthcare practitioners, to avoid imposing additional financial burdens. • Collaborate with insurance companies to review and standardize the content of malpractice insurance policies, ensuring clarity, consistency, and alignment with policy requirements. • Study the possibility of exempting some healthcare practitioners, such as those not working in the healthcare providers, from mandatory insurance requirements to ensure no unnecessary burdens are imposed. 60 CDC’s Program Evaluation Framework Action Guide, page 83. Health Policy Evaluation Guideline What are the types of health policy evaluation? 57 • Organize workshops and training courses to increase practitioners’ awareness of the importance of insurance and its benefits, as well as clarify the related policies and procedures. • Regulate the establishment of new insurance companies focusing on the policy to increase access and to foster fair competition. Implementing these recommendations in a scientific and studied manner will enhance the effectiveness of the mandatory insurance policy against medical errors, contributing to better healthcare and protecting the rights of practitioners and patients alike. Source: Developed by the authors using evaluation findings from the implementation of the Mandatory Medical Malpractice Insurance Policy for Other Health Practitioners (2024), in consultation with the Saudi Health Council. How to disseminate evaluation findings To ensure that evaluation findings are effectively used, it is essential to communicate these findings to stakeholders and make them publicly accessible.61 Evaluations that clearly articulate evidence-based implications for future decision-making only hold value if decision-makers can access and use the findings promptly. Similarly, oversight bodies and the public should be able to easily access, comprehend, and apply these findings to evaluate government policy design, implementation, and outcomes. Disseminating evaluation findings can be done through various channels and formats. The presentation of evidence should be strategic, guided by the evaluation’s objectives, and tailored to the information needs of the intended users.62 When evaluation results are well-synthesized, customized for specific audiences, and directly delivered to them, it enhances their usability. Tailored communication and dissemination strategies are crucial for ensuring that stakeholders have easy access to the findings, which greatly increases the likelihood that the findings will be used. One common method is to prepare and distribute an evaluation report. The report should clearly and impartially convey all aspects of the evaluation, presented in a concise and understandable manner. The report does not need to be overly technical or lengthy. A typical structure for such a report might include:63 • Executive Summary • Background and Purpose » Policy background » Evaluation rationale » Stakeholder identification and engagement » Policy description » Key evaluation questions/focus • Evaluation Methods » Design » Sampling procedures » Measures or indicators » Data collection procedures » Data processing procedures » Analysis » Limitations • Results • Discussion and Recommendations Effective communication of evaluation findings can also involve diverse methods such as one-page summaries, videos, infographics, data dissemination, newsletters, social media updates, and presentations at conferences and seminars. Engaging with stakeholders to understand their specific evidence needs can guide the selection of the most appropriate communication channels. While it is important to not discuss findings publicly before they have been formally published, involving key stakeholders early, especially with negative findings, can help to manage the messaging effectively before public dissemination. 61 OECD 2020, page 128. 62 OECD 2020, page 130. 63 CDC’s Program Evaluation Framework Action Guide, page 86. 58 What are the types of health policy evaluation? Health Policy Evaluation Guideline A comprehensive dissemination plan for evaluation findings should be developed and implemented to coordinate these efforts more effectively. Such a plan would detail:64 • The list of stakeholders and evaluation users. • Specific information to be shared with identified stakeholders. • The timing and purpose of the information dissemination, customized to the needs of different users. • The format—business, scientific, briefs, single issue/topic or multiple; choice of communication tools—print, online, social media; schedule for publishing information—quarterly, annually, or during seminars and conferences. • Coverage of all outputs to be published, including reports, underlying data, and research methodologies. The plan, as outlined in Table 14 should also encompass strategies for enhancing accessibility and engagement, such as using infographics, creating executive summaries, distributing “information nuggets” through social media, and organizing seminars to present findings. This approach ensures that all relevant stakeholders have timely and appropriate access to the evaluation insights, facilitating informed decision-making and policy refinement.  issemination Plan (Sample case from the Mandatory Medical Malpractice Insurance for Other Health Table 14 D Practitioners Policy)65 Stakeholders What information do they want? How do they want this When do they Who is / Users information (format/ want this responsible for communication tool)? information? sharing this information? Ministry of Does the policy contribute to improving Report Annual SHC Health patient safety? Does the policy contribute to healthcare Report Annual SHC quality improvement? Does the policy affect access to Report Annual SHC healthcare services? Does the policy affect the efficiency of Report Annual SHC health service delivery? Patient Safety Does the policy contribute to improving Report Annual SHC Center patient safety? Does the policy contribute to healthcare Report Annual SHC quality improvement? Saudi Does the policy reduce the financial Report Annual SHC Commission burden for healthcare practitioners? for Health Specialties Does the policy reduce financial burden Report Annual SHC for healthcare providing facilities? Patients Does the policy contribute to improving Press release / social Regular SHC patient safety? media / policy briefs etc. Private Sector Is the policy attractive for insurance Report Annual SHC companies? Health Care Does the policy reduce the financial Press release / social Regular SHC Practitioners burden for healthcare practitioners? media / policy briefs, etc. Source: Developed by the authors using evaluation findings from the implementation of the Mandatory Medical Malpractice Insurance Policy for Other Health Practitioners (2024), in consultation with the Saudi Health Council. To effectively engage stakeholders and secure their buy-in, it is advisable to collaboratively develop a usage and dissemination plan. This collaborative approach not only ensures alignment with their expectations but also 64 Magenta Book, page 81–82. 65 Iskarpatyoti, B. S, et al., page 48. Health Policy Evaluation Guideline References 59 facilitates broader acceptance and application of the findings. Academic collaborators may wish to publish evaluation results in scholarly journals.66 Doing so not only elevates the visibility of the research but also enhances its credibility and disseminates the knowledge more broadly. Encouraging academic publication early in the policy process can also attract a more diverse group of scholars to participate in evaluation activities, enriching the perspectives and expertise involved. Checklist for Step 6: Use and Disseminate Evaluation Findings ☐ Establish strategies to ensure that the health policy evaluation findings are effectively utilized. ☐ Maintain ongoing dialogue with the health program team to provide feedback. ☐ Ready stakeholders for the adoption and implementation of evaluation outcomes. ☐ Apply evaluation insights in formulating annual and strategic plans. ☐ Use the findings to advocate for and reinforce the health policy. ☐ Arrange periodic meetings with stakeholders to ensure that the conclusions from the evaluation are clearly communicated and understood ☐ Adapt the presentation of evaluation reports to suit diverse stakeholder needs. ☐ Present the findings clearly and promptly to ensure timely action. ☐ Avoid technical jargon to ensure clarity and accessibility in communications. ☐ Disseminate the evaluation findings through various methods to maximize reach and impact 66 Magenta Book, page 82. 60 References Health Policy Evaluation Guideline References Ahn, E., and Hyun Kang. 2018. “Introduction to systematic review and meta-analysis.” Korean Journal of Anesthesiology 71 (2): 103–112. https://doi.org/10.4097/kjae.2018.71.2.103. American Evaluation Association. 2018. “Guiding Principles for Evaluators.” Retrieved from https://www.eval.org/About/ Guiding-Principles. Austin, Elizabeth J., Elsa S. Briggs, Lori Ferro et al. 2023. “Integrating Routine Screening for Opioid Use Disorder into Primary Care Settings: Experiences from a National Cohort of Clinics.” Journal of General Internal Medicine 38 (2): 332–40. https://doi.org/10.1007/s11606-022-07675-2. Austin, P. C. 2011. “An introduction to propensity score methods for reducing the effects of confounding in observational studies.” Multivariate Behavioral Research 46 (3): 399–424. https://doi.org/10.1080/00273171.2011.568786. Beach, D., and Rasmus B. Pedersen. 2019 (2nd edition). Process-Tracing Methods: Foundations and Guidelines. Ann Arbor, USA: University of Michigan Press. Befani, B., and Gavin Stedman-Bryce. 2017. “Process Tracing and Bayesian Updating for Impact Evaluation.” Evaluation 23 (1): 42–60. https://doi.org/10.1177/1356389016654584. Befani, B., Stefano D’Errico, Francesca Booker, and Alessandra Giuliani. 2016. “Clearing the fog: new tools for improving the credibility of impact claims.” International Institute for Environment and Development (IIED) Briefing Papers. https://www.iied.org/17359iied. Bonander, C., David Humphreys, and Michelle Degli Esposti. 2021. “Synthetic control methods for the evaluation of single- unit interventions in epidemiology: A tutorial.” American Journal of Epidemiology 190 (12): 2700–2711. https://doi. org/10.1093/aje/kwab211. Breckon, J., Sandy Oliver, Cecilia Vindrola, and Thomas Moniz. 2023. Rapid Evidence Assessments: A guide for commissioners, funders, and policymakers. CAPE, University College London. https://www.cape.ac.uk/2023/10/31/ commissioning-rapid-evidence-assessments/. Burger, C., Ronelle Burger, and Eddy van Doorslaer. 2022. “The health impact of free access to antiretroviral therapy in South Africa.” Social Science & Medicine 299: 114832. https://doi.org/10.1016/j.socscimed.2022.114832. Centers for Disease Control and Prevention. 2013. Developing an Effective Evaluation Report: Setting the course for effective program evaluation. Retrieved from https://www.cdc.gov/tobacco/stateandcommunity/tobacco-control/pdfs/ developing_evaluation_report.pdf. Centers for Disease Control and Prevention. 2024. CDC’s Program Evaluation Framework Action Guide. Atlanta, GA: Centers for Disease Control and Prevention. https://www.cdc.gov/evaluation/media/pdfs/2024/12/FINAL-Action-Guide- for-DFE-12182024_1.pdf. Centers for Disease Control and Prevention. 2024. “Economic evaluation: Overview.” POLARIS. https://www.cdc.gov/ polaris/php/economics/index.html. Collins, A., Deborah Coughlin, James Miller, and Stuart Kirk. 2015. The Production of Quick Scoping Reviews and Rapid Evidence Assessments: A How to Guide. London: DEFRA. https://assets.publishing.service.gov.uk/ media/5a7f3a76ed915d74e33f5206/Production_of_quick_scoping_reviews_and_rapid_evidence_assessments. pdf. Crawford, C., Courtney Boyd, Shamini Jain, Raheleh Khorsan, and Wayne Jonas. 2015. “Rapid Evidence Assessment of the Literature (REAL©): Streamlining the systematic review process and creating utility for evidence-based health care.” BMC Research Notes 8: 631. https://doi.org/10.1186/s13104-015-1604-z. Davies, R., and Jess Dart. 2005. The ‘Most Significant Change’ (MSC) Technique: A Guide to Its Use. https://www.mande. co.uk/wp-content/uploads/2005/MSCGuide.pdf. DeSalvo, K. B., Y. Claire Wang, Andrea Harris, John Auerbach, Denise Koo, and Patrick O’Carroll. 2017. “Public Health 3.0: A Call to Action for Public Health to Meet the Challenges of the 21st century.” Preventing Chronic Disease 13: E86. https://doi.org/10.5888/pcd13.160017. https://dx.doi.org/10.5888/pcd14.170017. Dillman, D. A., Jolene D. Smyth, and Leah M. Christian. 2014. Internet, Phone, Mail, and Mixed-Mode Surveys: The Tailored Design Method. Hoboken, N.J., USA: John Wiley & Sons, Inc. Ditkowsky, J., Khushal H. Shah, Margaret R. Hammerschlag, Stephan Kohlhoff, and Tamar A. Smith-Norowitz. 2017. “Cost- benefit analysis of Chlamydia trachomatis screening in pregnant women in a high burden setting in the United States.” BMC Infectious Diseases 17 (1): 155. https://doi.org/10.1186/s12879-017-2248-5. Görgens, M., and Jody Zall Kusek. 2009. Making Monitoring and Evaluation Systems Work: A Capacity Development Toolkit. Washington, DC: World Bank. https://hdl.handle.net/10986/2702. Grant, A., Carol Bugge, and Mary Wells. 2020. “Designing process evaluations using case study to explore the context of Health Policy Evaluation Guideline References 61 complex interventions evaluated in trials.” Trials 21: 982. https://doi.org/10.1186/s13063-020-04880-4. Hammersley, M., and Paul Atkinson. 2019 (4th edition). Ethnography: Principles in Practice. Routledge. HM Revenue & Customs. 2024. “Soft Drinks Industry Levy: Detailed Information.” London: GOV.UK. https://www.gov.uk/ government/collections/soft-drinks-industry-levy-detailed-information. HM Treasury. 2018. “Soft Drinks Industry Levy Comes into Effect.” London: GOV.UK. https://www.gov.uk/government/ news/soft-drinks-industry-levy-comes-into-effect. HM Treasury. 2020. Magenta Book: Central Government Guidance on Evaluation. Retrieved from https://assets.publishing. service.gov.uk/media/5e96cab9d3bf7f412b2264b1/HMT_Magenta_Book.pdf. HM Treasury and Department of Health. 2016. Soft Drinks Industry Levy: Consultation on the Design of the Levy. London: HM Treasury. https://assets.publishing.service.gov.uk/media/5a80a0b040f0b62302694998/Soft_Drinks_ Industry_Levy-consultation.pdf. Independent Evaluation Group. 2010. Cost-benefit analysis in World Bank projects. Washington, DC: World Bank. https:// hdl.handle.net/10986/2561. Iskarpatyati, B. S., Beth Sutherland, and Heidi W. Reynolds. 2017. Getting to an Evaluation Plan: A Six-Step Process from Engagement to Evidence. Retrieved from https://www.measureevaluation.org/resources/publications/ms-17-124/ at_download/document. Iwashyna, T. J., and Edward H. Kennedy. 2013. “Instrumental variable analyses: Exploiting natural randomness to understand causal mechanisms.” Annals of the American Thoracic Society 10 (3): 255–260. https://doi.org/10.1513/ AnnalsATS.201303-054FR. Jack, S. M., Andrea Gonzalez, Karen Campbell, et al. 2020. Implementation and delivery of Nurse-Family Partnership in British Columbia, Canada: A synthesis of selected findings from the British Columbia Healthy Connections Project Process Evaluation (2013–2018). Hamilton, ON, Canada: School of Nursing, McMaster University. https://phnprep. ca/wp-content/uploads/2021/09/BCHCP_Process-Evaluation_Final-Report.pdf. Joint Committee on Standards for Educational Evaluation. 2016. “The program evaluation standards” (3rd ed.). Ypsilanti, MI: Western Michigan University. Retrieved from https://files.wmich.edu/s3fs-public/attachments/u350/2021/ program-eval-standards-jc.pdf. Kim, D., and Kim, M. 2022. Impact of the COVID-19 Pandemic on Mental Health Services in Korea: Challenges and Opportunities. Journal of Preventive Medicine & Public Health, 55(4), 195–202. https://doi.org/10.3961/ jpmph.22.195. Kingdom of Saudi Arabia Vision 2030, Health Sector Transformation Program. https://www.vision2030.gov.sa/en/explore/ programs/health-sector-transformation-program. Kingdom of Saudi Arabia. 2020-2021. “Health Sector Transformation Program Delivery Plan.” https://www.vision2030.gov. sa/media/u5xapka3/2021-2025-health-sector-transformation-program-delivery-plan-en.pdf. Lopez Bernal, J., Steven Cummins, and Antonio Gasparrini. 2017. “Interrupted time series regression for the evaluation of public health interventions: A tutorial.” International Journal of Epidemiology 46 (1): 348–355. https://doi. org/10.1093/ije/dyw098. Mayne, J. 2001. “Addressing attribution through contribution analysis: Using performance measures sensibly.” Canadian Journal of Program Evaluation 16 (1): 1–24. https://doi.org/10.3138/cjpe.016.001. Mayne, J. 2012. “Making causal claims.” Institutional Learning and Change Initiative (ILAC) Brief 26. https://hdl.handle. net/10568/70211. National Academies of Sciences, Engineering, and Medicine. 2023. Review of four CARA programs and preparing for future evaluations. Washington, DC: The National Academies Press. https://doi. org/10.17226/26831. National Committee of Bioethics (NCBE). “About the NCBE.” King Abdulaziz City for Science and Technology (KACST). Retrieved from https://ncbe.kacst.edu.sa/en/about-us/who-we-are/. OECD (Organisation for Economic Co-operation and Development). 2019. “Better Criteria for Better Evaluation: Revised Evaluation Criteria Definitions and Principles for Use.” Paris: OECD Publishing. https://doi.org/10.1787/15a9c26b- en. OECD. 2020. Improving Governance with Policy Evaluation: Lessons From Country Experiences. Paris: OECD Publishing. Retrieved from https://www.oecd-ilibrary.org/governance/improving-governance-with-policy- evaluation_89b1577d-en. OECD 2020. Regulatory Impact Assessment, OECD Best Practice Principles for Regulatory Policy. Paris: OECD Publishing. https://doi.org/10.1787/7a9638cb-en. Office for Health Improvement and Disparities. 2018. “Outcome evaluation: evaluation in health and wellbeing.” GOV.UK. Department of Health and Social Care. https://www.gov.uk/guidance/evaluation-in-health-and-wellbeing-outcome. 62 References Health Policy Evaluation Guideline Pan American Health Organization (PAHO). 2013. Health Impact Assessment: Concepts and Guidelines for the Americas. Washington, DC: PAHO. Retrieved from https://www3.paho.org/hq/dmdocuments/2014/health-impact- assessment-concepts-and-guidelines-2013.pdf. Papadakis, S., Adam G. Cole, Robert D. Reid, et al. 2016. “Increasing rates of tobacco treatment delivery in primary care practice: Evaluation of the Ottawa Model for Smoking Cessation.” Annals of Family Medicine 14 (3): 235-43. https:// doi.org/10.1370/afm.1909. Patton, M. Q. 2015. Qualitative Research and Evaluation Methods. 4th ed. Thousand Oaks, CA, USA: SAGE Publications, Inc. Pawson, R., and Nicholas Tilley. 1997. Realistic Evaluation. SAGE Publications. Ragin, C. C. 1987. The Comparative Method: Moving Beyond Qualitative and Quantitative Strategies. Oakland, CA, USA: University of California Press. Sasabuchi, Y. 2022. “Introduction to regression discontinuity design.” Annals of Clinical Epidemiology 4 (1): 1–5. https://doi. org/10.37737/ace.22001. Sattar, R., Rebecca Lawton, Maria Panagioti, and Judith Johnson. 2021. “Meta-ethnography in healthcare research: A guide to using a meta-ethnographic approach for literature synthesis.” BMC Health Services Research 21 (1): 50. https://doi.org/10.1186/s12913-020-06049-w. Saudi Data and Artificial Intelligence Authority (SDAIA). 2021. Personal Data Protection Law. https://sdaia.gov.sa/en/ SDAIA/about/Documents/Personal%20Data%20English%20V2-23April2023-%20Reviewed-.pdf. Saudi Data and Artificial Intelligence Authority. 2022. Personal Data Protection Law. https://sdaia.gov.sa/ar/SDAIA/about/ Files/RegulationsAndPolicies02.pdf. Saudi Health Council. 2024. Evaluate the Implementation of Mandatory Medical Malpractice Insurance for Other Health Practitioners Policy. Saudi Health Council and World Bank. 2024. The Health Policy Maker’s Manual: Integrating Data and Evidence. https://shc. gov.sa/Arabic/Documents/The%20Health%20Policy%20Makers%20Manual%20-%20KSA%20-2024.pdf. Schick-Makaroff, K., Marjorie MacDonald, Marilyn Plummer, Judy Burgess, and Wendy Neander. 2016. “What synthesis methodology should I use? A review and analysis of approaches to research synthesis.” AIMS Public Health 3 (1): 172–215. https://doi.org/10.3934/publichealth.2016.1.172. Schneider, C. Q., and Claudius Wagemann. 2012. Set-Theoretic Methods for the Social Sciences: A Guide to Qualitative Comparative Analysis. Cambridge, U. K: Cambridge University Press. Sheingold, S. H., and Anupa U. Bir. (2019). Evaluation for Health Policy and Health Care: A Contemporary Data-Driven Approach. SAGE Publications. Retrieved from https://us.sagepub.com/hi/nam/evaluation-for-health-policy-and- health-care/book262472. Simms, K. T., Jean-François Laprise, Megan A. Smith et al. 2016. “Cost-effectiveness of the next generation nonavalent human papillomavirus vaccine in the context of primary human papillomavirus screening in Australia: a comparative modelling analysis.” Lancet Public Health 1 (2): E66–E75. https://www.thelancet.com/journals/lanpub/article/ PIIS2468-2667(16)30019-6/fulltext. Thomas, J., Mark Newman, and Sandy Oliver. 2013. “Rapid evidence assessments of research to inform social policy: taking stock and moving forward.” Evidence and Policy 9 (1): 5–27. https://doi.org/10.1332/174426413X662572. UK Government Social Research Service. 2014. Rapid Evidence Assessment Toolkit. https://webarchive.nationalarchives. gov.uk/ukgwa/20140402164155/http://www.civilservice.gov.uk/networks/gsr/resources-and-guidance/rapid- evidence-assessment. UNAIDS. n.d. Indicator Standards: Operational Guidelines for Selecting Indicators for the HIV Response. Retrieved from https://www.globalhivmeinfo.org/AgencySites/MERG%20Resources/MERG%20Indicator%20Standards_ Operational%20Guidelines.pdf. U.S. Department of Health and Human Services (HHS). 2016. Guideline for Regulatory Impact Analysis. https://aspe.hhs. gov/sites/default/files/migrated_legacy_files//171981/HHS_RIAGuidance.pdf. Virani, S. S., Alvaro Alonso, Hugo J. Aparicio et al. 2021. “Heart disease and stroke statistics—2021 update: A report from the American Heart Association.” Circulation 143 (8): e254-e743. https://doi.org/10.1161/CIR.0000000000000950. White, H., and David A. Raitzer. 2017. Impact Evaluation of Development Interventions: A Practical Guide. Metro Manila, Philippines: Asian Development Bank. Retrieved from https://www.adb.org/sites/default/files/publication/392376/ impact-evaluation-development-interventions-guide.pdf. Wilson-Grau, R., and Heather Britt, H. 2012. Outcome Harvesting. Ford Foundation. https://outcomeharvesting.net/wp- content/uploads/2016/07/Outcome-Harvesting-Brief-revised-Nov-2013.pdf. Wing, C., Kosali Simon, and Ricardo A. Bello-Gomez. 2018. “Designing difference in difference studies: Best practices for public health policy research.” Annual Review of Public Health 39: 453–469. https://doi.org/10.1146/annurev- publhealth-040617-013507. Health Policy Evaluation Guideline References 63 World Bank. Cost-effectiveness analysis. DIME Wiki. https://dimewiki.worldbank.org/Cost-effectiveness_Analysis. World Bank. 2004. Monitoring and Evaluation: Some Tools, Methods, and Approaches (Updated Edition). Washington, DC: World Bank. https://hdl.handle.net/10986/23975. World Bank. 2005. The Logical Framework (Logframe) Handbook: A Logical Framework Approach to Project Cycle Management. https://documents1.worldbank.org/curated/en/783001468134383368/pdf/31240b0LFhandbook. pdf. World Bank. 2019. “World Bank Group Evaluation Principles.” Retrieved from https://ieg.worldbankgroup.org/sites/default/ files/Data/reports/WorldBankEvaluationPrinciples.pdf. World Bank. 2020. World Development Report 2020: Trading for Development in the Age of Global Value Chains. Washington, DC: World Bank. https://hdl.handle.net/10986/32437. Yin, R. K. 2014. Case Study Research: Design and Methods. Thousand Oaks, CA, USA: SAGE Publications, Inc. Zabor, E. C., Alexander M. Kaizer, and Brian P. Hobbs. 2020. Randomized controlled trials. Chest 158 (1 Suppl): S79–S87. https://doi.org/10.1016/j.chest.2020.03.013. Zhou, F., Abigail Shefer, Jay Wenger, Mark Messonnier, Li Yan Wang, Adriana Lopez, Matthew Moore, Trudy V. Murphy, Margaret Cortese, and Lance Rodewald. 2014. “Economic Evaluation of the Routine Childhood Immunization Program in the United States, 2009.” Pediatrics 133 (4): 577-585. https://doi.org/10.1542/peds.2013-0698. 64 Annexes Health Policy Evaluation Guideline Annexes Health Policy Evaluation Guideline Annexes 65 Annex 1: Regulatory Impact Analysis What is Regulatory Impact Analysis (RIA)? Regulatory Impact Analysis (RIA)67 is a structured, ex-ante framework for assessing the anticipated effects of health policy or regulatory options. Its primary goal is to ensure that regulations are justified, necessary, and aligned with public health and social objectives, while being implemented efficiently, equitably, and cost- effectively. RIA evaluates expected benefits, costs, and trade-offs, considering alternatives and additional impacts as mandated by Saudi Cabinet decisions or Ministry of Health (MoH) guidelines. It promotes transparency for stakeholders, such as the SHC, healthcare providers, and the public, by quantifying and monetizing effects where possible. These effects include estimating the economic and health impacts, and analyzing their distribution across populations. For instance, the RIA of Saudi Arabia’s 2017 sugar tax could assess its impact on reducing obesity rates – a key factor in diabetes prevalence – among adults versus children, ensuring equitable health outcomes and efficient resource use in line with Vision 2030 goals. Why is Regulatory Impact Analysis Important? Conducting an RIA in health policy evaluation offers significant benefits, though it also entails certain costs. Key Benefits 1. Better health policy decisions. RIA provides a robust evidence base, documenting data, assumptions, and analyses to inform decisions. It offers insights into critical outcomes, even those hard to quantify—like the psychological benefits of expanding mental health services under Saudi Arabia’s Vision 2030. 2. Thorough impact assessment. RIA identifies intended and unintended impacts – for example, that introducing mandatory health insurance in Saudi Arabia could potentially increase out-of-pocket costs for low-income families. It also clarifies stakeholder views, fostering transparency and trust. 3. Stronger public health outcomes. By analyzing benefits and costs, RIA ensures that regulations maximize health gains, enhance equity, and reduce burdens. For example, RIA could assess whether centralizing oncology services in major Saudi cities like Riyadh lowers costs while maintaining equitable access for rural and underserved communities. Associated Costs 1. Resource demands. RIA requires time, expertise, and funding for thorough assessments—such as collecting nationwide data to evaluate the cost-effectiveness of Saudi Arabia’s national influenza or HPV vaccination programs. 2. Tailored analysis. To manage resources, RIA must prioritize key health outcomes and actionable insights, focusing on critical metrics like disease incidence or hospitalization rates rather than secondary indicators like patient appointment punctuality. When is a Regulatory Impact Analysis Performed? RIA is critical when a proposed health policy or regulation is expected to have significant impacts on the economy, public health, or societal well-being. It supports thorough evaluation of these regulations to maximize benefits, minimize unintended consequences, and promote equitable and efficient outcomes. RIA is particularly important in the following scenarios: 1. Substantial economic impact. When a regulation imposes major financial implications for the healthcare sector, RIA assesses whether its anticipated benefits justify the costs. » For example: A regulation requiring all hospitals to adopt electronic health records (EHR) within a short timeframe would impose significant costs for technology implementation. RIA would evaluate whether the benefits of improved patient care and data management outweigh these costs, exploring cost- effective alternatives. 67Guideline for Regulatory Impact Analysis, US Department of Health and Human Services (HHS). https://aspe.hhs.gov/sites/default/files/migrated_ legacy_files//171981/HHS_RIAGuidance.pdf. 66 Annexes Health Policy Evaluation Guideline 2. Broad societal impact. When a regulation significantly affects public health, safety, or welfare, RIA evaluates its overall impact and feasibility. » For example: A policy mandating nationwide vaccination for an emerging infectious disease could prevent outbreaks and save lives but might raise concerns about vaccine accessibility and public compliance. To ensure positive societal impact, RIA would analyze public health benefits, economic costs of vaccine procurement and distribution, and potential resistance. 3. Significant distributional effects. When a policy is likely to disproportionately affect specific groups, such as low-income populations or rural communities, RIA identifies and addresses equity concerns. » For example: A policy to increase taxes on sugary beverages to reduce consumption and combat obesity might disproportionately burden lower-income populations who spend a larger share of their income on such products. RIA would examine these impacts to ensure the policy achieves health goals without unfairly affecting vulnerable groups. 4. Complex or uncertain outcomes. When a regulation involves complex interventions or has uncertain, far- reaching consequences, RIA provides a framework to evaluate trade-offs and navigate complexities. » For example: New air quality standards in healthcare facilities could improve patient outcomes by reducing pollution-related illnesses but might increase operational costs. RIA would analyze health benefits alongside economic impacts on facilities, providing evidence to guide policy makers in weighing these trade-offs. The scope and depth of RIA depend on the regulation’s anticipated impact—comprehensive or limited—ensuring efficient resource use while delivering critical insights for decision-making. • Comprehensive RIA for significant impact: For regulations with substantial effects, such as those profoundly impacting public health, the economy, or societal well-being, a detailed analysis is essential. » For example: Introducing mandatory health insurance for citizens under a universal health coverage goal would require a comprehensive RIA. This would evaluate public health benefits such as improved access to essential healthcare; economic costs like funding and administration; and societal implications such as affordability for low-income groups and impacts on private providers; ensuring that the policy maximizes health outcomes while maintaining financial sustainability and equity. • Streamlined RIA for limited impact: For regulations with minor impacts such as adjustments to existing policies, a streamlined analysis focuses on key aspects, avoiding unnecessary complexity. » For example: A regulation updating nutritional labeling requirements on food products would likely have a limited impact. A streamlined RIA could evaluate compliance costs for manufacturers and the potential public health benefits of clearer labeling, without requiring extensive analysis. Whether comprehensive or streamlined, RIA enhances decision-making by providing a clear, evidence-based rationale, ensuring that regulatory actions align with health policy goals like improving public health, enhancing equity, or optimizing healthcare efficiency. Case Study Regulatory Impact Analysis of the United Kingdom’s Soft Drinks Industry Levy Overview: The UK’s Soft Drinks Industry Levy (SDIL), implemented in April 2018, exemplifies how Regulatory Impact Analysis (RIA) informs health policy decisions. Known as the sugar tax, SDIL targets soft drinks with added sugar to reduce consumption, combat obesity, and address related issues like type 2 diabetes and dental caries. This case study highlights the RIA conducted in 2016 by HM Treasury and the Department of Health, demonstrating its methodology, findings, and initial impact, to provide a model for similar evaluations, including in Saudi Arabia under Vision 2030. Policy Context: Announced in March 2016 and launched in April 2018, the SDIL levies a two-tiered tax on soft drinks: £0.24 per liter for drinks with 8g or more of sugar per 100ml (high tier); £0.18 per liter for drinks with 5–8g per 100ml (low tier); and exemptions for drinks with less than 5g sugar per 100ml, fruit juices, milk-based drinks, and small producers (less than 1 million liters annually). The policy aimed to encourage manufacturers to reduce sugar content, promote low-sugar alternatives, and decrease portion sizes, addressing the UK’s obesity epidemic, where rates have nearly doubled over 30 years. Health Policy Evaluation Guideline Annexes 67 Objectives of the RIA: The 2016 RIA aimed to: • Assess potential health benefits from reduced sugar consumption, particularly lowering obesity, type 2 diabetes, and dental caries. • Evaluate economic impacts, including costs to the soft drinks industry – for example, reformulation and compliance, and government revenue generation. • Examine distributional effects to ensure that the tax does not disproportionately burden specific income groups, promoting equity. These objectives reflect RIA’s role as an ex-ante, evidence-based tool to inform regulatory decisions, balancing health, economic, and social outcomes. Methodology: The RIA used a mixed-methods approach, including: • Health impact assessment: Epidemiological models predicted sugar consumption reductions and potential health outcomes, estimating fewer obesity and disease cases based on national health surveys and consumption data. • Economic analysis: Assessed industry costs, such as one-off familiarization and ongoing compliance, and projected government revenue (approximately £520 million annually) using cost-benefit analysis and consumption forecasts, accounting for reformulation and price changes. • Stakeholder consultation: Conducted a public consultation in 2016, engaging industry – manufacturers, retailers – public health bodies, and consumers to address concerns including costs, health priorities, and equity, ensuring transparency. • Distributional analysis: Analyzed consumption patterns across income groups, finding minimal risk of disproportionate burden on lower-income households, supported by equity impact assessments. Key Findings: • Health impacts. Predicted significant reductions in sugar consumption, potentially preventing cases of obesity, type 2 diabetes, and dental caries over time. Post-implementation data (up to 2020) confirmed a 46% average sugar content reduction in levy-eligible drinks (2015–2020) and over 45,000 tons of sugar removed annually, aligning with initial projections. • Economic impacts. Projected £520 million in annual revenue, with costs primarily borne by manufacturers through reformulation and compliance. Industry faced one-off and ongoing costs, but the impact on the estimated 300 UK producers was expected to be negligible, with potential price pass- through to consumers noted. • Distributional effects. Found the tax would have a relatively even impact across income groups, minimizing equity concerns. But the tax had minimal effects on specific health conditions such as type 1 diabetes and lactose intolerance; these were acknowledged and mitigated by exemptions. Note: The £520 million revenue projection highlighted the policy’s potential to fund public health initiatives while improving health, though 2018 data revised this to £240 million annually due to reformulation, indicating initial overestimation. Conclusion and implications: The RIA supported the SDIL’s implementation, finding that health benefits likely outweighed costs. Initial post-implementation outcomes (2018–2020) validated sugar reductions and revenue generation, but uncertainties—consumer behavior shifts, long-term health impacts, industry responses—were noted, emphasizing ongoing evaluation. This demonstrates RIA’s value in data-driven policymaking, ensuring regulations are justified, efficient, and equitable. Sources: HM Treasury and Department of Health. 2016. Soft Drinks Industry Levy: Consultation on the Design of the Levy https://assets.publishing. service.gov.uk/media/5a80a0b040f0b62302694998/Soft_Drinks_Industry_Levy-consultation.pdf; HM Revenue & Customs. 2024. Soft Drinks Industry Levy: Detailed Information. https://www.gov.uk/government/collections/soft-drinks-industry-levy-detailed-information; HM Revenue & Customs. 2018. Soft Drinks Industry Levy Comes into Effect. https://www.gov.uk/government/news/soft-drinks-industry-levy-comes-into-effect. 68 Annexes Health Policy Evaluation Guideline Annex 2: Rapid Evidence Assessment Rapid Evidence Assessment Quick review for health policy evaluation Definition A Rapid Evidence Assessment (REA) is a streamlined, yet systematic method of reviewing existing evidence to inform health policy decisions within a limited timeframe. In the context of health policy evaluation, REAs are used to quickly assess what is already known about the effectiveness, implementation, cost-effectiveness, or unintended consequences of health interventions or programs. While not as exhaustive as full systematic reviews, REAs apply rigorous and transparent processes to ensure the evidence collected is relevant, credible, and useful for timely policy-making. When to use REAs? This approach is particularly valuable in dynamic healthcare environments where decisions must be informed by the best available evidence, but time and resources are constrained. By focusing on targeted research questions and using defined inclusion criteria, REAs support policy makers in making evidence- informed decisions. Benefits of Rapid Evidence Assessment • Timely: REAs are designed to deliver findings much faster than traditional systematic reviews—typically within weeks to a few months. This makes them especially useful in policy environments where decisions must be made quickly. • Systematic: Although abbreviated, REAs still apply a transparent and replicable methodology, including predefined criteria for study selection and data synthesis. This ensures that the process is methodical and evidence-based. • Cost-effective: By streamlining certain steps of the full systematic review – for example, limiting databases, or focusing on recent studies – REAs reduce the time and labor involved, resulting in lower costs. • Flexible: REAs can be tailored to address a wide range of policy questions. The scope, depth, and sources can be adjusted according to the specific needs and resources available. Limitations of Rapid Evidence Assessment • Reduced comprehensiveness: To meet time constraints, REAs often limit the number of databases searched or the years of publication reviewed. This may result in missing some relevant studies. • Potential bias: Due to shortened timelines, REAs may exclude gray literature, increasing the risk of publication bias. • Limited depth: REAs typically provide high-level summaries of study findings rather than detailed analysis or in-depth synthesis, which can limit their utility for complex decision-making. • Less suitable for complex questions: REAs work best for relatively straightforward questions. They may not be appropriate for evaluating complex interventions, systems, or theories of change that require in-depth qualitative or mixed-method analysis. Steps to Conduct a Rapid Evidence Assessment 1. Define the policy question and scope Clearly articulate the specific health policy question the REA is intended to answer. This may relate to the effectiveness, efficiency, equity, or implementation of a health policy. The scope should focus on the evidence needed to inform timely policy decisions. 2. Develop a brief protocol Prepare a short protocol outlining the objectives, inclusion and exclusion criteria, search strategy, health outcomes of interest, and policy relevance. The protocol should align with the needs of policy makers and ensure a consistent and transparent approach. Health Policy Evaluation Guideline What are the types of health policy evaluation? 69 3. Conduct a targeted literature search Search relevant health databases such as PubMed, Cochrane Library, and the Cumulative Index to Nursing and Allied Health Literature (CINAHL), along with gray literature from trusted sources—for example, the World Health Organization (WHO) and national health authorities. The search should be focused on recent and high-quality evidence directly relevant to the policy issue. 4. Screen and select relevant studies Review titles, abstracts, and full texts based on predefined criteria. Select studies that are methodologically sound and policy-relevant, such as those evaluating real-world effectiveness or cost-effectiveness in healthcare settings. 5. Extract and synthesize health policy-relevant data Extract key findings such as intervention type, population characteristics, outcomes – for example, reduced morbidity, healthcare utilization – and policy implications. A synthesis is typically narrative but can include tables, rating systems, or logic models to visualize health impacts. 6. Report findings and implications for health policy Provide a clear and concise summary of the evidence, emphasizing its relevance to current health policy decisions. Discuss the strength of the evidence, identify any limitations, and offer actionable recommendations where appropriate. Sources: Adapted from UK Government Social Research Service. 2014. Rapid Evidence Assessment Toolkit; Thomas, J., Newman, M., & Oliver, S. 2017. Rapid evidence assessments of research to inform social policy: taking stock and moving forward; Collins, A., Coughlin, D., Miller, J., & Kirk, S. 2015. The Production of Quick Scoping Reviews and Rapid Evidence Assessments: A How to Guide. DEFRA; Breckon, J., Oliver, S., Vindrola, C., and Moniz, T. 2023. Rapid Evidence Assessments: A guide for commissioners, funders, and policymakers. https://www.cape.ac.uk/2023/10/31/commissioning-rapid- evidence-assessments/. Annex 3: Key Questions for Each Evaluation Type (Non-Exhaustive) Formative Evaluation Process Evaluation Outcome Evaluation Impact Evaluation Economic Evaluation Questions Questions Questions Questions Questions How is the health How is the policy Did the target Does the health Is the health policy policy intended to be being delivered? Are population receive policy work? What are cost-effective? Does implemented across there any challenges the intended services its positive and it provide good value relevant settings – for or deviations from or products? negative effects, for the resources example, facilities, the implementation intended and invested? regions, or sectors? as planned? unintended? What data are What implementation Did the target Does the health Is the health policy currently being methods and population, such as policy achieve its cost-effective in collected, and what strategies are being specific age groups long-term goals, such achieving its health additional information used? or underserved as improved outcomes compared is needed to assess communities, receive population health or to planned targets or the health policy’s the intended health reduced disparities? benchmarks? feasibility and services or products, readiness for such as vaccinations implementation? or screenings? To what extent are What challenges or What specific health What are the positive Does the policy the identified target barriers are benefits or outcomes intended effects of provide good value groups being encountered in the were achieved, such the policy, such as for the resources considered and implementation as increased better access to care invested, such as adequately addressed process? vaccination rates or or reduced chronic financial, human, and in the policy design? reduced hospital disease rates? time resources? admissions? 70 What are the types of health policy evaluation? Health Policy Evaluation Guideline Formative Evaluation Process Evaluation Outcome Evaluation Impact Evaluation Economic Evaluation Questions Questions Questions Questions Questions How well are the Have there been any Was the policy What are the negative What are the costs of proposed policy deviations from the efficient in delivering unintended effects, if achieving the policy’s components aligned planned benefits, for example, any, such as health outcomes, with the needs, implementation in terms of speed, increased healthcare such as reduced expectations, and approach, and what resource use, and disparities or morbidity or capacities of the are the reasons? reach, compared to unintended burdens improved quality of target groups and expectations or on providers? life? implementing actors? alternatives? What factors are To what extent has To what extent can What are the likely to facilitate or the policy reached its the observed effects implementation costs hinder the successful intended target be attributed to the associated with the implementation of populations—for policy rather than policy—for example, the health policy in its example, rural areas external factors, such training, intended context? or low-income as economic infrastructure, and groups. conditions or other personnel? programs? What modifications How do external What causal factors, Has the policy been to the policy design factors, such as such as policy design cost-effective or delivery approach pandemics or or community compared to could improve its economic shifts, engagement, alternative relevance, influence the contributed to the interventions or no acceptability, or implementation policy’s effects? intervention? operational feasibility process? before scale-up? How do contextual Has the policy What is the value-for- factors, such as resulted in any money of the health cultural, social, or unintended long-term policy, such as the environmental health or social cost-benefit ratio? factors, affect the outcomes? implementation of the policy? How have the Have the impacts What are the benefits attitudes and been influenced by of the policy, such as behaviors of other external factors, improved quality of stakeholders—for such as technological life or reduced example, populations, advances? healthcare costs? providers—influenced the implementation process? What areas could be To what extent have What are the costs of improved in the different population implementing and current groups, such as by sustaining the policy, implementation age, income, or such as financial, strategies to enhance region, been time, and workforce effectiveness or impacted differently, costs? efficiency? and why? Can the policy’s Do the benefits health impacts be outweigh the costs, reproduced or scaled and how does this in other regions or compare to contexts? alternative health policies? What is the ratio of costs to benefits, and how does it compare to other health policy options? Health Policy Evaluation Guideline What are the types of health policy evaluation? 71 Annex 4: Sample Data Collection Plan # Indicator Data Needed Data Collection From Whom Will By Whom and Security or Method/Source These Data Be When Will Data Confidentiality Collected Be Collected Steps 1 % of patients’ Total number of SHC legal Healthcare SHC Legal Facility-level allegations allegations reporting system facilities and Affairs Unit, data, against against and service SHC records annually anonymized healthcare practitioners; statistics before analysis practitioners for total number of negligence patients served 2 3 4 5 Annex 5: Prioritized Health Status and Health System Indicators Health Status Indicators - 51 Sustainable Indicators Development Goal (SDG) Topic # Indicator Name Awareness & 1 Completion rate (primary education, lower secondary education, upper SDG 4.1.2 Education secondary education) 2 Health Literacy rate   Behavioral 3 Prevalence of smoking   4 Physical activity   5 Population (15 years and over), by number of eating fruits and   vegetables servings per day and Age Group Child 6 Proportion of children under 5 years of age who are developmentally SDG 4.2.1 Development on track in health, learning, and psychosocial well-being Demographics 7 Percentage of people age 65 and older   8 Population by sex/age   9 Population growth   10 Total Population (Millions of people)   Environmental 11 Air pollution level in cities (particulate matter [PM2.5])   Factors 12 Proportion of population using safely managed drinking-water   services 13 Proportion of population using safely managed sanitation services SDG 6.2.1 Function 14 Children under 5 years who are overweight SDG 2.2.2 15 Incidence of low birth weight among newborns   16 Age-standardized prevalence of overweight and obesity in persons   aged 18+ years 17 Age-standardized prevalence of raised blood pressure among persons   aged 18+ years Socio-economic 18 Proportion of population living below the national poverty line, by sex SDG 1.2.1 and age 19 Unemployment rate, by sex, age and persons with disabilities SDG 8.5.2 72 What are the types of health policy evaluation? Health Policy Evaluation Guideline Health Status Indicators - 51 Sustainable Indicators Development Goal (SDG) Topic # Indicator Name Care Coverage 20 Percentage of women aged 15–49 who received four or more   antenatal care visits 21 Births attended by skilled healthcare personnel SDG 3.1.2. Immunization 22 Immunization coverage rate by vaccine for each vaccine in the SDG 3.b.1 national schedule 23 Percentage of targeted population vaccinated with seasonal influenza   vaccine 24 Percentage of specific communicable disease that achieved targeted   decrease Program 25 Number of National Prevention Programs for NCDs   26 Percentage improvement resulting from the National Prevention   Program Screening 27 % of Primary Health Care Center (PHCC) visitors (18 years and above)   screened for hypertension 28 % of PHCC visitors (2 years and above) screened for obesity and   overweight 29 % of PHCC visitors (40 years and above) screened for diabetes   30 % of PHCC visitors (40 years and above) screened for dyslipidemia   31 % of PHCC visitors aged 50 years and above screened for colorectal   cancer by fecal immunochemical test (FIT) 32 % of women aged 40-69 years screened for breast cancer using   mammogram 33 % of PHCC visitors (adults) screened for COPD (Chronic Obstructive   Pulmonary Disorder) Mental Well-being 34 Proportion of adults with psychological distress   Incidence 35 Cancer incidence rate, by type of cancer (per 100, 000 population)   36 Incidence of heart attacks (acute coronary events)   Incidence 37.1 Estimated number of new hepatitis B infections per 100,000 Hepatitis B: SDG (Incidence of population in a given year 3.3.4 Communicable 37.2 HIV incidence (per 1000 population) HIV: SDG 3.3.1 Diseases) 37.3 Incidence of meningococcal meningitis in KSA   37.4 Incidence rate of hepatitis C virus (HCV)   37.5 Incidence rate of measles   37.6 Incidence rate of rubella   37.7 Malaria incidence rate (per 1,000 population) Malaria: SDG 3.3.3 37.8 TB incidence (per 100 000 population) TB: SDG 3.3.2 Injuries 38 The number of serious injuries resulting from traffic accidents per   100,000 population Prevalence 39 Population (15 years and above) who suffer from A Chronic Disease   by Name of Diagnosed Disease Life Expectancy 40 Health expectancy: Healthy Life Years (HLY)   41 Life expectancy at birth   Mortality by Age 42 Mortality Rate   Health Policy Evaluation Guideline What are the types of health policy evaluation? 73 Health Status Indicators - 51 Sustainable Indicators Development Goal (SDG) Topic # Indicator Name Mortality by 43 Mortality rate due to cardiovascular disease, cancer, diabetes, and   Cause - Chronic chronic respiratory diseases Condition Mortality by 44 Mortality attributable to joint effects of household and ambient air SDG 3.9.1 Cause – pollution Environmental: Air Quality Mortality by 45 Mortality rate attributed to unsafe water, unsafe sanitation, and lack of SDG 3.9.2 Cause – hygiene (exposure to unsafe water, sanitation, and hygiene for all Environmental: (WASH) services) Water Mortality by 46 Maternal mortality ratio (per 100,000 live births) SDG 3.1.1 Cause – Maternity Mortality by 47 Premature noncommunicable disease (NCD) mortality SDG 3.4.1 Cause – Premature NCDs Mortality by 48 Suicide rate (per 100,000 population) SDG 3.4.2 Cause – Suicide Mortality by 49 Death rate due to road traffic injuries SDG 3.6.1 Cause – Traffic Accidents Mortality by 50.1 Dengue mortality rate   Cause – 50.2 Malaria mortality rate (per 100,000 population)   Communicable Diseases 50.3 Tuberculosis (TB) mortality rate (per 100,000 population)   Mortality by 51 COVID-19 Cases: Deaths   Cause Health System Indicators - 58 Indicators Sustainable # Indicator Name Development Topic Goal (SDG) Functionality 1 Functional Health Outcomes Score: Inpatient and Outpatient   Mortality 2 Avoidable mortality (preventable and treatable)   Perception of Population (15 years and above) who assessed their own health 3 Health status as good or very good   Quality of Life 4 Quality of Life   5 Experienced a coordination problem in the past two years   Care Coordination 6 Patient assessment of level of integration in health care delivery   7 Post-care encounter   Communication 8 Care from clinicians (doctors and nurses)   9 % Timely resolution of complaints from patients and their families   Overall Rating 10 Overall hospital rating   11 Patient experience score   12 Access to own medical record   Shared Decision 13 Families feeling involved in the care of the patient   Making 14 Patients feeling involved in the decision making of their care   Workforce 15 Working life experience Experience   Care Coverage 16 Unmet health care needs   74 What are the types of health policy evaluation? Health Policy Evaluation Guideline Health System Indicators - 58 Indicators Sustainable # Indicator Name Development Topic Goal (SDG) Percentage of health centers implementing the electronic Health 17   Technology Information Systems (HIS) 18 Percentage of residents who have a unified digital medical record   19 Days to third next available appointment   Emergency Department (ED) Median Time from ED arrival to ED 20 departure for discharged ED patients for Adult Patients   Waiting times for elective surgery: proportion admitted within 21 Time to Care clinically recommended time   Outpatient Department (OPD) Appointment Waiting Time (Days) 22   (OPD including specialist) Average number of days a patient waits for an appointment (Urgent 23 referral)   Percentage of polyclinics in PHCCs that have three or more of the Proximity to Care 24 main specialties (medicine, obstetrics & gynecology, surgery, and pediatrics) of polyclinics per region   25 Number of Primary Health Care (PHC) visits per capita per year   Utilization 26 Outpatient service utilization   27 Patient seen in Virtual clinic   Evidence- based 28 Low value interventions   practice 29 Preventable hospitalization rate   30 Births by caesarean section (%)   Maternity Patients with elective vaginal deliveries or elective cesarean births at 31 ≥ 37 and < 39 weeks of gestation completed   Comprehensive Diabetes Care: Hemoglobin A1c (HbA1c) Poor Condition- specific 32   Control (>9/0%) Effectiveness 33 Thrombolytic Therapy   Percentage of hospitals that comply with Central Board for 34 Accreditation of Healthcare Institutions (CBAHI) Essential National Compliance Requirements (ESR) requirement   Percentage of hospitals that met US median of patient safety culture 35 survey   36 Rate of Adverse Events   Events 37 Rate of Sentinel Events   Infection 38 Hospital acquired infection rate   Readmission 39 Hospital Readmission Percentage   Readmission (by Percentage (%) of all patients who re-attend the emergency room 40   department) within 72 hours (3 days) of a previous visit to the ER Availability of 41 % Of the basic medicines available in the local market Medication   42 Bed Management / Bed Occupancy (%)   Beds 43 Hospital bed density (per 10,000 population)   Cost 44 Cost per weighted separation and total case weighted separations   45 Health facility density and distribution   Density 46 Health worker density and distribution SDG 3.c.1 Hospital Stay 47 Average length of hospital stay (in days)   Health Policy Evaluation Guideline What are the types of health policy evaluation? 75 Health System Indicators - 58 Indicators Sustainable # Indicator Name Development Topic Goal (SDG) General government expenditure on health as a percentage of total 48 government expenditure (ten-year growth)   49 Healthcare expenditure per capita   Out-of-pocket expenditure as percentage of current health 50   expenditure (CHE) 51 Population covered by Health Insurance   Public domestic sources of current spending on health as % of 52 current health expenditure   Spend Private domestic sources of current spending on health as % of 53 current health expenditure   54 Spend of low value interventions   Total current expenditure on health as percentage of gross domestic 55 product   The definition and methodology of Total revenue generated from private health insurance and out-of- 56 this indicator was pocket spend for utilizing government health resources adjusted for simplification 57 Net growth in health workforce   Workforce 58 Saudization (%)   Annex 6: Key Assessment Points to Validate Indicators The newly developed indicators could be generated following the below key criteria: Criteria 1: The Indicator Should Be Essential and Practical Criteria 1: The indicator should be essential and practical Assessment Points Explanation for Each Assessment Point ☐ Justified The necessity of the indicator in evaluating the health policy should be justified. ☐ Valuable and Evaluators should identify which stakeholders will use the indicator. To ensure that the relevant indicator is valuable, it is essential that the data it produces is required and beneficial to the intended users. Effective indicators offer useful information to a broad spectrum of stakeholders. ☐ Actionable An evaluator should make sure that the data and knowledge received from the indicator are utilized and that such information influences planning and decision-making. Since measuring an indicator can demand significant time, resources, and expense, it is crucial that the results are actionable, potentially guiding policy development, decision-making, or resource distribution. ☐ Aligned Before deciding to measure a particular indicator, it is vital to confirm that the information it seeks is not already obtainable from another source. If similar indicators exist, it is important to coordinate with other organizations to ensure consistency in measurement practices, systems, and timelines. Source: Adapted from UNAIDS/Monitoring and Evaluation Reference Group (MERG), pages 11–12. Criteria 2: The Indicator Must Demonstrate Technical Merit An indicator with technical merit ensures that the data collected is reliable, accurate, and credible. Technical merit refers to the ability of an indicator to consistently provide high-quality data under different conditions. An indicator with strong technical merit is reliable, specific, and peer-reviewed to confirm its validity. 76 What are the types of health policy evaluation? Health Policy Evaluation Guideline Criteria 2: The indicator must demonstrate technical merit Assessment Explanation for Each Assessment Point Points ☐ Technically The indicator must have technical reliability and significance, measuring something of value reliable and within its domain. It should provide clear, focused data and allow for straightforward significant interpretation of changes. The indicator should also be sensitive enough to detect variations in performance. ☐ Sensitive The indicator should demonstrate monitoring merit by being reliable and sensitive, consistently producing similar results even when different instruments, procedures, or observers are used. This ensures a low margin of error. ☐ Accurate and The indicator must measure exactly what it is intended to measure, with no ambiguity or specific overlap with other indicators. ☐ Clear The indicator should be clear and easy to interpret, leaving no room for multiple interpretations. ☐ High quality The indicator should undergo a thorough peer review process to ensure its quality. This review may involve experts in monitoring, evaluation, and technical fields, and can include the formation of review panels to validate the indicator’s technical merit. Source: Adapted from MERG, pages 12–13. Criteria 3: The Indicator Must Be Clearly Defined A well-defined indicator is critical for consistent application in the evaluation process. It ensures that everyone involved in data collection and analysis understands the indicator’s purpose, how it is measured, and how to interpret its results. Without clarity, data collection may lead to inconsistency and inaccurate conclusions. Criteria 3: The indicator must be clearly defined Assessment Explanation for Each Assessment Point Points ☐ Title and The title of the indicator should be concise yet descriptive, summarizing the essence of the Definition indicator. This facilitates easy identification and referencing during daily operational and monitoring activities. ☐ Purpose and Provide a detailed statement that explains the purpose of the indicator, including why it is Rationale necessary within the health policy context. This should highlight the importance of the indicator in tracking progress, informing decisions, or evaluating outcomes. ☐ Measurement Describe the approach used to measure the indicator, including any calculations involved. Method Clearly specify the numerators and denominators if applicable, as well as any formulas used, to ensure consistent application and accuracy in measurement. ☐ Data Collection Outline the process for gathering data needed for the indicator, including where the data will Method come from—for example, surveys, administrative records, electronic health records. Clarify who will be responsible for data collection and the tools or systems that will be used. ☐ Measurement Define how frequently the indicator will be measured—weekly, monthly, quarterly, or annually. Frequency Ensure that the frequency aligns with the data collection method and the needs of the health policy for timely decision-making. ☐ Level of Specify how data will be broken down to provide detailed insights—for example, by age, gender, Disaggregation geographic location. This disaggregation allows for targeted analysis, ensuring that the indicator can provide meaningful and actionable information. ☐ Interpretation Offer clear guidelines on how to interpret changes in the indicator values. For example, explain what an increase or decrease in the indicator signifies and provide context for understanding these shifts. If the indicator can be interpreted in multiple ways, clarify how to differentiate between them. ☐ Strengths and Identify common challenges associated with measuring this indicator, such as data quality Weaknesses issues, limited data availability, or potential biases. Provide practical advice or solutions to address these weaknesses to improve the indicator’s reliability. Health Policy Evaluation Guideline What are the types of health policy evaluation? 77 ☐ Additional Include relevant background information, examples of how the indicator has been applied in Sources of similar contexts, and references to related technical documentation. This section should Information provide stakeholders with additional resources for better understanding and application of the indicator. Source: Adapted from MERG, pp. 13–14. Criteria 4: The Indicator Must Be Practical to Collect and Analyze For an indicator to be effective, it must not only be technically sound but also feasible to collect and analyze using available resources and systems. Adequate data collection infrastructure, financial and human resources, and alignment with national monitoring systems are all necessary for the indicator to be practical in real-world settings. Criteria 4: The indicator must be practical to collect and analyze Assessment Explanation for Each Assessment Point Points ☐ Adequate To effectively measure an indicator, robust systems and mechanisms must be in place, including data management platforms, reporting tools, health information systems, and well-trained personnel. These components are essential for ensuring accurate data collection, analysis, and utilization. When considering new indicators, it is important to assess whether the current infrastructure can handle these needs or if it can be adapted without significant investments. For example, integrating a new question into an existing household survey can be a cost-effective way to collect data without creating a new system. This approach leverages existing resources, reducing the need for extensive additional investments. ☐ Aligned Indicators should be developed in alignment with existing national monitoring and evaluation systems to ensure a coherent approach. Designing indicators that harmonize with current national frameworks helps to prevent redundancy and overlap with other evaluation processes. This alignment ensures that the data collected is relevant and comparable across different systems, facilitating more efficient use of resources and avoiding conflicting or duplicative data efforts. Coordinated development of indicators also helps streamline data collection, making it more manageable and less resource-intensive. ☐ Resource Sufficient financial and human resources are crucial for the effective measurement of availability indicators. This includes funding for data collection, training personnel, and resources for data analysis and reporting. It is essential to weigh the costs against the expected benefits of collecting the indicator to ensure that resources are used efficiently. Cost-effectiveness assessments can help to determine whether the indicator justifies the investment required. This step is particularly important to avoid the common pitfall of recommending numerous indicators without considering the financial and operational burdens they may impose on the system. Source: Authors, adapted from MERG, pages 14–15. Criteria 5: The Indicator Should Be Field-Tested or Used in Practice Even well-designed indicators can face unforeseen challenges when implemented in real-world conditions. Therefore, it is essential that new indicators are thoroughly field-tested before being fully introduced. Additionally, all indicators should be reviewed regularly to ensure they remain relevant and effective as circumstances change. Criteria 5: The indicator should be field-tested or used in practice Assessment Explanation for Each Assessment Point Points Indicators, especially new ones, need to be field-tested to ensure their effectiveness and reliability in real-world settings. An indicator that seems robust in theory may encounter various challenges when applied in practice, such as difficulties in data collection, unforeseen external influences, or problems with measurement consistency. Field-testing helps to identify these ☐ Field-Tested potential issues early, allowing for adjustments before full-scale implementation. This testing phase ensures that the indicator can be practically applied, produces meaningful data, and aligns well with operational realities. For existing indicators, practical application in the field acts as an informal validation process, helping to confirm their continued relevance and reliability in changing environments. 78 What are the types of health policy evaluation? Health Policy Evaluation Guideline Indicators should be regularly reviewed to ensure that they remain relevant and effective as conditions change. Regular assessments can uncover problems with data collection methods, such as inaccuracies, biases, or data gaps, as well as challenges in how data is interpreted. Circumstantial changes like the introduction of new medical treatments, shifts in program Regularly ☐ priorities, or evolving health policies can impact the utility and accuracy of indicators. These reviewed reviews allow stakeholders to make necessary adjustments, such as updating definitions, refining measurement methods, or even removing indicators that are no longer applicable. Continuous review is crucial for maintaining the quality and usefulness of the indicators in the face of dynamic healthcare environments. Source: Authors, adapted from MERG, page 15. Criteria 6: The Indicator Set Should Be Coherent and Balanced Overall A coherent and balanced set of indicators is essential for providing a comprehensive understanding of the health policy being evaluated. A well-constructed indicator set should cover all critical aspects of the policy and include a range of indicators that assess inputs, outputs, outcomes, and impacts. The indicator set should not only reflect the policy’s goals but also be practical and appropriate for the specific context in which it is used. Criteria 6: The indicator set should be coherent and balanced overall Assessment Explanation for Each Assessment Point Points The set of indicators should provide a thorough and comprehensive assessment of the health policy’s performance. This means the indicators collectively should cover all necessary areas to give a complete picture of the policy’s effectiveness, strengths, and weaknesses. An adequate set ☐ Adequate captures all key facets of the policy, including implementation processes, service delivery, and final outcomes. Without a comprehensive set of indicators, the evaluation may miss critical insights, leading to an incomplete understanding of the policy’s true impact. Relevance means that each indicator in the set should directly relate to important aspects of the policy being assessed. The indicators should address the key questions of the evaluation, such as whether the policy is achieving its intended goals, addressing priority health issues, and meeting the needs of the population. A relevant set avoids the pitfall of focusing too much on ☐ Relevant certain areas (like clinical outcomes) while neglecting others such as access to services or quality of care. For instance, if a policy aims to improve healthcare access, but most indicators only measure treatment outcomes, the evaluation may overlook whether access issues have been effectively addressed. Appropriateness refers to ensuring that the indicators encompass various levels of monitoring and evaluation, including inputs (resources invested), processes (how the policy is implemented), outputs (services delivered), outcomes (changes in health status), and impacts (long-term effects on public health). This multilevel approach allows for a nuanced understanding of how ☐ Appropriate the policy functions at different stages. However, it is important that the number and type of indicators are practical and feasible given the available resources, including financial, human, and data capacity. An overly complex set with too many indicators can strain resources and lead to poor data quality, while a very minimal set might fail to capture the full scope of the policy’s impact. Source: Adapted from MERG, pp. 15–16. Health Policy Evaluation Guideline What are the types of health policy evaluation? 79 Annex 7: A Sample Informed Consent Template for Health Policy Evaluations Section Details Notes for Customization Title of the [Insert Evaluation Name, for example, “Evaluation Replace with specific policy or program name Evaluation of [Policy Name] Implementation”] for instance, “Saudi Vision 2030 Health Policy Assessment.” Purpose of the We are conducting this evaluation to assess the Specify policy – for instance, “Saudi National Evaluation effectiveness, efficiency, and impact of [Policy Health Strategy” – aligning with “assess” and Name] on health outcomes, ensuring it improves “health outcomes.” public health and meets stakeholder needs globally. Your participation will provide evidence to guide health policy decisions, in accordance with international standards. What Voluntary participation involving [describe Tailor activities, time, and location – for Participation activities, for example, “surveys, interviews, or example, “Arabic-language focus groups, 45 Involves observations”]. minutes, or community centers in Saudi Arabia.” Ensure cultural and linguistic accessibility for Approximately [time, for example, “30–60 diverse populations. minutes”] at [location/method, for example, “online, in-person, or community setting”]. Data collection on [for example, “demographics, health status, or policy implementation experiences”], used solely for this evaluation, per international ethical guidelines. Confidentiality Protect privacy by keeping personal and sensitive Add local or regional data protection laws, for and Data information confidential, using anonymization example, “Ensure compliance with Saudi health Security and de-identification techniques where regulations.” applicable, with access restricted to authorized personnel, National Committee of Bioethics (NCBE). Implement robust data security measures, including encryption, secure storage, and restricted access, to prevent breaches and maintain data integrity, in compliance with Saudi Personal Data Protection Law, 2021, if applicable. Risks and Minimal risks, such as discomfort discussing Specify support resources, for example, “Local Benefits sensitive topics, mitigated by support resources counseling at [number]” or regional benefits, – for instance, counseling, helplines – and ethical such as “Saudi Vision 2030 health goals.” review processes. Potential benefits include contributing to improved global or local health policies, enhancing community well-being, and providing insights that may benefit your organization or population. Your Rights Participation is entirely voluntary; you may Include local ethics contacts for example, withdraw at any time without penalty or impact “Saudi Health Council Ethics Committee,” and on services, per the Declaration of Helsinki. language options, such as “Available in Arabic, English, or local dialects.” Ensure accessibility Refuse to answer any questions or stop for low-literacy or disabled individuals. participation at any point. Receive a copy of this form, translated into your preferred language if needed, and contact [Evaluation Coordinator] at [Email/Phone] for questions or concerns. For ethical issues, contact [Ethics Committee] at [Email/Phone]. 80 What are the types of health policy evaluation? Health Policy Evaluation Guideline Informed I have read and understood the information Specify languages for example, “Arabic or Consent above, or it has been explained to me in English,” or alternative consent methods such [language, for example, Arabic/English], and I as verbal consent or thumbprint for illiterate or voluntarily agree to participate. I understand my disabled participants. rights, the evaluation’s purpose, data use, and privacy protections, per international ethical standards. Name of Participant: _______________________________ Signature or Mark: _______________________________ Date: _______________________________ For Participants If unable to sign but agreeing, a witness or Include only if targeting populations with Unable to Sign authorized representative may sign: literacy or physical barriers, such as rural Saudi Arabia. Witness/Representative Name: _______________________________ Witness/Representative Signature: _______________________________ Date: _______________________________ Relationship to Participant: _______________________________ Contact for For evaluation questions, contact [Evaluation Include contact details for a designated Questions or Coordinator Name], [Title], at [Email Address] or evaluation lead – for example, from the SHC or Concerns [Phone Number]. For ethical concerns, contact Ministry of Health and an ethics contact – [Ethics Committee Chair Name], [Title], at [Ethics NCBE. Email Address] or [Ethics Phone Number]. Health Policy Evaluation Guideline What are the types of health policy evaluation? 81 Annex 8: Consolidated Evaluator Checklist for the Six-Step Evaluation Process Step 1: Identify and Engage Stakeholders # Checklist Item ✔ If No, Specify 1.1 Have all relevant stakeholders been identified? ☐ 1.2 Have stakeholders been categorized into the following main ☐ groups? • those impacted by the policy • those implementing the policy • those using the evaluation findings 1.3 Have key stakeholders who influence credibility, implementation, ☐ institutionalization, or decision-making been identified? 1.4 Have the interests and influence of stakeholders been analyzed ☐ and mapped? 1.5 Has there a stakeholder engagement plan outlining when and how ☐ to involve them? 1.6 Has each stakeholder’s engagement in key policy evaluation ☐ steps—describing health policies, designing evaluation, collecting and analyzing data, justifying conclusions, using and disseminating the evaluation findings—been identified and agreed? Step 2: Describe the Health Policy # Checklist Item ✔ If No, Specify 2.1 Have all key policy components—inputs, activities, outputs, ☐ outcomes, and impacts—been identified? 2.2 Are the policy components clearly described and understood? ☐ 2.3 Has a logical map been developed and visualized to show the ☐ relationships among the policy components? 2.4 Has the political, social, and financial context of the policy been ☐ analyzed? Step 3: Design Evaluation # Checklist Item ✔ If No, Specify 3.1 Has the focus of evaluation been determined? ☐ 3.2 Has the type of evaluation required been identified? ☐ 3.3 Have the data needs been determined? ☐ 3.4 Have the most suitable evaluation methods been selected given ☐ the following key parameters? 3.4.1 Data needs 3.4.2 Available resources 3.4.3 Policy context 3.4.4 Evaluation comprehensiveness 3.4.5 Timing 82 What are the types of health policy evaluation? Health Policy Evaluation Guideline Step 4: Data Collection and Analysis # Checklist Item ✔ If No, Specify 4.1 Have the evaluation questions been identified? ☐ 4.2 Have the indicators needed to measure each evaluation question been ☐ identified? 4.3 Are the indicators aligned with the national health indicators? ☐ 4.4 Have the indicators been chosen from an existing inventory or newly ☐ developed? 4.5 If newly developed, have new indicators been pilot-tested to reduce ☐ measurement error? 4.6 Is the necessary data available and are data sources appropriate? ☐ 4.7 Which data collection methods best suit the evaluation context and objectives? ☐ 4.8 Have quality and quantity issues in data collection been evaluated considering ☐ the resource constraints? Step 5: Justify Conclusions # Checklist Item ✔ If No, Specify 5.1 Has the policy context been considered in data interpretation? ☐ 5.2 Are findings compared with literature and similar policies? ☐ 5.3 Have alternative explanations been explored and tested? ☐ Are benchmarks – Vision 2030 or transformation goals – used to assess 5.4 ☐ results? 5.5 Are results compared with previous years’ outcomes? ☐ 5.6 Are actual findings compared to intended policy goals? ☐ 5.7 Have potential biases (if any) in the analysis been documented? ☐ 5.8 Have the limitations (if any) of the evaluation been fully assessed? ☐ Step 6: Use and Disseminate Evaluation Findings # Checklist Item ✔ If No, Specify 6.1 Is there a strategy to promote use of the evaluation findings? ☐ 6.2 Are evaluation findings used to inform both short-term and strategic planning? ☐ 6.3 Are there regular meetings to communicate the evaluation conclusions? ☐ 6.4 Are evaluation reports tailored to suit different stakeholder audiences? ☐ 6.5 Are findings communicated clearly and in a timely manner? ☐ 6.6 Are findings disseminated through various accessible methods? ☐