GOVERNANCE GOVERNANCE EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT Artificial Intelligence in the Public Sector Maximizing Opportunities, Managing Risks Supported by the GovTech Global Partnership: www.worldbank.org/govtech Republic of Korea © 2020 International Bank for Reconstruction and Development / The World Bank 1818 H Street NW, Washington DC 20433 Telephone: 202-473-1000; Internet: www.worldbank.org Some rights reserved. This work is a product of the staff of The World Bank with external contributions. The findings, interpretations, and conclusions expressed in this work do not necessarily reflect the views of The World Bank, its Board of Executive Directors, or the governments they represent. The World Bank does not guarantee the accuracy of the data included in this work. The boundaries, colors, denominations, and other information shown on any map in this work do not imply any judgment on the part of The World Bank concerning the legal status of any territory or the endorsement or acceptance of such boundaries. Nothing herein shall constitute or be considered to be a limitation upon or waiver of the privileges and immunities of The World Bank, all of which are specifically reserved. Rights and Permissions This work is available under the Creative Commons Attribution 3.0 IGO license (CC BY 3.0 IGO), http://creativecommons.org/licenses/by/3.0/igo. Under the Creative Commons Attribution license, you are free to copy, distribute, transmit, and adapt this work, including for commercial purposes, under the following conditions: Attribution—Please cite the work as follows: 2020. Artificial Intelligence in the Public Sector | Maximizing Opportunities, Managing Risks. EFI Insight-Governance. Washington, DC: World Bank. Translations—If you create a translation of this work, please add the following disclaimer along with the attribution: This translation was not created by The World Bank and should not be considered an official World Bank translation. The World Bank shall not be liable for any content or error in this translation. Adaptations—If you create an adaptation of this work, please add the following disclaimer along with the attribution: This is an adaptation of an original work by The World Bank. Views and opinions expressed in the adaptation are the sole responsibility of the author or authors of the adaptation and are not endorsed by The World Bank. Third-party content—The World Bank does not necessarily own each component of the content contained within the work. The World Bank therefore does not warrant that the use of any third- party-owned individual component or part contained in the work will not infringe on the rights of those third parties. The risk of claims resulting from such infringement rests solely with you. If you wish to reuse a component of the work, it is your responsibility to determine whether permission is needed for that reuse and to obtain permission from the copyright owner. Examples of components can include, but are not limited to, tables, figures, or images. All queries on rights and licenses should be addressed to World Bank Publications, The World Bank Group, 1818 H Street NW, Washington, DC 20433, USA; e-mail: pubrights@worldbank.org. Cover design and layout: Diego Catto / www.diegocatto.com >>> Contents Foreword 5 Acknowledgments 6 Executive Summary 7 Priorities Going Forward 9 Abbreviations 11 1. Introduction 12 Methodology and Scope 14 2. AI Opportunities 15 Use Cases 19 AI in Corruption 20 AI for Citizen Engagement 22 AI in Customs 23 AI in Health 23 AI in the Judicial Sector 26 AI In Procurement 27 AI in Tax Compliance 28 AI in Tax Policy 30 AI in Audit 32 3. AI Risks 33 Performance, Trust, and Bias 33 Cybersecurity 36 Control 37 Privacy 37 4. AI Governance and Operations 38 AI Ethical Principles 38 Role of a Central Government Agency 44 AI Operations Framework 45 Innovative Procurement Examples 51 Role of the Public Sector in Society 52 AI Operationalization in World Bank Projects 53 5. Ethical Considerations 54 Inequality 55 Control 55 Concentration 56 6. Government’s AI Building Blocks 57 Whole-of-Government Architecture 57 Interoperability Patterns 61 Data Standards 62 7. Conclusions 64 Priorities Going Forward 65 Appendix A. AI Technical Primer 67 Appendix B. AI and the Sectors 92 Glossary 101 References 102 Boxes Box 1. Actionable Insight: 41 Adopt Principles of AI and Issue an AI Governance Model Box 2. Private Sector AI Principles 43 Box 3. Procurement: Important Steps to Consider 53 Box 4. Data Fabric in Brief 58 Box 5. Blockchain: Distributed Ledger Technology 61 Box 6. Actionable Insight: Data Fabrics Can Overcome Silos 61 Box 7. Actionable Insight: Governments Should Standardize Data 62 Figures Figure 1 - Fixed Broadband Subscriptions per 100 Inhabitants, 2001–2019 17 Figure 2: The disparity in ICT Skills across the Regions 18 Figure 3. Wuhan Neural Network Model with Quarantine Control 24 Figure 4. Italy Neural Network Model with Quarantine Control 24 Figure 5. South Korea Neural Network Model with Quarantine Control 24 Figure 6. U.S. Neural Network Model with Quarantine Control 24 Figure 7. Results of COVID-19 Analysis by AI 25 Figure 8. An Optimal Tax Policy Optimizes a Balance between Equality and 31 Productivity Figure 9. Global Consensus on the Principles of AI 40 Figure 10. AI Business Case Assessment 47 Figure 11. Operationalizing AI 48 Figure 12. Singapore Procurement Model 51 Figure 13. General Data Fabric Architecture for Whole-of-Government Use 59 Figure 14. High-Level Data Fabric Architecture 60 Tables Table 1. AI Readiness Index 16 Table 2. Role of humans - Five Levels of AI Adoption 19 Table 3. AI Risk Mitigation Framework 35 >>> Foreword Disruptive technologies like artificial intelligence (AI), mobile apps, Internet of Things, block- chain, cloud computing, and data analytics have the potential to transform governments by enhancing personalized service delivery experience, improve back-end process efficiencies, and strengthening policy compliance. One of the most promising disruptive technologies, AI is already being adopted by the digitally advanced governments to maximize its potential benefits. And this trend is catching up with other governments as well. More than 50 governments have issued or are in the process of issuing AI strategies in recent years. However, in many of our client countries, the public sector’s ability to adopt AI is hampered by low access to digital skills, insufficient foundational digital technologies, and inadequate digital data as well as a lack of awareness of the potential of AI. These differences in the pace of AI adoption in the public sector could further exacerbate inequalities between the rich and the poor countries. To promote wider AI adoption in our client governments, this paper provides a prelimi- nary synthesis of the existing opportunities, risks, and building blocks required for implementing and integrating AI in their operations. The paper also highlights policy, governance and people aspects necessary for AI implementation, as there are no shortcuts to technology adoption. The use of technology cannot be fast-tracked as many of the analog complements needed for adop- tion are not yet in place (World Bank 2016). To better understand the role AI can play in public sector transformation, the World Bank pro- duced this paper in partnership with the Swiss State Secretariat for Economic Affairs. This paper aims to distill the existing knowledge on the use of AI in the public sector and summarize the lessons learned from early adopters. It draws on the accumulated literature, case studies, and emerging trends to provide guidance to our teams working in this field. The World Bank’s tech- nical team benefited from a panel of experts from inside the World Bank and from the industry who shared their insights and enriched the paper. The goal is to alert our staff and clients to the opportunities, risks, and the potential to foster AI for public sector transformation. Edward Olowo-Okere Global Director Governance and Institutions 5 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN >>> Acknowledgments This paper was prepared by a World Bank team consisting of Khuram Farooq (Senior Gover- nance Specialist in the Global Governance Practice (GGP) and Bartosz Sołowiej (Consultant). The team received valuable guidance from Edward Olowo-Okere (Global Director, GGP) who is leading the GovTech agenda in the Bank; Tracey Marie Lane and Adenike Sherifat Oyeyiola (Practice Managers, GGP); Kimberly Johns (Senior Governance Specialist); and Cem Dener (Lead Governance Specialist, GGP). The team benefited from the comments of external peer reviewers: Aaron Moffatt (Chief Technol- ogy Officer, Immersion Analytics) and Brittan Heller (Global Head of AI, Foley Hoag LLP, Harvard Tech, and Human Rights AI Fellow 2019). The team is also grateful for contributions from the World Bank’s reviewers: Aki Ilari Enkenberg (Senior Digital Development Specialist, IDD02), David Santos (Senior Public Sector Specialist, ELCG2), Jana Kunicova (Senior Public Sector Specialist, EA1G2), Parminder P.S. Brar (Lead Governance Specialist, GGP), and Trevor Mon- roe (Senior Operations Officer, DECAT). The team also wishes to express its thanks to Barbara Joan Rice (Consultant, World Bank) and Mary A. Kent (Working Copy Editor) for their editorial support; to Jasmine N. Brown (Intern, Foley Hoag LLP) for her research support, and Angela Hawkins (Team Assistant) for her format- ting expertise. Finally, our thanks to Richard Crabbe for his editorial work and communications advice. This paper was financed by the State Secretariat for Economic Affairs of Switzerland (SECO). We gratefully acknowledge this excellent partnership with SECO to promote the GovTech agenda. 6 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN >>> Executive Summary Disruptive technologies like Artificial Intelligence (AI) offer new opportunities to governments facing development challenges, especially now as fiscal stress is causing many governments to find new solutions to improving services without increasing the costs. Artificial Intelligence can be defined as the ability of the software systems to carry out tasks that usually require human intelligence: vision, speech, language, knowledge, and search. Many governments view AI as a strategic resource for competitiveness and growth and are em- bracing it with speed and priority. According to Bughin et al. (2018), AI can potentially contribute $13 trillion to the global economy by 2030.1 At least 50 governments have developed or are in the process of developing an AI strategy. However, the pace of AI adoption is uneven, and most countries are not ready for AI adoption. There is no country from Africa or Latin America in the list of the top 20 countries on the AI Readiness Index developed by Oxford Insights.2 Except for four economies, Asia-Pacific is also one of the worst-performing regions on this list. Slower adoption of AI in our client countries may lead to further inequality between the rich and the poor nations To reduce these inequalities, opportunities should be explored through the initiation of AI proj- ects in areas of strategic impact and priority. Chapter 2 on AI opportunities provides examples of AI use from around the world. Moreover, it provides operational guidance on AI implementation on fundamental questions relating to developing country contexts. It broadens the perspective to explore opportunities for implementation. Government AI deployments exist in every sector. A common pattern of use cases includes citizen engagement, compliance and risk management, fraud and anti-corruption, business process automation, service delivery, asset management, and analytics for decision-making and policy design. While AI should be explored to solve complex problems, associated adverse consequences in client contexts should also be fully understood and managed as AI comes with additional risks that could exacerbate the problems facing the public sector. Chapter 3 summarizes these risks. The ethical use of AI is fundamental to managing the adverse consequences of AI use in public policy. The ethical use of AI means that these systems should not harm humans. Rather, they are used to enhance overall human wellbeing. For example, an AI system that renders people jobless on a wide-scale, makes a biased decision against an ethnic minority applicant on eligibil- ity for government welfare assistance or is used to propagate fake news on social media would be unethical. On the other hand, however, an AI system that improves anti-fraud measures through the reconciliation of multiple large data sets, facilitates medical diagnosis through image recognition, or enhances learning outcomes through tailored access to learning material, would be considered ethical and therefore, human-centered AI. National level public policy response is needed to address these ethical issues. Inequality could rise due to unemployment, the lowering of wages for low-skilled workers, and the vulnerability of some communities to bias in AI-based automatic decisions. Control could increase due to 1. Bughin et al., 2018. 2. Government AI Readiness Index 2019. 7 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN state surveillance of citizens, robot-induced propaganda and Most early adopters are embracing a design thinking frame- fake news on social media and use of AI-enabled weapons work and agile methodology. These include a staged iterative like drones. The concentration of wealth could accentuate mo- approach to implementation—ideation (problem definition), nopolies as a few firms with the AI resources could dominate conceptualization, proposal, procurement, prototype, testing, the market and lead to net resource flows from the developing deployment, and scaling up. A feedback learning loop is built to the developed countries where these firms are based. into the design at every stage. To manage the risks and maximize the opportunities of adopt- Adopting a government-wide data fabric architecture will help ing AI in the public sector, the government should prepare an governments leverage cutting-edge technologies to address AI policy and governance frameworks to help guide the ethical data silos in a cost-efficient manner. The initial focus should be use of AI and to provide clarity about AI principles and priori- on foundational technologies, interoperability, open data, and ties. Following the adoption of AI policy and the development of standardization of data across government. Chapter 6 on AI a roadmap, an operating framework will anchor the principles building blocks illustrates the technology foundations for this as the use of AI is rolled out. Chapter 4 on governance and architecture. The data fabric architecture will serve as the com- operations provides more details of the models and AI com- mon denominator for standardized data interchange among pliance frameworks currently in use. The models are funda- the multitudes of subject-area specific applications such as an mentally important to help guide the government in protecting integrated financial management information system, payroll, the sanctity of human life throughout phases of AI adoption in tax administration systems, e-procurement, health manage- the public sector. Governments also adopt basic principles to ment system, population census, and geographical informa- promote human-centered use of AI. These principles include tion systems, among others. This architecture should be built personal data privacy, accountability, cybersecurity, transpar- on agile principles, evolve organically, and engender trust. ency and explainability, fairness and non-discrimination, hu- Cloud computing offers immense opportunities to harness the man control of technology, and human values. power of such an architecture with agility. Inadequate founda- tional digital technologies, quality of data, and digital skills are A central innovation hub for AI could help pool scarce resourc- the major barriers to AI adoption in developing countries and es to support the initiatives of line ministries. In the use cases, constitute critical elements of the digital divide. most governments have set-up the main hub for AI that serves as a central authority over decentralized projects among line AI threats during implementation need to be carefully as- agencies. The AI hub helps them in several ways. It central- sessed and mitigation actions planned. Threats include per- izes talent that guides and supports the line agency, connects formance and bias, cybersecurity, control, and privacy. These industry expertise to the line agency, promotes research, and risks should be managed at the implementation agency level, builds alliances with academic institutions and the private sec- while broader ethical issues need policy action at higher lev- tor. It also helps connect with AI organizations internationally els. Chapter 3 offers steps toward risk mitigation. Involving to exchange knowledge and resources. Neighboring countries stakeholders is crucial to mitigating risk, especially among that have a forum for coordination at the political level can groups most vulnerable to bias. Additionally, transparency and develop regional AI innovation hubs suitable to many of the explainability could strengthen accountability. Compliance World Bank’s client countries. with privacy and data protection regulations is critical. Innovation procurement frameworks provide agility for ex- Governments and world leaders are instrumental in guiding perimentation. In digitally advanced governments, a problem- the transition to automation and AI. They can provide lead- driven request for proposal (RFP), rather than a tender with ership to influence the trajectory of AI adoption among citi- solution specifications, is developed and launched under inno- zens at national and international levels. This will help avoid vative procurement methods. These methods allow an initial adverse consequences and reap productivity gains. National award of a small scope proof-of-concept contract to more than governments could choose global guiding principles that will one vendor to compare a range of solution options and decide inevitably shape the acceptance or rejection of AI. Since AI will the best option for further scale-up. World Bank task teams have a profound influence on service delivery, citizen engage- could adopt these approaches in consultation with procure- ment, and core operations, it is imperative to help formulate ment colleagues and other available technical resources in a cohesive governance model that supports the process of the Bank, such as the GovTech team, Innovation Lab, Digital ethical implementation. Development teams, Innovations in Big Data and Analytics for Development Program, and other sector colleagues. 8 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Priorities Going Forward Based on the issues highlighted in the discussion, several pri- » Adverse ethical implications of AI could be man- orities could be considered by policymakers. aged through broader economic policies. These could include industrial policy, tax policy, competition • Governments must adopt policies and governance policy, human capital policy, among others. These frameworks that promote human-centric AI while policies should aim to develop human capital, ensure maximizing opportunities. A few aspects of the policy fair competition, incentivize human-enhancing AI so- framework are mentioned below: lutions, among others. » AI policy anchored in ethical principles would » These policies should also promote digital skills, be essential. It could be tailored to specific settings and broader education in science, technology, but should be approved at the policy level to provide engineering, and mathematics (STEM) to support the authorizing environment. Governments in many people as they adjust to the shifting nature of work settings have issued AI strategies approved by the in the coming decades. Unskilled people and disad- parliament, president, prime minister, or the cabinet. vantaged groups should be given special attention. These policies should be based on ethical principles. Governance and operational framework are essential » The regulatory framework to fight online propa- to specify broad guidelines and institutional arrange- ganda, misinformation, libel, and cybercrimes ments. An innovation hub could be established to pool should be given priority. Also, governments could talent, establish partnerships with academia and the establish agency mandates to monitor policy compli- private sector, promote research, and facilitate ex- ance and track, prevent, and investigate disinformation perimentation by line ministries. The innovation hub to protect their citizens. Engagement with social media should source the best talent through adequate in- Big Tech—Facebook, Instagram, and Twitter—should centives. Innovative procurement approaches should aim at encouraging the deployment of AI tools and pro- be adopted to leverage private sector skills with agility fessional fact-check partnerships to take down content to allow iterative, problem-driven approaches to the that is malicious, hateful, propagandist, and false. RFP. The implementation teams should also manage the risks associated with AI, including bias, security, » Strengthen privacy, data protection, and civil lib- and unintended consequences, among others. erties and monitor compliance, which is typically weak in most settings. Promoting full disclosure of » Promote transparency and accountability through information being tracked by AI and robots through inclusion and multi-stakeholder engagement at transparency frameworks should also be strength- every step of the AI policy design and implemen- ened. Civil liberties and privacy are at a particular risk tation. Affected communities and populations should of infringement, which should be addressed through be informed and provided with avenues for contesting these regulatory frameworks. AI logic without delays and hurdles. 9 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN • Investments should be made in human capital and » Data fabric architecture, including interoperabil- digital infrastructure. AI research, digital skills, AI entre- ity, should be considered for investments. This preneurship, and foundational digital technologies could will overcome silos, and leverage data assets for de- be prioritized. cision-making, compliance monitoring, and analytics. The initial focus should be on interoperability, open » Investments should be directed to fund research, data, and data standardization. A hybrid cloud option, education, and digital skills development pro- which combines on-prem data and cloud computing grams in general and in AI in particular. They in a hybrid envirionment, should be explored to lever- could include scholarships, apprenticeships, and re- age the computing power at much lesser costs to pilot search funding in AI, computer science, STEM educa- AI solutions. tion, and AI-related disciplines such as data science. Special emphasis could be given to disadvantaged » Proof-of-concept and pilot AI projects could be groups such as women, minorities, and those at risk the starting point for exploring opportunities. of being left behind. Many governments have deployed AI to solve prob- lems. Key use cases include citizen engagement, ser- » Innovative entrepreneurship could be promoted. vice delivery, regulatory compliance, decision analyt- This could be done through an innovation fund, loan ics, fraud, and anti-corruption. Hackathons promote programs through state development banks, income- emerging talents and start-ups as seen in Austria, Es- contingent loans for students or others, and small tonia, India, Pakistan, Poland, and the United States. business loan programs. Variations of these funding modalities are already used in Brazil, China, Denmark, • Risks should be identified and managed, rather than the European Union, Finland, Germany, Israel, and the avoided. Good algorithm impact assessment framework United States (Mazzucato, 2015). AI could be one of models exist, which can be tailored to suit a country’s the areas to be incentivized through these programs. context. The details could vary from context to context, but fundamental principles of risk mitigation are common. » The innovation hub should be staffed with the These include self-assessments, peer reviews, inclusion, best talent on market-based salaries. These skills and transparency. are in high demand and could easily drain overseas if not attracted and retained with appropriate incentives. 10 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN >>> Abbreviations ACL Access Control Layers AI Artificial Intelligence ANN Artificial Neural Network API Application Programming Interface COTS Commercial Off-The-Shelf CPU Central Processing Unit DLT Distributed Ledger Technology FedRAMP Federal Risk and Authorization Management Program FMIS Financial Management Information System FOSS Free Open Source Solutions GAN Artificial Neural Network ICT Information and Communication Technology IoT Internet of Things IPC Inter-Process Communication IRS Internal Revenue Service ITU International Telecommunication Union ML Machine Learning MoH Ministry Of Health NGFM New Generation Fiscal Machines NGO Nongovernmental Organization NLP Natural Language Processing OECD Organisation For Economic Co-Operation And Development RFP Request For Proposal RL Reinforcement Learning SRT Solicitation Review Tool STEM Science, Technology, Engineering, And Mathematics TCO Total Cost of Ownership UN United Nations UNCTAD United Nations Conference on Trade And Development UNESCO United Nations Educational, Scientific, and Cultural Organization VPC Virtual Private Cloud 11 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN 1. >>> Introduction The World Bank launched the GovTech3 Global Partnership in 2019 to support the mod- ernization of client governments through the use of technology. To promote this effort, the Swiss State Secretariat for Economic Affairs partnered with the World Bank to produce a series of papers. This paper on artificial intelligence (AI) in the public sector, one in this initial series, offers insights drawn from the existing uses of AI in the public sector. The target audience is non- technical staff and policymakers who are developing and supporting the implementation of digi- tal strategies for the public sector and drawn into conversations on the role of AI in modernizing the public sector. It refers to some fundamental technical concepts and provides more in-depth technical explanations in the appendices. In recent years, governments have begun to investigate ways of leveraging artificial in- telligence (AI) in public policy to better serve citizens, enhance compliance, and reduce fraud. The development of an appropriate policy and legal environment for AI could help coun- tries stay ahead in commercial innovation, competitiveness, and international trade. The aca- demic and professional research on AI ethics, policy, and regulatory reforms provides empirical and quantitative evidence on the opportunities and risks of AI adoption in the public sector The objective of this paper is to help World Bank’s client governments understand the ethical issues and policy options associated with AI to promote ethical AI and to elaborate on the opportunities for AI adoption in the public sector. 3. GovTech is a whole-of-government approach to public sector modernization that promotes simple, accessible, and efficient government. It aims to promote the use of technology to transform the public sector, improve service delivery to citizens and businesses, and increase efficiency, transparency and accountability. 12 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Advanced digital economies are increasingly adopting policies to guide its use. In recent years, the public sector AI in both the private and the public sectors as part of made impressive headway developing counsels and policies their digital strategies. According to Bughin et al. (2018), AI on AI applications, procurement, and adoption. The United can potentially contribute $13 trillion to the global economy by Kingdom and Bahrain launched AI procurement guidelines 2030. Use cases provide case studies for learning and also across their governments (ANI 2019). The U.K. government illustrate the potential for adopting AI in the public sector to published “A Guide to Using Artificial Intelligence in the Public enhance efficiency and quality of service delivery. The World Sector” (GDS and OAI 2019). Singapore issued the “Model Bank’s client governments frequently request support on how AI Governance Framework” (PDPC 2020). The United Arab to design digital transformation programs that can increase Emirates established the National Program for Artificial Intel- efficiency and quality of service delivery, improve citizen en- ligence.5 The Organization for Economic Co-operation and gagement, and modernize core government operations. One Development (OECD) published the “Recommendation of the of the important areas of support is AI. With careful execution, Council on Artificial Intelligence” (OECD 2019). AI programs can help a government to deliver services faster and more tailored to the needs of beneficiaries and citizens The use of AI poses substantial risks as models and data and the public administration charged with delivering them. may be substandard or inaccurate leading to bias. Data pri- vacy and security, and ethical use of AI pose major concerns Public administrations that lack data collection capa- in all contexts, but this is likely to be even more of a concern bilities, technical skills in the civil service and digital where there is a lack of transparency more generally, concerns infrastructure are unlikely to be able to manage AI data over human rights, or what might be considered a “poor gover- requirements or benefit from the application of AI. But, nance” environment. AI software is a “black box” that is opaque generally, the volume of information produced and stored to policymakers. This means that algorithm opacity—the inabil- daily by people’s movements, activities, and transactions is ity to detect design bias in constructing the algorithm—poses a increasing, and combined with more computing power, such major challenge for policymakers and auditors. data can be used for effective analysis and policymaking. The speed of AI innovation and adoption has been fast; AI The adoption of digital solutions in government will re- computation has been doubling every three months.4 Govern- quire an investment in digital skills. The shift in the public ments could create readiness conditions to fully leverage the sector needs from low-skilled to high-skilled workers will take potential of AI as both the speed of government digitalization, place gradually over the long term, but it is a key consider- store of data, and AI innovation evolve. The paper describes ation because building digital skills in the public sector and readiness conditions, such as governance arrangements, overcoming skills shortages more generally also takes time. availability of digital data, local and international data source The use of AI in the public sector may shift the characteris- integrations, technical capacity, and infrastructure, for wider tics of public sector employment and potentially result in job AI adoption and guides assessing these conditions. While losses as more decision making becomes automated through this paper touches briefly on the policy as it relates to AI, a the use of machine learning and models. However, the impact more detailed paper is forthcoming on AI policy aspects and of adopting AI is likely to be less of a concern where the public elaborates on a comprehensive framework for policy domains. sector wage bill is manageable, and the cost of labor is low. In Also, this paper does not cover the re-engineering of business some cases, demand for lower-skilled labor will decrease but processes or project management aspects as they relate to AI whole scale substitution of professions with an AI program or adoption. Regulations and policies on data, privacy, security, machine is unlikely as the expert judgment will still be needed. transparency, and accountability, in addition to the business The demand for high-skilled labor will likely increase. In some process review, must precede the actual implementation of AI. contexts, AI can automate systems of bureaucracy and create new job opportunities in, for example, policymaking, auditing, The adoption of AI in government requires interagency and resource management, jobs that require more analytical oversight, coordination among interdisciplinary teams of skills and judgment. policymakers, and requires the adoption of overarching 4. This is six times faster than the Moore’s Law on processor speed doubling every two years – now closer to 18 months. See Artificial Intelligence Index 2019 Annual Re- port, 2019 Stanford Report, produced in partnership with McKinsey & Company, Google, PwC, OpenAI, Genpact and AI21Labs (Artificial Intelligence Index 2019 Annual Report, 2019). 5. For more information, visit the website of the National Program for Artificial Intelligence at https://ai.gov.ae/about-us/. 13 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Methodology and Scope This paper aims to provide some indications of the opportu- and prediction within the disciplines. To fully comprehend the nities and risks of AI adoption in the public sector. It distills impact that AI might have on governments, it is necessary to knowledge and guidance on the use of AI in the public sec- develop a solid understanding of key AI concepts. The paper tor. The AI use cases discussed in the paper demonstrate the does not offer in-depth coverage of work in specific sectors. potential to improve government services and create new op- The findings in the paper were validated through interviews portunities to strengthen engagement with citizens. with industry experts. Special efforts have been made to en- sure the architectural design approaches discussed in the pa- The paper curates knowledge residing in public documents per incorporate the best industry knowledge. The paper goes and aims to distill lessons learned on how to adopt and use AI to great lengths to maintain a practical approach, with “hands- as part of a public sector modernization strategy. The paper’s on” examples of architectures and applications. primary scope is on governance-related aspects. Chapter 2 elaborates on the opportunities being availed by governments The paper has limitations and AI adoption is not widespread. around the world through the use of AI. These opportunities Actionable lessons in AI use are rare among client govern- should be availed while managing associated risks, which ments. Furthermore, there are limitations to the level of de- are discussed in Chapter 3. For maximizing opportunities and tailed, in-depth information, and availability of use cases from managing risks, governments need to adopt AI ethical prin- public resources. ciples and institutional arrangements, discussed in Chapter 4. Chapter 5 discusses the ethical dimensions that need a Chapter 2 provides 14 use case examples of how AI has al- broader policy response at the national level. Chapter 6 enu- ready been adopted in the public sector to address public sec- merates the building blocks necessary for a successful long- tor issues such as how to control corruption. The associated term AI strategy. risks of AI adoption are elaborated in Chapter 3. However, to harness the opportunities from AI governments need to de- The appendices contain information for practitioners. Appen- velop the governance frameworks, address the ethical con- dix A provides technical information and additional resources siderations and develop the building blocks of a government- for further support, and Appendix B highlights solutions that wide AI architecture, issues discussed in Chapters 4, 5, and rely on AI for improvements in efficiency, scientific analysis, 6, respectively. 14 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN 2. >>> AI Opportunities The public sector in advanced digital economies is rapidly adopting AI, notably in Austria, Brazil, China, Estonia, Israel, Mexico, Republic of Korea, Singapore, the United States, and the United Kingdom, among others. Noteworthy examples are also surfacing in Bank cli- ent countries. In this chapter, several AI use cases are provided to demonstrate the opportunities of AI already being harnessed in the public sector. Developing governments can also harness these opportunities to address some of the complex developmental challenges. However, gov- ernments in these countries need to address some of these challenges to maximize opportuni- ties. The biggest bottlenecks in AI adoption are the availability of quality data, expertise, budget, and mindset for experimentation and problem-solving. Sectors or agencies that are more likely to adopt AI primarily have well-developed data infrastructures. These agencies are typically well resourced, experience compliance pressures, have a mission-critical need for analytical information for decision-making, or consider citizen engagement as an important element of the policy design. The role of leadership initiative is also important. Silos and closed systems with poor or inaccessible data impede AI development. Governments need to first evaluate the strengths and weaknesses of their data, procedures and AI policy framework before embarking on AI solutions. Wider AI adoption in the public sector typically follows once prerequisites like sufficient digital infrastructure, adequate digital skills, enabling legal frameworks, and digital strat- egies are in place. The Oxford Insights’ Government AI Readiness Index scores the govern- ments of 194 countries according to their preparedness to use AI in the delivery of public ser- vices. The overall score is comprised of 11 metrics grouped under governance, infrastructure and data, skills and education, and government and public services. The data is derived from a variety of resources including desk research and the UN eGovernment Development Index. As presented in Table 1, the 2019 AI Readiness Index shows that Singapore comes first, with the rest of the top 20 mostly Western European countries.6 6. Government AI readiness indicators: https://www.oxfordinsights.com/ai-readiness2019. 15 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN > > > T A B L E 1 - AI Readiness Index GOVERNMENT AI READINESS INDEX 2019 Rank Government 1 Singapore 2 United Kingdom 3 Germany 4 United States of America 5 Finland 6 Sweden 7 Canada 8 France 9 Denmark 10 Japan 11 Australia 12 Norway 13 New Zealand 14 Netherlands 15 Italy 16 Austria 17 India 18 Switzerland 19 United Arab Emirates 20 China Source: Oxford Insights Investments in data infrastructure, APIs, open standards, and data governance arrangements are all required for suc- cessful AI strategies in government, as discussed in the fol- lowing chapters. A digital divide exists across countries in terms of ful- filling the prerequisites for AI adoption. Most World Bank client countries are still far behind compared to the developed countries in terms of access to broadband, availability of digital skills, and adoption of relevant policies and legislation. Access to fixed broadband is significantly higher in more advanced economies, and the gap between the developed and develop- ing countries has increased in the last 20 years according to the ITU (See Figure 1). 16 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN > > > F I G U R E 1 - Fixed Broadband Subscriptions per 100 Inhabitants, 2001–2019 Fixed broadband subscriptions per 100 inhabitants, 2001-2019* 33.6 35 30 25 Per 100 Inhabitants 20 14.9 15 10 11.2 5 0 2008 2009 2006 2003 2005 2002 2004 2007 2001 2010 2018 2019 2016 2013 2015 2012 2014 2017 2011 Developed World Developing Only 14.9 percent of inhabitants in developing countries have compared to Asia and the Pacific, Arab states, and Africa (ITU access to fixed broadband compared to 33.6 percent in de- 2019). More generally, skills in data science and technology veloped countries (ITU 2019). Internet usage is limited to only are scarce in low-income countries. Capacity constraint is an 19 percent of the population in least-developed countries, important issue. Those that have already adopted AI have pro- compared to 87 percent in developed countries. There are moted the adoption of additional AI capacity in government only 67 data centers in 13 countries in Africa—of which 21 through sponsoring government officials to attend programs in are in South Africa—compared to 1,237 data centers in 23 academic institutions, introducing training programs in-coun- Western European countries. Advanced digital skills, such as try, or partnering with the private sector to provide expertise. writing software using a programming language, are also con- Creating an innovation hub or a central AI unit as part of the centrated in a few rich countries. The disparity in information centralized digital agency or as an independent agency helps and communication technology (ICT) skills around the world maximize the use of scarce expertise. is shown in Figure2. Europe is far ahead in terms of ICT skills, 17 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN > > > F I G U R E 2 - The disparity in ICT Skills across the Regions The Percentage of People with Advanced IT Skills 2014-2018 0-5 5-10 10-15 15-20 Data not available Source: ITU, 2019 The legislative framework for data protection and privacy European Union, have developed or are in the process of is relatively widely enacted, but policies that would allow developing a national AI strategy. Out of these, 37 have or accessibility to government-held data are mostly not in plan to have either separate strategies in place for the public place in most developing countries. According to UNCTAD sector or a dedicated public sector focus embedded within a 2020, 132 of 194 countries, including 50 percent of the African broader strategy (Berryhill et al. 2019). AI strategies are being countries and 57 percent countries in Asia-Pacific, have adopt- adopted in some developing and emerging economies around ed data protection and privacy legislation. However, only seven the world including in India, Kenya, Malaysia, Mexico, Poland, governments out of 115 include a statement on open data by Taiwan, and Tunisia (Dutton 2018). default in their current data management policies. Worldwide, only 7 percent of government-held data is fully open, and only AI patent applications (279,145) are also predominantly in one in every two datasets is machine-readable (Open Data Ba- the USA (55 percent), Europe, China, and Japan (Statistica, rometer 2020). There is also a significant lack of data sharing 2019). Similarly, AI research publications are dominated by and interoperability within the government. Open, machine- developed countries (Microsoft Academic Graph, 2019). readable, and interoperable data are some of the important preconditions for wider AI adoption in government. Governments in some less advanced digital economies have started deploying AI to improve government effec- AI use in government is therefore typically in a few ad- tiveness. While the scope and abundance of digital resourc- vanced countries, and being taken up by digitally more es—talent, capital, infrastructure, and data—may be relatively advanced World Bank client countries. Some countries limited, some developing governments have started piloting AI have adopted an AI strategy as a signal of the government’s to address their development challenges. commitment to AI. At least 50 governments, in addition to the 18 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN The remainder of the chapter reviews several use cases where • ANALYTICS AND DECISION-MAKING. AI or machine AI has been used in the public sector to address specific chal- learning helps aggregate and cross-reference data such lenges: corruption, citizen engagement, customs compliance, as household survey data with information on school health pandemic response, consistent judicial decisions, pro- enrollment, address changes, satellite images of floods, curement compliance, taxation compliance, and policy, and mosquito swamps, and pandemics to produce policy in- audit efficiencies. Regardless of the stage of development, sights and identify areas needing greatest attention for countries can develop AI initiatives based on their most im- targeted policy actions. mediate needs, but it is also recommended that the approach to AI should be part of the planning and accounting for fu- ture digital initiatives with a whole-of-government approach to Use Cases infrastructure, standardization, governance, and execution.7 The pattern of government adoption typically follows this ty- pology of use cases: The following use cases illustrate real-world applications and opportunities for AI in the public sector in a range of • CITIZEN ENGAGEMENT. The introduction of AI tools contexts. The use cases provide a summary of AI initiatives such as chatbots that answer citizen queries. For exam- to tackle corruption in China and Brazil, to engage citizens in ple: Where is my ballot? Where is the nearest emergency Nigeria and Uganda, to improve efficiency and compliance in department? How can I apply for social welfare benefits? the United States customs administration, to tackle the CO- Additionally, aggregation and pattern determination can VID-19 response in Singapore and China, to improve public be used to collect feedback from millions of citizens on a procurement in South Korea and the United States, to improve draft policy published online. the effectiveness of the justice sector in China and the UK, the tax administration in Armenia, Mexico, and the UK, and audit • COMPLIANCE AND RISK MANAGEMENT. AI systems in Canada and UK. The use case of health pandemic, CO- are used to cross-reference and reconcile terabytes of data VID19, is developed in-depth to elaborate the concepts and from multiple sources to create alerts for noncompliance. the details of AI logic. Using the typology developed by Oxford For example, financial intelligence units and central banks Insights, each use case brief states the role of humans on the use AI to track illicit fund flow and beneficial ownership as level of AI adoption in each application area. Table 2 describes well as terrorism financing to comply with the Financial Ac- the five levels of AI adoption. For examples of additional AI tion Task Force. Tax authorities can use AI to track tax filers use cases, see Appendix B. who use duplicate profiles to avoid taxation. > > > • FRAUD DETECTION, PREVENTION, AND INVESTI- T A B L E 2 - Role of humans - Five Levels of AI Adoption GATION. Closely related to compliance AI can be used Level Description to detect and prevent fraud for example by procurement A fully automated system that never requires agencies, anti-corruption units, or audit agencies. Level 5 human intervention. Automation: A public service runs itself unless • BUSINESS PROCESS AUTOMATION. AI automation Level 4 it hits an extreme case where it requires human tools can scan websites to get currency exchange rates intervention. and present information. Semi-autonomous: Computers monitoring and Level 3 running (e.g., a regulatory sys-tem). • PERSONALIZED SERVICE DELIVERY. Based on a pro- file, AI sends automatic alerts such as when to renew a Close supervision: Routine administration of Level 2 systems (e.g., energy networks with difficult driving license. decisions referred to a human). Simple augmentation: Entering data, process- • ASSET MANAGEMENT. AI can be used to tracking asset Level 1 ing, identifying clusters of activity, and profiling, movements across multitudes of systems, aggregating among others (e.g., in fraud detection). data from the Internet of Things devices. Level 0 No automation: People-powered public services. Source: Oxford Insights. 7. Governments can adopt an incremental approach when making investments to avoid the huge costs of full-scale infrastructure before development begins. Cloud solutions can provide any opportunity to reduce the total cost of ownership (TCO) for nascent projects. Cloud solutions enable incremental growth because they offer on-demand services at scale, without upfront investments or buying any on-premise servers. More information about cloud solutions and infrastructure is available in Appendix A. 19 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN In most instances, multiple AI methods and techniques, de- AI in Corruption scribed below, are in use. A more detailed description of these techniques is provided in Appendix A on the AI Technical Primer. Natural Language Processing (NLP): processing large Brazil Governance Risk Assessment System amounts of natural language data by the AI systems. For ex- ample, NLP refers to the ability of an AI algorithm to read a text, Use Case World Bank Artificial Intelligence convert speech into text, or vice versa. Specific use of NLP is Brief Governance Risk Assessment System chatbots, applications used to support online chat conversation using text or text-speech, typically used as customer support. It is estimated that Brazil might be losing between 3 to 5 percent of GDP annually due to corruption. Over 48,000 companies tendered Data Mining: The ability of the AI algorithm to examine large in public bidding processes between 2016- amounts of raw data to determine patterns. For example, Level 5 2018 in the State of São Paulo alone. Brazilian analyzing millions of comments from citizen feedback on an Government agencies can systematically iden- online policy document, and converting these comments into tify public expenditure risks at this scale only patterns of suggestions, approval, disapproval, etc. Common through advanced digital technolo-gies. uses are: Government agencies do not have the tools or capacity to conduct sys-tematic fraud risk • CLUSTER ANALYSIS: Clusters of similar objects or in- Level 4 assessments. The current approach, which depends on manual input to a large extent, is formation are grouped to find patterns. For example, clus- time consuming, inefficient, and ineffective. ter analysis of tax filings to identify the same warehouse Graph theory, clusterization, regression analy- or same names of employees used by the same firm but Level 3 sis, and supervised ma-chine learning. using different registration numbers and titles to avoid or evade taxes. Level 2 Level 3-4: Users must interpret the evidence • FEATURE ENGINEERING: Features are extracted from concerning high-risk firms and agencies. The Level 1 raw data to recognize patterns and classify information. System analyzes complex networks of poten- tial fraud with minimal effort. For example, drone pictures of community rooftops could be used by the AI to identify types of roofs – thatch, cor- Level 0 rugated, cement – and determine patterns of poverty for Source: World Bank. more targeted policy interventions. Artificial Neural Network (ANN): AI algorithms that recog- The World Bank Team in Brazil, with funding from the Dis- nize relationships between different data sets similar to how ruptive Technologies for Development (DT4D) Trust Fund, the human brain analyzes such information. For example, rec- developed an AI System that identifies 225 red flags of onciliation of two or more data sets to detect fraud patterns, potential fraud in public procurement processes and can medical image analysis to relate a specific feature in an image help improve expenditures. The World Bank partnered with to a diagnosis and improving the diagnosis through adaptive the City of Sao Paulo, the States of Rio de Janeiro and Mato learning. ANN techniques are also used in data mining. Grosso, and the Federal Ministry of Health to leverage the vast amounts of unused data to build a system to help improve Convolutional neural networks: A type of deep neural net- their investigative and expenditure capabilities. work, most commonly applied in visual imagery. As part of the project, the World Bank created one of the Generative Adversarial Network (GAN): Use of two or more world’s largest data lakes, which currently includes 27 data- neural networks to produce a realistic output. For example, sets with over 250 million data points and more than R$500 fake videos can be made about some celebrity or popular fig- billion in public expenditure (approximately US $100 billion). ure by synthesizing two videos to misinform and manipulate This includes numerous sources and types of data: expen- public opinion. diture databases; electoral databases; beneficiaries of social 20 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN programs databases; blacklisted firms’ databases; and elec- Therefore, instead of adapting all the complex algorithms nec- tronic invoices. Overall, the system builds on: essary for extracting the 225+ red flags to match the schema of a single public expenditure dataset, the team did the op- • Analysis of over R$500 billion in public procurement in posite and now every new public expenditure dataset is con- Brazil from 12 States and Federal Level. verted to the global schema and the risk detection algorithms • Analysis of over 15 million electronic invoices. are implemented directly. • Analyzed and geo-referenced over 750,000 firms and a Public Registry Dataset containing details about 30 mil- China lion firms – HQ address, partners, data of incorporation, Zero Trust economic sector. • Incorporated over 30,000 news feeds about corruption. Use Case World Bank Artificial Intelligence • Data on 20 million social program beneficiaries. Brief Governance Risk Assessment System • Data on 30,000+ blacklisted firms. President Xi Jinping’s policy of promoting tech- • Data from 20 million politicians and 800,000 political do- nological innovations such as Big Data and AI in Strategic nations. government reform. China has faced enor-mous context challenges of controlling corruption and has 50 The system optimizes the process of detecting fraud in public million employ-ees on the government payroll. expenditure substantially, saving valuable resources – time Problem The extent of operational corruption among and money – and increasing the effectiveness of audits and statement public officials. investigations. The system has, so far, led to the exposure of AI Natural language processing; Big Data; data numerous high-risk cases, including: methods mining; anomaly detection. Role of • Identified over 420 firms that won bids against companies Level 2-3 humans that have a high likelihood of being shell companies and Source: World Bank. reflecting potential bid-rigging. The winning firms have more than R$ 600 million in public contracts. Zero Trust was developed by the Chinese Academy of Scienc- • Identified 857 companies that won bidding processes against es and the Chinese Communist Party’s internal control institu- firms that share at least one partner in common. These firms tions to monitor, evaluate, and scrutinize the work and lives have executed at least R$ 800 million in contracts. of public servants. Zero Trust can cross-reference more than • 450 firms whose partners are beneficiaries of the con- 150 databases in central and local government systems. The ditional cash transfer program, Bolsa Família, which in- system detects an individual’s property transfers, infrastruc- dicates that these individuals are potentially strawmen. ture, construction, land purchases, and house demolitions. These companies have more than R$ 600 million in con- Zero Trust also detects unusual increases in a civil servant’s tracts. bank savings, new car purchases, and if an official is bid- • Identified more than 500 firms owned by public servants ding for government contracts or is doing so under the name working at the same government agency that has exe- of family members or friends. The system then calculates a cuted the contract. These cases amount to over R$ 4.5 probability that those actions are corrupt and alerts officials to billion in contracts. highly probable cases of corruption. The technology has a high potential for scalability across Zero Trust was rolled out in 30 counties and cities and identi- Brazil and beyond through the implementation of Scalable fied 8,721 government officials suspected of engaging in em- Data Unification, which drastically reduces the marginal bezzlement, abuse of power, misuse of government funds, cost of replicating the implementation of the algorithms and nepotism. Some of these cases resulted in a prison sen- and the system. This approach reduces the cost of replica- tence, most were allowed to keep their jobs after receiving a tion by building a global public expenditure database schema warning or minor punishment (Chen 2019). The future of Zero upfront, based on identifying and converting local schemas— Trust is uncertain; the system faces backlash from public of- State’s public procurement dataset—into that global schema. ficials, and it may be decommissioned (Chen 2019). 21 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN AI for Citizen Engagement Nigeria AI-powered geofencing. This feature instantly rejects a sub- DataCrowd mission made outside of a geofenced location. This feature was used during the Edo pilot. Use Case Public Spending Brief Observatory AI Apps AI-powered image classifier. This feature can classify the Citizen engagement and feedback are helpful contents of a picture. For instance, if a picture of a person tools to complement formal mechanisms is taken, the AI model can tell if it is a male or a female. Un- Strategic of accountability as they offer compelling like the tag cloud and sentiment analyzer features, the image context insights for monitoring and evaluation of poli- classifier feature is custom-trained based on the image data cies, project designs, and imple-mentation. collected for a particular project, which always requires a lot Problem The limited capacity of the agencies to receive, of images to train. In the case of SEEFOR, the Research and description analyze, and respond to citizen feedback. Development Team is working on training the image classi- AI design Natural language processing text matching. fier model to classify some of the SEEFOR images, especially Role of under the public works category. This feature is particularly Level 1-2 useful for quality assurance checks and when many images humans are being collected. Source: World Bank. A World Bank team is working in Edo State, Nigeria with Data AI-powered image matching. This feature will allow DataC- Science Nigeria (DSN) to pilot an AI solution for citizen feed- rowd to instantly match an existing image with a new image back to monitor project progress in sample locations. DSN has and report if they are the same or not. This feature is in devel- a mobile app called DataCrowd that is based on AI. The pilot opment. It is expected to be useful as first-level data verifica- was done over four weeks in May 2020. Its scope covered tion and validation when many images are being submitted by 77 locations in the state and collected citizen’s feedback for data collectors. the project, State Employment, and Expenditure for Results (SEEFOR). After initial positive results, the project is planned AI-powered opinion mining and sentiment analyzer. The to scale up to cover three more states and about 350 loca- sentiment analyzer feature can measure the sentiment pulse tions. The AI solution has several features; the following were of text data, such as citizen feedback, and categorize sentenc- included in the Edo pilot: es into negative, neutral, and positive sentiments. Although this feature exists on DataCrowd, it was not included in the AI-powered tag cloud. DataCrowd can summarize text and pilot. It is useful for understanding the sentiment expressed in sentences, such as citizens’ feedback through mobile phones, all citizen feedback and could be used potentially for scale-up. and instantly shows the keywords and their relevance. This In the pilot, the project authorities were able to obtain citizen AI feature was used on citizen feedback received during the feedback on civil works and confirm various aspects of the Edo pilot.8 project’s implementation progress, including location, perfor- mance, quality, and completion. 8. The tag cloud is available at https://datasciencenigeria.github.io/DataCrowd/. 22 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN AI in Customs los exacerbate the issue, leaving data locked up and inacces- sible to communities. AI can sort through regional data and identify which aspects United States of the overarching infrastructure have the greatest impact on Northern Border Surveillance System resilience. It can simulate various disaster events in a region to uncover vulnerabilities and assist with the formulation of Use Case Northern Border Remote disaster recovery plans. A data fabric can hold data from si- Brief Video Surveillance System los and enhance disaster preparation by coordinating emer- The US Customs and Border Patrol is one of gency information exchange capabilities. During a disaster, the world's largest law en-forcement organi- predefined use cases can equip first responders with better zations and is charged with keeping terror- Strategic ists and their weapons out of the U.S. while tools for understanding the local context to take more precise context facilitating lawful international travel and action. Reinforcement learning (RL) is a strong candidate for trade There are 300 ports of entry into the this type of future simulation. United States that need to be secured without disrupting trade and transit. Use Case Contact Tracing and Temperature Concerns of illegal trade, including drug smug- Brief Detecting Camera Apps Problem gling and human traffick-ing, and weapons statement entering the US under the mandate of the U.S. The US Customs and Border Patrol is one of Cus-toms and Border Protection Agency. the world's largest law en-forcement organi- (Sense- zations and is charged with keeping terror- Convolutional neural network, computer vi- Time, ists and their weapons out of the U.S. while AI methods sion, pattern matching, anomaly detection, Megvii, facilitating lawful international travel and prediction. WeChat) trade There are 300 ports of entry into the Role of United States that need to be secured without Level 2 disrupting trade and transit. humans Source: World Bank. Contact tracing and screening to target policy Strategic response on quarantine for minimum disrup- Border patrols require vigilance to stem illicit trade including context tion on economic life and contain the spread of drug smuggling and human trafficking. The use of AI to com- COVID-19. bat illicit activities is on the rise. The U.S. Customs and Border The economic shutdown to contain COVID-19 Protection Agency uses the Northern Border Remote Video has impacted jobs and growth and has trig- Surveillance System (NBRVSS). The NBRVSS can detect and gered an unprecedented economic recession Problem monitor vessels from miles away and alert authorities when it in many economies. Smarter and targeted statement response on quarantine and social distancing recognizes unusual vessel movements. It commenced before policy could save economies from economic 2016 and utilizes many radio towers equipped with computer disasters. vision that spot anomalies in vessel behavior and allow agents Artificial neural network, reinforcement learn- on the ground to intercept potential sources of contraband en- AI methods ing, data mining, predic-tion. tering the United States from the Canadian border. Role of Level 3 humans Source: World Bank. AI in Health On a more practical level, in light of the COVID-19 pandemic, AI methods are being employed in earnest to model potential The unforeseen rise of unseen global threats to human health effects of quarantine models and screen patients for potential and safety has put AI on the frontlines of disaster response infections using facial and thermal recognition models. Figures efforts. Furthermore, sudden changes in the behavior of the 4-7 demonstrate recent modeling of quarantine methods using human population challenge existing models and stressed artificial neural networks (ANNs) from several countries with predictive AI systems to the breaking point (World Economic varying degrees of quarantine policy. Predicted data are mod- Forum 2018). The speed of response to disaster events sub- eled in using solid lines, while actual observed data uses dots. stantially impacts the extent of economic losses and human Note the relative accuracy of the predictions for most sources suffering. Delays occur due to a lack of information, analytics, and the detection of possible disparity of infection due to poten- and predictive modeling of the best course of action. Data si- tial under-reporting (Dandekar and Barbastathis 2020). 23 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN > > > F I G U R E 3 - Wuhan Neural Network Model with Quarantine Control 8x104 1.1 2.0 Wuhan: Number of cases 6x104 1.0 1.5 Q (t) 4x10 4 0.9 Rt 1.0 2x104 0.8 0 0.7 0.5 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 Days post 500 infected Days post 500 infected Days post 500 infected Data: Infected Data: Recovered Quarantine Strenght Effective reproduction number Prediction Prediction R1 = 1 Source: Dandekar and Barbastathis 2020. > > > F I G U R E 4 - Italy Neural Network Model with Quarantine Control 1x105 1.0 2.0 Italy: Number of cases 8x10 4 0.8 1.5 6x104 Q (t) 0.6 Rt 4x104 1.0 2x104 0.4 0 0.5 0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25 Days post 500 infected Days post 500 infected Days post 500 infected Data: Infected Data: Recovered Quarantine Strenght Effective reproduction number Prediction Prediction R1 = 1 Source: Dandekar and Barbastathis 2020. > > > F I G U R E 5 - South Korea Neural Network Model with Quarantine Control 2.0x104 0.75 2.0 Korea: Number of cases 0.70 1.5x104 0.65 1.5 Q (t) 1.0x104 0.60 Rt 0.55 1.0 5.0x103 0.50 0.45 0.5 0 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 25 Days post 500 infected Days post 500 infected Days post 500 infected Data: Infected Data: Recovered Quarantine Strenght Effective reproduction number Prediction Prediction R1 = 1 Source: Dandekar and Barbastathis 2020. > > > F I G U R E 6 - U.S. Neural Network Model with Quarantine Control 2.0x105 0.7 2.0 US: Number of cases 1.5x105 0.6 1.5 Q (t) 1.0x105 0.5 Rt 1.0 5.0x104 0.4 0 0.3 0.5 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 25 Days post 500 infected Days post 500 infected Days post 500 infected Data: Infected Data: Recovered Quarantine Strenght Effective reproduction number Prediction Prediction R1 = 1 Source: Dandekar and Barbastathis 2020. 24 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN The impact that AI can have on pandemic mitigation continues with additional AI methods that are in place, beginning with facial recognition—it uses face scans to detect symptoms. Upon entering the Tampa General Hospital, patients are given an automatic face scan that determines signs of fever, including sweating and increased skin temperature within 0.3 degrees of variance over 1-3 seconds. In another modeling example, RL models learn to combat the illness using policies of quarantine and hospitalization to identify the most successful policy model (Chilamkurthy 2020). Figure 8 illustrates the results of the AI analysis, revealing the potential to thwart the progression of the pandemic within 50 days. > > > F I G U R E 7 - Results of COVID-19 Analysis by AI 400 300 200 100 0 0 10 20 30 40 Contagious Recognized Hospitalized Death This is the bast action policy, where the agent brings down the whole virus in 50 days. At some point, this agent also allows increase in virus spread (25th to 29th day). Distribution of action space {7: 3, 11: 4,5: 21,15: 4,13: 2,12: 2,0: 2,3: 1,1: 4, 10: 1,14: 1,9: 1,4: 2, 2: 2} Source: Chilamkurthy, 2020. Lastly, contact tracing applications are emerging on the Singapore front lines of halting the spread of infectious disease. One Bot MD notable example taps into Bluetooth communication broadcasts from smartphone devices. In this system, data from a confirmed Use Case Hospital and health information app for infected person’s cell phone can be extracted to list the Blue- Brief doctors and front-line health workers tooth broadcast “chirps” detected within the phone’s database. By uploading this information to an interoperable data platform, Doctors and front-line health workers need in- Strategic formation on the latest health protocols, staff the signatures of the chirps can be cross-referenced with chirps context rosters, operational directives, and dosage to from other reported infections. If the information is made avail- effectively manage the COVID-19 pandemic. able through an application interface on any smartphone, then Health facilities are under immense pressure to the general public can determine whether they have come in respond to the un-precedentedly high volume contact with known sources of infection and can take measures Problem of COVID-19 patients. An effective re-sponse to mitigate the risk of further exposure and potentially seek statement needs timely information for a coordinated treatment, if the potential of infection is high due to repeated or team effort. multiple contacts. While this concept is possible to implement Natural language processing, data mining, AI methods on a local basis and efforts to implement this technology are chatbot, search. documented by both Google and Apple in partnership, no suc- Role of cessful implementation exists for the general public at this time. Level 1 humans The key problem is interoperability using a large-scale data fab- Source: Bot MD. ric solution, though the two tech giants assure the public that a solution will exist in the coming months (Apple 2020). 25 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Bot MD is an AI Chatbot mobile app that acts like ‘google’ dicial decisions. Initially, a system called “Review and Approval for hospital and clinical information on COVID19 for doctors of Judgement System” was implemented through which supe- and frontline health workers. Developed in Singapore, more rior courts would review the judgments of the courts submitted than 13,000 doctors in 52 countries are now using the app. online through the system. However, this led to inappropriate The doctors, front-line health workers, and Ministry of Health interference and delays. The use of this technology, there- (MoH) officials can type a question and the app can provide fore, was canceled. Under the new guidelines, the principle information on staff rosters, health protocols, drug formulary of self-accountability and independence was established, un- information, disease guidelines, operational directives, and der which the final judgment is issued by the concerned judge latest MoH circulars. The app was developed by Tan Tock without higher-level approval. However, this has led to the risk Seng Hospital (TTSH), and the MoH’s IT team in 2018. The of inconsistent judgments across jurisdictions. SPC policies system uses AI to predict situations before they occur, provide now require judges to research similar cases and cite these information for decision-making on resource allocation to deal cases in judgments to ensure consistency. with the pressures. These resources could include manpower, equipment, supplies, medicines, hospital beds, intake centers, To support this research, the Chinese judiciary is piloting AI in etc. (The Straits Times, April 2020). some provinces to improve consistency. Under this implemen- tation, all prior judgments were digitized and stored in a da- tabase. Next, the SPC deployed NLP AI capabilities, through AI in the Judicial Sector the Similar Cases Push System, to match key text relevant to pending cases using the database. The system presents relevant judgments before a judge using a pre-populated judg- ment template that the judge reviews and edits. The system The inconsistent application of law and long pendency of reduces the time it takes to formulate a written judgment and cases due to excessive workloads plagues the judicial sector. all legal procedural documents by 70 percent and 90 percent, AI has the potential to enhance consistency and efficiency in respectively (China Daily 2019). the judiciary. Also, an AI pilot program records court proceedings. Some China courts in China are now using AI speech recognition products Similar Cases Push System to directly translate the court hearing recordings into texts in real-time and convert these into written court proceedings us- Use Case Similar Cases Push System ing Speech-to-Text NLP methods. Brief China’s Supreme People’s Court is promoting Strategic the policy of “Similar Judgments in Similar United Kingdom context Cases” to promote consistency in judicial Legal AI Tools and Bots deci-sions. Use Case Problem Inconsistent application of law during judicial Robot Lawyer—DoNotPay App statement decisions. Brief Natural language processing, Big Data, data Strategic Legal document processing in cases of litiga- AI methods context tion. mining, and automation. Role of An AI legal assistant is necessary for im- Level 1 Problem provements in the analysis of legal contracts; humans statement support of private legal bureaucracy among Source: World Bank. citizens; and guided legal advice. AI methods Natural language processing, chatbots. Before harnessing AI, the Chinese judiciary adopted poli- cy measures that enforced the use of technology. China’s Role of Level 4 humans Supreme People’s Court (SPC) issued a policy of “Similar Judgments in Similar Cases” to promote consistency in the ju- Source: World Bank. 26 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Automated AI legal assistants and lawyers have sur- The U.S. government is harnessing the power of AI to passed human-level accuracy. An AI bot performed better strengthen procurement compliance. The U.S. General than human lawyers in competitions for accuracy and efficien- Services Administration (GSA) has an Office of Government- cy held in London and Tokyo. In London, human lawyers from wide Policy, which developed a new pilot using AI for scanning prominent law firms in the United Kingdom predicted whether bidding documents to determine regulatory compliance. The the Financial Ombudsman would allow an insurance claim. tool is known as the Solicitation Review Tool (SRT). Of the 775 total predictions, the AI “Case Cruncher” emerged on top with an 86.6 percent accuracy rate compared to 66.3 The SRT AI platform uses NLP, text mining, and machine percent among 100 human lawyers (BBC News 2017). learning (ML) algorithms to scan and review whether federal solicitations posted on fbo.gov are compliant with Section 508 DoNotPay, touted as the world’s first robot lawyer, helps of the Rehabilitation Act. It alerts responsible parties of non- users dispute parking tickets. In one month, post-launch, compliance and flags the need for corrective actions. Through DoNoPay.com helped people overturn 160,000 of 250,000 the independent review, the predictions have an accuracy of parking tickets—a success rate of 64 percent (King, n.d.). 95 percent. DoNotPay has now expanded its offerings to airline ticketing disputes and subscriptions. Other lawyer bots are also in op- This innovation substantially alleviates the human re- eration. These include Ross (United States) for cash research sources needed to identify, audit, and enforce compliance. powered by Watson AI APIs; Billy Bot (United States), which The SRT platform is innovative because it helps the GSA focus takes the role of a junior clerk to guide users to free online on limited available resources on noncompliant solicitations. resources and to find legal representation; and i-LIS, South The SRT AI platform has expanded to predict whether solicita- Korea’s first intelligent legal assistant for legal research. tions comply with other federal regulatory requirements, such as cybersecurity or sustainability (GSA 2018). AI In Procurement Korea Bid Rigging Indicator Analysis System Central procurement agencies in governments face chal- Use Case Korea’s Fair Trade Commission’s lenges when ensuring regulatory compliance of procurement Brief Bid Rigging Indicator Analysis System among a large number of government entities. Central pro- The Fair Trade Commission ensures fair Strategic curement agencies cannot manage the magnitude of procure- competition in procurement practices in the context ment activities occurring across the government because the government. capacity of human agents is limited. Problem Unfair practices in procurement, using bid- statement rigging, to beat the compe-tition. United States AI methods Natural language processing, Big Data, data Solicitation Review Tool mining, feature engineer-ing, automation. Role of Level 2-3 Use Case humans Solicitation Review Tool Brief Source: World Bank. Strategic Legal compliance in the tender documents context with Section 508 of the Rehabilitation Act. Korea is cracking down on bid-rigging through the use of AI. Officials converted a manual process that was in place Reviewing hundreds of complex and volumi- Problem nous bidding documents, issued by the federal since 2004 to detect bid-rigging cases using AI. The introduc- statement agencies, to ensure compliance with regula- tion of the AI system greatly increased speed and effectiveness. tions. Natural language processing, Big Data, data Bid rigging refers to collusion between procurement of- AI methods ficials and a pre-ordained vendor to award a contract mining, feature engineer-ing, and automation. using corrupt practices. Bid rigging can take various forms, Role of Level 3 including short bid submission windows, split procurements to humans capture funds below detectable thresholds, significant change Source: World Bank. 27 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN orders, and substitution of low priced items with high priced AI in Tax Compliance items after the award. Korea’s Fair Trade Commission (KFTC) is leveraging an AI and analytics platform, the Bid Rigging In- dicator Analysis System (BRIAS) to combat corrupt practices. Tax administration authorities in governments consistently grapple with the challenge of ensuring reasonable tax compli- Before the introduction of the automated AI solution, the ance. Tax authorities are better positioned to pilot AI tools to KFTC collected and manually analyzed hard copies of bid-re- strengthen their mandate on compliance for several reasons. lated documents from major public organizations such as the Public Procurement Service, Korea Expressway Corporation, • Generally, tax agencies have more data assets than other and Korea Electric Power Corporation, which issue large-scale agencies. The capacity is generally higher. public projects. Presently, the KFTC collects and analyzes this information electronically and flags cases of suspicious bid- • Tax compliance and collections directly affect fiscal sus- rigging activities. tainability targets and the political agenda of most govern- ments with an interest in funding capital and social proj- In total, 322 public organizations must report their bids to ects as promised in their manifestos. the KFTC. Construction projects over ₩5 billion and tenders for procurement of goods and services over ₩500 million must • Many tax agencies across the world deploy data ware- report to the KFTC. The affected public organizations must re- houses, data analytics, and, lately, AI projects to leverage port related data into BRIAS within 30 days of selecting a bid- the power of technology to promote their mandate through der. The organizations that use internal bidding systems may a shift to risk-based auditing techniques for tax compliance. transmit bid data to the KFTC in real-time using BRIAS APIs. The others must report bid information to the KFTC portal. The More than 32 tax administrations worldwide have changed information submitted includes the following features: their strategies from a traditional data-oriented audit to a risk- based, cooperative compliance approach that relies heavily • The organization’s information on the executive agency on analytics during the assessment process (Microsoft, PWC and issuing agency. 2018). The huge data assets typically process a historical record of tax payments, electronic value-added tax (VAT) in- • Procurement information: types and methods of tenders, voices, income tax returns, and personal and company infor- the date and contents of tender notices, and the estimated mation. By deploying a Big Data architecture, capturing all the price set out by issuing organizations before tender no- relevant structured and unstructured information in one data- tice, which serves as a benchmark to determine the tender base, and running AI and analytics tools, tax administrations amount for the successful bidder. may significantly improve their effectiveness. These solutions offer a complete picture of businesses or individuals using • Bid evaluation information: the ratio of bidding price to the risk-based compliance assessments. Examples highlight how estimated price, the number of bidders, bidder-based ten- tax authorities are deploying AI and analytics for common der details, company information for successful bidders, problems faced by most tax authorities. and the number of unsuccessful bids. • Contract execution information: the number of estimated price increments and alterations to bids. The KFTC weights the features according to a preset for- mula and uses the data to analyze the probability of bid- rigging quantitatively. An automated system calculates and assigns a score between 0 and 100 to the procurement item or contract. The higher the score, the more likely the concerned bid is rigged. The KFTC sends flagged bids to external depart- ments for further investigation. In one example involving 12 construction companies for the Seoul subway, the KFTC de- tected bid-rigging, and the government imposed a surcharge amount of ₩5.108 billion. 28 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Armenia warehouse. The NGFM data reveals when a group of taxpay- AI Use in Tax Administrationl ers use a variety of fiscal machines at the same location. The registration database reveals when different enterprises have Use Case the same founders. Such suspicious anomalies are subject New Generation Fiscal Machines to detailed audits, which may not necessarily be tax fraud but Brief need deeper scrutiny. Strategic Tax evasion among businesses and individu- context als. Taxpayer’s employees. AI and analytics can detect suspi- Tax evasion practices remain undetected as cious cases in which different employers declare identical Problem evasive practices fail to cross-reference fiscal statement records that may reveal correlations resulting groups of employees in income tax filings or when a closed in the detection of tax reporting anomalies. and reopened business hires identical employees. Employee information is obtained by linking a social security number and Natural language processing, Big Data, data AI methods person identification database using Big Data infrastructure. mining, and cluster analysis. Role of Level 2 AI and analytics on sales data from the cash registers. humans The Monitoring Center leverages information received from Source: World Bank. NGFM. Some patterns call for further scrutiny. For example, Tax evasion is carried out in many ways. One of the most if a fiscal machine does not work throughout the day, and the common practices among small businesses and individuals of taxpayer prints 100 or 200 receipts within one or two hours, lower-income is to remain below revenue thresholds to benefit this flags a falsified fiscal amount with false receipts without from lower tax rates. An existing business may open a new actual sales. Some taxpayers print a single receipt with an business when the existing firm reaches the threshold. Simi- unrealistic amount at the end of the day. All such cases are larly, a business will split into several small businesses, often under control since the Monitoring Center automatically sends using the names of friends and relatives. The aim is to avail notifications and requires explanations. If no reasonable ex- of the lower rate compared to the appropriate VAT. To combat planation is given, the case goes to audit. this, tax administrators will analyze tax data and identify the interconnectedness of split entities. Armenia’s tax authorities Comparison of data from utility providers. The data from handled this issue using several techniques. water, electricity, and gas, for example, reveals enterprise ex- penses, which demonstrate a logical correlation with the to- Single administrative document (SAD). Producing a SAD tal amount of reported sales for a particular line of business. about importers of goods is one way of having a fuller view. Again, this cross-referencing and correlation shows valuable Analytics detect whether a taxpayer is always importing the insights. same goods from the same country and the same enterprises repeatedly. Moreover, electronic invoices help detect groups The outcomes of Big Data analysis. Using targeted audits of taxpayers that use identical storage for imported goods. conducted during the first years of implementation of NGFM, The tax administration investigates the anomalies. the tax administration reduced the number of audit cases by about 2.5 times over recent years. The effectiveness, mea- Cross-matching of sales and invoices. Cross-referencing sured by the average amount of additional tax per audit, grew sales and invoice data provides important insights into various constantly over recent years. Also, the agency achieved sub- sellers’ revenues. Armenia’s tax authority collects data from stantial cost savings due to the rapid reduction of the num- the registration database—new generation fiscal machines ber of local tax administration offices. Armenia reduced the (NGFM) or cash registers connected to the agency’s serv- number of local tax offices from 52 in 2009 to only two offices ers—and invoice databases. The invoice database detects in 2017 (IOTA 2018). These are the departments for large, when a variety of entities are selling goods from the same medium, and small taxpayers. 29 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN United States AI in Tax Policy Palantir Gotham Platform Use Case United States Palantir Gotham Platform Brief AI Economist Strategic Tax refund fraud, identity theft, and compli- context ance. Use Case AI Economist AI is necessary in detecting tax evasion and Brief Problem conducting criminal inves-tigations in cases of Macro-fiscal policymakers need better data statement Strategic tax fraud and identity theft. and analytical reporting to design data-driven context Cloud computing, Big Data, analytics, aggre- policies. AI methods gation, and automation. Data-driven macro-economic policies are Problem Role of hampered by a lack of data, skills, and robust Level 1-2 statement humans models. Source: World Bank. Artificial neural networks, cloud computing, AI methods Mechanical Turk, and au-tomation. In 2011, the Internal Revenue Service (IRS) created the Of- Role of Level 1 humans fice of Compliance Analytics (OCA) to construct analytics programs that could identify potential refund fraud, detect Source: World Bank. taxpayer identity theft, and handle noncompliance issues ef- ficiently. OCA leverages an advanced analytics program that Modeling data-driven tax policies in most developing coun- relies on the use of Big Data and predictive algorithms to re- tries is hampered by a lack of reliable data, forecasting skills, duce tax fraud. In 2016, significant organizational changes and robust models. These impediments could be overcome took place when the OCA and Research, Analysis, and Sta- through the use of emerging AI tools if concomitant analog tistics merged to create the Research Applied Analytics and complements are in place. The challenge in most settings is Statistics (RAAS) division. RAAS leads a data-driven culture devising a tax policy that optimizes equity and productivity. through innovative and strategic research, analytics, statistics, The AI Economist employs AI models based on RL algorithms and technology services in partnership with internal and ex- to model and predict tax policy design through data-driven ternal stakeholders. By combining AI and advanced analytics simulations using a two-level RL framework composed of platforms, RAAS extracts value by leveraging vast amounts of agents (workers) and tax policy to model and learn the effects proprietary data stored within the IRS legacy computers. of dynamic tax policies in principled economic simulations. The framework does not use prior world knowledge or make The IRS uses the Palantir Gotham platform to run its Lead any modeling assumptions. It can optimize for any socioeco- and Case Analytics (LCA) service. Special agents and in- nomic objective. It learns from observable data alone. Though vestigative analysts in IRS Criminal Investigations use LCA the framework is not yet deployed in government, results to “generate leads, identify schemes, uncover tax fraud, and show that the AI Economist can improve opportunity costs conduct money laundering and forfeiture investigative activi- and trade-offs between equality and productivity by 16 per- ties” (Federico and Thompson 2019). cent when compared to a prominent tax framework proposed by Emmanuel Saez, professor of economics and Director of The various divisions of the IRS have access to several the Center for Equitable Growth at the University of California data mining applications. These include the Investigative at Berkeley. The framework captures even larger gains over Data Examination Application—formerly known as Investiga- an adaptation of U.S. federal income tax in the free market tive Data Analytics; LCA; Return Review Program (RRP); Fi- (Zheng et. al. 2020). nancial Crimes Enforcement Network Query; and Compliance Data Warehouse. In 2016, RRP generated more than 693,000 identity theft leads, with a 62 percent accuracy rate and more than 103,000 other nonidentity fraud leads with a 49 percent accuracy rate (U.S. Department of the Treasury 2017). 30 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN > > > F I G U R E 8 - An Optimal Tax Policy Optimizes a Balance between Equality and Productivity a. Equality-productivity trade-off b. Wealth distribution by Tax Model 100% Par eto Bo u Equality nd ar y Free US Saez AI Market Federal Formula Economist 0% Low High Productivity Free Market US Federal Saez Formula AI Economist Top Agent Typical Agent Source: Zheng et al. (2020). Notably, the AI Economist leveraged real-world human actors the parity in wealth distribution among sectors of society and in the roles of workers competing with AI-driven policy models the overall gain in productivity due to the tax policies enacted that evolved based on human interactions. Figure 9 compares by the AI Economist model. The AI Economist is in active de- the overall results of the study. They take into account the velopment with plans for open-source distribution and govern- Pareto boundary, which is the event horizon where marginal ment engagements shortly. benefit and cost trade-offs result in reduced productivity. Note 31 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN AI in Audit Canada, UK When detecting anomalies, AI produces a risk score using MindBridge for AI Auditor general ledger entries with financial features to meet compli- ance and assurance parameters. Some of these features are: Use Case • Materiality levels MindBridge AI Auditor • All urgent payments Brief • Unbalanced debits and credits Oversight, assessment of the effectiveness of Strategic • Rare flows risk management, con-trols, and governance context • Cash to bad-debt conversions through external and internal audit. • All payments that went through multiple adjustments or AI models help maximize the efficiency of docu- Problem reversals ment analysis for legal infraction detection statement • Journal entries beyond a threshold and policy audits. Natural language processing, Big Data, data • Open invoices beyond a period AI methods • Sudden spikes in otherwise dormant vendors mining, anomaly detection. • High-value transactions for a historically low-value vendor Role of Level 1-2 • Duplicate entries humans • End of the year or end of the period procedures Source: World Bank. • Uncleared bank reconciliation entries Private sector audit and assurance firms are the primary • Multiple changes to the bank account information of a adopters of AI in the audit. Their goal is to maximize ef- vendor. ficiency, minimize the costs of audit work, and enhance the coverage of audit procedures. Specifically, these procedures AI processes allow auditors to extract and load account- require two functions: ing and finance data directly from financial management information systems (FMIS) or underlying enterprise re- • Analyze contract documents—leases, rental agreements, source planning (ERP) systems. Human auditors use a etc.—for pre-identified keywords, such as key clauses, dashboard to visualize the risk scores and investigate anom- dates, persons, and relevant terms. alies externally. Auditors can flag data and trigger ML algo- • Present potential anomalies for further human investigation. rithms to refine scores. In minutes, AI can do work that will otherwise cost several auditors for many weeks. Some tools Because these documents may be several thousand pages are compliant with international audit standards like SAS 99, long, they are often reviewed on a sample basis due to limita- CAS 240, and ISA 240, such as the MindBridge AI Auditor, an tions associated with manual labor. application developed by a private Canadian firm. The UK and Canadian federal governments are testing the tool for wider However, AI allows document analysis at a fraction of the time applicability and adoption (MindBridge). cost. In some cases, it reduces the time cost by more than 90 percent. Furthermore, the quality of risk assessment is also vastly improved. 32 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN 3. >>> AI Risks For all the potential benefits, there are also significant potential risks that will need to be miti- gated for the adoption of AI as part of a government’s digital transformation. The risks and their mitigating measures discussed here are primarily at the project-level, while policy-level ethical issues for society at large are discussed in Chapter 5. Performance, Trust, and Bias Negative bias is an inherent problem in AI that arises as a result of many factors, includ- ing incomplete, inaccurate, or corrupt data (statistical bias) which cause a predictive outcome that is in favor of or against one or more groups of people. There are well-known cases of how harmful such negative bias can be leading for example to unfair access to public services such as housing and social benefits or unfair incarceration. For example, analysis of the Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) software used by U.S. courts and police to forecast which criminals are most likely to re-offend found it was biased against African Americans. The COMPAS algorithm provided information to police and judges to make decisions on defendants and convicts, for example setting bail amounts and sentences. The analysis found that the software was twice as likely to falsely label black defen- dants as future criminals than white defendants.9 9. Venkateswaran 2020 33 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Some bias is inherent in AI models because data are fi- • Good data can be traced back to the origins. Data can ob- nite, even when made available at scale. AI systems need fuscate sensitive personal information about people. Data to continuously be refined and improved AI systems as da- come from official government sources. tasets and tools evolve or weaknesses emerge. Even with considerable preparation, sources of bias can be difficult to Nonetheless, bias can emerge throughout the AI project identify preemptively. As a result, AI results can be deceptively life cycle, often unconsciously, through selective data rational, even when biased (Ntoutsi et al. 2020). Sometimes, gathering, requiring additional policies to oversee data the AI team of developers or data scientists carries some in- selection processes. For example, data scientists may herent bias (cognitive bias), which should also be carefully choose to collect data from groups that are perceived to be monitored. Also, AI firms voluntarily manipulate data and algo- relevant, but these groups may be selected as a matter of rithms to maximize profits (economic bias), which should also personal preference. This is a classical polling technique that be addressed through policy action and public scrutiny. yields favorable results from a population-based selection of data around information on gender, race, ethnic origin, zip To manage the risks of bias and the impact on access to code, color, and disability. services, a policy framework needs to address these is- sues. The full disclosure of the datasets and algorithms Bias is best mitigated by policies and processes that en- used in AI is the key to managing bias. Data and algorithm sure inclusion, conscientious oversight, transparency, disclosure can aid in building trust and also aids the produc- disclosure, and contestability. Where models may influence tion, collection, and engineering of “good” data, which is de- public policy or mission-critical outcomes, the publication of fined as follows: data collection criteria as well as the release of open-source code for the implemented frameworks may mitigate the risk • Good data are available in abundance. The more data, of producing nefarious outcomes. Even more so, the democ- the better. ratization of data and policymaking can improve the practical outcomes of AI frameworks and enhance trust in AI infrastruc- • Good data have explainable features that relate to the ture in government. problem statement. Raw unprocessed data contains sim- ple, human-readable values. Additionally, governments should develop competing AI systems that focus on the same problem statement. By • Good data are extensible. In other words, new features employing multiple solutions on one problem statement, a (data points or parameters to the layman) can be added practice adopted in Singapore and Israel, governments can to each record as models evolve. Feature engineering is significantly improve the likelihood of a positive outcome. Two possible with good data, which involves using existing systems with varying degrees of bias help reduce the likeli- features to derive additional information about each re- hood of unintended outcomes by converging on results in dif- cord of data (see Annex A). ferent ways. Also, AI systems could be developed to identify bias – use the same tool to fight bias that caused the bias in • Good data are normally distributed. A normally distributed the first place. sample of the population is easily derived using random selection methods during training and testing. The values Human oversight could provide an additional safeguard of data are not random; however, the selected members against machine-invoked bias. Introducing human oversight of a broader population are random. can help detect skewed results from influences such as train- ing data manipulation, forgery, and intentional bias. • Good data are complete. Data are not missing key fea- tures that are critical to the problem statement. 9. Venkateswaran 2020 34 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Implementing agencies could develop risk mitigation frameworks. Many governments have already developed model AI risk mitigation frameworks, which can be tailored to the local context. The Government of Canada developed an Algorithm Impact As- sessment10 for implementing agencies that consist of an online questionnaire and scoring scheme to assess the level of risk and mitigate the risk. The U.S. Department of Homeland Security has developed an AI risk assessment framework that is also useful in mitigating risks in AI performance. Key aspects of this framework are summarized below: > > > T A B L E 3 - AI Risk Mitigation Framework Types of Where Standards How It Can Reduce How Applicable to AI Standard Are Applied AI Risk from an Adversary Identify faulty logic or reasoning, increase the Standards that evaluate difficulty of deceiving and/or manipulating Analytics and Back end: explainability the quality of analysis and analysis from AI re-search and transparency scrutability of algorithms Determine how much to trust system inputs and outputs Standards-based Front end: usability Change understanding of liability for mistakes on govern-ance and Legal and and personalization; and enhance attribution regulatory over-sight into regulatory back end: standardized Transform the notion of the jury of peers and preserving privacy and architecture evolve crime and punishment consent Standards that prevent AI Reduce the likelihood that AI will do the “wrong from performing actions Moral and ethical Back end: fail-safes thing” (i.e., immoral or unethical behavior) if that are contrary to a exploited or infiltrated by an adversary moral or ethical norm Standards to measure Meet appropriate tech-nical specifications (e.g., Technical and the performance of an Front end: performance low number of false posi-tives) to be robust indus-try algorithm on relevant against adversary denial and deception activities tasks Limiting access to and information about how an Data and Standards for the Front end: training; back AI system works to appropriate people could help Information protection, sharing, or use end: data integrity and prevent exploitation by an adversary security of data relevant to a task availability Preventing manipulation of training data Source: Oxford Insights 10. https://www.canada.ca/en/government/system/digital-government/digital-government-innovations/responsible-use-ai/algorithmic-impact-assessment.html. 35 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Cybersecurity Hacking poses a serious risk in AI systems. Forged data and also use tools known as fuzzers to discover errors and secu- bad actors can impair training algorithms to cause harm. One rity loopholes by inputting massive amounts of data (called of the most common hacking techniques to exploit security fuzz) to the system in an attempt to make it crash. vulnerabilities in AI is phishing. The implementing teams should also ensure back up data Spear-phishing tactics include the practice of delivering with redundant systems and enforce no single point of malicious code or gaining unauthorized access through failure (SPOF). Wiping attacks that erase or overwrite oth- socially engineered messages. The best-known example of erwise benevolent files on computing systems are difficult to a general phishing attack is that of a digital hustler – a foreign detect because their effect is known only after they propagate prince offering unclaimed money in a foreign bank account in and execute. However, through the use of learning-enabled AI, exchange for a small cash advance or a bank account number. engineers can develop defenses against these types of propa- In this example, broad stroke methods of AI message creation gation attacks, though no known examples of such AI systems leverage socially desirable outcomes, which AI in spam filters exist in the public domain. Obfuscation and anti-forensics em- have become adept at detecting. Propagation of malicious ploy methods of detection avoidance. AI can be quite beneficial code that spreads itself across systems, networks, and even in detecting obfuscation attacks as well as creating them. De- ‘networkless’ devices offers exponential reach in offensive cy- structive attacks are unlikely candidates for AI prevention. ber operations. The most notable example is the infiltration of a country’s uranium enrichment program called Stuxnet, AI holds great promise in cybersecurity defense. How- where a targeted propagation attack led a centrifuge to spoil its ever, given the fact that destructive propagation attacks can payload. Another is NotPetya, which relied on password theft proliferate and remain dormant for months, even years, the and caused over $10 billion in damages across hundreds of detection of these attacks may be limited in scope. Still, the thousands of computers in more than 100 countries. NotPetya effort to detect security breaches remains a key focus of AI was later repurposed by the National Security Agency of the systems in cybersecurity. U.S. Department of Defense to rip through targeted networks in seconds or minutes, making it one of the fastest-spreading It stands to reason that if AI can learn to detect threats, it can pieces of malicious code in history. The adage goes, “by the also alter them to further delay their effects if not block them second you saw it, your data center was already gone.” Not- altogether. Furthermore, attribution mechanisms that detect Petya did not leverage even the slightest bit of AI. external sources of threats through AI clustering techniques are proving themselves in identifying sources for threats in Solid governance practices help mitigate the risks by geographic regions. In some cases, NLP can detect grammat- imposing explainability, transparency, and validation in ical nuances in source codes that allow defenders to home in AI systems, in addition to the security best practices at the on geographical regions for further investigation. technical level. Governments can prevent adversarial attacks on data sources and computing resources with the use of All told, the many methods of subterfuge and espionage em- security best practices, such as access-control lists (ACLs) ployed by hackers and defenders are writhe with theory and are and API tokens for inter-process communication (IPC) and unclear to the general public. Sometimes, tools designed at the human-facing endpoints. These practices are standard rules hands of government defense departments are responsible for among corporations. Government systems are no exception the greatest defenses and offenses. It is in the domain of a gov- to these rules of practice. ernment’s responsibility to determine the degree of impact that its defensive strategies have on the safety of the general population. Prevent common patterns that kill critical processes. Pro- active cybersecurity operations conceptualize the kill chain—the Remain proactive. Not all offenses are the source of political sequence of steps that hackers cycle through to achieve nefari- cyberwarfare. Many still emerge from obscure corners of the ous goals. Both hackers and defenders have a vested interest internet, to “prove” that the vulnerabilities of government and in finding vulnerabilities in AI systems; the former to exploit, the commercial organizations are real. Although they are effec- latter to remediate. AI is useful in vulnerability discovery. tive in advancing the evolution of cybersecurity best practices, they are more often than not isolated incidents that fall under Mitigate zero-day exploits—those with no patches—that the jurisdictions of international authorities and garner stern are the targets of cyberattacks. Cybersecurity and AI teams responses from enforcement officers and legislators. 36 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Control municipalities are enforcing policies to ban facial recognition technology altogether. The use of AI in some countries to de- tect fever from facial recognition software in cameras installed Because many AI systems operate autonomously and inter- at public places carried the risk of human surveillance and act behind the scenes with one another using IPC, machine- infringement of privacy. In Singapore, the GovTech agency centric feedback loops can cause unintended consequences. and the Ministry of Health (MoH) have co-developed an app, In 2010, stock exchanges that allow high-frequency trading TraceTogether, that can trace individuals without infringing on experienced a flash crash caused by AI algorithms that went privacy. Citizens download the app, turn on the Bluetooth, and awry in competition with one another. This led to unintended allow push notifications and location services. The app can artificial financial market inflation. Moreover, chatbots interact- exchange signals in a short distance of 2-5 meters with other ing with one another can create their language that humans app users, exchange anonymized identifications (IDs), and cannot understand. store anonymized data locally of all the persons in the proxim- ity of the app users. If the user allows on the app, the MoH Proactive control, monitoring, testing, and validation are nec- will contact the user by sending a code. MoH will then be able essary to control the outcomes of rogue AI systems and pre- to decrypt the random IDs of individuals with whom the user vent edge cases in software development from getting the came into contact. The authorities comply with the privacy and best of humanity, if only on a rare occasion. data protection laws, as no personal details are collected ex- cept the phone number. Furthermore, policies can enforce limitations on group infer- Privacy ence models that lead to individual discrimination. For ex- ample, organizations are choosing to obfuscate individual identity to mitigate against the risk of fraud due to unauthor- The use of data fabrics and Big Data, growing reliance on au- ized access to data. Rather than use names and ID numbers, tomation and decision-making, and the gradual reduction of data systems are using salted cryptographic hash functions human involvement in human processes raise concerns about to “scramble” identifiable information. Because the use of a fairness, responsibility, and respect for human rights. More- salted hash function is idempotent—it always yields the same over, AI data policy raises concerns for privacy and individual result for a given input—systems can protect exploitable data identity. Group and community-driven AI has the potential to and retain uniqueness for algorithmic purposes. increase the risk of harm by what Carl Jung describes as the collective unconscious of humanity, a shadowy force or dark Privacy legislation and regulatory framework provide a side of personality that collectively propels human digressions solid legal basis for mitigation privacy risks. Governance at a macro level. AI is no exception. frameworks that promote self-assessment, peer review, and public inclusion could strengthen compliance with these le- Protect privacy and human identity. Yet, despite all the gal frameworks. The details could be adopted based on the foreboding ethical predictions, ethical influences begin with context and existing mechanisms of transparency, citizen en- the protection of individual identity within large-scale datasets, gagement, and accountability. However, the value of public access control, and policies. This prevents the arbitrary ex- inclusion is critical in this process. ploitation of identity recognition systems. In the United States, 37 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN 4. >>> AI Governance and Operations Most advanced digital governments have issued governance frameworks, including ethical prin- ciples for the use of AI. An overview of these governance models is presented in this chapter, which discusses three aspects of governance models: ethical principles, the role of a central agency, and operational framework. AI Ethical Principles The risk mitigation for AI requires the adoption of some ethical principles and several of the key ethical considerations. Several advanced digital economies are adopting AI governance models and policies developed by an interagency team of policymakers and AI experts, this chapter summarizes those principles, identifies good practices for the institutional design for adopting AI in the public sector, and shares innovative procurement practices for acquiring AI implementation services. The models of AI governance typically include bias, privacy, algorithm opacity, limited data access, security, citizen consent, and inadequate supervision. National gov- ernments, including Australia, Canada, China, Japan, Singapore, United Arab Emirates, and the United States as well as international organizations including the European Commission (EC), the Institute of Electrical and Electronics Engineers (IEEE), International Organization for 38 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Standardization (ISO), UN, and World Economic Forum, are • HUMAN CONTROL OF TECHNOLOGY. The AI should actively proposing governance models for AI that emphasize be under human control. People should review automated common principles: decisions. Individuals should be allowed to opt-out of au- tomated decisions. • PRIVACY AND DATA PROTECTION. AI solutions should respect an individual’s right to privacy and civil liberties. • PROFESSIONAL RESPONSIBILITY. Multistakeholder Individuals should have control over their data. Individual collaboration, accuracy, and scientific integrity of the solu- consent is necessary for using and re-distributing their tion should be ensured. data. They should have the right to restrict the processing of their data, rectification, and erasure. • PROMOTION OF HUMAN VALUES. AI should be hu- man-centric. It should promote human values and benefit • ACCOUNTABILITY. Mechanisms must ensure account- society. able behavior during the life cycle of AI design and im- plementation. Impact assessment frameworks should be Some governance models and guidelines emphasize common done to identify accountability at every step of the pro- program and project management practices like cost-benefit cess. An agency or body should be responsible for moni- analysis, legal and regulatory compliance, risk management, toring accountability. flexibility, and the use of an agile approach. • SAFETY AND SECURITY. Cybersecurity is critical. AI so- Ensuring compliance with these principles would require lutions should have predictable behavior. Leaders must en- a careful balance between oversight and agility.11 sure the well-being of society at large and private individual humans. These principles are given a different level of emphasis in dif- ferent settings. The Berkman Klein Center for Internet Society • TRANSPARENCY AND EXPLAINABILITY. The algo- at Harvard University tracks and maps the global consensus rithm, business case, data collection, design, and policy on ethical principles for AI. Figure 10 is adapted from their information must be transparent to stakeholders and work which shows the global adoption of these principles and those impacted. Open-source data algorithms could en- the level of emphasis of each principle. Despite different levels hance transparency. Individuals should get notifications of emphasis on different principles, there is a consensus that when interacting with AI or when AI decides for him or ultimate control of AI must remain with people. AI must not be her. There should be regular Reporting requirements on a regulatory means unto itself. transparency. The rights of citizens to information are im- portant. Data should be of high quality and representative. • FAIRNESS. AI solutions should minimize bias and iden- tify and manage risk. Inclusiveness should be ensured in design and impact. 11. To enforce policies, the European Union (EU) is considering establishing a standards body, similar in composition to the U.S. Food and Drug Administration, to assess the impact of algorithmic processes before release. There is a key problem here since algorithmic innovation occurs at such a speed that it outpaces the government’s ability to evaluate every potential outcome. The agency may even become a bottleneck that developers simply bypass due to capital constraints. Instead, some propose that such validation should be part of a certification process that is executed through peer review. 39 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN > > > F I G U R E 9 - Global Consensus on the Principles of AI Source: Fjeld et al. (2020). 40 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Country Examples of 8. ACCOUNTABILITY. People and organizations respon- sible for the creation and implementation of Al algorithms AI Governance Systems should be identifiable and accountable for the impacts of The Australian Government Department of Industry Innovation that algorithm, even if the impacts are unintended. and Science funded research into the ethical principles of AI us- age in government in 2018 and published a white paper on it in The Canadian government’s 2019 Directive on Automated 2019. The core principles in its AI governance framework are:12 Decision-Making13 guiding principles for the ethical application of AI governance are: 1. GENERATES NET-BENEFITS. The Al system must gen- erate benefits for people that are greater than the costs. • Understand and measure the impact of using AI by devel- oping and sharing tools and approaches. 2. NOT HARM. Civilian Al systems must not be designed to harm or deceive people and should be implemented in • Be transparent about how and when to use AI, starting ways that minimize any negative outcomes. with a clear user need and public benefit. 3. REGULATORY AND LEGAL COMPLIANCE. The Al • Provide meaningful explanations about AI decision mak- system must comply with all relevant international and ing, while also offering opportunities to review results and Australian local, state, territory, and federal government challenge these decisions. obligations, regulations, and laws. • Be as open as possible by sharing source code, training 4. PRIVACY PROTECTION. Any system, including Al sys- data, and other relevant information, all while protecting tems, must ensure people’s private data are protected personal information, system integration, and national se- and kept confidential plus prevent data breaches which curity and defense. could cause reputational, psychological, financial, profes- • sional, or other types of harm. • Provide sufficient training so that government employees developing and using AI solutions have the responsible 5. FAIRNESS. The development or use of the Al system design, function, and implementation skills needed to must not result in unfair discrimination against individuals, make AI-based public services better. communities, or groups. This requires particular attention to ensure the “training data” is free from bias or character- Furthermore, the Canadian government formulated a compre- istics which may cause the algorithm to behave unfairly. hensive analysis and exposition of the key government pro- cesses in play across the entire government. The document 6. TRANSPARENCY AND EXPLAINABILITY. People must includes objectives and expected results, definitions, and be informed when an algorithm is being used that impacts rules for semi-annual re-evaluation, which is crucial in light them and they should be provided with information about of the rapid pace of AI development. The government also what information the algorithm uses to make decisions. developed an Algorithm Impact Assessment (AIA), which is a questionnaire designed to assist agencies in assessing and 7. CONTESTABILITY. When an algorithm impacts a person mitigating their risks.14 there must be an efficient process to allow that person to challenge the use or output of the algorithm. > > > B O X 1 - Actionable Insight: Adopt Principles of AI and Issue an AI Governance Model The central digital agency should adopt the common principles of ethical AI and prepare a governance model. The model should formulate operational arrangements, including an innovation hub, data governance, data standards, collaboration with the private sector, skills development, adoption in the public sector, and partnership with nonprofit and academia to promote AI research, among others. 12. https://www.industry.gov.au/news-media/towards-an-artificial-intelligence-ethics-framework. 13. https://www.tbs-sct.gc.ca/pol/doc-eng.aspx?id=32592. 14. For more information, visit the “Responsible Use of Artificial Intelligence (AI)” on the website of the Government of Canada at https://www.canada.ca/en/government/sys- tem/digital-government/modern-emerging-technologies/responsible-use-ai.html#toc1. 41 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Singapore talent through proactive partnerships between academia, Most of the themes discussed have been incorporated into the private sector firms, and start-ups, which are fundamen- Model Governance Framework, issued by the Government of tal pipelines leading to the success of AI initiatives. Pro- Singapore (PDPC 2020).15 Singapore maintains an active curement must provide flexible experimentation, produce leading role in the strategic development of integrated gov- proofs of concept over multiple iterations, and scale up ernment AI systems around the world. Singapore is actively with an acceptable risk of failure. investing in AI policy and process standards among partner nations to support global AI development in trade and com- 4. STAKEHOLDER INTERACTION AND COMMUNICA- merce. Transparency affords opportunities for the successful TION. Strategies must ensure consistent and transparent development of systems impacting key strategic international communication with the key stakeholders and manage re- partners. The willful commitment to long-term execution pro- lationships with them. In the public sector, public scrutiny vides a global foundation that extends far beyond Singapore’s and transparency are critical aspects of AI initiatives. borders. Singapore is also experimenting with policy enforce- ment using AI-powered robotics and contact tracing since the The U.K. Office for Artificial Intelligence is responsible for start of the COVID-19 pandemic. The government’s gover- overseeing the implementation of A I and has produced sev- nance model is driven by two fundamental guiding principles: eral Reports.16 The agency is a joint effort of the Department for Business, Energy, and Industrial Strategy and the Depart- • EXPLAINABLE, TRANSPARENT, AND FAIR PROCESS. ment for Digital, Culture, Media, and Sport. The organizations using AI should ensure the decision- making process is explainable, transparent, and fair. AI and the Multilaterals The EC has formed a high-level expert group to prepare the • HUMAN-CENTRIC AI. AI SOLUTIONS ARE HUMAN- ethics guidelines which were circulated for comments, testing, CENTRIC. AI helps amplify human capabilities and pro- and assessment in 2019 and being vetted by many organiza- tects human interests. tions (EC 2019). The EC envisions developing an AI ecosystem that brings benefits to citizens and businesses for improved The model advocates that organizations should embrace four service delivery, promotes new products and services, and em- key measures in their quest for AI adoption: phasizes sustainability while ensuring safeguards, rights, and freedoms. EC is promoting a common European approach to 1. INTERNAL GOVERNANCE STRUCTURE. The involve- reach scale and avoid fragmentation of the single market. Ac- ment of top officials and their sponsorship of AI initiatives cording to these guidelines, trustworthy AI should be: is critical. This ensures ethical considerations are intro- duced in the decision-making process and monitored reg- • Lawful, comply with all applicable laws and regulations. ularly at the highest levels. • Ethical, respect ethical principles and values. • Robust, both from the technical and social perspective. 2. DETERMINING THE LEVEL OF HUMAN INVOLVE- MENT IN AI-AUGMENTED DECISION-MAKING. AI The guidelines put forward seven key requirements that AI algorithms can support processes with or without the in- systems should meet. They incorporate the ethical principles volvement of humans. Any process that affects human promoted by the EC and include human agency and over- beings must involve humans “in-the-loop.” sight, privacy and data governance, transparency, diversity, nondiscrimination and fairness, societal and environmental 3. OPERATIONAL MANAGEMENT. This aspect includes well-being, and accountability. data management, talent, skills, and procurement. Or- ganizations must ensure data governance arrangements In 2019, the United Nations (UN) launched its Centre on Ar- are in place to ensure integrity, consistency, transparen- tificial Intelligence and Robotics, under the UN Interregional cy, security, interoperability, and accountability for data. Crime and Justice Research Institute (UNICRI), to monitor Also, organizations must strive to incorporate relevant developments in AI and robotics, with the support of the gov- 15. https://www.pdpc.gov.sg/Help-and-Resources/2020/01/Model-AI-Governance-Framework. 16. Understanding artificial intelligence—GOV.UK. This is an introduction to using AI in the public sector. The Data Ethics Framework. A Guide to Using AI in the Public Sector enables public bodies to adopt AI systems in a way that works for everyone in society (GDS and OAI 2019). Guidelines for AI procurement—GOV.UK. These new procurement guidelines will inform and empower buyers in the public sector, helping them to evaluate suppliers, then confidently and responsibly procure AI technologies for the benefit of citizens. 42 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN ernment of the Netherlands. The center, based in The Hague, helps focus expertise on AI throughout the UN in a single agency. UNICRI initiated its program on AI and robotics in 2015. One of the leading agencies of the UN, the United Nations Educational, Scientific, and Cultural Organization (UNESCO) recently appointed an international expert group to draft internationally applicable global recommendations on the ethics of AI (UNESCO 2020). This action follows the decision by UNESCO’s 193 member states during its last General Conference in November 2019. > > > B O X 2 - Arivate Sector AI Principles There is broad convergence on the adoption of AI principles in the public and the private sector. Several private organiza- tions have adopted principles to enhance trust and transparency in the process of developing AI applications: • IBM’s principles of trust and transparency state that AI should augment human intelligence rather than replace it, trust is key to adoption, and data policies should be transparent (Dignan 2017). • Google’s principles on AI state that AI should protect the privacy of citizens and be socially beneficial, be fair, be safe, and accountable to people. • The Asilomar AI Principles were outlined at the 2017 Conference on Beneficial AI organized by the Future of Life Institute and cover research, ethics, and values in AI. The 23 principles have been adopted and signed by 1,273 re- searchers and 2,541 other interested parties, including Elon Musk and the late Stephen Hawking. • Organizations interested in joining the Partnership on AI must endeavor to uphold eight tenets and support the Part- nership’s purpose. They include calls for an open and collaborative environment to discuss AI best practices, social responsibility on the part of companies delivering AI, explainability, and a culture of trust, cooperation, and openness among scientists and engineers. • The AI4PEOPLE principles and recommendations are concrete recommendations for European policymakers to facilitate the advance of AI in Europe (Floridi et al. 2018). • The World Economic Forum’s five principles for ethical AI cover the purpose of AI, its fairness and intelligibility, data protection, the right for all to exploit AI for their well-being, and the opposition to autonomous weapons (O’Brien et al. 2020). • The IEEE’s set of principles place AI within a human rights framework with references to well-being, accountability, corporate responsibility, value by design, and ethical AI (IEEE 2019, 17–35). The Institute for Ethical AI & Machine Learning adopted eight principles of responsible ML development to provide a practical framework to support technologists when designing, developing, or maintaining systems that learn from data.17 17. https://ethical.institute/principles.html. 43 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Role of a Central Government Agency or AI Hub The use of AI in many advanced digital governments is seen ensure commitment, governance, line-of-sight, and monitor- as a broader effort for the citizen-centric digital transformation ing for the acceptable use of AI in the public sector. The AI of public services. A central coordinating agency is typically policy should address key policy domains: research, talent, established and responsible for issuing the ethical principles entrepreneur ecosystem, ethical standards, data access, AI in and guidelines for trustworthy AI. It develops government-wide government, AI in sectors, and governance capabilities (World data strategies and policies to harness the power of AI. This Bank 2020). is the essential first step in making advances in this domain to > > > T A B L E 4 - The Role of a Central Agency in AI Country Agency or Program Role Leads the strategy in close partnership with the Canadian government and three CIFAR (formerly the new AI institutes: the Alberta Machine Intelligence Institute in Edmonton, the Canada Canadi-an Institute Vector Institute in Toronto, and Mila in Montreal. It is primarily a research and for Advanced Research) talent promoting institute, while the implementation of AI in the government is decentralized. The program seeks to provide a holistic set of personalized AI-driven government Aurora AI National services for citizens and businesses in a way that is human-centric and works Finland AI Program toward their well-being as its ultimate goal, instead of being driven by the needs of the public authorities. A joint partnership by Aalto and Helsinki Uni-versities to promote AI research, Finnish Center for AI talent, and industry collaboration. It also supports an AI accelerator pilot program and the integration of AI in the public service. Joint Center of State-level agency to help recruit AI talent and to serve as an advisor and lab for France Excellence for AI public policy design. The coordinator’s role is to implement France’s AI strategy, including public sector Inter-ministerial AI transformation efforts, and serving as an interface between the public and coordina-tor private sectors. German Research A major actor in this pursuit and provides funding for application-oriented Germany Center for AI research. Plattform Brings together experts from science, indus-try, politics, and civic organizations Lernende Sys-teme to devel-op practical recommendations for the gov-ernment. Aayog adopted a three-pronged approach: (a) undertaking exploratory proof-of- National Institution for concept AI projects in various areas; (b) crafting a na-tional strategy for building India Transforming India— a vibrant AI eco-system in India; and (c) collaborating with various experts and Aayog program stakeholders. Strategy approved in 2019 provides a core mandate to drive and own the national data and AI agenda to help achieve the govern-ment’s Vision 2030’s goals. To Saudi Data and Artificial Saudi Arabia fulfill this mandate, the Authority and its sub-entities—National Information In-telligence Authority Center, National Data Management Office, and National Center for AI—will deliver on the promise to create a data-driven and AI-supported government. One of the leading agencies on AI, which also brings together research institutions Singapore Digital Government Office and the private sector. Issued a memorandum to the agencies providing guidance on ethical principles United States The White House and operating framework. U.S. Commerce Depart- Delivers a fed-to-fed framework for data sci-ence innovation through ment’s National Technical partnerships with industry, universities, and nonprofits at the velocity of the Information Service government’s needs. Source: World Bank. 44 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN The institutional arrangements for the implementation of AI Operations Framework AI could be centralized or decentralized. In several jurisdic- tions such as Canada and the USA, implementation of AI is delegated to the agency level, while the central agency issues The central agency responsible for leading the AI initia- the AI ethical principles, AI data strategy, and operating frame- tives generally provides an operating framework. It guides work. The central agency may partner with the private sec- agencies and departments through steps for operationalizing: tor and academia to bring in talent and do research. Table 1 defining the idea with a problem statement; conceptualizing presents an overview of the role of a central agency in several the problem with experts; proposing a solution to the prob- countries. Under centralized arrangements, governments cre- lem; developing a proof of concept; and implementing this ate a hub within a central digitization agency to implement the idea through iterative stages. The framework focuses on in- AI strategy. The central hub pools scarce talent, partners with tegrating AI into operations to produce efficiencies, enhance the line agencies, provides an AI lab, and develops alliances the quality or augment data-driven policy capabilities. It also with academia, the private sector, and start-ups. Governments accounts for ways in which the solution will augment human typically view themselves not only as the service providers for decision-making capabilities by increasing the breadth of data citizens and businesses but also as an orchestrator of public beyond human comprehension. A key example is using NLP services through expanding public-private partnerships. This to analyze millions of policy documents from citizen sources model is adopted by many economies such as Austria, Es- and public records. The operating framework typically guides tonia, Israel, Saudi Arabia, Singapore, United Arab Emirates, key implementation steps. Governments may customize the and the United Kingdom. framework contextually, but overall, it could include six com- ponents as presented below in Table 5). In the United States, AI is both centralized under the federal government and decentralized among state governments. > > > Centralization is enabled through the National Technical Infor- T A B L E 5 - Operating Framework mation Service (NTIS) under the U.S. Commerce Department and the Federal Risk and Authorization Management Program Component Description (FedRAMP). The former is responsible for helping federal The problem statement is produced in agencies rapidly analyze, manage, and implement scalable Ideate detail. data solutions by leveraging an extensive NTIS network of The statement is agnostic to technology. technical talent from private industry, which is often difficult to The project manager coordinates discus- locate in today’s competitive information technology landscape. Conceptualize sions between small and medi-um enter- FedRAMP’s mission is to promote the adoption of secure cloud prises and Al experts. services across the federal government by providing a stan- A detailed proposal is prepared. It contains dardized approach to security and risk assessment. the problem statement, po-tential solution options, and a checklist with a brief de- scription of each to ensure alignment with The central agency encourages and promotes agency-, Propose legal, policy, and ethics risks, mitigation ministry-, and department-level initiatives. U.S. agencies ac-tion, and expected results. A separate such as the IRS, Treasury, and General Services Administra- section on data sources is critical. Manage- tion (GSA), have their centers of excellence focused on agen- ment approves. cy-specific AI solutions. The National Security Commission on The project manager ensures technol- AI Strategy focuses on defense, security, and war. Regard- ogy teams work together with Small and Develop a less, state and municipal levels aggressively pursue indepen- Medium Enterprises (SMEs) seamlessly to prototype dent AI initiatives, primarily for land management, tax revenue develop a proof of concept. A prototype visualizes the solution with or without code. management, and fraud detection. Test SMEs and technical teams test the system. The Canadian government tapped CIFAR (formerly the Ca- The system is developed full scale, tested Develop and nadian Institute for Advanced Research), a global research again, and deployed for oper-ational use. It deploy organization based in Canada, to lead the development of its is also integrated with the environment. Pan-Canadian AI Strategy. CIFAR is focused on ethics, re- Source: World Bank. search, and talent promotion, while implementation is done at the government agency level. 45 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN The implementation steps are summarized as follows: • TEST, DEVELOP, AND DEPLOY. Proof of concept typi- • IDEATE. The detailed problem statement involves le- cally goes through several iterations, leading up to imple- veraging subject matter experts. The problem statement mentation, based on working feedback from the subject should be technology agnostic. It captures sufficient detail matter and domain experts participating since the early and contextualizes the overall strategy and vision to main- planning stages. Upon maturation, the solution is ready tain a clear line-of-sight. for go-live production as a pilot capable of scaling hori- zontally. An important operational issue in procuring AI is • CONCEPTUALIZE. Domain, subject, and technology ex- experimenting at the big-data scale. Traditional approach- perts enter into discussion and conceptualize technical es to linear solution silos require detailed specifications components to the problem statement. These experts are that interfere with AI innovation, which involves many it- either from the center of excellence in the government or erations, much experimentation, optimization, and itera- the private sector. The output of this stage is a conceptual tive learning from performance tuning based on unprec- Report that details how the solution will address the prob- edented results due to the immense scale of AI modeling lem statement. beyond the scope of human capabilities. • PROPOSE. In this stage, the team formulates a proposal Justification at the for the implementation. Typically, implementation partners Conceptualization Stage are private sector firms, including start-ups, nongovern- AI is not the solution to every problem. How should an organi- mental organizations (NGOs), legislative, and human zation evaluate the scope and needs of a problem statement rights experts with experience and knowledge of these so- to determine whether AI fits the playbill or is little more than lutions. The procurement framework engages these firms a theater act? The American Council for Technology and In- with flexibility; without detailed specifications, but rather dustry Advisory Council (ACT-IAC) AI playbook for the U.S. based on problem statements and a high-level solution government offers a questionnaire for assessing the necessity concept, amenable to change based on market response. and fitness of AI solutions. Figure 11 illustrates the full scope of the playbook consisting of five phases. “Phase 1, Problem • DEVELOP A PROTOTYPE. The team selects an imple- Assessment” stipulates that a government must “[d]develop a mentation partner and requests a working proof of con- vision and business objectives through various assessments cept. This software demonstrates how the solution will to ensure the AI solution addresses a specific use case and work as a vertical, without pursuing full-scale production delivers results that optimize services and operational deliv- deployment, customization, and data migration. ery” (ACT-IAC 2020). In more detail, the inputs and outputs of this assessment are shown in Figure 11. 46 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN > > > F I G U R E 1 0 - AI Business Case Assessment Goal: Determine if AI is the appropriate technology to solve my problem INPUTS ASSESSMENT OUTPUTS AI AI • Primer: Basic KEY ACTIVITIES • Technical Vision understanding • Non-functional of AI Management People Process Technology Acquisition Requirements • Type/Quality of • Assess how • Capture the Data Business Need • Establish an • Define who • Map the use AI Inventory will use the AI to the AI sophistication solution value & Definition • Workforce • Define the and maturity and outcomes • Problem of the AI • Define your Statement Set Readiness ethical Business Need • Capture the (knowledge & boundaries • Evaluate the constraints • Use Case Ideas AI’s fitness for (Cost & Need and Capability & for the AI • Validated Use Use Cases Skill) • What is the the intended Schedule) Cases Business Need for problem • Willingness impact of use • Future State Vision • Document (Perception the AI • Identify • Stakeholder Awareness of: the objective of Value capability Analysis • Applicable NIST trying to be Benefit vs differentiators Guidance - FISMA, achieved Consequence) GRC 800-53 (Security), 800-63 (Identify) • Checklist • Agency specific KEY OUTCOMES • Applicable compliance government- • Government-wide wide and agency & agency-specific specific policy and policies Engaged Defined Planned compliance • High-level risk • Program/mission • An AI solution is • The ROI permits analysis office executive applicable (selection MGT, procurement and rank and file is not defined in this options exist phase) Source: ACT-IAC (2020). On a granular level, a 14-point questionnaire accompanies the assessment phase, which asks questions of stakeholders and key decision-makers. Answers fall on a scale of zero (not at all) to five (critical). A score of 18 or less indicates limited applicability and low return on investment; 19 to 40 indicates that AI could be applicable, but not without more in-depth analysis, and over 41 represents compelling applicability and significant benefits from a potential AI solution. The questions are: • Does the use case clearly and accurately describe the • Have other technologies successfully been applied to ad- problem to be solved? dress elements of the use case? (Could you somewhat • Does the use case accurately outline current processes solve your use case with an existing solution?) in place? • Does the data fit for purpose (descriptive modeling), and • Does the use case align the goals and objectives with de- is it operationally relevant (predictive modeling)? sired outcomes? • Are the authoritative data sources of the use case orga- • Does the use case identify what data are required and nized, structured, deconflicted, and matriculated? available, accessible, and accurate? • Could the result of the use case change how conformance • Does the use case need greater insight from the data? requirements need to be applied—for example, person- • Has sufficient data been identified for the use case? ally identifiable information (PII), classified, etc? • Are the data from the use of case annotated and curated? • Does the use case contain ethical considerations, and is (Does the data contain meta-information?) there a potential for bias, for example in the data, algo- • Does your use case largely need manual process auto- rithms, or aggregation process? mation? (That is to determine if only RPA [robotic process The implementation agency should assess the high-level gov- automation] is needed) ernance conditions in Figure 12. • Is there a predictive element to the use case? (Assump- tions and testing made based on prior data) 47 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN > > > F I G U R E 1 1 - Operationalizing AI 1 Leadership Is Leadership committed? 2 Problem Statement What is the problem, is it strategic enough to have impact? 3 Data Is data available, complete and of good quality; can it be shared/interoperable? 4 Coordination Is there internal coordination machanims with institutions who share the date? How can I access resources to fine-tune the problem statement, develop the idea, 5 Expertise engage the AI expertise to test the idea through the proof-of-concept (PoC)? Source: The World Bank. The operating framework should also address the issues of organizational roles and responsibilities. Entities implementing AI must identify key roles and responsibilities when designing the internal organization for managing AI, suitable to their context. At a minimum, these roles include: • EXECUTIVE SPONSOR. Depending on the context, this requirements. This person needs a background in AI soft- role is the head of an agency, chief information officer, ware systems engineering. or department director. This role ensures compliance and alignment with the broader legal framework, policy objec- • DATA SCIENTIST. A quantitative engineer that under- tives, strategies, and ethical considerations for AI. Also, stands the data requirements for the project based on this role develops coordination mechanisms with involved both qualitative and quantitative best practices that lever- agencies. age statistical methods for assessing inbound and out- bound data for bias and qualitative excellence. This per- • WORKING GROUP. Stakeholders from different depart- son needs a background in AI modeling and should be a ments whose data will be used, or who will be impacted champion for data interoperability. by AI or have a stake in the solution, should be consulted at every step. • PROJECT MANAGER. A project manager who manages teams, resources, results, and procurement in project • SUBJECT MATTER EXPERT. Someone that under- planning at all stages of the project life cycle. He or she stands the business process and its data, core nature of needs to be versed in AI systems engineering at a level of the qualitative objectives, and key results required for the competency that will allow for the proper scoping of team successful implementation of an AI solution. This person objectives and key results. This person must also take does not need a background in AI to fulfill this role. the overall responsibility of aligning expectations from the subject matter expert, developer, and data scientist so the • DEVELOPER, AI ENGINEER, AND DATA ARCHITECT. policy liaison can properly construct a policy plan that en- An engineer with a mind for understanding the practical sures on-time delivery and overall project integrity. implementation of the AI infrastructure and engineering 48 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN These roles and responsibilities can be tailored to the • SOFTWARE AND ALGORITHM DEVELOPMENT. Mul- context, but essentially, they should cover the contextual ar- timodal data recognition must be implemented to reduce eas of activity: discrimination, bias, and unjust consequences. Algorithm transparency must disclose the steps taken to explain • Oversight of the various stages of AI planning, budgeting, the results. design, development, legislation, and operations. • MODEL TRAINING AND EXCHANGE. Standardization • Integration of roles and responsibilities defined by an in- and consistency offer practitioners the opportunity to ex- ternal risk management framework. change trained models without revealing sensitive data, yet offering explainable disclosures for the practical pur- • Procedures for data governance, transparency, and dis- pose of understanding results. closure. • TESTING AND VALIDATION. Fairness and bias testing • Policies for information governance, which enforce securi- must be evaluated against standardized test sets created ty, interoperability, and access control among stakeholders. with oversight from representatives of affected popula- tions and stakeholders. • Oversight of data science and AI modeling procedures that emphasize documentation and explainability to Procurement stakeholders. Most governments acquire expertise from the private sector through innovative procurement methods. The private sec- Stages of Technical Solution Development tor, in particular start-ups, brings cutting-edge expertise to The following concepts are important components that the solve the complex public sector problems through AI. The stages of AI implementation must address. implementation team must produce a broad overview of how they will customize procurement to these initiatives by using • DATABASE COLLECTION. Collected data must be the procurement framework. Governments should consider cleaned and checked for bias. adopting a set of guidelines and principles published by the World Economic Forum (see Table 6). 49 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN > > > T A B L E 6 - Innovative Procurement Guidelines Guideline Principles 1. 1. Prescribe a procurement b. Allow innovative procurement processes for AI systems development. process that defines the c. Develop a clear focus with a specific problem statement. scope of problems and op- d. Avoid putting any energy toward defining the details of the solution. portunities while allowing e. Support an iterative approach to product development. room for iteration. a. Assess why AI is relevant to the problem. Be open to alternative technical solutions. 2. Produce an RFP that pub- b. Explain which public benefits are the main drivers in the decision-making process when licly defines the benefits assessing proposals. Consult with external experts if needed. and costs associated with c. Conduct an initial AI risk and impact assessment before starting the procurement pro- an AI solution while assess- cess. Ensure that interim findings inform the RFP and revisit the initial assess-ment at ing risks. key decision points. 3. Align procurement with a. Consult relevant government AI initiatives on national, innovation, or industrial strategies. relevant existing govern- Review any guid-ance documents informing public policy about emerg-ing technologies. mental strategies and b. Collaborate with other relevant government bodies and institutions to share insights and contribute to their further knowledge. improvement. 4. Incorporate potentially rel- a. Conduct a review of relevant legislation, rights, admin-istrative rules, and other relevant evant legislation, policies, norms that govern the types of data and kinds of applications in scope for the project. and codes of practice in b. Consider the appropriate confidentiality, trade-secret protection, and data privacy best the RFP. practices that may be relevant to AI systems deployment. 5. Articulate the technical a. Implement the proper data governance mechanisms at the start of the procurement process. and administrative feasi- b. Assess whether relevant data will be readily available for the project. bility of accessing relevant c. Define data sharing policies for the vendor(s) during the procurement initiative and sub- data. sequent project. 6. Highlight the technical a. Consider the susceptibility of data and if the usage of the data is fair. and ethical limitations of b. Highlight known limitations (e.g., quality) of the data by consulting domain experts and intended data uses to mini- require bidder(s) to describe strategies for addressing these shortcomings. mize issues with bias. c. Have a plan for addressing relevant limitations as they arise. a. Develop ideas and make decisions throughout the procurement process in a multidisci- 7. Work with a diverse, multi- plinary team. disciplinary team. b. Require the successful bidder(s) to assemble a team with the right skillset and consult with the established domain experts. 8. Focus on mechanisms of a. Promote a culture of accountability across AI-powered solutions. algorithmic accountability b. Ensure that AI decision-making is as transparent as possible. and of transparency norms c. Explore mechanisms to enable the interpretability of the algorithms internally and exter- throughout the procure- nally as a means of establishing accountability and contestability. ment process. a. Consider that acquiring a tool that includes AI is not a one-time decision. Testing the 9. Implement a process for application over its lifespan, adapting to new models, and extending to new datasets is the continued engagement crucial to success. of the AI provider with the b. Ask the AI provider to ensure that knowledge transfer and training are part of the en- acquiring entity for knowl- gagement. edge transfer and long- c. Ask the AI provider for insights on how to manage the appropriate use of the application term risk assessment. by nonspecialists. 10. Create the conditions for a a. Discover a wide variety of AI solution providers. level and fair playing field b. Engage vendors early and frequently throughout the process. among AI solution provid- c. Ensure interoperability of AI solutions and require open licensing terms to avoid vendor ers. lock-in. Source: WEF (2019). The procurement of AI expertise should be done within the procurement framework of the government, exploring flexibilities within the framework to ensure the best value for money. Practitioners should adopt an iterative and agile approach to developing a solution. 50 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Innovative Procurement Examples The U.S. government has launched two programs to facilitate Smaller digital economies also offer similar flexibility in their the procurement of innovative solutions: FASt Lane and Start- procurements. In Israel, the government issues challenge ten- up Springboard. The FASt Lane program aims to facilitate and ders that outline the problem statements, without the solution streamline the process for younger, innovative companies and specifications. suppliers to do business with the government. Under this pro- gram, the suppliers get shorter processing times for specified The Government of Singapore launched a process called Call contract categories (e.g., IT Schedule 70 contracts) including for Solutions. It entails sourcing ICT innovations through the a 48-hour turnaround for contract modifications and a turn- evaluation of working prototypes and awarding contracts by around as quickly as 45 days for new contract offers. stages to one or more suppliers. These multiple solutions are assessed in parallel through a series of pilot trials when the Under the Startup Springboard program, if a start-up does not preceding stage or pilot proves successful. Facilitated by the have the required experience, it can use the experience of its Infocomm Development Authority of Singapore, this process executives and key professionals as a substitute for two years will allow government agencies to collaborate more closely of corporate experience. Startup Springboard has one primary with the industry on ICT innovation needs. The EC adopted a objective: helping federal agencies quickly gain access to the similarly innovative approach.18 Figure 13 depicts the process latest innovative technologies from fresh, vibrant private sec- in Singapore. tor firms (Nakasone 2018). > > > F I G U R E 1 2 - Singapore Procurement Model Conduced via an open contest for crowd-sourcing of ICT innovation proposals Stage one: Stage two: Supplier B Stage three: Supplier C Stage four: Stage five: Issue Request- Shortlist Evaluate devel- Pilot Implementation Supplier C Supplier D Supplier D for-Proposal Proposals oped working Trials (RFP) in GeBIZ Supplier D prototypes Supplier A Supplier B Supplier C Phase 1 Supplier D Typical durantion: 6 months Phase 2 Open tender process where the procurement A single process for contracting principles of Value-for-Money, Competition from prototyping to Implementation and Transparency are adhered to (where applicable) Source: Reproduced from Annex A: Innovation Procurement for Singapore Government, Infocomm Development Authority of Singapore, available at https://www.imda.gov.sg/-/media/Imda/Files/Inner/Archive/News-and-Events/News_and_Events_Level2/20120531094015/AnnexA.pdf. 18. For more information, visit Shaping Europe’s Digital Future on the website of the European Commission at https://ec.europa.eu/digital-single-market/en/innovation-procurement. 51 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Role of the Public Sector in Society The public sector can have a much wider role in governing For example, in the United States, the White House issued AI outside the government for society at large. Its many fac- Executive Order 13859—Maintaining American Leadership ets include promoter of science, technology, and innovation in Artificial Intelligence—to federal agencies to guide them culture to be a source of talent for AI; promoter of research on regulatory and nonregulatory oversight of AI applications in academic institutions; a regulatory body to regulate the AI developed and deployed outside of the federal government. developments in the private sector; and a promoter of AI by The memo encourages the agencies to avoid regulatory and opening up its administrative and sectoral data to the private nonregulatory actions that needlessly hamper AI growth. It sector in machine-readable and downloadable formats to pro- also provides guidelines on new regulations to ensure the mote innovative use of these data. In this manner, the public principles of AI, as described in this paper, are adhered to in sector can set the direction for the development of technology the private sector as well. It calls on agencies to facilitate the and set the rules for its application. private sector innovation and growth by giving the public ac- cess to agency data. This access should be open, public, and electronic according to the Public, Electronic, and Necessary Government Data Act. 52 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN AI Operationalization in World Bank Projects Task teams sometimes support Bank clients in experimentation and proof of concept for AI within the scope of World Bank proj- ects. These engagements may be development policy operations, investment project financing, or advisory services, and analyt- ics. The Bank’s New Procurement Framework is flexible enough to allow experimentation with agile approaches, customized to the context; see Box 3. > > > B O X 3 - Procurement: Important Steps to Consider A few steps for developing the RFP and designing the procurement process are given below: • Outline the problem, not the solution specifications. The problem must be agnostic to technology. Special consider- ations should be given to sources of data and their quality. • Define the benefits or results, which are of strategic importance and impact. • Align with existing legal frameworks, public policies, and government strategies. Ethics and associated risks should be assessed together with mitigation strategies. Risks should be managed, as it is difficult to eliminate or avoid risks. • Constitute a working group or multidisciplinary team. • Establish mechanisms for transparency and accountability of AI systems. • Ensure knowledge transfer from the AI vendor. • Ensure value for money and fairness through competition, especially for scaling up AI that will involve large invest- ments. • Ensure code ownership. AI vendors could standardize the code, make it agnostic to client context, and resell the license, as with any technology, to create win-win. Consider opportunities for open-source code sharing. Source: WEF (2020). 53 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN 5. >>> Ethical Considerations Managing AI ethics is important and unavoidable for the productive use of AI in either the public or private sector. Failure to address ethical considerations, in government and private- sector AI solutions alike, leads to public mistrust and potential backlash. Most of the discourse on AI is dominated by the power of technology to process data faster, learn faster, and propose or take actions automatically to increase efficiencies and effectiveness. However, the societal implications of wider AI adoption have ethical dimensions that need to be understood and ad- dressed at the outset. This chapter focuses on those ethical dimensions needing national-level policy response, while the technical risks to be mitigated at the implementing agencies level were discussed in Chapter 3 on AI risks. AI harbors the inherent risks of automating poor decision-making and hiding complex decisions behind opaque algorithmic logic. AI can also do harm, for example, through AI-generated dis- information campaigns on social media. Malicious actors may leverage AI to further strengthen their influence over society. Policy-level concerns on the ethical use of AI are can fall under the following three categories:19 • INEQUALITY. Bias in the use of algorithms, or as a result of a biased data pool may en- hance negative bias toward vulnerable and weak communities and exacerbate inequalities; AI could lead to more demand for higher-skilled labor and exacerbate the returns to educa- tion which may not be equally accessed in the first place. • CONTROL. AI could increase the misuse of information, surveillance, and use in defense systems. • CONCENTRATION. The concentration of power and wealth in a few actors could be aggra- vated through the net flow of resources into a few firms, and success in achieving singularity when machines become equal or better than human general intelligence. The detailed discussion on policy level ethical issues is given below. 19. See WDR World Bank 2016. 54 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Inequality human capital and skills will be deployed and how equal- ity of access to skills enhancement opportunities will be managed. Priority should be given to research, education, AI may lead to specific job losses in both public and pri- and skill development programs. Investing in such skills now vate sectors, more likely among lower-skilled workers, for the future use of AI in the public sector is also important. which has implications for the government to skill-up its Special emphasis should be given to managing equality of workforce and to introduce policies that manage this tran- access and reaching groups vulnerable to missing these op- sition. It is estimated that as much as 30 percent of today’s portunities. This could include scholarships, apprenticeships, jobs will be replaced by AI and automation by 2030, and up and research funding in computer science, STEM education, to 375 million workers in both the private and public sectors and AI-related disciplines such as data science for skill devel- worldwide could be affected by emerging technologies (McKin- opment. Governments could also create an innovation fund, sey Global Institute 2017). While the impact in the governments loan programs through state development banks, and income- of the World Bank’s client countries is likely to take place time, contingent student loans. Variations are already used in China it will also take a longer period to prepare the workforce for the and Brazil, and examples can be drawn from the experiences future. According to one study, 50 percent of the activities peo- of Denmark, the European Union, Finland, Germany, Israel, ple do can be automated by adapting currently demonstrated and the United States (Mazzucato 2015). Governments could technologies. This could have significant implications for the also initiate hackathons to promote opportunities for emerg- use of automation in the public sector. To manage this change, ing talent and start-ups, as is being done in many countries, a distinction should be made between human-replacing AIs including Austria, Estonia, India, Poland, Pakistan, and the and human-assisting AIs. Government policies should promote United States. human-assisting AIs, rather than human-replacing ones. To off- set the effects of AI, unskilled labor should be progressively di- verted to sectors needing personal attention and care, including Control health, education, and hospitality sectors.20 The potential threat to low skilled jobs in the private sec- One of the potential risks introduced by AI is who has tor from AI is also a potential issue for the World Bank’s control over the information and how it can be manipu- client governments whose comparative advantage eco- lated for certain outcomes. Developing policies early on to nomically stems from a large unskilled and semi-skilled deal with the use of AI to misinform or mislead groups is an im- labor force. Unlike the innovations of the past, AI solutions portant issue. The use of fake news and targeted but distorted could be more labor-replacing than human-enhancing. Ger- newsfeeds can have several consequences leading to polar- man robots have already begun replacing workers of garment ization of ideas and groups in society and influencing political factories in Bangladesh.21 Chatbots are increasingly taking choices. AI-enabled social media bots can analyze millions over call center work. It is estimated that 80 percent of cus- of personality profiles by using cookies to track websites that tomer interaction will be managed without human interaction. people visit and deliver tailored news, including fake news, Autonomous vehicles could soon become a reality, with po- suitable to the profile. Fake or selected news can be used as tential erosion of jobs for the taxi, bus, and Uber drivers in all a tool for manipulating political outcomes and discrediting a countries. On the optimistic side, countries could potentially political opponent. Managing the development of policies and increase productivity in sectors like agriculture, health, educa- legislation to manage what is and is not acceptable, while at tion, and climate change through human-enhancing use of AI. the same time balancing rights to form an opinion, is a com- For example, AI can improve diagnosis through image recog- plex endeavor. nition, increase crop yield through monitoring soil and crop health using drone-generated data on farming, strengthen the Governments should develop or strengthen policies and fight against fraud and corruption through the reconciliation of agencies that cover the treatment of online propaganda, data from multiple data sources. misinformation, libel, and cybercrimes. Agencies are re- quired to monitor policy compliance and track, prevent, and To manage the labor market transition, the policy frame- investigate disinformation to protect its citizens, to enforce work needs to be developed to show how investments in compliance and sanction lack policy violations. Governments 20. Stiglitz 2018. 21. Wall Street Journal. 55 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN need to regulate and influence social media Big Tech compa- transparency frameworks can be strengthened, and manag- nies (such as Facebook, Instagram, and Twitter) to ensure the ing the risks of misuse of such measures will pave the way for appropriate use of AI tools and to take down content that is the productive use of AI in this domain for the public good, for malicious, hateful, propagandist, and false. example, to trace and identify those at risk from contact with a contagious disease. The government’s use of AI to provide citizens with infor- mation about access to services also needs to be covered Weaponized AI systems have the potential to increase by a policy that governs the use and re-use of this infor- the use of autonomous weapons in conflicts, requiring mation. This will mitigate the risk of misuse of this informa- a specific policy to address the ethical use of AI in war- tion. Handled correctly, AI has enormous potential to ensure fare. The control and use of autonomous weapons systems appropriate targeting of information about, for example how to may in turn destabilize regions and increase potential conflicts access certain government services, to the groups most likely as human costs may be reduced. Global military spending on to be beneficiaries of such programs. autonomous weapons systems and AI is projected to reach $16 billion and $18 billion respectively by 2025.22 The cost of AI can also be used as a tool that can be used to track drones that can be advanced enough to defeat a U.S Air Force and surveil people, something that may be very helpful, fighter pilot in combat simulations is as little as $35.23 AI prin- for example in managing public health outcomes or re- ciples of adoption emphasize human control and AI use for ducing traffic congestion, but which also has risks of ex- human benefit. The application of these principles to the use cessive government surveillance that could infringe on of autonomous weapons is an issue of global importance and human rights. The opacity around governmental use of AI as coordination. Global governance through multilateral forums a surveillance tool makes it very difficult to assess the magni- and international cooperation is needed to address these is- tude of the problem. According to Feldstein 2019, at least 75 sues. The role of civil society to influence the debate is also out of 176 countries were using AI technologies for surveil- important. lance. Typical platforms for surveillance include smart cam- eras under the smart city initiatives, smart police projects, and facial recognition systems for contact tracing to quarantine Concentration COVID-19 carriers. AI can be used to track the movement of employees to monitor performance in the public sector (police rounds), the private sector (pizza delivery). Therefore, policies AI can also lead to increased concentration of wealth in the governing the privacy and rights of employees need to be de- hands of a few individuals controlling the big firms. These big veloped to avoid misuse of AI. firms can finance expensive research and attract top talent through better financial incentives. These big firms not only Data privacy laws, transparency, and citizen’s voice control the AI research and talent but also the associated data should be strengthened to manage risks that AI used for center infrastructure through cloud computing. This concen- surveillance is in the public interest. Europe has adopted tration would provide even more resources at the disposal of the General Data Protection Regulations and many govern- these individuals to influence public policy through campaign ments have legislation covering personal rights to privacy, financing, lobbying, corruption, and influence peddling. This personal data protection, and civil liberties but compliance and will also lead to a net outflow of resources from the develop- enforcement remain challenges. Promoting full disclosure of ing to the developed countries, as most of these big firms are information being tracked by AI and robots through existing based in the high-income countries. 22. Sander and Meldon, 2014; Research and Markets, 2018. 23. Cuthbertson, 2016. 56 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN 6. >>> Government’s AI Building Blocks A whole-of-government, data fabric AI architecture is central to the technology vision of the government and forms the building blocks for the use of AI. The government’s ap- proach needs to encompass interoperability and security and the importance of continuity of ar- chitecture among AI systems designed for use in a whole-of-government architecture. By under- standing the components and building blocks of AI systems at a high level, common knowledge becomes a tool for exploring relevant entry points with technologists to guide the broad direction of possible solutions. Three key concepts that constitute the building blocks are (a) a whole-of- government architecture; (b) interoperability; and (c) data standardization. Whole-of-Government Architecture Most World Bank client countries are managing stand-alone legacy systems, often re- ferred to as “silos.” These systems are not interoperable or have problems with interoperabil- ity. Since AI models need large amounts of data to work well, the “ideal” architecture needs the silo systems to feed data into a large distributed data storage repository—often referred to as a data lake. The data lake is then made accessible to various AI applications. A government aspir- ing to greater digital transformation should adopt a whole-of-government architecture, which is the de-facto industry standard. Siloed systems can be “stitched together” through a common data platform. A govern- ment has many ministries, departments, and divisions. Each one typically operates autono- mously, but often reports to a central government agency. A data fabric is a similar concept (see Box 4 below). It has several data centers with many departmental computing resources. Each one operates autonomously, but each one reports to a central computing administration system using a standard set of rules or protocols for data storage, security, and processing. They are all “stitched together” using a common software platform that spans the whole-of-government. The data though remain separate and independent. This is a simple description of how the kind of system that can lead to incredibly powerful capabilities in AI and data processing. 57 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN > > > B O X 4 - Data Fabric in Brief The term “data fabric” refers to the very large-scale continuous Big Data architectures used by some of the largest or- ganizations in the world. A data fabric provides storage, computation, and security for organizations with exceptionally large data pools, such as governments and multinational corporations. A data fabric also supports distributed computing between multiple data centers spanning entire countries. In 2019, Gartner identified the data fabric among the top 10 trends in data and analytics technology (Gartner 2019). The difference between Big Data and data fabric. Big Data systems are more uniform and monolithic while data fabrics offer a common computing layer across a variety of systems that include these characteristics: • One or more databases containing data from various sources (Big Data). The database and file system layers com- prise the data lake as explained later • Application Program Interfaces (APIs) to connect with external government systems such as financial management information systems, payroll, integrated tax administration systems, and e-procurement. • Data and cluster management tools, including: » Storage APIs for real-time (or batch) data ingestion, updates, creation, and deletion. » Data tools such as streaming, machine learning, and preprocessing systems. » Administrative tools for data access control, monitoring, and provisioning. The general purpose of data fabric architecture is to unify data storage and AI computation across many independent gov- ernment departments while keeping data safe from loss and protected from unauthorized access. A data fabric does not replace an existing architecture in one iteration. A government can roll out a data fabric over time and incorporate all the existing data systems into the fabric architecture, slowly replacing “old” walled-off legacy systems with “new” interoperable systems at their discretion. Figure 14 presents architecture built atop a data fabric. 58 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN > > > F I G U R E 1 3 - General Data Fabric Architecture for Whole-of-Government Use Government Systems Firewall Firewall Firewall TAX FMIS [... ] PAYROLL Procurement eHealth Data Data Data Data Data Application Layer Audit X-Action Blockchain Budget APP APP APP APP Event Event Event API Data API Data API DB Streams kafka Layer API Transaction Cleared Blockchain Event Topic Event Topic Event Topic Data Fabric Cluster File API Database Layer Raw Cleared Ephemeral General NoSQL X-actions X-actions Reports Database Filesystem Filesystem Layer ML CSV Transaction System Models Reports Blockchain Logs Source: The World Bank. A data fabric architecture has two high-level layers: gov- er inside the cluster in real-time. Beneath the streams layer, ernment systems and the data fabric cluster. Government a database layer gives departmental AI applications a place systems are shown in the gray box at the top of Figure 14. to store their rapid-access data. Lastly, beneath the database Inside this layer are all the government’s applications, which layer, the filesystem layer stores archival data and even larger belong to the various departments. Each white box represents data structures for long-term storage in blockchains and flat departments or divisions. The two applications on the right, file systems similar to the hard drive on a personal computer, Procurement and eHealth, represent commercial-off-the-shelf but scaled to handle the data needs of an entire country and (COTS) solutions. All applications send and some receive all its citizens. Appendix A provides a discussion of the poten- data from the data fabric cluster. Existing applications, such tial role of blockchain technology for government systems. as tax, FMIS, and payroll, share data with an AI application layer inside the data fabric cluster, which is at the top of the The Standardized Application Programming Interfaces data fabric cluster portion of Figure 14. (APIs) are the threads that stitch the fabric together. Fig- ure 14 represents these connections with bold arrows. They The data fabric cluster layer is subdivided into four lay- are labeled DB API, File API, and Event API. They are the core ers: application, streams, database (DB), and filesystem. of interoperability for this architecture. The most successful The first layer, the AI application layer, holds all the custom AI large-scale operations, including India’s system for issuing a applications inside the data fabric cluster. Underneath that, a unique digital ID (Aadhaar), use this design. stream layer ensures that data flows from one place to anoth- 59 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN > > > F I G U R E 1 4 - High-Level Data Fabric Architecture High Level Data Fabric Architecture Government Systems Policy IFMIS Procurement eHealth Data Flow Application Low Storage Layer High Storage Mid Storage AI Layer Event API DB Streams Low Storage Layer API High Compute kafka Mid Memory Data Fabric Cluster File API Database High Storage Layer Mid Compute NoSQL Hogh Memory Data Lake Filesystem High Storage Layer Low Compute Filesystem Low Memory Source: The World Bank. Figure 15 illustrates the resource utilization and performance Sometimes referred to as a data lake, a data fabric – though requirements in a mature whole-of-government architecture. it is more than a data lake as mentioned in Box 4 – is made Resource requirements and utilization are major factors in up of commodity hardware systems at up to exabyte scale determining the total costs of ownership (TCO). Here, the AI (1018 bytes) throughout a wide variety of architectural pat- application layer and the streams layer comprise the whole of terns; some cloud-based, some on-premise, and some in a the primary AI layer. The database and file system layers com- hybrid configuration of both. prise the data lake. The general takeaway from Figure 15 is the distinction between the AI layer and the data lake within a Data storage for AI is broken into three tiers: ephemeral, data fabric cluster. Any external AI solutions can leverage data persistent, and archival. Each tier favors a particular subset APIs to access data within the cluster from anywhere within of structured data, accessible through a standardized inter- the dominion of the whole-of-government. Also, the relation- face and abstracted into an accessible format through pro- ship between resource consumption and broader layers of grammatic and algorithmic convention, which may be open architecture is not uniform, which allows for lower TCO. More- source or proprietary. Storage in 2020 can be localized to one over, the data fabric cluster remains independent of top-level machine, one drive, or spread across geographies in sophisti- government systems. Each department may have its services cated and redundant data topologies that distribute exabytes built into silos. All the computers within the data fabric cluster across global geographies while offering access-control lay- can have different capabilities and distributed locations. ers (ACLs) for strict management. The storage tiers are con- sidered part of the storage layer in the AI technology “stack.” 60 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN > > > B O X 5 - Blockchain: Distributed Ledger Technology The data in the persistent file system tier may be grouped into blocks of information and hashed with an identifier linked to a previous block: a blockchain. This blockchain formulates an archival decentralized ledger. To further the utility of blockchain technology, also known as distributed ledger technology (DLT), applications relying on the use of a distributed ledger may prevent the completion of a transaction or asset transfer until enough computing nodes in the infrastructure reach consensus through Application Program Interfaces in real-time. By distributing and requiring consensus among participating compute nodes, DLT essentially offers an immutable solution to asset tracking and transactional audit. Modern DLT solutions even offer safeguards against Byzantine attacks in which malicious agents attempt to gain majority influence within a network of computing nodes that are running DLT applica- tions. Appendix A has more detailed information about how the blockchain plays an integral role in the long-term use of whole-of-government data architecture. Besides the core AI architecture layers, a few other consid- A data silo is an architecture that is isolated due to the ab- erations play an important part in the technical design of the sence of a common API for IPC. Data silos can emerge from data fabric AI stack. vendor lock-in, proprietary systems design, or poor planning. A system of silos lacks a common denominator to effectively allow for interoperability. Data are trapped in the silo. Over Interoperability Patterns time, the silo will bloat and stagnate with information that could otherwise be utilized by AI systems. Data silos are the opposite of scalable, interchangeable, and Various agencies and departments tend to pursue entirely in- interconnected computing systems. They are rigid, limited, dependent solutions to solve narrow problem statements spe- and isolated from other systems. Imagine a government in cific to their short-term needs. This common practice creates which various departments, ministries, or divisions did not complex pervasive fragmentation. As a result, interdependent speak a native or common language and could not commu- organizational units end up with entirely independent systems nicate. These are silos. Successful large-scale deployments that are isolated from one another in the long run. rely on the following patterns to compensate for a lack of in- teroperability between silos: Siloed systems can potentially become bottlenecks for data sharing that prevent useful implementations of AI. As • Data exchange standards and schemas. a result, to discover trends and patterns with AI, departments • Secure APIs. must export enormous volumes of data to a centralized stor- • Cohesively interconnected layers of services using IPC age location, which is extremely time-consuming and costly. best practices. • Geographically distributed data centers within the data Data silos stifle whole-of-government AI development, al- dominion. though they are preventable. This pattern is consequential • Architectural redundancy and replication. to siloed systems, reflective of turf sensitivities, and lack of • ACLs. interagency coordination mechanisms. Luckily, there are solu- tions to address the issue. > > > B O X 6 - Actionable Insight: Data Fabrics Can Overcome Silos A data fabric architecture prevents and solves problems arising from data silos. A central agency, responsible for govern- ment-wide digitalization, could deploy data fabric architecture to overcome silos. It deserves much-needed consideration for governments wanting to harness the power of data by streamlining operations with a large-scale AI-ready infrastructure. 61 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Data Standards Data are the lifeblood of any AI architecture—it is the net, and network protocols rely intensively on standards es- “gold.” As a form of untold wealth, data are worth sharing tablished by the Internet Engineering Task Force, International among stakeholders within an organization. To overcome en- Standards Organization (ISO), and the Institute of Electrical trapment in silos, interoperability plays a crucial role in suc- and Electronics Engineers (IEEE). The processes and sub- cessful AI systems development. Although enterprise comput- structures within these organizations are oriented toward de- ing solutions—such as enterprise resource planning (ERP), veloping a very uniform agreement between ground-level engi- data lakes, and databases—often use compressed binary neers responsible for implementing the technologies that drive streams internally, there is a high probability their data storage the internet’s evolution. When a fundamental technological systems open a pathway to external applications through an agreement must be reached, a consensus is reached through API. Standards make this possible. a Request for Comment (RFC), which contains guidelines for the implementation of and use of the technology needing stan- Instituting data governance arrangements promote stan- dardization through peer review. A complete RFC must con- dardization of data necessary for interoperability. Modern tain core tenets explaining and enumerating every behavior governments, like Estonia, create data governance councils and function in technical detail and depth. RFC practice was and appoint data stewards in each agency to coordinate data used in several global standards: World Wide Web, JavaScript standardization and interoperability. These arrangements are Object Notation (JSON),24 and the Portable Operating System part of the data governance strategy that defines the authority Interface, a family of standards developed by IEEE that pro- and control over the data assets and includes policies, process- vides a standardized protocol for communications within and es, standards, definitions, and data exchange arrangements. between computing file system layers worldwide.25 Data standardization across agencies could also follow Good models also exist in the public sector at the inter- good practices of standardization internationally. These national level. The Open Contracting Data Standard targets are more common in the private sector, though some models contracts in general and enables disclosure of data and docu- also exist in the public sector. In the private sector, several ments at all stages of the contracting process by defining a standards evolved using these practices. Programming, inter- common data model. > > > B O X 7 - Actionable Insight: Governments Should Standardize Data The central agency may develop standards for data formats and interoperability through engagement with line ministries. The creation of data governance councils and nomination of ministry- and agency-wise data stewards help ensure standard quality in data sharing. Also, engagement with stakeholders to develop consensus using Requests for Comments is suc- cessful among international standards organizations. 24. See Request for Comments (8259), “The JavaScript Object Notation (JSON) Data Interchange Format,” Internet Engineering Task Force, at https://tools.ietf.org/html/rfc8259. 25. The Open Group 2018. 62 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Open, consistent standards and methodologies are the controllers within the data fabric. While it is true that granular ground-level blueprints for a successful whole-of-government data access can lead to a “spaghetti” dependency structure, implementation of AI technologies. Prospects are that a global entirely independent distributed services can be tuned to exe- governance standard for data will likely emerge over time. cute any arbitrary set of applications, especially whole-of-gov- This notion of a data standard can extend further to include ernment AI models over the long run. More granular informa- suggested best practices for developing interoperable data tion about advanced connectivity is available in Appendix A. fabrics within and across governments. Their practices may differ substantially from one another, though the technical pro- In conclusion, a data fabric offers an intrinsically resil- cesses for accountability and integrity share common stan- ient, adaptive, and decentralized architecture that has no dardized infrastructural patterns. single point of failure. Trends are moving toward AI as an operating system among developed governments. The intro- By enforcing standards, the international community of policy- duction of the FedRAMP marketplace established by the Unit- makers can achieve an intergovernmental vision of AI interop- ed States, which provides vendors with stipulated standards, erability. By leveraging standards for document data storage, requirements, and guidelines for being authorized to provide and APIs common to the databases supporting various ex- cloud services to federal agencies, provides an indicative di- isting silos, governments can deploy system integrations that rection of the emerging trends. evolve continuously with the trends and advancements in AI at a national and international scale. Within whole-of-government systems, standardized access to data enables many types of practitioners to experiment and Access to data is the key to managing governments at all design all kinds of use cases for AI. In reality, the AI application levels of AI deployment. The software platforms and solu- layer can contain tens of thousands of AI models for all types tions that do the actual computation often provide APIs that of purposes. Each application can easily leverage all types access standardized databases. Developers and data scien- of information simultaneously stored within the architecture— tists alike may be able to access a data fabric over network data such as text, audio, video, and biometrics. This allows interfaces and conduct experimental research that helps de- for better solutions over the long run by enabling a fail-fast termine the proper course in developing permanent AI solu- approach using data access as a baseline. Ultimately, govern- tions to common problems in government. This will also pro- ments can develop long-term strategies in AI innovation that vide avenues for more effective data collection, aggregation, count on standards. There is little doubt that a tidal shift is un- experimentation, policy management, and access control. folding for governments that are serious about improving their long-term strategic advantages in AI. Data are more accessible than governments may realize. Leveraging data stored in existing silos should be the essen- To proceed and formulate a more in-depth view of AI systems, tial tenet of any digital transformation strategy. The majority see Appendix A. It dives into the core concepts of AI in practi- of ERP, custom-developed, or open-source solutions these cal applications. The concepts are meant to inform the reader days provide some type of data access control through direct of the basic, advanced, and real-world AI applications. Again, communication with the database layers that these systems understanding these foundational concepts demystifies much utilize. Therefore, siloed solutions do not require forced ob- of the jargon and hype orbiting the topic of AI. The key topics solescence either. Governments may continue to utilize them that Appendix A depicts in greater detail include: while they transition to newer data fabric oriented architec- tures. Limitations do exist among mainframe systems devel- • Project development patterns. oped before the turn of the millennium, which require custom • Cloud, hybrid, and on-premise architectures. programming to extract data from COBOL (or common busi- • AI connectivity. ness-oriented language) and other flat file systems. • Microservices. • AI models. An application within a data fabric can query existing data- • AI workflows. bases for new records and feed the data to an ingestion layer, • Distributed ledger technology (DLT) in AI architectures. which routes information to the appropriate hardware resource 63 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN 7. >>> Conclusions AI is still a new area even for many of the advanced digital economies, but its rapid diffusion in every facet of private and public life is increasingly more visible. The enormity of development challenges requires exploring modern approaches, tools, and techniques. AI offers immense opportunities to address some of these challenges. However, it has inherent risks that can have profound consequences for society. Governments have to lead the efforts to manage these risks while promoting the use of AI in the private and the public sectors. This paper distills existing knowledge on these aspects for client governments. Conclusions as well as priorities going for- ward are highlighted here. Human-centric AI design is a key principle to guide the development and deployment of AI. AI will not eliminate human oversight in decision-making. Also, entirely externalizing deci- sion-making using AI is unrealistic due to bias, which is impossible to eliminate but reasonably controllable. Public sector AI technology must remain under the guidance of humans because it has the potential to affect trust, human health, safety, and overall well-being. Fortunately, the state of the art in AI demands it and all mission-critical AI deployments keep humans “in the loop” to varying degrees. Governance and government practices benefit from transparency and evidence-based decision-making. AI systems must operate with transparency, human oversight, and neutrality while attempting to manage and disclose bias, which humans will never fully eradicate from AI solutions. However, well-managed AI solutions yield a repeatable model that may provide fun- damental services through an open-source consortium of international collaborators. Therefore, while this paper encourages collaboration, any general government AI solution must take secu- rity, privacy, and data protection into full account to protect the sanctity and privacy of people and their governments. Currently, close to 135 governments are implementing privacy and data protection in their legislation, which applies to AI for the benefit of stakeholders. 64 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN The process of AI implementation is a journey. It starts with Priorities Going Forward the most critical basic foundation: the acquisition, aggregation, management, and storage of reliable data. With quality data in hand, policymakers, data scientists, and AI engineers can per- Based on the issues highlighted in the paper, several priorities form introspective and comprehensive iterative deployments could be considered by policymakers. to expose the possibilities for full-scale AI systems. The jour- ney requires coordination and collaboration between teams • Governments must adopt policies and governance of stakeholders at all levels of government. It also demands frameworks that promote human-centric AI while that the outcomes earn the citizens’ trust through disclosure, maximizing opportunities. A few aspects of the policy explainability, and transparency wherever bias is a concern. framework are mentioned below: Where necessary, administrators may provision AI algorithm audits, especially in cases requiring forensic investigations. » Ethical AI requires the adoption of an AI policy and strategy. It could be tailored to specific settings Governments need to adopt a large-scale data fabric ar- but should be approved at the policy level to provide chitecture to serve as the common denominator for stan- the authorizing environment. Governments in many dardized data interchange among a fully digital whole- settings have issued AI strategies approved by the government infrastructure. This approach enables robust AI parliament, president, prime minister, or cabinet. solutions to grow and evolve with changing needs. The funda- These policies should be based on ethical principles. mental shift in the mindset of developing countries involves an Governance and operational framework are essential emphasis on interoperability and IPC through standardization to specify broad guidelines and institutional arrange- and API enablement. ments. An innovation hub could be established to pool talent, establish partnerships with academia and the The promise of AI is riddled with commercial marketing private sector, promote research, and facilitate ex- hype, but the fundamental value of the introspection can- perimentation by line ministries. The innovation hub not be overstated. AI systems offer a mechanism for quali- should source the best talent through adequate in- tative predictions using quantitative measures of information. centives. Innovative procurement approaches should The various patterns of AI analysis provide tools for attack- be adopted to leverage private sector skills with agili- ing a multitude of problems that are emerging in the face of ty to allow iterative, problem-driven approaches to the increasingly intricate governance systems. Regardless of the RFP. The implementation teams should also manage flavor of governance employed, one thing remains clear: AI the risks associated with AI, including bias, security, has the potential to revolutionize human intelligence in un- and unintended consequences, among others. precedented ways. Despite the hype associated with being at the forefront of innovation by being the first to deploy one or » Promote transparency and accountability through more cleverly marketed solutions, the real focus should be on inclusion and multi-stakeholder engagement at solving problems for internal governance and citizens. Also, every step of the AI policy design and implemen- government agencies must be willing to adopt standards and tation. Affected communities and populations should practices that enable fast and agile delivery, with an accept- be informed and provided with avenues for contesting able degree of failure risk. AI logic without delays and hurdles. A myopic view of AI is counterproductive. Immediate prob- » Adverse ethical implications of AI could be man- lems are like individual fires in a forest ablaze. Governments aged through broader economic policies. These must avoid this tendency and commit to building a whole-of- could include industrial policy, tax policy, competition government infrastructure that allows line agencies to operate policy, human capital policy, among others. interdependently. Systems at this scale require the collective efforts of nearly everyone in the scope of government influ- » These policies should also promote digital skills, ence to learn, trust, and invest. By creating fabrics of informa- education, and redeployment efforts to support tion, governments can promote their missions of better gover- people as they adjust to the shifting nature of work nance, transparency, accountability, and efficiency. in the coming decades. Unskilled people and disad- vantaged groups should be given special attention. 65 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN » A policy framework to fight online propaganda, programs through state development banks, income- misinformation, libel, and cybercrimes should be contingent loans for students or others, and small given priority. Also, governments could establish business loan programs. Variations of these funding agency mandates to monitor policy compliance and modalities are already used in China, Brazil, Denmark, track, prevent, and investigate disinformation to pro- the European Union, Finland, Germany, Israel, and the tect their citizens. Engagement with social media Big United States (Mazzucato 2015). AI could be one of Tech—Facebook, Instagram, and Twitter—should aim the areas to be incentivized through these programs. at encouraging the deployment of AI tools and profes- sional fact-check partnerships to take down content » The innovation hub should be staffed with the that is malicious, hateful, propagandist, and false. appropriate talent on market-based salaries. These skills are in high demand and could easily » Strengthen privacy, data protection, and civil drain overseas. liberties and monitor compliance, which is typical- ly weak in most settings. Promoting full disclosure » Data fabric architecture, including interoperabil- of information being tracked by AI and robots ity, should be considered for investments. This through transparency frameworks should also will overcome silos, and leverage data assets for de- be strengthened. cision-making, compliance monitoring, and analytics. The initial focus should be on interoperability, open • Investments should be made in human capital and data, and data standardization. A hybrid cloud option digital infrastructure. AI research, digital skills, AI entre- should be explored to leverage the computing power preneurship, and foundational digital technologies could at much lesser costs to pilot AI solutions. be prioritized. » Proof-of-concept and pilot AI projects could be » Investments should be directed to fund research, the starting point for exploring opportunities. education, and digital skills development pro- Many governments have deployed AI to solve spe- grams in general and in AI in particular. They could cific problems. Key use cases include citizen engage- include scholarships, apprenticeships, and research ment, service delivery, regulatory compliance, deci- funding in AI, computer science, STEM education, sion analytics, fraud, and anti-corruption. Hackathons and AI-related disciplines such as data science. promote emerging talents and start-ups as seen in Special emphasis could be given to disadvantaged Austria, Estonia, India, Pakistan, Poland, and the groups such as women, minorities, and those at risk United States. of being left behind. • Risks should be identified and managed, rather than » Innovative entrepreneurship could be promoted. avoided. They could be mitigated through self-assess- This could be done through an innovation fund, loan ments, peer reviews, and inclusion. 66 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN >>> Appendix A. AI Technical Primer Appendix A explains a large subset of technical and operational details about AI project manage- ment, architecture, types, methods, and models, as a self-contained primer. This is based on the best industry advice from practitioners. The consolidated information herein intends to benefit the reader. For even more information, practical guidance is available in many books, blogs, articles, and other technical resources. Project Development Patterns Agile Development Iterative, agile development is the key to the steady adoption of any technology. The ag- ile methodology offers an adaptive model for destructuring complex projects into manageable stages with discreet goals, short term development intervals, and continuous delivery. Agile teams convene regularly, often daily, in scrum meetings to disclose incremental progress and dependencies. Agile methodology is a longstanding backbone among organizations of all sizes. Agile offers a method for execution that complements a goal-setting methodology consisting of objectives and key results (OKRs). They keep all levels of organization, especially individuals, holistically accountable to the project. Because OKRs are typically disclosed publicly to all stake- holders, the whole organization may audit the development process for measurable progress. Of course, process management is not a panacea for projects attempting to reinvent the wheel, or parts therein. To that end, there are ample turn-key solutions that ship with a unique set of caveats to consider. Iteration times have been steadily decreasing in the decades since the 1980s. Early wa- terfall-based methodologies “iterated” through projects over months up to a year. The 1990s brought the adoption of the Rational Unified Process, an early precursor to Agile Development and eXtreme Programming (XP). These advances in management timelines reduced develop- ment iterations down to two to three weeks. The unit of code development has also decreased significantly since the 1980s with the advent of Service-oriented architectures and microservices. Today, Continuous Deployment techniques allow high performance organizations to release mi- croservice applications to production several times a day. Figure A.1 illustrates the changes in iteration time and code volumes. Figure A.2 illustrates the change in unit of code over the previ- ous 15 years prior to 2020. 67 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN > > > F I G U R E A . 1 . - Iteration Times and Code Volumes versus Time Very Large Systems Dep loy Multiple Years abl e In Iter cre atio me Many Months nD nt S ura ize Monolithic tion ols Inte gra #To architectures tion are not Dur atio ods well suited n eth to rapid ls /M ode deployment #M Spiral Microservices V-Model RAD Wahterfall RUP XP Lean <1 day Continuous 1970 1980 1990 2000 2010 RIPP Scrum DevOps CSE (Dupont) [---- Other Agile Methods----> Source: Reproduced with permission from ©Paul Clarke; further permission required for reuse. Note: Paul Clark, “Computer Science Lecture Notes,” Dublin City University and Lero, the Irish Software Research Center. > > > F I G U R E A . 2 . - Unit of Code Scale Change 2005 2010 2015 2020 Client Client Client Client Application Application Microservices Code (Function) Application Server Application Server Runtime Cloud Managed Runtime Runtime Containers Cloud Managed OS OS Cloud Managed Cloud Managed Hardware Virtual Servers Cloud Managed Cloud Managed Source: Reproduced with permission from ©Paul Clarke; further permission required for reuse. 68 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Wherever possible, problems will require a consistent process nature. Governments tend to silo operations within the scope for reducing complexity and establishing a manageable scope of an agency or department due to budgetary firewalls. Solu- of execution. Project management plays a critical role in estab- tions developed by localized operations successfully focus on lishing a track record of success for software developments at the problem at hand, incorporating turnkey solutions and con- the national level. The temptation to boil the ocean and attach sulting opportunities for highly specialized services deemed a blanketed solution to a broad scope of operating require- vital to deliverables needed within a budgetary window of op- ments is often irresistible within novel undertakings. Project portunity. This practice leads to fragmentation and a lack of managers help mitigate risk and counteract scope creep by interoperability, often solving problems many steps ahead of coordinating and elucidating the requirements and steps nec- the current set of requirements. essary for projects during the planning phase. This practice does not necessarily assume end-to-end project development Conversely, as experimental technologies emerge from aca- of every possible use of a fundamentally valuable data infra- demic organizations and professional firms alike, their applied structure. Rather, project managers (PMs) solidify first-level implementations may be directed toward advanced problems operational requirements for a data infrastructure, the applica- that ignore the scope of current operation. This leads to the tion layer, and critical external dependencies. These require- presentation of a technology that is inconsistent with the im- ments capture the general requirements necessary for the mediate requirements of any organization, much less those of future development of siloed executable applications, each the entire group of organizations comprising government. In assigned a dedicated PM that coordinates with the central brief, no one has engineered anything resembling a “govern- development team to ensure consistent standards capable ment-in-a-box,” but many consultancies come close to sell- of supporting the various permutations of core and second- ing solutions as a panacea for the most mission-critical prob- ary systems. With proper coordination, standardization, and lems. They do so in a manner that opens a dialogue that often management among government stakeholders, government demands full-scale adoption of a technological product that efforts will secure a development process that ensures that requires immense customization or otherwise a total replace- projects reach completion for timelines, which span changes ment of the existing infrastructure that is incompatible with the in elected officials, and survive the varied political landscape. fragmented, siloed solutions described previously. Prior stepping into the domain of applied solutions in other AI technologies and automations offer myriad possibilities for sections, this appendix to the paper will step through some enhancing the decision-making process used within manual processes for reducing and scoping problems into solutions. systems. Replacing a manual system from day zero is not These are by no means a comprehensive list of project man- necessarily the best solution because of the lack of introspec- agement best practices, but they facilitate understanding of tion. By establishing proper ground-level data infrastructure, the intricacies of software development planning and execu- solution architects can coordinate with data scientists to tion. These processes are also not intended to replace the study the quality of information produced by the government acumen of an experienced project management professional. and carefully scrutinize prospective product solutions with a Each project has a specific set of requirements that accom- knowledge of the internal workings of data. Thus, teams can pany project-specific nuances. Furthermore, the number of subsequently derive quick wins before the need for advanced variables involved during execution necessitate careful co- analytics or sophisticated ordination, investigation, and execution by professional plan- software emerges in ac- ners and managers. There is no one-size-fits-all solution in AI project management, just processes based on the type, tuality. Therefore, it is a wise strategy to organize Entities scope, and timeline required for execution. data early on through the implementation of poli- should not Project Management Avoid solutions looking for problems. All too often, tech- cies that standardize data within a very large distrib- be multiplied nologists invent a groundbreaking solution in a theoretical environment and apply the solution to problems that simply uted data fabric that sup- ports further development without necessity. do not exist. As unimaginable as this may seem, the promise and the integration of pro- of technology may outweigh the actual benefit when practical prietary software tools by solutions fail to emerge from concrete and battle-tested best providing APIs for filesys- William of Ockham practices, even if those practices are manual or fragmented in tem data access. 69 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Resource Management ing across a variety of free and open source and paid licensed Among human resources, create a stakeholder hierarchy that solutions from private software firms. Whatever the path taken is largely decentralized across areas impacted by the infra- to develop a formal production model, supporting members structure. This means that organizational decision-makers of the AI engineering team will analyze data many ways dur- need representation from inception. Eligible representatives ing the process. Once in production, the compounding effect include IT directors, executives, and project leaders: people of various deep learning applications will continue to extend who are central to planning, policy-making, standards, and the infrastructure. By using commodity hardware systems execution. Choose only the key members required to com- on-premises and considering a hybrid cloud infrastructure for municate and address project requirements. These members experimentation, administrators can attain considerable flex- will take responsibility for communication with supporting ibility and adapt to the changing demands of uncertain futures. members of their respective teams. Only including the nec- Capacity with headroom will ensure that new models and new essary individuals occupying central roles prevents paralysis data are able to proliferate. A safe general guideline is to main- by analysis. Supporting team members will have the power tain a minimum 20 percent of additional capacity, whether on- to comment on policies and standards that emerge by issu- premise or in-cloud, in order to have burst capabilities for new ing documents in RFC format. Teams review comments as initiatives in machine learning and artificial intelligence. they funnel into the project and conduct discourse to evaluate and settle on a final specification for project requirements. Ul- Continuous Deployment and Automation timately the top-level representatives ensure that the needs of The final key concept in understanding the rapid develop- their organizations are met. Standards may evolve over time ment of any technology, especially those in the space of AI to reflect changing architectural requirements. microservices development within a data fabric is continuous development and service deployment automation. The actual Concerning AI systems infrastructure, prepare to manage tens, programming portion of large-scale systems development and hundreds, even thousands of experimental projects of varying deployment is a fraction of the entire delivery pipeline. This scale. There is no one-size-fits-all infrastructure that picks all is important to note in light of the possible solutions that ex- the locks and opens all the doors. There is no panacea, no ist. Even COTS systems require continual development and completely turn-key solution. AI requires work in layers of ap- releases to address bug fixes and the deployment of new fea- plication infrastructure built on large volumes of data. There tures. It is therefore quite useful to understand the continuous are no fewer than hundreds of tools available for AI engineer- deployment pipeline. > > > F I G U R E A . 3 . - Unit of Code Scale Change Source: The World Bank 70 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Essentially, any application within a larger architecture To further the example, application containers offer en- contains source code. Engineers specify criteria for the prop- gineers dependency management at previously unprec- er function of that source code in the form of various tests. Engi- edented scale. Applications with dependencies for particular neers store the code in source control systems. They then edit software modules with specific versions get packaged into the code and commit changes to the central repository for their small images called containers that are encapsulated to run in preferred source control system. The tests must run to validate an isolated instance alongside many other application contain- the functionality of the application before new changes propa- ers without overlapping dependencies. Thus, a small number gate to production. Every change requires that all tests pass in of computing nodes can run a large number of application in- order for the change to be validated for release into the wild. stances independently. This further reduces the complexity of Breaking changes prevent release. Passing changes merge dependency management among commodity infrastructures into the master branch of the source code tree, and the product and lowers TCO over time. The initial investment of setting evolves. This is the continuous development pipeline. up this application deployment environment pays handsome- ly to teams with limited resources such as those of govern- Continuous integration (CI) systems automate this process so ment agencies, their respective contracting consultants, and that engineers can work on the core of the product and con- in-house technical management teams. There is little to deny tinuously deploy code to production. The early days of manual the virtue of pursuing this course for any government wishing testing are long gone for all modern software enterprises. to develop a long-term plan for successful AI infrastructure. Thus, the expectation today is that developers make new fea- The following diagram illustrates the continuous integration tures and fixes in real time and deploy these to production and continuous deployment of an application using containers without hesitation, sometimes several times a day among vari- within Kubernetes, which is the gold standard in application ous teams managing various projects tied to the many appli- container management among world-class software engi- cations supporting a microservices architecture. neers and architects. > > > F I G U R E A . 4 . - Continuous Integration and Continuous Deployment Pipeline Workflow with Kubernetes CI/CD Pipeline Workflow with Kubernetes DEVELOPER CI SERVER KUBERNETES Commit code, Build new Create Restart push to git Docker image Pull new new pod new pod Docker image Check pod Let old pod GIT Repo Run tests health continue running Push new DOCKER New pod is New pod is CI Server Docker image REPOSITORY healthy not healthy notices new code in Git Update Delete repo & starts Kubernetes old pod running deployment Kubernetes through its receives pipeline request to use new image Source: ReactiveOps 71 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Computing Architectures Brief History of Computing Architecture Computing history is a history of the levels of logical abstrac- Because SOAP is text-based, it had better interoperability al- tion. Early concerns with physical disk sectors gave rise to though it was still cumbersome compared to modern, gRPC,28 operating systems. Low-level languages gave rise to dynamic JSON29 and RESTful APIs.30 Hence, the Software-as-a-Ser- modern interpreted languages, thanks to layers of abstraction. vice (SaaS) gained traction among Web companies, and the The same applies to architectures leading to the boon in arti- industry began its shift toward the Web as the primary applica- ficial intelligence. Before diving into types of architecture, it is tion service delivery mechanism. worth rewinding through brief history to better understand how technology evolved. Increasing pressure to deliver rapid iterations of both custom- er and internal-facing software systems led to the shift toward As ancient computers became logical computing languages open-source software systems, led by organizations such as that gave rise to operating systems, a key principle of com- Apache and GNU/Linux.31 This shift was an irreversible bi- puter system architecture emerged; it was called the Single furcation that took power away from enterprise leaders and Responsibility Principle (SRP). SRP dictates that programs democratized innovation at the hands of hobbyists, start-ups, do one thing and do it well: work together, and handle text and academics at a previously unprecedented rate. Where streams as a universal interface. In short, single-focus pro- once everyone waited on centralized policy-making, now the grams, when strung together, may perform a varied and com- community offered unparalleled power and agility that led to plex assortment of tasks. A more detailed explanation may be the advent of cloud computing. found in The Unix Programming Environment, the book by Brian Kernighan and Rob Pike. Amazon Web Services launched its Elastic Computing Cloud (EC2) in 2006, Google Compute Engine followed suit in 2008, The engineering community largely forgot the SRP pattern in and Microsoft Azure in 2010. In 2019, Amazon Web Services favor of the object oriented paradigm during the late 1980s and (AWS) reported revenues of $35 billion, indicating the extent early 1990s, as the languages C++ and Java gained popularity. of the seismic shift in the software industry, which sought out The promised vision of object orientation and code reuse was the most innovative software tooling developers for leader- never fully realized due to ironic problems with polymorphism. ship and not the enterprise software vendors. As cost mod- This period in history also gave rise to monolithic systems that els shifted away from large up-front capital expenditures to often had millions of lines of code buried in one executable. lower ongoing operating costs, scaling and resources could be used and paid for on-demand, and the entire deployment The 2000s brought the introduction of network-aware ap- stack transformed into a DevOps32 infrastructure as code with plications and the application server model, which was the advent of CI and continuous deployment services. characterized by large monolithic code bases, massive rela- tional databases —with stored procedures for query optimi- Open source software and operational expenditure fueled a zation—and Common Object Request Broker Architecture resurgence of the Unix Philosophy and gave rise to the mi- (CORBA) and Common Object Model (COM) for distributed croservices architecture: many small, fine-grained services communication and application interoperability.26 that perform a single function all trying to achieve the goal of distributed networked components. Microservices gave rise to The revolutionary 2000s also gave rise to XML27 as a an engineering culture that embraces automated testing and means to configure and communicate. The open stan- deployment and embraces failure with unprecedented levels dards community developed the Simple Object Access Pro- of fault tolerance. Microservices teams have the power to work tocol (SOAP) as a “superior” alternative to CORBA and COM. on independent, deployable units of application code that are 26. The Common Object Request Broker Architecture (CORBA) is a legacy binary communication protocol that was popularized in the early 2000s. The Common Object Model was a Microsoft specification and alternative to CORBA. RESTful APIs and gRPC replaced both technologies. 27. XML – Extensible Markup Language 28. gRPC is a modern, open source remote procedure call (RPC) framework that can run anywhere. It enables client and server applications to communicate transparently and makes it easier to build connected systems: https://grpc.io/. 29. JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate: https://www. json.org/. 30. RESTful API design (Representational State Transfer) is designed to take advantage of existing protocols: https://restfulapi.net/. 31. The GNU/Linux operating system is free software that is an alternative to Microsoft Windows and macOS: https://www.gnu.org/. 32. DevOps is a set of practices that combines software development (Dev) and IT operations (Ops). It aims to shorten the systems development life cycle and provide continuous delivery with high software quality. DevOps is complementary with Agile software development; several DevOps aspects came from Agile methodology. 72 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN elastic, resilient, minimal, and complete. These applications interface. The downside, however, is the lack of ownership and scale individually and horizontally. geographical disadvantage that the location of data centers presents. A lack of data centers is not uncommon in nations The underlying idea of microservices existed since the with nascent computing industries, and likewise limited domes- 1970s. Distributed systems are a permanent and enduring real- tic control of cloud-based data infrastructure. Nations with data ization of the power of decentralized, democratized computing infrastructure located in foreign nations face the real potential as a process and superset of products. Modern cloud infrastruc- for disruptions to corporate agreements due to unanticipated tures are built on microservices. Rapid, continuous integration geopolitical tensions spurred by sanctions during periods of and deployment pipelines are the reason for the overwhelming conflict. Although this is uncommon, strategic vulnerability re- success of cloud computing platforms, where new features can mains a key reason for the slow adoption of cloud computing in move directly to production without human intervention after government infrastructure among developing nations. passing a stringent series of automated tests. Consider a hybrid-native cloud services approach in order To conclude this brief excursion through computer architec- to maximize redundancy and protect data. Cloud services ture and summarize the experience, it is evident that the main offer undeniable benefits and minimize TCO by offering com- driver for the evolution of computing architecture is speed of prehensive lists of commoditized services on demand. One deployment. The demand to get code into production is unre- particularly valuable benefit is the ability to extend on-premise lenting. Early results lead to rapid innovations regardless of infrastructure with dedicated cloud infrastructure. The reasons the product. Cloud-native computing services open the clear- for developing the hybrid infrastructure model are primarily cen- est path to achieving the goal of rapid engineering and deploy- tered around redundancy and specialization. Data redundancy ment at blistering rates. is essential to successful operations. Systems fail in all envi- ronments, without exception, and data are always at risk for Basic Components of AI Architecture total loss. The operating cost associated with archival storage A nation’s production of AI infrastructure requires that may not be equitable in the long run as data volume increases. data centers be built on commoditized goods and ser- Similarly, databases and data processing systems often require vices. The key to infrastructure cost savings and operating redundant nodes to guarantee serviceability and failover and strategy is a commodity of goods and services. By defini- eliminate any SPOF. Scalability relies on the mitigation of these tion, commoditized hardware is limited to components that factors. A hybrid model offers effective strategic reserves for are readily available at economies of scale for general pur- growing infrastructural demands by providing burst capabilities poses in computing—drives, racks, switches, routers, cool- for adjusting to unanticipated demand during phases of growth. ers, power supplies, etc. Highly successful data centers use Furthermore, growth requires investment in innovation, and in- commoditized hardware to maximize the procurement of parts novation requires specialization of novel services. Rather than for repairing equipment failures and performing system main- develop new service infrastructure in an on-premise environ- tenance. Much effort goes into minimizing failure rates across ment, cloud service providers offer a wide gamut of specialized large-scale computing infrastructure. Failure is unavoidable. artificial intelligence infrastructure suitable for experimentation Generally, data centers frown upon specialized computing in- using on-demand billing agreements. The incurred expense is frastructure. Research shows that the type and utilization of limited to only what is used by the organization. In either redun- commodity hardware can reduce failure by orders of magni- dant or specialized use cases, on-premise infrastructure gets tude. Commoditized hardware minimizes TCO, a key metric extended over the network and organizations have the power for financial viability of any data center. Analysts often use to control the security and topology entirely. TCO for cost-benefit analysis when deciding whether to pur- sue cloud software systems such as AWS, Microsoft Azure, General AI Architectures Google Compute Platform, or others. Cloud computing originated with the need to run virtual ma- chines on standardized hardware inside remote data cen- Consider using cloud services wherever possible. On the ters—what is commonly known as Infrastructure-as-a-Service upside, cloud services minimize TCO by multiples for opera- (IaaS). In the present day, cloud computing services span tions of all sizes, especially during early phases of develop- a vast array of on-demand services that address all sizes ment. Cloud systems offer a variety of commoditized hardware of computing tasks. Three major competitors dominate the and services. Cloud offerings range in complexity and com- global market for cloud-native services: Amazon, Google, and putational power; customers may purchase bare, dedicated Microsoft. These provide customers with similar service offer- systems and turn-key AI solutions alike through an all-in-one ings, which are listed in Table A.1. 73 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN > > > T A B L E A . 1 - Cloud Service Counts Services Service type AWS Google Azure Compute 14 8 17 Data and storage 13 12 12 Network 6 8 13 Developer 9 13 9 AI and Machine Learning 11 15 35 Other (e.g. IoT) 56 33 24 Total 109 89 110 Source: World Bank. The service landscape continues to transform as new technol- Overall, achieving the most effective AI IaaS model relies ogies enter the market, and ground-breaking work from pub- on understanding four pillars: Architecture, Development, licly traded technology companies continues to evolve in the Operations, and AI. To understand is to pursue the following open source communities, especially the work in AI and ML. respective inquiries with the goal of depth and breadth, un- Therefore, the goal for executive leadership is to understand derscoring a clear strategy of experimentation and execution. architectural principles and compose services into systems designed to achieve specific business goals. Systems may • ARCHITECTURE: What are the architectural patterns for target general solutions, such as storage for AI experimenta- adopting AI computing infrastructure? tion, or specific siloed solutions, that detect a specific form • DEVELOPMENT: What are the best development tools, of fraudulent activity within data streaming from a discrete frameworks, and best practices? source. By maintaining a general inventory of service types, • OPERATIONS: What are the best practices to deploy and practitioners can zero in on desirable results. manage services in production? • AI: What are the available ML/Data Services? How can problems be best solved with these tools? > > > F I G U R E A . 5 . - Pillars of Effective AI Architecture Architecture Operations AI Development Microservices CI/CD Data Science Frameworks RPA Logging Deep Learning Tooling Protocols Monitoring Chat Bots Debudding Messaging Performance GANs IDEs Queueing Analytics NLP Technologies Events Databases Text-to-Speech Cloud Service Data Models Security Speech-to-Text APIs Cloud Machine Learning Engineering AI Services Preprocessing Computation Prediction Networking ERPs COTS Source: The World Bank. 74 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Function-as-a-Service Architecture Serverless software is the state of the art in software ser- tecture. It is important not to be misled by the term serverless. vice development today. There is no “official” definition of Data stores and persistent layers are available to FaaS appli- the term “serverless,” therefore the following is adopted as a cations. The point is that practitioners no longer concentrate working definition: on the infrastructure because the cloud vendor provides a layer of abstraction over all underlying infrastructure. “Serverless computing is a form of cloud utility computing where the cloud provider dynamically manages the underly- The principles of FaaS (serverless) computing architecture ing resources for the user of the service. It provides a level can be summarized as follows: of abstraction over the underlying infrastructure, removing the burden of management from the end user.” • On-Demand cloud functions replace all servers and con- tainers. Serverless software avoids the explicit creation and manage- • Preference goes to managed services and third-party ment of infrastructure, such as servers and containers. In- APIs over custom-built resources. stead, functions that are managed and run by the cloud ser- • Architectures are event driven and distributed. vice provider replace these traditional computer resources. • Engineers focus on developing the core product, not the This category of cloud computing services is called Function- low-level infrastructure. as-a-Service (FaaS), the overarching pattern serverless archi- > > > F I G U R E A . 6 . - Evolution of Architectures: A History of Computing Concepts Leading to Serverless Computing Serverless Function-as-a-Service Event-Driven Workflows Container orchestration Microservices 2010s Containers IaaS, PaaS, SaaS (Docker, 2013) (NIST Cloud Ref. Archi, 2010) Event-Driven Arch. Linux Containers Google App Engine 2000s AWS Cloud (2006) Workflow orchestration cgroups REST OGSA FreeBSD Jails VMWare ESX Grid Computing URI SOA 1990s (RFC 1630, 1994) (Pasik, 1994) Virtual Private CGI LDAP Server (NCSA, 1993) (RFC 1487, 1993) CORBA, Stored Procedures DCOM, OSF Workflow, BoTs 1980s DNS RPC Implementation 1970s Remote Procedure Call (RFC707, 1976) Virtualization Requirements Actor Model (Popek, 1974) (Hewitt et al., 1973) IBM CICS Transactions 1960s IMB VMs Func. Programming Concurrency Event Sourcing (CP-40/CMS, 1968) (McCarthy, (Djikstra, (McCarthy, 1960) Hoare, 1960s) 1963) Resources Code functions Naming/registry Functions Execution flows Events (Where to (How to (How to find the (What to (How to model (When to execute?) execute?) executable?) execute?) the program?) execute?) Source: van Eyk et al. (2018). 75 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Infrastructure-as-a-Service Architecture underlying AI models, but still requires the end user to col- Also known as cloud-native solutions, infrastructure as lect, clean, and provide as much data with relevant features a Service (IaaS) is common to AWS, Azure, and Google as possible to the COTS solution. Compute Engine. AWS offers a dedicated government infra- structure for qualifying customers. IaaS is based on the premise Training is an important part of the AI process. Intelligence is that infrastructure costs are significantly reduced when shared the result of an emergence of outcomes that are trained and multiple tenants maximize rack resources in a data center. Ten- tested repeatedly over countless cycles of iteration depending ancy does not affect security. Individual systems operate on vir- on the model and methods employed. Most of the computa- tualized machines referred to as Virtual Private Clouds (VPC). tional cost—and the biggest barrier to entry overall—lies in the VPCs are isolated from one another; the end user “sees” them fact that training requires a large volume of data processing as physical machines when really the resources are constrained on compute-intensive resources. Therefore, it is beneficial to according to a virtualization policy specific to the customer’s in- approach new problems with an understanding of pre-trained dividual requirements and cost selections. AI models available from COTS and cloud service providers. COTS AI Architecture Cloud service providers discussed here have several ap- Most commercially available AI toolkits abstract the learning plications and services available to attack common AI process with models developed for specific uses in a siloed problems. These span a wide gamut of topics including docu- environment. These pre-trained models require a specific da- ment analysis, speech recognition, sentiment analysis, object taset with custom features. Data inputs must have the pre- detection, recommendation, and forecasting. Table A.2 lists scribed features in order to realize accurate predictions. The common cloud AI services. This section of the Appendix dis- end user simply needs to make known data available to a cusses how to employ several of the services listed in Table specific COTS product, and activities take off. This absolves A.2 to address notable problems in the final chapters, which the end user from possessing an in-depth knowledge of the contain practical examples of AI systems. > > > T A B L E A . 2 - AI Applications and Services Application Use Service Machine Translation AWS Translate Document Analysis AWS Textract Natural Key Phrases Language Sentiment Analysis Processing Topic Modelling AWS Comprehend Document Classification Entity Extraction Conversational Interfaces Chatbots AWS Lex Speech-To-Text AWS Transcribe Speech Text-To-Speech AWS Polly Object, scene, and activity detection Machine Facial recognition AWS Rekognition Vision Facial analysis Text in images Time Series Forescasting AWS Forecast Others Real-time personalization and recommendation AWS Personalize Source: Amazon Web Services 76 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Cloud-Native Hybrid Architecture nodes, the benefits of investment pay off in spades over the Redundancy, persistence, and service access underscore the long run. Once services are operating as planned, depending value proposition when considering whether to extend a native upon their intended use among citizens (public) or agencies architecture into the cloud. All the major cloud services provid- (private), a configuration and template system will generally ers offer an assortment of general ML and AI services using an manage the scale of the infrastructure operation. Typically, a on-demand commodity, software-as-a-service (SaaS) model. YAML32 (YAML Ain’t Markup Language) document will contain The services include both software and hardware solutions for the topology requirements for the entire system and individu- common problems in ML and AI such as image object detec- ally within containers that make up the constituent services. tion, natural language processing, computer vision, and speech recognition. Although the on-demand price of services may ap- Several containerization and instance management solutions pear to be a significant expense for any “24/7” solution, custom- are available through the major cloud service providers, but ers may pay only for the time and computation they utilize for a many of the best solutions are Free and Open-Source Software specific task, experiment, or stage within the project lifecycle. (FOSS). In particular, Kubernetes (K8s) paired with docker con- tainers is among the most well-regarded solutions for full-scale A cloud-native hybrid solution allows discrete access con- application infrastructure management. Docker Containers are trol and infrastructure integration. Overall, the physical data lightweight disk images containing an application—and all soft- center can extend its topology with a virtual topology in the cloud ware dependencies—configured in a fully operational state. called a virtual private cloud (VPC) that is nearly identical to a Containers will deploy on any computer (node) running docker Virtual Private Network, governed by access and security poli- software. Docker reduces deployment time, eliminates system cies that the government’s Development Operations (DevOps) dependency management, and allows nodes with different Manager has control over. The extension behaves as though it operating systems to run “dockerized” applications stored in is on-premises. Whitelisted infrastructure communicates inter- docker containers. Capital allocated to DevOps and DevOps nally, using private DNS network addresses. DevOps adminis- stretches much farther with a managed cloud application clus- trators may enforce one or more firewall proxies to grant access ter. K8s deployments are clusters of nodes running dockerized to vetted external components and services. service applications that employ easy-to-use configurations to scale with minimal human intervention. Implementation occurs in several phases depending on the desired objectives and key results. First, the team deter- Cloud-Native COTS Hybrid Architecture mines the purpose of external cloud services. If there is a need Starting small is adequate for long-term proliferation of suc- for bespoke AI compute services, then managers may surmise cessful solutions, yet there are hybrid alternatives with com- estimates by using existing local development processes that mercial off-the-shelf (COTS) solutions that may extend AI ca- may address computational shortcomings. Per the information pabilities. There are several commercial large-scale systems in the previous section on cloud architecture costs, from these for transactional accounting and financial audit available. shortcomings, DevOps Administrators may estimate the hourly Many rely on proprietary cloud infrastructures in foreign data TCO of cloud services based on the anticipated service require- centers. This places significant barriers to entry for govern- ments. Similarly, if there is a need for storage redundancy, then ment teams facing long-term goals of developing a conver- managers will estimate storage durability (level of redundancy) gent data infrastructure on-premises. Governments that con- and availability (time-to-access), based on the current footprint sider making an investment in small-scale development once and operating requirements. With anticipated estimates in hand, a broad data infrastructure strategy is in place may have a budgets may be appropriated, and resources deployed accord- higher likelihood of long-term success. By edifying a formal ing to the prescribed needs of the project. DevOps will man- iterative agile process, small-scale projects can spiral upward age the deployed resources and ensure that operating require- through versions. The key to iterative development is failing ments are adequately resourced, which admins may choose to fast and often: projects that invest long time-spans to realize automate using cloud management tools for the long run. products at any scale become burdensome and fail to garner enough momentum to endure or provide value in the face of By using cloud management tools, the size of cloud-native changing economic and political landscape. Thus, it is impor- services hybrid architectures can expand and contract auto- tant to start small and scale with experimentation through it- matically, on-demand. Although it requires an investment of eration in order to prove the effectiveness of novel solutions, time to develop a formalized topology of services and storage especially those in artificial intelligence. 32. YAML (“YAML Ain’t Markup Language”) is a human-readable data-serialization language. It is commonly used for configuration files and in applications where data is being stored or transmitted. 77 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Turn-key solutions may present quick and easy wins, but need Measure the costs associated with a cloud computing archi- careful vetting to prevent unnecessary technical debt from tecture in terms of TCO, which takes into account more than sneaking up. More nuanced than mere costs associated with the cost of engineering and implementation. TCO factors in cutting corners, technical debt is a hidden intrinsic cost of tech- the cost of personnel, heating, ventilation, and air conditioning nological development, which emerges as bugs, unfinished (HVAC), maintenance, monitoring, hardware, software, land, tasks, improvements, features, and upgrades that accompany facilities, electricity, and innovation. By leveraging on-demand the process of engineering any software or hardware product. Infrastructure-as-a-Service (IaaS), projects forgo the cost of Technical debt is inherently unavoidable and impossible to brick-and-mortar data center underutilization. Planning calls for eliminate. Yet, there are process management best practices expected server capacities that are guaranteed to fluctuate due for mitigating the risk of extensive technical debt. During initial to regular cycles of use on a daily basis. Moreover, although early stages, careful planning and scoping are the best measures projects demand fewer resources, planning for lateral growth to for maximizing productivity without incurring debt. But some accommodate new deployments places a burden on resources things cannot be anticipated, so it is commonly acceptable to that may naturally underwhelm the overall server infrastructure, commence in a small scope that addresses the key concepts leading to idle systems that demand step-wise investments for that underpin a full-scale long-term solution. anticipated future demands. The overall efficiency of cloud-native architecture exceeds on-premise systems significantly, as illus- trated by the graphs and statistics below, provided by AWS. > > > F I G U R E A . 7 . - Overall Efficiency of Cloud-Native Architecture Before After Before After Before After Before After AWS AWS AWS AWS AWS AWS AWS AWS 27.4% 57.9% 56.7% 37.1% Reduction in overall Increase in VM Decrease in Decrease in time spend per user managed per admin downtime to market for new features/ applications Source: Amazon Web Services > > > F I G U R E A . 8 . - Optimizing Cost of Providing IT Services and AWS Value 62% 51% 6 monts More efficient IT Lower 5-year cost To infrastructure staff of operations payback Improved IT and Business Agility Nearly 3x 25% 90% More new features More productive application Less staff time to deploy delivered development teams new storage Business Operations Impact 94% $36.5M 14% Less time lost to Additional revenue per Increase in business unplanned downtime year per organization user productivity Source: Amazon Web Services 78 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Cloud-native costs fall into one of four contract catego- for experimental development and skunkworks usage in which ries: reserved, partially-reserved, spot, and on-demand. there are no risks to the general public. Lastly, on-demand in- Reserved instances require up-front payment for a period of stances offer no savings compared to reserved instances, but one to three years with no additional monthly costs for cen- still remain very competitive with TCO of on-premise deploy- tral processing unit (CPU) and memory utilization. Partially- ments. The market determines the on-demand rate. reserved instances require a partial payment for one to three years but require a reduced monthly service cost for CPU Also, important to note, disk drives (volumes), network input- and memory. Both reserved options are contract-based solu- output (I/O), monitoring, and dedicated VPCs are additional tions. Contracts can be sold at a rate prorated to the remain- costs on top of the VPC cost in a cloud-native infrastructure. ing duration of the contract period if a customer should deem These are metered in fractions of a unit of computational pay- the contract unusable. Contract options are also specific to load (in bytes) and sold as add-ons, which still offer significant machine types, which often ship with immutable memory and savings and added efficiency over the on-premise model. CPU configurations. Spot instances offer significant savings similar to reserved instances, but their availability is not guar- The overall reduction in costs is compounded by the increase anteed. Spot rates allow the customer to specify the maximum in efficiency of AI algorithms, which are outpacing predictions allowable cost per hour of VPC use for a given VPC configura- made by Moore’s Law, which states that the number of transis- tion. Customers pay a reduced variable cost for the instance, tors on a microchip doubles about every two years, though the but should the market cost exceed the customer’s maximum, cost of computers is halved. This leads to exponential increas- the instance can terminate, causing potential loss of data. The es in computational power. Coupled with the fact that research key is to set a high cost threshold and the VPC remains pro- in AI algorithms is increasing their efficiency, AI is outpacing tected. Additional configuration can allow for persistent disk Moore’s Law faster than expected. Figures A.9 and A.10 below mounts that protect volumes of information from loss in the illustrate the fact; the first is efficiency, and the second is com- event of an unexpected termination. Spot instances are best pute according to a study conducted by researchers at OpenAI. > > > F I G U R E A . 9 . - Less Compute Required to Get to AlexNet Performance 7 Years Later – Efficiency Level Compute Efficiency 44x less compute required to get to AlexNet performance 7 years later (log scale) (linear) 50 45 EfficientNet-b0 40 35 Training Efficiency Factor 30 25 ShuffleNet_v2_1x 20 ShuffleNet_v1_1x ShuffleNet_v2_1_5x 15 MobileNet_v2 10 5 GoogLeNet Squeezenet_v1_1 AlexNet Resnet-18 DenseNet121 0 VGG-11 2013 2014 2015 2016 2017 2018 2019 2020 Total amount of compute in teraflops/s-days used to train to AlexNet level performance. Lowest compute points at any given time shown in blue, all points measured shown in gray. 2,5,6,7,8,9,10,11,12,13,14,15,16 Source: OpenAI 79 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN > > > F I G U R E A . 1 0 . - 44x Less Compute Required to Get to AlexNet Performance 7 Years Later – Compute (log scale) Compute Efficiency 44x less compute required to get to AlexNet performance 7 years later (log scale) (linear) 10 5 VGG-11 AlexNet Wide_ResNet_50 Teraflops/s-days Resnet-50 ResNext_50 1 DenseNet121 GoogLeNet Squeezenet_v1_1 0.5 MobileNet_v1 MobileNet_v2 ShuffleNet_v2_1_5x ShuffleNet_v1_1x 0.1 ShuffleNet_v2_1x 0.05 EfficientNet-b0 2013 2014 2015 2016 2017 2018 2019 2020 Total amount of compute in teraflops/s-days used to train to AlexNet level performance. Lowest compute points at any given time shown in blue, all points measured shown in gray. 2,5,6,7,8,9,10,11,12,13,14,15,16 Source: OpenAI Overall, the TCO for cloud-native and hybrid infrastructure makes a strong case for consideration in government systems, if only during planning and research phases of new initiatives, even after a government deploys the core on-premise infrastructure. 80 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Advanced AI Connectivity The necessity of a common data interchange standard as well The attributes and values within a document are iterable— as compact and accessible interfaces in the design of an AI programs can “walk” the document to retrieve values—and capable infrastructure cannot be understated when pursuing mutable—programs can alter the values of the attributes. the task of access to clean data processing pipelines. JSON documents are the primary structure of document stor- Global Data Interchange age in several of the most successful databases and big-data Plan globally for data interchange the long run and act storage solutions on the market. JSON is also the preferred locally for data processing in the short term. The coor- format for data exchange among web services architectures dination and management of a digital infrastructure requires throughout the world of software development. There are in- conscious effort to identify existing requirements and plan the ternational standards for the structure and definition of JSON digitization of the most mission-critical systems. These sys- documents. More information about JSON is available at www. tems may not have a national scope. They may be localized json.org, and many other resources exist that cover the sub- operations devoted to managing reporting under an integrated ject matter exhaustively. The discussion of JSON in AI Archi- financial management information system, procurement, or tecture continues in the section on Leveraging Microservices. asset management and exchange. Whatever the entry point, stakeholders must carefully consider the design of the under- It will suffice to write that JSON provides a very efficient IPC lying architecture early during the planning phase with a focus standard for virtually any application that will ever be engi- on accommodating long-term use of data to satisfy secondary neered. It is fast, compact, semantically endowed for human operational requirements. Health, education, transportation, consumption, and provides a low barrier to entry for practitio- and public safety systems are examples of secondary opera- ners in need of rapid interchange of data between specialized tional requirements in a digital government environment. applications. In some instances, internal applications require even more performance, less readability of payload while Efficient Communication maintaining semantic interoperability. This leads engineers to with Compact Data consider protobuffers. In the early days following the shift from mainframe to distrib- uted systems of computing, engineers began to address IPC Protocol Buffers between applications, computers, and data centers with often When speed of interchange and consistent structure is cru- creative solutions. At the smallest scale, data communication cial for mission-critical applications—such as those in finance between applications using XML and SOAP protocols allowed or infrastructure management—protobuffers provide a valu- for independently specialized applications to share informa- able alternative to JSON. Protobufs are platform-independent, tion somewhat effectively. At a very large scale, immense data language-independent extensible mechanisms for serializing transfers required the physical transport of magnetic reels structured data. Once in a document schema, developers struc- and hard drives over land to mitigate the total cost of net- ture the data, and any applications wishing to communicate work transfer. Today’s standards may require the occasional with that data can simply implement an API that is automatically physical transport, but among emerging data technologies at generated in any programming language on any platform. massive scale, IPC is managed efficiently in real time using compact data standards and communication protocols. Two A specialized remote procedure call framework further extends standards in structured data stand out above others: JSON the power of protobuf’s compact data interchange format with and protocol buffers - also called protobuffers or protobufs. structured programmatic function definitions called gRPC. With gRPC in place, IPC occurs over any network topology by le- JavaScript Object Notation veraging exposed functions capable of ingesting and output- JSON is an object definition “language” standard that gives ting protobufs. This means that highly specialized and compact the practitioner the ability to define key-value relationships be- application services can be built to communicate in real time tween any number of values, which may be primitive types and process large volumes of information for large scale imple- such as strings, numbers, and Boolean values or complex mentations of AI services. This is the technology at the heart types such as arrays and nested JSON objects. Engineers of Google’s global infrastructure. The standards and software and data scientists refer to one of these comprehensive and supporting this technological breakthrough are FOSS. completely self-contained units of information as a document. 81 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Other Data Formats ited to the scope of each independent component within the Other standards in data do exist. Many are proprietary to the microservices application infrastructure. Teams can and often systems that leverage that data. One notable data format is do work independently to achieve incredibly rapid results for called Parquet—a FOSS construct of the Apache Software very large scale systems. Thus, the future of AI systems engi- Foundation. The purpose of Parquet was to provide a colum- neering and data fabric infrastructure rests on this fundamen- nar data access format that interoperates well with Hadoop tally advanced pattern of application architecture development software systems. Hadoop came to prominence in the 2010s regardless of the course of physical deployment, be it in the after Google Published a MapReduce white paper describ- cloud, on-premise, or a hybrid of both. ing in detail the design and development of the original archi- tecture that powered PageRank algorithms to notoriety. The A large volume of information exists on the subject of microser- report was reverse-engineered, and an open source MapRe- vices development. This appendix to the paper does not delve duce solution hit the market leading to a trend in Big Data into the subject further, but rather uses this mechanism as a talk- technology, which never fully panned out for Hadoop and its ing point to illustrate the necessary jargon that is essential in un- consortium of supporters. derstanding the factors allowing for the development of solutions. As a modern relic, Hadoop (and Parquet) technology proves primarily that industry hype can mislead practitioners in search Advanced AI Models of problems looking for a solution. In contrast, the more sim- ple, streamlined, and effective long-tail solutions and patterns of application and data architectures continue to satisfy the Artificial Neural Networks requirements for modern Big Data best practices. Neural networks are at the heart of advanced concepts in AI. Neural networks perform computations that derive potentially vast sets of self-selected features. Deep Learning relies on Leveraging Microservices artificial neural networks (ANNs). First studied in the 1950s, ANNs have emerged today through several cycles of dorman- cy due in large part to the copious amount of raw computing As mentioned, monolithic systems have critical faults that power available in the cloud. At their core, ANNs are orga- lead to eventual collapse, for reasons of obsolescence stem- nized layers of decision nodes called perceptrons. Numbers ming from stifling complexity. Microservices, conversely, are enter an input layer and exit through an output layer. Hidden a methodology of designing, architecting, and developing a layers exist between the two. The goal of ANNs is to iteratively wildly scalable infrastructure of highly specialized applica- learn weights for each perceptron layer and produce an ap- tions. Engineering teams focus on each application indepen- proximation of the desired result in the output layer. “Deep” dently, while inter-process communication, especially when refers to the number of hidden layers in an ANN, which may leveraging the power of gRPC, remains versioned through API be as few as seven to eight but most often hundreds. Figure standards. Thus, project management dependencies are lim- A.11 represents the basic ANN structure. > > > F I G U R E A . 1 1 . - Basic Deep Neural Network Structure Input Perceptron Hidden Perceptron Hidden Perceptron Output Perceptron Neuron Neuron Neuron Neuron Neuron Neuron Neuron Neuron Neuron Neuron Source: The World Bank. 82 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN > > > F I G U R E A . 1 2 . - AI and Machine Learning Algorithms and Applications Problem solving Search Constraint satisfaction Knowledge, reasoning and planning Artificial neural networks Reinforcement Logic agents Association rule learning learning First-order logic Bayesian networks Planning and acting Clustering Deep reinforcement Knowlege representation Decision tree learning learning Probabilistic reasoning Genetic algorithms Decision making Inductive logic programming Reinforcement learning Deep learning Learning Representation learning Learning from examples Rule-Based machine learning Supervised Unsupervised learning learning Knowledge in learning Similarity and metric learning Learning probabilistic models Sparse dictionary learning Machine learning Reinforcement learning Support vector machines Communication, Artificial Intelligence perceiving and acting Natural language processing Perception Robotics Source: Peter Elger and Eoin Shanaghy, AI as a Service, Manning 2020. Natural Language Processing Language is the most powerful and potent human mecha- as a tool for combating corruption and giving a voice to citi- nism. With great access to language data comes great re- zens. One project called Hack Oregon used natural language sponsibility. Today’s commercial email, word processing, and campaign finance data to find connections between political voice communication tools are constantly scanning and inter- donors because it seemed that politicians were hiding their preting human language with a goal of suggesting grammati- donors’ identities behind obfuscating language in their cam- cal corrections, advertisements, and translating our conver- paign finance filings. sations into written language. Smartphones and smarthomes alike respond to words, sometimes when Basic NLP systems track term fre- a passive conversation is “overheard.” quency relative to inverse document The annals of news reporting are at the mercy of suggestions catered to individ- Language is frequency (TF-IDF). These evolved to “chain” clusters of word frequencies in ual indulgences. At the heart of all this is NLP. the foundation order so that predictions could be made about the best “next” word, also called Since around 2013, NLP and chatbots upon which Markov chains. These conditional, proba- bilistic distributions have evolved since have gained presence nearly everywhere in society at large. Google search became we build our into very sophisticated systems of inter- preting, “understanding,” and formulating smarter and more capable of interpreting more human-like inquiries. Smartphone shared sense language into topics with semantic mean- ing using math alone. Fascinating barely auto-correct and auto-complete followed suit, and the emergence of personalized of humanity. begins to describe the power of NLP. phone assistants began to gain traction. Dr. Arwen Griffioen, Senior Data In government and beyond, the necessity In government, NLP began to emerge Scientist - Research, Zendesk of beneficial machines with prosocial be- 83 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN havior that leads to greater cooperation among actors remains a key focus of ongoing NLP research. Governments are able to leverage NLP for interfacing with citizens for many purposes, the least of which is gathering information about the quality of service within the government. As news sources become increasingly aligned with the indulgences and personal preferences detected among patrons of various internet service providers, the quality of information revealed to citizens in relation to the government also falls into question. A simplified process of NLP operations in the AI algorithm is depicted below in Figure A.13. > > > F I G U R E A . 1 3 . - Chat Box Recirculating (Recurrent) Pipeline Text Response 4. Execute string Generalize & classify Scored update models responses update objective update dialog plan select respond Structured data Scored 1. PARSE 2. Analyze Statement 3. Generate Tokenizers (feature vector) Check spelling Search regular expressions check grammar templates tag analyze sentiment FSM NER analyze humanness MCMC extract information analyze style RBM reduce dimensions Response CNN RNN feature GAN vector Database Scored Satements Statements responses scroes user profile Possible responses Source: Lane, Howard, and Hapke (2019). To highlight the many use cases, NLP makes it possible to quick exploration of how an NLP architecture operates pre- review contract submissions, resumes, proposals, campaign cedes later sections that enumerate the examples of NLP in advertisements, published documents, and financial trans- action within government. NLP is among the most interesting actions for authenticity with minimal bias. The mathematical topics in AI that will make a lasting impact on the way in which models that enable these technologies to perform such im- human beings interact with computers, organizations, the en- portant tasks fall outside the scope of this paper. Instead a vironment, and each other for decades to come. 84 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Generative Adversarial Networks Generative adversarial networks (GANs), introduced to the AI impossible for artificial systems, such as the ability to generate ecosystem in 2014 by Ian Goodfellow, enable computers to fake images (and videos) with real-world quality. GANs can generate realistic data by using two separate neural networks. turn a scribble into a photographic image or turn video foot- Although these were not the first computer programs used to age of a horse into a zebra—all without the need for incredibly generate data, their results and versatility set them apart from large painstakingly labeled data. A staggering example of how all the rest. GANs achieve remarkable and often alarmingly far machine data generation is able to advance because of convincing results that were previously considered virtually GANs is the synthesis of human faces—see Figure A.14. > > > F I G U R E A . 1 4 . - Progress in Synthetic Human Face Generation, 2014–2017 GAN Human Face Generation 2014 2015 2016 2017 Source: Brundage et al. (2018). By 2017, GANs enabled computers to synthesize fake faces rivaling high-resolution photographs. Most notably, GANs produced fake videos of notable celebrities and political figures whose speech and countenance are virtually indistinguishable from real life recordings simply by “mutating” the face of any recorded individual to appear as the synthesized individual, as shown in Figure A.15. This is of particular interest to government policymakers due to the fact that fake-news videos can be produced and prolifer- ated by anyone with access to GAN modeling toolkits in order to misinform and manipulate the public with practically any video content imaginable. > > > F I G U R E A . 1 5 . - GAN Transformation of One Politician into Another33 GAN Transformation INPUT CYCLE-GAN RECYCLE-GAN Source: Bansal (2018). 33. Watch the video: https://www.youtube.com/watch?v=F51RCdDIuUw. 85 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN GANs are a class of machine learning techniques that consist the concern over influence due to intentional disinformation of two simultaneously trained models competing with one an- spread through social media and the news. All hope is not lost, other as adversaries: one (the Generator) trained to generate however, with the introduction of authentication mechanisms fake data, and the other (the Discriminator) trained to discern that practitioners can implement to prevent the loss of integ- the fake data from real-world examples. rity for state-sponsored messaging using asset encryption and cryptographic signing, which produces a digital watermark us- The term “generative” refers to the overall purpose of the mod- ing state-sponsored media. Despite this possible solution, as el: to create “new” data. The data that GANs learn to gener- AI methods continue to improve, the need for more robust ate depends on the choice of the training set. In the example authentication and prevention mechanisms will accelerate in mentioned above, if a practitioner wants a GAN to generate order to keep pace with more advanced methods of image images that look like the president of any country, they will use and audio forgery. a training dataset of the president’s face. General Artificial Intelligence The term “adversarial” refers to the game-like competi- The ultimate goal of artificial intelligence is to emulate the in- tive dynamic between the two models that constitute the telligence of humans and animals by modeling the behavior of GAN framework. The Generator creates examples that are neural pathways and the brain. General artificial intelligence indistinguishable from the real-world data in the training set: takes that goal one step further by pursuing the ability to learn fake images of the public figure. The Discriminator verifies the how to learn. The mention of General AI conjures visions of authenticity of the images believed to be the president. The a singularity and the domination of mankind by sentient ma- two networks are continually trying to outwit each other: the chines. This is fodder for science fiction and cinema. Learning better the output of the Generator, the better the Discrimina- to learn—sentience—is beyond AI’s current capabilities. Pres- tor needs to be at distinguishing real examples from the fake ently, all AI practitioners operate within the confines of artifi- ones. The term “network” indicates the class of machine learn- cial methods of guided learning and modeling based mostly ing models most commonly used to represent the Generator on advanced statistical models built to process vast amounts and Discriminator: deep neural networks. The complexity of of information in order to assist with specific decision-making the artificial neural networks employed varies from simple to goals. They cultivate interpretive data models that train com- extreme and the results are unimaginably concerning for poli- puters to provide sound decision-making similarly to humans, cymakers interested in preserving public trust and ensuring by emulating the physiological design of the brain. General the safety of representatives of government charged with pro- AI is a proverbial mecca on the AI horizon that aims to elimi- tecting national security. nate the need for human influence over the learning process. There are no known instances of this phenomenon in current GANs is explored further in the section about AI in policy. employment among AI practitioners, although researchers Before progressing to the topic of general artificial intelligence, have made significant contributions to reach this ultimate goal. it is worth noting that technological advancements in AI also There is no doubt that current AI resources will be instrumen- enable concerns with voice synthesis in addition to image tal in the emergence of General AI, however, the timeline for synthesis. Present day AI technologies allow the mimicry of the realization of the singularity is uncertain. Therefore, it is human speech with relative ease. Thus, with moderate effort, outside the scope of the paper to explore this topic any further, AI models can produce human speech samples that are prac- however many resources exist for those interested in learning tically indistinguishable from actual human voice, furthering more about General AI. 86 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Real World AI Workflows The real world AI workflow has five main components: data Human-Out-of-the-Loop Workflow preparation, model building, evaluation, optimization, and pre- The basic AI workflow, absent of human oversight, is also dictions on new data. Although applying these steps has an in- referred to as an Out-of-the-Loop workflow. This simply herent order, most real-world applications revisit each step mul- refers to the fact that humans do not evaluate the predicted tiple times using an iterative process. Practitioners first build a outcomes before applications take additional action. It is worth model using historical input data from a particular ML algorithm. noting that this is a simplified representation of an AI system, Next, they iteratively evaluate model performance and optimize and the following human-out-of-the-loop table does not ac- for accuracy and scalability to fit the project requirements. Last, count for the steps one must take to optimize the AI model they use the final model to make predictions on new data in- building process, such as feature engineering and model tun- puts to the system. Historic data helps build the model, and new ing. Overall, an Out-of-the-Loop approach is a useful way to data flows into the resulting AI model to create predicted data. approach non-critical decision-making systems such as rec- Predicted data flows into data streams that may be useful in ap- ommendation engines and general classification engines— plications for additional computational workloads and eventual see Figure A.16. For mission-critical applications that result archival storage in distributed ledger technologies (DLT). This in consequential collateral actions—the detection of fraud and appendix touches upon DLT—a useful tool in combating long- other criminal acts—practitioners must employ advanced AI term data tampering—in later sections. workflows with human intervention built into the AI loop. > > > F I G U R E A . 1 6 . - Basic Out-of-the-Loop AI Workflow Basic AI Workflow New Data Historical Model Predicted Result Archival AI Model Event Stream Event Stream Data Building Data Model Model Archival Blockchain Evaluation Optimization Algorithm Algorithm Source: The World Bank. 87 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Human-in-the-Loop Workflows Of the many forms of advanced AI workflows, the simplified—yet advanced—flow depicted in Figure A.17 has the same compo- nents of the basic workflow, but with an additional human element that improves model performance and prevents unintended consequences in mission-critical systems. Workflows with human intervention loops are referred to as human-in-the-loop and human-over-the-loop workflows. > > > F I G U R E A . 1 7 . - Basic Out-of-the-Loop AI Workflow Advanced AI Workflow New Data Feature Model Result Archival Historical AI Model Event Stream Event Stream Data Engeneering Building YES Model Model Predicted Archival Blockchain Evaluation Optimization Data Algorithm Algorithm YES ~100% Match Vetted Data NO Human Intervention Source: The World Bank. 88 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN The Human-in-the-Loop workflow incorporates an additional logi- authorized applications that are capable of “listening” to the cal step in the overall workflow that presents all predicted data to data stream producers using APIs over the network. a human for further intervention. This logical gate may enrich the result data, complete with additional information related to the outcome. There, a human being can review the information and Distributed Ledger Technology select the proper classification. This is useful when there must be no doubt in the accuracy of predicted results. The Human-over- the-Loop workflow selectively gates results that demonstrate a Any mention of DLT among the general public stirs the topic high probability of accuracy for additional human intervention. of cryptocurrency and alternative currency markets, particu- This allows humans to concentrate on other mission-critical tasks larly Bitcoin. However, while Bitcoin is a particularly popular when the certainty of predicted results is uncompromising, and it example of DLT, when put to novel use, the subject of DLT allows for intervention when the quality of predicted results falls spans a much broader variety of topics. At the core, DLT is a below a threshold of desired probability. distributed network of computing nodes that manage identical ledgers containing blocks of data. Each ledger contains blocks Both variants of the advanced AI workflow may feed human-vet- linked together in a manner that prevents tampering by enforc- ted predicted results back to the historical data stream for model ing consensus requirements across the distributed network. retraining before continuing into the data stream that captures At a minimum, each block contains an index that references and distributes results to subsequent applications. Vetted data the block order, a timestamp that references creation time, a are particularly useful for improving the quality of decision-mak- cryptographic hash that is a signature of the data contents ing over time, because human intervention lessens the gap of “salted” with the previous hash, a hash referencing the pre- uncertainty and provides the model with increasing accuracy. ceding block, and the rows of data, which may be individually encrypted for added security. Feature engineering is also present in the advanced AI workflow. All problem domains require specific knowledge when deciding what data to collect. This valuable domain DLT Architecture knowledge can also be used to extract value from collected A block is defined simply as a collection of batched data. Block data. Creating new data from existing data are called feature size is determined by the fault tolerance of the blockchain net- engineering. This phase occurs prior to model building. Once work due to distributed denial-of-service attacks and other fac- the AI loop is functioning adequately, practitioners often find the tors related to network capacity. The typical block size is 1MB majority of their time going into this part of the optimization pro- but that can be tuned to the needs of a particular application. cess. This is the more creative part of developing AI solutions The data stored in blocks is typically metadata on the order of since it requires imagination and knowledge to invent ways to kilobytes in scale. Storing large files in blocks is contraindicated improve the model by extracting hidden value from standard to the functionality of the standard blockchain. Typically, large data. Common examples of feature extraction include convert- files are stored in a filesystem while the information describing ing dates and times to times of day/week/year, location-wran- their contents such as location, author, and perhaps a hashed gling, in conjunction with census data, and object detection in checksum that serves as a signature for the integrity of the file, land use imaging data that is useful in classification. is written to the batched data buffer that eventually becomes part of the block. Data stored in a block can be encrypted and The mention of data streams deserves some attention later deciphered upon retrieval. When batched data reaches when discussing real-world AI. Data streams are an impor- the block size limit, the block is hashed using a cryptographic tant component in hybrid AI architectures. During develop- algorithm, and the blockchain algorithm places a request to add ment, data scientists may load data from comma-separated the block to the distributed ledgers throughout the network. Mul- or tab-delimited data files for cleaning and processing. In tiple requests may be placed from different nodes in the net- practice, data flows from input to output in a constant stream. work; these are handled in sequential order, and the consensus Thus, the term “data stream” is applicable. It is fair to won- mechanism aids in orderly propagation of data. der “what” exactly streams the data. Data streams are usually event-driven applications that “listen” for specific events within Because a block hash is unique to data contained within the the system architecture. These are powered by open source block and the blockchain links blocks using hashes generated technologies, particularly Kafka, that consume data from vari- from the previous hash and current, any mutation to the data ous sources through a standardized API, such as those of- within a prior block will change the reference in subsequent fered by a relational database or ERP system. Data streams blocks, thereby breaking the blockchain. Figure A.18 illustrates are especially useful because they offer data to one or more the design of a single computing node within the DLT network. 89 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN > > > F I G U R E A . 1 8 . - Basic Blockchain Node Data Basic Blockchain Blockchain Application Blockchain Node Blockchain Ledger Block 0 [Genesis Block] Block 1 Block 2 Index Index Index Timestamp Timestamp Timestamp Hash Hash Hash dfd265ca281e96659f09954e eae7274960de395b72a87959 426017c20ea19a03689c8e40 Previous Hash Previous Hash Previous Hash null dfd265ca281e96659f09954e eae7274960de395b72a87959 Data Data Data ec805923 8c9b5c82 c3e83f08 b3121395 22ea779b a8e7d68f a3c96d6a a43ab504 ae2be25c 15d86fde 10dfdb40 8ab65a28 Source: The World Bank. Blockchain security is compelling for archival data storage in goal of each consensus algorithm is to validate the blockchain government systems. Yet, a single node is insufficient for es- integrity before the block append operation is distributed to all tablishing a proper DLT network. The owner of one central- the remaining nodes in the network. Energy consumption is ized node can simply alter any block arbitrarily and rewrite the reason so many forms of consensus exist. A network con- the hashes for all the subsequent blocks! Thus, the power sensus generally consumes a tremendous amount of compu- lies in decentralization. DLT architectures distribute the en- tational power—thus, electricity—and utilizes a large amount tire ledger to nodes qualified to participate in the DLT network of network resources. Therefore, it is imperative for the DLT and require consensus before new blocks may append to the network architecture to implement an efficient consensus blockchain. DLT requires consensus to prevent the Byzantine mechanism. Figure A.19 illustrates the architecture of a DLT Generals Problem, which arises when actors attempt conflict- network. When one of the nodes in the network captures ing actions such as overwriting or altering the blockchain with enough data to write a block to its local blockchain, it issues nefarious intent. a consensus request to other nodes in the network accord- ing to the rules of the consensus mechanism. When the net- Consensus work reaches a consensus, the source node writes the block, There are several mechanisms to achieve consensus: proof- and the block propagates throughout the remaining nodes in of-work, proof-of-stake, proof-of-bid, and the list goes on. The the network. 90 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN > > > F I G U R E A . 1 8 . - Basic Blockchain Node DLT Network Architecture Blockchain Node Blockchain Blockchain Node Node Blockchain Blockchain Node Node Blockchain Blockchain Node Node Blockchain Blockchain Node Node Source: The World Bank. Government information systems can benefit significantly The most prolific government projects leveraging DLT for the from the use of DLT for maintaining the integrity of mission- purposes of distributed trust are those of central banks. Banks critical data. Data for procurement, FMIS, and other systems leverage DLT for Treasury Single Accounts that are host to for- generating transactions are the primary benefactors. When a eign exchange transactions between central banks to speed transaction is generated, AI systems can send data to archival the settlement of international exchange. Several experimen- DLT nodes for archiving. Archived data stored in a DLT archi- tal models are under consideration by the Monetary Authority tecture helps maintain integrity throughout the network of par- of Singapore in ongoing research conducted through Project ticipating nodes for reasons that should be obvious, given the Ubin (https://www.mas.gov.sg/schemes-and-initiatives/Proj- context of preceding sections. Overall, a network of govern- ect-Ubin). ment agencies, or even departments within an agency, may become a stand-alone DLT network that is capable of main- Additionally, governments can benefit significantly from imple- taining, authenticating, and honoring long-term commitments menting DLT along with AI processes in procurement and lo- to data integrity within the government. Should any participat- gistics. By tracing the procurement process with a distributed ing node attempt to sabotage the integrity of the transactional ledger, equipment, raw materials, and various critical resourc- blockchain archive, a mechanism can be established to alert es can be transferred between parties with granular control. overseers of the transgression and preventative action can be taken to investigate the problem and take appropriate action to prevent any fraud or corruption. 91 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN >>> Appendix B. AI and the Sectors AI is gaining traction as an invaluable tool in urban planning, resource utilization, energy management, and climate change. Several practical applications are in development due in part to successful academic research funded by private enterprise. This trend will continue as humans occupy more densely populated urban areas that make use of natural resources in all manners. The scope of development and land use is enormous considering that most of human resource management touches on nearly every aspect of society in some form. Appendix B at- tempts to highlight many solutions that rely on AI for improvements in efficiency, scientific analy- sis, and prediction within the disciplines mentioned above. Figure B.1 illustrates the timeline of AI innovation in the environment and potential impact over the next 20 years. 92 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN > > > F I G U R E B . 1 . - Timeline of AI innovation in the Environment and Impact Over the Next 20 Years rm fo w n ne ies ba at ies ou t h r ts fo s pl cit ur or cit hr ar gh as g se ies f or d kt r E s ec n or er ing aj te em es on i ce rn or ea e aj , m ec y br ow ov rn od sp st a ef m g rg n n ien le sc a f C r re y s i on ce rs p in ne at n d i t le s sc nt rm si m c er ad e m e er e ien te AV or en te & nk ast t r ed fo at t h em cli sc pu ys t ed ct m at & liz - w & s EV ea rc du ce ia om h y di pl a ed er ts a w fo o id tr r d or om ta on fo liz th er c e/ in gr en an t Ea uni a t um da ls r c in at re ra s ea Ba sp ut er ec em pe re nt m id w m an a n m nt at d ea -d su ep cli ybr ce t r lly m w rial in rt & a On Qu Oc Co De De Fu AI H T 2020 2030 2040 re es in od in d ar tal D ad e R& m ltu iti tr liz g th fo e E igi ar tc & tra icu d th d ify sf ize ar or lic ds en gr pl ou tim sm b m tA gr dec d u om ra ar p op d ar & ne on gy d te f i Sm er ea nd bo e ig pu t sh im Au en pr -e es m da al-t es o -d co -t id AI Re er d W En up es m Ho Source: http://www3.weforum.org/docs/Harnessing_Artificial_Intelligence_for_the_Earth_report_2018.pdf. The list of sectors covers energy, agriculture, materials spoilage in the post-harvest supply chain. In parts of Africa and science, transportation, climate management, and urban Asia, AI helps maximize food production given the increasing planning. The overall effort is toward a more effective feed- dearth of annual rainfall, which forces farmers to become more back loop that mitigates risks brought on by overpopulation precise in their forecasting and planning. The use of computer and resource scarcity in all of these sectors. vision in combination with deep learning methods can detect potential fluctuations in pests, disease, water shortage, and harvestability. This is all a part of an emerging discipline called Agriculture precision agriculture. More specifically, a project called Ag-Analytics is collecting Agricultural innovators are currently using AI to model several farmland data in the cloud and making it available to farmers interdependent factors in an effort to maximize food production for precision agriculture. Ag-Analytics uses sensors to collect yields. By consuming vast amounts of weather conditions, soil, tillage, and yield-data for specific plots of farmland (https:// satellite and drone imaging, temperature, water use, soil analytics.ag/Home/HowItWorks). Microsoft Azure stores the conditions, crop rotation, and annual yields, AI systems are data and shares the information with farmers through user- able to suggest optimal planting patterns that guide heavy friendly APIs to lower costs, improve yields, and minimize the equipment using geospatial precision. AI monitoring assists environmental cost of agriculture. with managing water distribution during the growing season. As harvest approaches, AI leverages hundreds of thousands AI is also assisting with labor shortages in agriculture. of data points on the ground from the Internet of Things (IoT) As society becomes more urbanized, the supply of labor con- devices combined with satellite or drone imagery to determine tinues to move toward urban centers. Seasonal agricultural optimal harvest quality and accuracy, which minimizes food demand is faced with consistent shortages. Companies like 93 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Root.AI are developing robotic harvest systems to bridge the Ecology, Climate, and Conservation gap between supply and demand of labor during harvest. Ad- vanced methods in autonomous robotics, computer vision, botany, and biotechnology form the basis for the production Deforestation and land degradation are major problems of large scale operations capable of detecting ripeness and for ecosystems. Governments and NGOs are using AI to continuous harvesting at the peak of efficiency. monitor the steady decline of forests worldwide. By using multi-agent AI systems (MAS), resource utilization scenarios Chatbots also enable farmers to share information and can better understand the impact that agricultural expansion resolve problems in the supply chain. The proliferation of has on forest decline. MAS has the ability to manage complex chatbots in AI is made possible through the use of advanced systems with several stakeholders to allow the exploration of NLP frameworks. Farmers can turn to chatbots for difficulties alternative forest and land management systems. Moreover, in production planning and resource management that are MAS serves as a tool for learning and understanding, rather common to agriculture. than predictive analysis. Reinforcement learning (RL) meth- ods using computer vision and transfer learning are most suit- Agricultural monitoring by whole-of-government systems, us- able for forest management and conservation. ing a data fabric, can leverage resource production and pre- vent state capture events from occurring in underrepresented Climate change stemming from deforestation also requires regions. Many of the methods in agriculture are also relevant a comprehensive understanding of additional factors in the to mineral resources and energy production, so investment in overall health of both local and global ecosystems. Several these technologies is worth considering for the advancement AI subdomains are necessary for the comprehensive analysis of digital government systems. of such a monumental topic. Table B.1 illustrates the various subdomains relating to AI that are currently employed for cli- mate impact mitigation. > > > T A B L E B . 1 - AI for climate impact mitigation Unsurpevised Interpretable qualification Uncertainty Time-series Computer inference Transfer analysis learning learning Control Casual Vision Other RL & NLP ML 1 1.1 1.1 1 1.1 1.1 1.3 1.1 1.1 Electricity Systems 1.2 1.2 2.1 2 2.1 2 2.1 2 2.1 2 Transportation 2.2 2.4 2.4 2.4 2.4 Building & Cities 3.2 3.3 3 3 3.1 3.1 3.3 3 4.1 4.3 4.3 4 4.2 4.2 4.3 Industry 4.3 4.3 4.3 5.1 5.2 5.4 Farms & Forests 5.3 5.4 CO2 Removal 6.3 6.3 6.3 6.2 Climate Prediction 7.1 7 7.3 7 8.1 8.4 8.2 8.2 8.3 8.2 8.1 8.3 Societal Impacts 8.4 8.3 9.3 9.4 9.3 9.2 Solar Geoengineering 9.4 Tools for Individuals 10.1 10.1 10.2 10.3 10.2 10.1 10.2 10.2 11.1 11.2 11.3 11.2 11.1 11.1 11 11.1 11.1 Tools for Society 11.1 11.1 11.3 11.3 Education 12.2 12.1 Finance 13.2 13 13.2 Source: https://miro.medium.com/max/1400/0*7_Ilv_JRbf85ClQj. 94 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN The world’s oceans are under increasing threat due to Security (PAWS). PAWS uses AI to aid conservationists in the human overpopulation. A project called OceanMind is using fight against poaching by utilizing AI for learning, planning, satellites and AI to preserve biodiversity, protect the livelihoods and behavior modeling. PAWS collects information from previ- of fishermen, and prevent slavery in the fishing industry. It col- ous poaching activities and then generates predictions about laborates with governments to prevent illegal, unreported, and poaching locations and optimal patrol routes, resulting in more unregulated fishing by analyzing vessel movements in real effective patrols and better use of resources in the fight against time. AI algorithms detect anomalous behavior that Ocean- poaching endangered animal species (Fang 2013). Mind shares with regulatory agencies to direct ocean patrols more efficiently. More technical information about the goal of tackling climate change with AI is available from a technical report published In forest management and conservation, SilviaTerra is trans- by a consortium of researchers from many prominent universi- forming how conservationists and landowners measure and ties worldwide (Rolnick et. al. 2019). monitor forests (https://www.silviaterra.com). The system tracks an inventory of forest resources for the protection and management of ecological, social, and economic health. Sil- Urban Planning viaTerra uses AI frameworks on Microsoft Azure to study the effects of climate change and improve habitats using high-res- olution satellite imagery, U.S. Forest Service inventory, and In one prominent example, researchers leverage advanced field data to train AI models to measure forest values. methods in predictive analysis using AI for urban planning. By using cellular automata in conjunction with evolutionary al- In species conservation to fight extinction, Wild Me is leverag- gorithms and AI, a mathematical model for predicting evolving ing computer vision, citizen science, and deep learning algo- spatial patterns examines the impact of policy and geography rithms to power Wildbook (http://www.wildbook.org/doku.php). on the outcomes of various urban planning scenarios (Yang et. Wildbook scans and identifies individual animals and species. al. 2019). In plain English, this means they are using math to Wildbook is notably an open source platform. It provides scal- model the evolution of any urban environment over time. This able and collaborative wildlife data storage and management, framework optimizes Urban Development Demand by leverag- extensible easy-to-use software tools, API support, data expo- ing a model to synthesize changes in urban growth boundar- sure to external biodiversity resources, and animal biometrics ies (UGB). The model uses historical observations of different that support easy data access. This robust design for data in- time intervals and per-capita land requirements. Next, a patch- terchange using APIs makes it a stellar example of a system based cellular automata (CA) model simulates urban growth by that will integrate well with a whole-government data fabric ar- estimating urban development probability using a random for- chitecture. (Wildbook, Software to Combat Extinction) Another est machine learning algorithm (Figure B.2). project in the same domain is Protection Assistant for Wildlife > > > F I G U R E A . 1 8 . - Basic Blockchain Node Datasets Random subset 1 Random subset 2 Random subset 3 Random subset 4 Tree 1 Tree 1 Tree 3 Tree 4 Class 2 Class 2 Class 1 Class 1 Voting Outcome: Class 1 Source: Yang et. al. 2019. 95 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN The “patches” represent plots of land. Then, genetic algorithms optimize key model parameters, and finally the system aggregates land maps from multiple model runs to generate UGB alternatives. The random forest (RF) algorithm models a classification hier- archy using a strategy that creates a “forest” of individual decision trees. RF is hardly the only model in AI, but it is the most useful in this case. Each “tree” in the RF model makes independent decisions based on the feature variables and a random selection of observations derived from training data. Final outputs are the resulting averages of the decisions of the individual trees, which is considered a “voting strategy” that generates the resulting outcome. The RF method is insensitive to outliers, noise, and overfitting. Figure B.3 illustrates the workflow of modules within this predictive UGB framework. > > > F I G U R E B . 3 . - Workflow of Modules within Predictive UGB Framework Population Driving Land use/ Protected factors cover areas Trend extrapolations Random Forest Projections Probability of urban Ecological constraints of urban demand development for urban development PAT C H - B A S E D C E L L U L A R A U T O M ATA M O D E L Optimized using Genetic Algorithm Organic Proportion Organic urban Normal distributions Spontaneous urban growth procedure of patch size growth procedure Isometry D E L I N E AT I O N O F U R B A N G R O W T H B O U N D A R I O E S Simulated urban Morphology UGBs under land maps Operators different scenarios Source: Yang et. al. 2019. 96 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Figure B.4 shows the CA patch generation function with a size of three cells. Think of the cells as patches of land with land use probabilities. Note again that the cells represent plots of land area. > > > F I G U R E B . 4 . - Cellular Automata Patch Generation Function with a Size of Three Cells 0.2 0.4 0.2 0.3 0.2 0.2 The seeding procedure 0.3 0.2 0.7 0.5 0.6 0.4 0.2 0.3 0.8 0.7 0.6 0.6 0.3 0.1 0.8 0.6 0.8 0.3 0.7 0.8 0.6 0.7 0.2 0.6 0.6 0.5 0.7 0.4 0.5 0.3 The urban development The pruning operation to Survival test to find seed A cell survives to be the probability eclude impossible cells from the candicate cells seed to initialize a patch The self-growing procedure Scaling to probability of The selected cell survives and Add the seed to the patch and Centrally placing a scanning overlapped neighbors using the is added to the patch, and its randomly select another cell to window on the seed and add its isometry parameter neighbors is added to the pool take part in the survival test neighbors to the candidate pool Randomly select another cell The selected cell don’t survive Randomly select another cell The selected cell survives and is from the candidate pool to take the test and is excluded from the from the candidate pool to take added to the patch. The patch part in the surival test patch-growing procedure part in the survival test generation process ends. Cells that are Cells that don’t survive Seed that survives the Selected central cell of the excluded from the the test during the self test to initialize an scanning window to take seeding procedure growing procedure urban patch part in the survival test Candidate cells Cells that survive the Candidate cells whose Candidate cells that don’t for self-growing test and is added to probability is scaled by survive the test during the of new patches the patch the ismetry parameter self-growing procedure Source: Yang et. al. 2019. 97 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Lastly, Figure B.5 illustrates the simulated and observed land map in 2009 and 2016. Map “a” is the observed land map, while map “b” is the simulated solution with the highest fitness score, meaning the best fitting results of the genetic algorithm. Note the ac- curacy of the predictions. In the real world, this model was put to use in an undisclosed rapidly growing city in China and revealed high reliability in the simulation of urban growth and the delineation of UGBs. > > > F I G U R E B . 5 . - Simulated and Observed Land Maps in 2009 and 2016: Part 1. 2009 Part 2. 2016 Note: Map “a” is the observed land map; map “b” is the simulated solution with the highest fitness score. Source: Yang et. al. 2019. The patch-based CA model, which represents urban growth The results also suggest that empirical (observed) knowledge as an organic and spontaneous process can simulate more from historical observations can assist the genetic algorithm realistic urban landscapes by coupling the spatial process with with avoiding overfitting, to some extent. Although this model the pattern of urban development. The RF model can suc- leverages simple population projection methods, the factors cessfully show the relationship between driving policy factors that drive future urban development can be further enhanced, and the urban development probability. Key model parameter and government planners can derive a deeper understand- calibration is achieved through genetic algorithms that cap- ing and analysis of the resulting planning scenarios with more ture the landscape characteristics of historical urban changes comprehensive data. quite well and can therefore be used for future projections. 98 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Energy and Smart Cities The introduction of IoT in cities around the world enables of widespread transition to connected electric vehicles, which the use of AI in the management and planning of elec- serve as distributed reserves of electricity. tricity. IoT smart meters transmit information over Wi-Fi using passive communication between a grid of meters, distributed One notable example of AI in a data fabric stems from the in high and moderate density urban environments. The data inception of SmartGrid AI systems using a large scale data allows energy companies to adjust electricity production to layer that transformed power grid utilization in Ontario, Cana- nearly real-time accuracy. Prior to the advent of smart grid da. The project serves over 70 regional distribution companies technology, power companies needed to predict demand handling reads from over four million meters and processing based on a combination of environmental predictions using over 100 million transactions per day (KX Systems 2014). A weather and temperature forecasting and almanac predic- similar project is called FinGrid, which is run by the primary tions. This led to massive inefficiencies and wasteful produc- transmission provider for Finland (KX Systems 2018). FinGrid tion of electricity. will process data from 3.7 million locations to deliver 15-min- ute imbalance settlements between electricity suppliers and The invention of large scale data processing systems and consumers, an EU regulatory requirement, by December introduction of data fabric infrastructure allowed power com- 2020. This architecture is called DataHub. The data migra- panies to transition to consuming massive pipelines of infor- tion and go-live planning are important examples of how exist- mation about electricity use, thereby reducing the impact of ing systems can transition to entirely new data architectures electricity production on the environment and improving the with minimal disruptions to existing mission-critical services. overall efficiency of the electricity marketplace. The utilization Figure B.6 illustrates the proposed transition plan for FinGrid of electricity in modern cities is now burgeoning as a result (Fingrid Datahub 2019). > > > F I G U R E B . 6 . - Proposed Transition Plan for FinGrid Certification Market party Market Market party testing started: has completed party’s trial has pass the 1. class test Datahub operation II go-live dree cases ran & certification 30.9.2021 rehearsal accepted 7.5.2021 7.21.2021 1.1.2020 PHASE III: PHASE IV: PHASE V: Consistency check III Consistency check and Second trial operation (new IDs) donwloading data to Consistency check (T5) 29.5.2020 Datahub (T4) 30.3.2021 7.9.2021 2019 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 21.2.2022 2020 2020 2020 2020 2021 2021 2021 2021 Data Migration Phase 3 1/2020 - 12/2020 Phase 4 1/2021-5/2021 Phase 5 5 6/2021 - 2/2022 PHASE III: PHASE V: Go-live dress rehearsal: Consistency check V (T3) (First trial operation): Consistency check 25.11.2020 Consistency check 30.11.2021 7.6.2021 Market Market party Market party has party trial is ready been proven to be operation I to start prepared to enter 30.6.2021 go-live dress into production rehearsal environment 7.11.2021 21.12.2021 Source: Fingrid Datahub 2019. 99 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Figure B.7 illustrates the workflow of the data migration project for FinGrid using kdb+, which is defined here as a column-based relational time series database (TSDB) with in-memory database (IMDB) abilities. It is commonly used in high-frequency data sets needing storage and retrieval of large data sets at high speed. > > > F I G U R E B . 7 . - The Workflow of the Data Migration Project for FinGrid Using kdb+ Market parties Check reports, Source data migration reports Follow-up Market party portal Support Checked Fingrid data Reporting Data checks Transfer Datahub Verification Migration FINGRID reports External data Reference Datahub Titta sources data Service provider Solteq Oyj Source: Fingrid Datahub (2019). For more information about Datahub and Fingrid, visit the Fingrid website at https://www.ediel.fi/en/datahub/business-processes/business-process-other-datahub-instructions. 100 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN >>> Glossary One or more databases containing extremely large data from various Big Data sources. The scope of a government’s ownership and use of data, applications, Data and infrastructure defined by geographic, political, and national Dominion boundaries. An interconnected data storage infrastructure that provides a common set Data Fabric, of interfaces and access control layers for “Big Data” operations spread Data Lake across thousands of servers that may be geographically distributed. Data An architecture that is isolated due to the absence of a common Silo application programming interface (API) for inter-process communication. A self-contained JSON object specifying the attributes and values in Document a comprehensive unit of information that is iterable, transactable, and mutable. Dummy A binary feature that indicates that an observation is (or is not) a member Variable of a category. The input attributes that are used to predict the target, which may be Features numerical or categorical. Feature A form of machine learning optimization that leverages collected data to Engineering extract features. GANs are a class of machine learning techniques that consist of two Generative simultaneously trained models competing as adversaries with one another: Adversarial one (the Generator) trained to generate fake data, and the other (the Network Discriminator) trained to discern the fake data from real examples. Ground Truth The value of a known target variable or label for a training or test set. Instance A single object, observation, transaction, or record. (Or Example) A mathematical object describing the relationship between features and Model the target. Online Machine A form of machine learning in which predictions are made, and the model Learning is updated, for each example. The process of cleaning and correcting errors and inconsistencies in Preprocessing collected data. Also referred to as data munging or data wrangling. Protocol Buf- A language-neutral, platform-neutral, extensible mechanism for serializing fers, Protobuf- structured data. fers, Protobuf Recall Using a model to predict a target or label. Supervised Machine learning in which, given examples for which the output value is Machine known, the training process infers a function that relates input values to Learning the output. Target The numerical or categorical (label) attribute of interest. This is the (Or Label) variable to be predicted for each new instance. Training Data The set of instances with a known target to be used to fit a ML model. Unsupervised Machine learning techniques that do not rely on labeled examples, but Machine rather attempt to find hidden structure in unlabeled data. Learning 101 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN >>> References ACT-IAC (American Council for Technology and Industry Advisory Council). 2020. Artificial Intelligence (AI) Playbook for the U.S. Federal Government. Fairfax, VA: ACT-IAC. https://www.actiac.org/system/files/AI%20Playbook_1.pdf. ANI (Asian News International). 2019. “Bahrain and UK First in the World to Pilot New Artificial Intelligence Procurement Guidelines Across Government.” Business Standard, July 4, 2019. https://www.business-standard.com/article/news-ani/bahrain-and-uk- first-in-the-world-to-pilot-new-artificial-intelligence-procurement-guidelines-across- government-119070401389_1.html. Bansal, Aayush. 2018. “Donald Trump to Barack Obama.” August 11, 2018. YouTube video, 0:06. https://www.youtube.com/watch?v=F51RCdDIuUw. Berryhill, Jamie, Kévin Kok Heang, Rob Clogher, and Keegan McBride. 2019. Hello, World: Artificial Intelligence and its Use in the Public Sector. Paris: OECD Publishing. https://oecd-opsi.org/wp-content/uploads/2019/11/AI-report-Online.pdf. Brundage, Miles, Shahar Avin, Jack Clark, Helen Toner, Peter Eckersley, Ben Garfinkel, Allan Dafoe, Paul Scharre, Thomas Zeitzoff, Bobby Filar, Hyrum Anderson, Heather Roff, Gregory C. Allen, Jacob Steinhardt, Carrick Flynn, Seán Ó hÉigeartaigh, Simon Beard, Haydn Belfield, Sebastian Farquhar, Clare Lyle, Rebecca Crootof, Owain Evans, Michael Page, Joanna Bryson, Roman Yampolskiy, and Dario Amodei. 2018. The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation. Oxford, UK: Future of Humanity Institute. https://maliciousaireport.com/. Buchanan, Ben. 2020. 2020. “A National Security Research Agenda for Cybersecurity and Artificial Intelligence.” CSET Issue Brief, Center for Security and Emerging Technology, Washington, DC. https://cset.georgetown.edu/wp-content/uploads/CSET-A-National-Security-Research- Agenda-for-Cybersecurity-and-Artificial-Intelligence.pdf. Bughin, Jacques, Jeongmin Seong, James Manyika, Michael Chui, and Raoul Joshi. 2018. “Notes from the AI Frontier: Modeling the Impact of AI on the World Economy.” report, McKinsey & Company, Washington, DC. https://www.mckinsey.com/featured-insights/artificial-intelligence/notes-from-the-ai- frontier-modeling-the-impact-of-ai-on-the-world-economy#. Chen, Stephen. 2019. “Is Fraud-Busting AI Systems Being Turned Off for Being Too Efficient?” South China Morning Post, February 4, 2019. https://www.scmp.com/news/ china/science/article/2184857/chinas-corruption-busting-ai-system-zero-trust-being- turned-being. 102 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Chilamkurthy, Kowshik. 2020. “Reinforcement Learning for Covid-19: Simulation and Optimal Policy.” Towards Data Science (blog), March 31, 2020. https://towardsdatascience.com/reinforcement-learning-for-covid-19-simulation-and- optimal-policy-b90719820a7f. Coursera. 2019. Craddock, M. 2019. “UN Global Platform.” Retrieved June 27, 2020, from https://unstats. un.org/unsd/bigdata/conferences/2019/presentations/seminar/day1/5th%20Big%20 Data%20External%20Workshop%20Slides%20-%20UN%20Global%20Platform.pdf. Dandekar, Raj, and George Barbastathis. 2020. “Quantifying the Effect of Quarantine Control in Covid-19 Infectious Spread Using Machine Learning.” medRxiv; 2020. DOI: 10.1101/2020.04.03.20052084. https://www.medrxiv.org/ content/10.1101/2020.04.03.20052084v1. Data Center Map. 2020. Dignan, Larry. 2017. “IBM’s Rometty Lays Out AI Considerations, Ethical Principles.” Between the Lines (blog), June 17, 2017. https://www.zdnet.com/article/ibms-rometty- lays-out-ai-considerations-ethical-principles/. Dutton, Tim. 2018. Building an AI World Report On National and Regional AI Strategies. Toronto, Canada: CIFAR. https://www.cifar.ca/docs/default-source/ai-society/ buildinganaiworld_eng.pdf. EC (European Commission). 2019. Ethics Guidelines for Trustworthy AI. An independent Report by the High-Level Expert Group on Artificial Intelligence. Brussels: European Commission. Eykholt, Kevin, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno, and Dawn Song. 2018. “Robust Physical-World Attacks on Deep Learning Models.” arXiv:1707.08945v5. https://arxiv.org/abs/1707.08945. Fang, Fei. 2013. “Protection Assistant for Wildlife Security.” Societal Computing (Applied Systems and Infrastructure), November 13, 2013. https://sc.cs.cmu.edu/research- detail/102-protection-assistant-for-wildlife-security. Federico, C., and T. Thompson. 2019. “Do IRS Computers Dream About Tax Cheats? Artificial Intelligence and Big Data in Tax Enforcement and Compliance.” Journal of Tax Practice & Procedure February–March 2019: 43–47. https://www.crowell.com/files/2019- Feb-March-Do-IRS-Computers-Dream-About-Tax-Cheats-Federico.pdf. Feldstein, Steven. 2019. “The Global Expansion of AI Surveillance.” Working Paper, Carnegie Endowment for International Peace, Washington, DC. https:// carnegieendowment.org/files/WP-Feldstein-AISurveillance_final1.pdf. Fingrid Datahub. 2019. Go-Live Plan for Centralized Information Exchange Services (Datahub) for Electricity Market. Helsinki, Finland: Fingrid Datahub Oy. (May 28, 2019). https://www.ediel.fi/sites/default/files/Go-Live%20plan%20for%20centralised%20 information%20exchange%20services%20%28Datahub%29%20for%20electricity%20 market.pdf. 103 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Fjeld, Jessica, Nele Achten, Hannah Hilligoss, Adam Nagy, and Madhulika Srikumar. 2020. Principled Artificial Intelligence: Mapping Consensus in Ethical and Rights-Based Approaches to Principles for AI. Cambridge, MA: Berkman Klein Center for Internet and Society. https://dash.harvard.edu/handle/1/42160420. Floridi, Luciano, Josh Cowls, Monica Beltrametti, Raja Chatila, Patrice Chazerand, Virginia Dignum, Christoph Luetge, Robert Madelin, Ugo Pagallo, Francesca Rossi, Burkhard Schafer, Peggy Valcke, and Effy Vayena. 2018. AI4People’s Ethical Framework for a Good AI Society: Opportunities, Risks, Principles, and Recommendations. Brussels: Atomium–European Institute for Science, Media and Democracy Atomium. https://www.eismd.eu/wp-content/uploads/2019/11/AI4People%E2%80%99s-Ethical- Framework-for-a-Good-AI-Society_compressed.pdf. Gartner. 2019. “Gartner Identifies Top 10 Data and Analytics Technology Trends for 2019.” Press Release, February 18, 2019. https://www.gartner.com/en/newsroom/press-releases/2019-02-18-gartner-identifies-top- 10-data-and-analytics-technolo. GDS (Government Digital Service) and OAI (Office for Artificial Intelligence). 2019. “A Guide to Using Artificial Intelligence in the Public Sector.” GOV.UK, June 10. https://www.gov.uk/government/collections/a-guide-to-using-artificial-intelligence-in-the- public-sector#contents. Government of Canada. 2019a. “Directive on Automated Decision-Making.” Government of Canada, modified February 5, 2019. https://www.tbs-sct.gc.ca/pol/doc-eng.aspx?id=32592. IEEE (Institute of Electrical and Electronics Engineers). 2019. Ethically Aligned Design: A Vision for Prioritizing Human Well-being with Autonomous and Intelligent Systems, First Edition. Piscataway, NJ: IEEE. https://standards.ieee.org/content/dam/ieee-standards/standards/web/documents/other/ ead1e.pdf. IOTA (Intra-European Organisation of Tax Administrators). 2018. Impact of Digitalisation on the Transformation of Tax Administrations. Budapest, Hungary: IOTA. https://www.iota-tax.org/sites/default/files/publications/public_files/impact-of-digitalisation- online-final.pdf. ITU (International Telecommunication Union). 2019. Measuring Digital Development: Facts and Figures 2019. Geneva: ITU. https://www.itu.int/en/ITU-D/Statistics/Documents/facts/FactsFigures2019.pdf. Kernighan, Brian and Rob Pike. 1984. The Unix Programming Environment. New Jersey: Prentice Hall. KX Systems. 2014. “KX Systems’ kdb+ Chosen by Ontario Electric Grid Operator.” Press Release, June 22, 2014. https://kx.com/news/kdb-technology-chosen-for-retrieval-and-querying-of-smart-meter- data-processed-by-ontario-smart-metering-system/. KX Systems. 2018. “European Energy Market Contract Win with FinGrid.” Press Release, 104 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN July 16, 2018. https://kx.com/news/european-energy-market-contract-win/. Lane, Hobson, Hannes Hapke, and Cole Howard. 2019. Natural Language Processing in Action. Shelter Island, NY: Manning Publications. McKinsey Global Institute. 2017. Harnessing Automation for a Future that Works. Washington, DC: McKinsey & Company. Mazzucato, Mariana. 2015. “Re-Igniting Public and Private Investments in Innovation.” Report presented at the U.S. Senate Forum of the Middle Class Prosperity Project “Building the Economy of the Future: Why Federal Investments in Science and Innovation Matter.” Washington, DC, July 27. https://marianamazzucato.com/wp-content/ uploads/2015/07/Mazzucato-Statement-Middle-Class-Prosperity-Project-.pdf. Mozur, Paul, and Lin Qiqing. 2019. “Hong Kong Takes Symbolic Stand Against China’s High-Tech Controls.” New York Times, October 3, 2019. https://www.nytimes. com/2019/10/03/technology/hong-kong-china-tech-surveillance.html. Nakasone, Keith. “Game Changers: Artificial Intelligence Part II; Artificial Intelligence and the Federal Government.” Statement of Keith Nakasone, Deputy Assistant Commissioner, Acquisition Operations, Office of Information Technology Category (ITC), U.S. General Services Administration, before the Subcommittee on Information Technology of the Committee on Oversight and Government Reform, Washington, DC, March 7, 2018. https://republicans-oversight.house.gov/wp-content/uploads/2018/03/Nakasone-GSA- Statement-AI-II-3-7.pdf. 2154 Rayburn House Office Building Ntoutsi, Eirini., Pavlos Fafalios, Ujwal Gadiraju, Vasileios Iosifidis, Wolfgang Nejdl, Maria- Esther Vidal, Salvatore Ruggieri, Franco Turini, Symeon Papadopoulos, Emmanouil Krasanakis, Ioannis Kompatsiaris, Katharina Kinder-Kurlanda, Claudia Wagner, Fariba Karimi, Miriam Fernandez, Harith Alani, Bettina Berendt, Tina Kruegel, Christian Heinze, Klaus Broelemann, Gjergji Kasneci, Thanassis Tiropanis, and Steffen Staab. 2020. “Bias in Data-Driven Artificial Intelligence Systems—An introductory Survey.” WIREs Data Mining and Knowledge Discovery 10 (3). O’Brien, Tim, Steve Sweetman, Natasha Crampton, and Venky Veeraraghavan. 2020. “How Global Tech Companies Can Champion Ethical AI.” World Economic Forum Annual Meeting, Davos-Klosters, Switzerland, January 21-24, 2020. https://www.weforum.org/agenda/2020/01/tech-companies-ethics-responsible-ai- microsoft/. OECD (Organisation for Economic Co-operation and Development). 2016. Preventing Corruption in Public Procurement. Paris: OECD Publishing. http://www.oecd.org/gov/ ethics/Corruption-Public-Procurement-Brochure.pdf. OECD (Organisation for Economic Co-operation and Development). 2019. “Recommendation of the Council on Artificial Intelligence.” OECD Legal Instruments, May 21, 2019. https://legalinstruments.oecd.org/en/instruments/OECD-LEGAL-0449. 105 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN PDPC (Personal Data Protection Commission). 2020. Model Artificial Intelligence Governance Framework (Second Edition). 2020. Mapletree Business City, Singapore: Infocomm Media Development Authority and PDPC. https://www.pdpc.gov.sg/-/media/ files/pdpc/pdf-files/resource-for-organisation/ai/sgmodelaigovframework2.pdf. Perrault, Raymond, Yoav Shoham, Erik Brynjolfsson, Jack Clark, John Etchemendy, Barbara Grosz, Terah Lyons, James Manyika, Saurabh Mishra, and Juan Carlos Niebles. 2019. The AI Index 2019 Annual Report. Stanford, CA: AI Index Steering Committee, Human-Centered AI Institute, Stanford University. Public-Private Analytic Exchange Program. 2018. AI: Using Standards to Mitigate Risks. Washington, DC: U.S. Department of Homeland Security. https://www.dhs.gov/sites/default/files/publications/2018_AEP_Artificial_Intelligence.pdf. Rolnick, David, Priya L. Donti, Lynn H. Kaack, Kelly Kochanski, Alexandre Lacoste, Kris Sankaran, Andrew Slavin Ross, Nikola Milojevic-Dupont, Natasha Jaques, Anna Waldman-Brown, Alexandra Luccioni, Tegan Maharaj, Evan D. Sherwin, S. Karthik Mukkavilli, Konrad P. Kording,Carla Gomes, Andrew Y. Ng, Demis Hassabis, John C. Platt, Felix Creutzig, Jennifer Chayes, and Yoshua Bengio. 2019. “Tackling Climate Change with Machine Learning.” arXiv: arXiv:1906.05433v2. https://arxiv.org/pdf/1906.05433v2.pdf. Rossi, Francesca. 2019. “Building Trust In Artificial Intelligence.” Journal of International Affairs 72 (1). https://jia.sipa.columbia.edu/building-trust-artificial-intelligence. Stiglitz. 2018. https://royalsociety.org/science-events-and-lectures/2018/09/you-and-ai/ The Open Group. 2018. The Open Group Base Specifications Issue 7. San Francisco: The Open Group. https://pubs.opengroup.org/onlinepubs/9699919799/. UN (United Nations). 2004. United Nations Convention Against Corruption. New York: United Nations. https://www.unodc.org/documents/treaties/UNCAC/Publications/Convention/08-50026_E.pdf. UNESCO (United Nations Educational, Scientific, and Cultural Organization). 2020. “UNESCO Appoints International Expert Group to Draft Global Recommendation on the Ethics of AI.” Press Release, March 11, 2020. https://en.unesco.org/news/unesco- appoints-international-expert-group-draft-global-recommendation-ethics-ai. U.S. Department of the Treasury. 2017. 2017 Annual Privacy, Data Mining, and Section 803 Reports. Washington, DC: Department of the Treasury. https://home.treasury.gov/ system/files/236/annual-privacy-data-mining-Report-and-section-803-Report-final-2.pdf. van Eyk, E., L. Toader, S. Talluri, L. Versluis, A. Uta, and A. Iosup. 2018. “Serverless Is More: From PaaS to Present Cloud Computing.” IEEE Internet Computing 22 (5): 8–17. Venkateswaran, T.V. 2020. “AI Isn’t Unbiased because Humans are Biased.” The Eighth Column (blog), February 19, 2020. https://thefederal.com/the-eighth-column/artificial- intelligence-algorithms-unbiased-humans-biased/. WEF (World Economic Forum). 2018. Harnessing Artificial Intelligence for the Earth Cologny, Switzerland: World Economic Forum. http://www3.weforum.org/docs/Harnessing_Artificial_Intelligence_for_the_Earth_ Report_2018.pdf. 106 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN WEF (World Economic Forum). 2020. “AI Procurement in a Box: AI Government Procurement Guidelines.” Toolkit June 2020, World Economic Forum, Cologny, Switzerland. http://www3.weforum.org/docs/WEF_AI_Procurement_in_a_Box_AI_ Government_Procurement_Guidelines_2020.pdf. West, Darrell. 2018. “Will Robots and AI Take Your Job? The Economic and Political Consequences of Automation.” TechTank (blog), April 18, 2018. https://www.brookings. edu/blog/techtank/2018/04/18/will-robots-and-ai-take-your-job-the-economic-and-political- consequences-of-automation/. World Bank. 2016. World Development Report 2016: Digital Dividends. Washington, DC: World Bank. https://www.worldbank.org/en/publication/wdr2016. Yang, J., J. Gong, W. Tang, Y. Shen, C. Liu, and J. Gao. 2019. “Delineation of Urban Growth Boundaries Using a Patch-Based Cellular Automata Model under Multiple Spatial and Socio-Economic Scenarios.” Sustainability 11 (21): 6159. https://www.mdpi.com/2071-1050/11/21/6159. Zheng, Stephan, Alex Trott, Sunil Srinivasa, Nikhil Naik, Melvin Gruesbeck, David Parkes, and Richard Socher. 2020. “The AI Economist: Improving Equality and Productivity with AI-Driven Tax Policies.” Salesforce Research (blog), April 28, 2020. https://blog.einstein.ai/the-ai-economist/. 107 >>> EQUITABLE GROWTH, FINANCE & INSTITUTIONS INSIGHT | GOVTECH LAUNCH REPORT AND SHORT-TERM ACTION PLAN Supported by the GovTech Global Partnership: www.worldbank.org/govtech Republic of Korea