Publication: Missing Evidence: Tracking Academic Data Use around the World
Loading...
Date
2024-01-25
ISSN
Published
2024-01-25
Author(s)
Editor(s)
Abstract
Data-driven research on a country is key to producing evidence-based public policies. Yet little is known about where data-driven research is lacking and how it could be expanded. This paper proposes a method for tracking academic data use by country of subject, applying natural language processing to open-access research papers. The model’s predictions produce country estimates of the number of articles using data that are highly correlated with a human-coded approach, with a correlation of 0.99. Analyzing more than 1 million academic articles, the paper finds that the number of articles on a country is strongly correlated with its gross domestic product per capita, population, and the quality of its national statistical system. The paper identifies data sources that are strongly associated with data-driven research and finds that availability of subnational data appears to be particularly important. Finally, the paper classifies countries into groups based on whether they could most benefit from increasing their supply of or demand for data. The findings show that the former applies to many low- and lower-middle-income countries, while the latter applies to many upper-middle- and high-income countries.
Link to Data Set
Citation
“Stacy, Brian; Kitzmüller, Lucas; Wang, Xiaoyu; Mahler, Daniel Gerszon; Serajuddin, Umar. 2024. Missing Evidence: Tracking Academic Data Use around the World. Policy Research Working Paper; 10673. © World Bank. http://hdl.handle.net/10986/40965 License: CC BY 3.0 IGO.”
Associated URLs
Associated content
Other publications in this report series
Publication The Macroeconomic Implications of Climate Change Impacts and Adaptation Options(Washington, DC: World Bank, 2025-05-29)Estimating the macroeconomic implications of climate change impacts and adaptation options is a topic of intense research. This paper presents a framework in the World Bank's macrostructural model to assess climate-related damages. This approach has been used in many Country Climate and Development Reports, a World Bank diagnostic that identifies priorities to ensure continued development in spite of climate change and climate policy objectives. The methodology captures a set of impact channels through which climate change affects the economy by (1) connecting a set of biophysical models to the macroeconomic model and (2) exploring a set of development and climate scenarios. The paper summarizes the results for five countries, highlighting the sources and magnitudes of their vulnerability --- with estimated gross domestic product losses in 2050 exceeding 10 percent of gross domestic product in some countries and scenarios, although only a small set of impact channels is included. The paper also presents estimates of the macroeconomic gains from sector-level adaptation interventions, considering their upfront costs and avoided climate impacts and finding significant net gross domestic product gains from adaptation opportunities identified in the Country Climate and Development Reports. Finally, the paper discusses the limits of current modeling approaches, and their complementarity with empirical approaches based on historical data series. The integrated modeling approach proposed in this paper can inform policymakers as they make proactive decisions on climate change adaptation and resilience.Publication From Patriarchy to Policy(Washington, DC: World Bank, 2025-05-29)Legal institutions play an important role in shaping gender equality in economic domains, from inheritance to labor markets. But where do gender equal laws come from? Using cross-country data on social norms and legal equality, this paper investigates the socio-cultural roots of gender inequity in the legal system and its implications for female labor force participation. To identify the impact of social norms, the analysis uses an empirical strategy that exploits pre-modern differences in ancestral patriarchal culture as an instrument for present-day gender norms. The findings show that ancestral patriarchal culture is a strong predictor of contemporary norms, and conservative social norms are associated with more gender inequality in the de jure legal framework, the de facto implementation of laws, and the labor market. The paper presents evidence for a political selection mechanism linking norms to laws: countries with more conservative norms elect political leaders who are more hostile to gender equality, who then pass less progressive legislation. The results highlight the cultural roots and political drivers of legalized gender inequality.Publication Geopolitics and the World Trading System(Washington, DC: World Bank, 2024-12-23)Until the beginning of this century, the GATT/WTO system worked. Economic research provided a compelling explanation. It showed that if governments maximize the well-being of their own countries broadly defined, GATT/WTO principles would facilitate mutually beneficial cooperation over their trade policy choices. Now heightened geopolitical rivalry seems to have undermined the WTO. A simple transposition of the previous rationalization suggests that geopolitics and trade cooperation are not compatible. The paper shows that this is only true if rivalry eclipses any consideration of own-country well-being. In all other circumstances, there are gains from trade cooperation even with geopolitics. Furthermore, the WTO’s relevance is in question only if it adheres too rigidly to its existing rules and norms. Through measured adaptation to the geopolitical imperative, the WTO can continue to thrive as a forum for multilateral trade cooperation in the age of geopolitics.Publication Rethinking Fiscal Policies(Washington, DC: World Bank, 2025-05-27)This paper examines the redistributive impact of fiscal policy—specifically taxes and transfers—on poverty and inequality in eight countries in the Middle East and North Africa: the Arab Republic of Egypt, Djibouti, the Islamic Republic of Iran, Iraq, Jordan, Morocco, Tunisia, and the West Bank and Gaza. Utilizing the Commitment to Equity framework, the analysis evaluates how fiscal interventions alter income distribution across these diverse national contexts. The results indicate that direct cash transfers and social assistance programs are generally effective in reducing poverty and shielding vulnerable populations, while in-kind benefits—particularly in education and healthcare—significantly contribute to mitigating income inequality. In contrast, generalized subsidies, especially in the energy sector, are fiscally burdensome and largely regressive, offering limited equity gains. Indirect taxes, although important for revenue generation, often exacerbate income disparities. The study underscores the need for comprehensive fiscal reforms, including the expansion of well-targeted transfers, adoption of progressive taxation, and reallocation of inefficient subsidies toward investments in human capital. Successful initiatives, such as Egypt’s Takaful and Karama and Jordan’s Takaful and bread subsidy compensation programs, illustrate scalable models of effective redistribution. Moreover, the Islamic Republic of Iran’s progressive tax policies highlight viable pathways to equitable revenue mobilization. Strengthening investment in education and health is essential for promoting long-term equity, enhancing upward mobility, and supporting inclusive and sustainable development across the region.Publication Yield Gains from Balancing Fertilizer Use(Washington, DC: World Bank, 2025-05-29)As with most agricultural inputs, the optimal use of fertilizer leverages the production complementarities between different types of nutrients. Wide variation in the intensity of nutrient application rates suggests there are potentially large productivity gains to be had from rebalancing fertilizer use across nutrient types even under a fixed expenditure budget. Using detailed information on a large sample of rice fields across three states in eastern India, this paper investigates whether a more balanced use of fertilizer—measured as the ratio of potash to nitrogen applied to a field—can lead to higher yields and revenues. To address the endogeneity of fertilizer application decisions, the analysis exploits the fact that nitrogen-based fertilizers demanded by Indian farmers are mostly produced domestically in a limited number of manufacturing plants, while all potash-based fertilizers must be imported by ship from abroad. Instrumenting for the ratio of potassium-to-nitrogen fertilizer applied on a field with the relative travel distances between farmers’ villages and both the nearest urea production plant and the nearest international port, the paper estimates the impact of more balanced fertilizer use on yields and revenues. The estimates show that at median levels of fertilizer use, and keeping the level of expenditure on fertilizers constant, rebalancing fertilizer application choices such that the potassium-to-nitrogen ratio of fertilizer is doubled would lead to a 4.8 percent increase in yield.
Journal
Journal Volume
Journal Issue
Collections
Related items
Showing items related by metadata.
Publication When Is There Enough Data to Create a Global Statistic ?(Washington, DC: World Bank, 2022-05-05)To monitor progress toward global goals such as the Sustainable Development Goals, global statistics are needed. Yet cross-country data sets are rarely truly global, creating a trade-off for producers of global statistics: the lower is the data coverage threshold for disseminating global statistics, the more statistics can be made available, but the lower is the accuracy of these statistics. This paper quantifies the availability-accuracy trade-off by running more than 10 million simulations on the World Development Indicators. It shows that if the fraction of the world’s population for which data are lacking is x, then the global value will on expectation be off by 0.37*x standard deviation, and it could be off by as much as x standard deviations. The paper shows the robustness of this result to various assumptions and provides recommendations on when there is enough data to create global statistics. Although the decision will be context specific, in a baseline scenario, it is suggested not to create global statistics when there are data for less than half of the world’s population.Publication Missing SDG Gender Indicators(Washington, DC: World Bank, 2023-08-15)The Sustainable Development Goal agenda lays out an ambitious set of 231 indicators to track progress. Countries continue to fall short in terms of reporting on the indicators in general, and this is particularly the case for the subset of 50 gender-related indicators, where countries reported on average on 31 percent of these indicators in at least one year from 2016 to 2020. A closer look at this low coverage reveals four salient fundings. First, this is not just a problem of missing data; lack of reporting on existing data is detected to be a problem. For example, of the 32 gender-related indicators that are sex disaggregated, if countries that had a population estimate also had a sex-disaggregated estimate (which is almost always feasible), the Sustainable Development Goal gender coverage rate would be 43 percent instead of 31 percent. Second, better statistical systems are a major part of the solution, as statistical system strength is correlated with higher coverage. Third, poorer countries are doing no worse in reporting on gender-related Sustainable Development Goal indicators than high-income countries, despite weaker statistical systems. Lastly, sizable over (and under) performance in reporting, conditional on statistical strength, suggests that country-level advocacy and focus can yield wins in Sustainable Development Goal gender indicator coverage.Publication Updating Poverty Estimates at Frequent Intervals in the Absence of Consumption Data : Methods and Illustration with Reference to a Middle-Income Country(World Bank Group, Washington, DC, 2014-09)Obtaining consistent estimates on poverty over time as well as monitoring poverty trends on a timely basis is a priority concern for policy makers. However, these objectives are not readily achieved in practice when household consumption data are neither frequently collected, nor constructed using consistent and transparent criteria. This paper develops a formal framework for survey-to-survey poverty imputation in an attempt to overcome these obstacles, and to elevate the discussion of these methods beyond the largely ad-hoc efforts in the existing literature. The framework introduced here imposes few restrictive assumptions, works with simple variance formulas, provides guidance on the selection of control variables for model building, and can be generally applied to imputation either from one survey to another survey with the same design, or to another survey with a different design. Empirical results analyzing the Household Expenditure and Income Survey and the Unemployment and Employment Survey in Jordan are quite encouraging, with imputation-based poverty estimates closely tracking the direct estimates of poverty.Publication Technical Assessment of Open Data Platforms for National Statistical Organisations(World Bank, Washington, DC, 2014-10-18)The term quot;open dataquot; is generally understood to be data that are made available to the public free of charge, without registration or restrictive licenses, for any purpose whatsoever (including commercial purposes), in electronic, machine-readable formats that ensure data are easy to find, download and use. National Statistics Offices (NSOs) have the potential to play a pivotal role in the implementation of open data initiatives. As producers and curators of data, the objective of making high quality data more accessible and usable is consistent with their guiding principles. NSOs indicate, in research conducted in support of this report, that one of the difficulties they encounter is that the technology they use to publish - or electronically distribute - data for public use is not compatible with open formats. They also indicate that common software packages used for open data portals do not accommodate the data formats and metadata they produce. Two key concerns related to data dissemination products are addresses: (1) Can such products designed primarily for NSOs satisfy requirements for an open data initiative?; and (2) Can such products designed primarily for open data satisfy the requirements of NSOs? Furthermore, data reuse, both by data experts and the public at large, is key to creating new opportunities and benefits from government data. The following recommendations are made to improve the overall utility of data publication platforms to NSOs and the open data community: improve technical documentation; ensure public Application Programming Interfaces (APIs) and endpoints are interoperable; presentation of metadata and Uniform Resource Identifiers (URIs) must conform to W3C standards; natural language search and metadata faceting should be standard; structural metadata and hypercube support are core NSO requirements; dashboards and visualisations are necessary for user engagement; and develop data engagement tools for improving data-quality and reuse.Publication Reviewing Assessment Tools for Measuring Country Statistical Capacity(Washington, DC: World Bank, 2024-03-11)Country statistical capacity is increasingly recognized as crucial for development, but no academic study exists that reviews the available assessment tools. This paper offers the first review study that fills this gap, paying particular attention to data and practical measurement challenges. It compares the World Bank’s recently developed Statistical Performance Indicators and Index with other widely used indexes, such as the Open Data Inventory index, the Global Data Barometer index, and other regional and self-assessment tools. The findings show that each index brings advantages in data sources, number of indicators, measurement focus, coverage of countries and time periods, and correlation with common development indexes. The Open Data Inventory index covers the most countries, the Global Data Barometer index collects data through its surveys, and the Statistical Performance Indicators and Index offer a broader framework for assessing statistical systems. The paper offers further thoughts on the potential mechanisms through which these tools can bring positive impacts on economic activities and some political economy concerns, as well as future directions for development.
Users also downloaded
Showing related downloaded files
Publication Economic Recovery(World Bank, Washington, DC, 2021-04-06)World Bank Group President David Malpass spoke about the world facing major challenges, including COVID, climate change, rising poverty and inequality and growing fragility and violence in many countries. He highlighted vaccines, working closely with Gavi, WHO, and UNICEF, the World Bank has conducted over one hundred capacity assessments, many even more before vaccines were available. The World Bank Group worked to achieve a debt service suspension initiative and increased transparency in debt contracts at developing countries. The World Bank Group is finalizing a new climate change action plan, which includes a big step up in financing, building on their record climate financing over the past two years. He noted big challenges to bring all together to achieve GRID: green, resilient, and inclusive development. Janet Yellen, U.S. Secretary of the Treasury, mentioned focusing on vulnerable people during the pandemic. Kristalina Georgieva, Managing Director of the International Monetary Fund, focused on giving everyone a fair shot during a sustainable recovery. All three commented on the importance of tackling climate change.Publication Remarks at the United Nations Biodiversity Conference(World Bank, Washington, DC, 2021-10-12)World Bank Group President David Malpass discussed biodiversity and climate change being closely interlinked, with terrestrial and marine ecosystems serving as critically important carbon sinks. At the same time climate change acts as a direct driver of biodiversity and ecosystem services loss. The World Bank has financed biodiversity conservation around the world, including over 116 million hectares of Marine and Coastal Protected Areas, 10 million hectares of Terrestrial Protected Areas, and over 300 protected habitats, biological buffer zones and reserves. The COVID pandemic, biodiversity loss, climate change are all reminders of how connected we are. The recovery from this pandemic is an opportunity to put in place more effective policies, institutions, and resources to address biodiversity loss.Publication South Asia Development Update, April 2024: Jobs for Resilience(Washington, DC: World Bank, 2024-04-02)South Asia is expected to continue to be the fastest-growing emerging market and developing economy (EMDE) region over the next two years. This is largely thanks to robust growth in India, but growth is also expected to pick up in most other South Asian economies. However, growth in the near-term is more reliant on the public sector than elsewhere, whereas private investment, in particular, continues to be weak. Efforts to rein in elevated debt, borrowing costs, and fiscal deficits may eventually weigh on growth and limit governments' ability to respond to increasingly frequent climate shocks. Yet, the provision of public goods is among the most effective strategies for climate adaptation. This is especially the case for households and farms, which tend to rely on shifting their efforts to non-agricultural jobs. These strategies are less effective forms of climate adaptation, in part because opportunities to move out of agriculture are limited by the region’s below-average employment ratios in the non-agricultural sector and for women. Because employment growth is falling short of working-age population growth, the region fails to fully capitalize on its demographic dividend. Vibrant, competitive firms are key to unlocking the demographic dividend, robust private investment, and workers’ ability to move out of agriculture. A range of policies could spur firm growth, including improved business climates and institutions, the removal of financial sector restrictions, and greater openness to trade and capital flows.Publication Business Ready 2024(Washington, DC: World Bank, 2024-10-03)Business Ready (B-READY) is a new World Bank Group corporate flagship report that evaluates the business and investment climate worldwide. It replaces and improves upon the Doing Business project. B-READY provides a comprehensive data set and description of the factors that strengthen the private sector, not only by advancing the interests of individual firms but also by elevating the interests of workers, consumers, potential new enterprises, and the natural environment. This 2024 report introduces a new analytical framework that benchmarks economies based on three pillars: Regulatory Framework, Public Services, and Operational Efficiency. The analysis centers on 10 topics essential for private sector development that correspond to various stages of the life cycle of a firm. The report also offers insights into three cross-cutting themes that are relevant for modern economies: digital adoption, environmental sustainability, and gender. B-READY draws on a robust data collection process that includes specially tailored expert questionnaires and firm-level surveys. The 2024 report, which covers 50 economies, serves as the first in a series that will expand in geographical coverage and refine its methodology over time, supporting reform advocacy, policy guidance, and further analysis and research.Publication Media and Messages for Nutrition and Health(World Bank, Washington, DC, 2020-06)The Lao People’s Democratic Republic (Lao PDR) has experienced rapid and significant economic growth over the past decade. However, poor nutritional outcomes remain a concern. Rates of childhood undernutrition are particularly high in remote, rural, and upland areas. Media have the potential to play an important role in shaping health and nutrition–related behaviors and practices as well as in promoting sociocultural and economic development that might contribute to improved nutritional outcomes. This report presents the results of a media audit (MA) that was conducted to inform the development and production of mass media advocacy and communication strategies and materials with a focus on maternal and child health and nutrition that would reach the most people from the poorest communities in northern Lao PDR. Making more people aware of useful information, essential services and products and influencing them to use these effectively is the ultimate goal of mass media campaigns, and the MA measures the potential effectiveness of media efforts to reach this goal. The effectiveness of communication channels to deliver health and nutrition messages to target beneficiaries to ensure maximum reach and uptake can be viewed in terms of preferences, satisfaction, and trust. Overall, the four most accessed media channels for receiving information among communities in the study areas were village announcements, mobile phones, television, and out-of-home (OOH) media. Of the accessed media channels, the top three most preferred channels were village announcements (40 percent), television (26 percent), and mobile phones (19 percent). In terms of trust, village announcements were the most trusted source of information (64 percent), followed by mobile phones (14 percent) and television (11 percent). Hence of all the media channels, village announcements are the most preferred, have the most satisfied users, and are the most trusted source of information in study communities from four provinces in Lao PDR with some of the highest burden of childhood undernutrition.