Policy Research Working Paper 11030 Global Expansion of Marine Protected Areas and the Redistribution of Fishing Effort Gavin McDonald Jennifer Bone Christopher Costello Gabriel Englander Jennifer Raynor Development Economics Development Research Group January 2025 Policy Research Working Paper 11030 Abstract The expansion of marine protected areas (MPAs) is a core difference between these scenarios represents the predicted focus of global conservation efforts, with the “30x30” ini- change in fishing effort resulting from MPA expansion. The tiative to protect 30% of the ocean by 2030 serving as a results show that, regardless of the MPA network’s objec- prominent example of this trend. This paper examines a tive or size, fishing effort would decrease inside the MPAs, series of proposed MPA network expansions of various sizes though by much less than 100%. Moreover, this reduc- and forecasts the impact that increased protection could tion in fishing effort within MPAs does not simply shift have on global patterns of fishing effort. This is accom- outside—fishing effort outside MPAs also declines. The plished using a predictive machine learning model trained overall magnitude of the predicted decrease in global fishing on a global dataset of satellite-based fishing vessel moni- effort principally depends on where networks are placed in toring data, current MPA locations, and spatiotemporal relation to existing fishing effort. MPA expansion will lead environmental, geographic, political, and economic fea- to a global redistribution of fishing effort, which should be tures. The model predicts future fishing effort under various considered in network design, implementation, and impact MPA expansion scenarios, compared to a business-as-usual evaluation. counterfactual scenario that includes no new MPAs. The This paper is a product of the Development Research Group, Development Economics. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at aenglander@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Global expansion of marine protected areas and the redistribution of fishing effort Gavin McDonald1,2,3*, Jennifer Bone1,2,3 , Christopher Costello1,2,3 , Gabriel Englander4 , Jennifer Raynor5 1 Marine Science Institute, University of California, Santa Barbara, USA. 2 Bren School of Environmental Science and Management, University of California, Santa Barbara, USA. 3 Environmental Markets Lab, University of California, Santa Barbara, USA. 4 Development Research Group, The World Bank, USA. 5 Department of Forest and Wildlife Ecology, University of Wisconsin-Madison, USA. *Corresponding author(s). E-mail(s): gmcdonald@bren.ucsb.edu; Contributing authors: jbone@bren.ucsb.edu; ccostello@bren.ucsb.edu; aenglander@worldbank.org; jraynor@wisc.edu; JEL classification: Q22, Q28 Keywords: marine protected areas | isheries economics | isher behavior | predictive machine learning 1 Introduction Simulation methods developed in the fisheries lit- erature [10] and location choice models developed The expansion of marine protected areas (MPAs) in the economics literature [6, 11–14] are help- is a crucial part of global conservation efforts [1]. ful for understanding the structure of individual The “30x30” initiative, for example, aims to pro- behavior, but are unlikely to apply when con- tect at least 30% of the world’s oceans by 2030 sidering complex interactions between multiple through a combination of fully protected areas fleets at the global scale. They also often require (no extractive activities allowed) and partially detailed vessel-level data, which are rarely avail- protected areas (some activities remain permit- able globally even with modern satellite tracking. ted) [2, 3]. Currently, fully protected MPAs cover Causal inference methods have been effective in less than 3% of the world’s oceans, but this is examining regional effects of individual marine expected to increase [4, 5]. As fully protected protection policies [15–18], but these methods MPAs expand, it is crucial to understand their require an unaffected control group, which by def- impact on global fishing effort. The creation of inition does not exist for a policy that induces fully protected MPAs will shift the location and global effects. As a result, simulations of large- intensity of industrial fishing effort—as fishers scale MPA expansions have relied on heuristic move out of newly-created MPAs, increased con- assumptions of fisher responses, such as assuming gestion and altered economic opportunities from no change outside MPAs or a uniform realloca- fishing could lead to a cascade of redistribution tion of effort from within MPAs to areas outside that ripples from proximate to far flung areas [6]. [19–23]. This approach provides an “all else equal” But where and how much fishing effort will move reference point, but it is an assumed scenario, remains an unanswered question. not an empirically-driven response, so does not The effectiveness and longevity of fully pro- capture the true net effect of the policy. tected MPA expansion depends on how fishing We develop the first data-driven, predictive effort responds. If fishing effort simply moves behavioral model of global fishing effort response elsewhere, it could increase fishing intensity and following large-scale spatial closures. We begin by threaten biodiversity outside of MPAs, possibly compiling a global dataset of fishing effort for even reversing the presumed biodiversity bene- all industrial fishing vessels that used Automatic fits of protection [7]. However, if fishing effort Identification System (AIS) transponders between decreases (e.g., due to increased competition and 2016 and 2021 [24], which is our outcome vari- reduced profitability), it could protect biodiversity able (Fig. 1). We then generate 42 model features [8] but potentially harm the economies of fishery- that include spatial and temporal information dependent nations and the feasibility of long-term on the geographic distribution of fully protected protection commitments. MPAs, environmental and economic conditions, For example, two of the largest fully protected and geographic and governance characteristics; MPAs ever created, Phoenix Islands Protected we also assess AIS reception quality, which can Area and Palau National Marine Sanctuary, were affect perceived fishing effort from AIS transpon- recently re-opened to fishing because of their per- ders (see Materials and Methods for a complete ceived negative economic effects. In fact, protected list of model features). In order to promote com- area downgrading, downsizing, and degazettement putational tractability, we aggregate all data to of MPAs has been observed in dozens of MPAs a 1x1 degree pixel level annually; however, the around the world, with commercial fishing inter- model can be implemented, in principle, at any ests being one of the driving factors [9]. Thus, geographic resolution. Next we train a series of understanding how fishing effort will redistribute two-stage hurdle random forest models to pre- must be a central component of marine spatial dict fishing effort one, two, and three years in the planning. future: the first stage predicts whether any fishing Previous research has measured the effect of occurs in a pixel, and the second stage predicts individual MPAs on fishing responses, but typ- the intensity of fishing if it occurred. We tune ically for a single fleet in a limited area; none the model hyperparameters using cross-validation of the existing methods can capture the effects (CV) with time-based folds, and we quantify out- of global interventions affecting all fishing fleets. of-sample performance over both time and space. 2 Finally, we use the trained models to predict: (EBSAs) proposed by the Convention on Biolog- (i) a business-as-usual (BAU) counterfactual sce- ical Diversity [27]. We finally evaluate a set of nario, which represents future fishing effort if networks that randomly protect pixels to achieve no new fully protected MPAs are implemented certain area-based targets, as well as networks and (ii) MPA expansion scenarios, which repre- that protect either the currently most-fished areas sent future fishing effort as fully protected MPA or the areas that are currently not fished. Impor- coverage incrementally increases from current lev- tantly, each network differs in its overlap with els. The difference between each MPA expansion current fishing effort (Fig. 1); for the same ocean scenario and the BAU counterfactual scenario rep- coverage percentage, some networks would place resents the predicted change in fishing effort as a MPAs in regions with significantly higher current result of the MPA expansion. We focus our anal- fishing activity than others (Fig. 3). Comparing ysis on the potential impacts of fully protected results across MPA networks therefore allows us MPAs, although in practice the levels of protec- to explore not only the potential impact of fully tion afforded by MPA expansion will vary across protecting, for example, 15% of the ocean, but specific policies and regions. For example, under also whether it matters which 15% of the ocean is 30x30 the European Union has proposed to afford protected. some level of protection to 30% of its waters, while We make three distinct scientific contributions. fully protecting only 10% [25]. First, our results contribute to a timely, interna- In forecasting the effects of future fully pro- tional policy debate on the impacts of large-scale tected MPAs, we do not favor or limit our analysis MPA expansion and the best implementation to any particular MPA network proposal. Rather, strategies. While previous work has modeled the we explore how a suite of alternative proposed potential impacts of large-scale expansion of ter- MPA networks would each affect fishing effort. We restrial protected areas on land [28], no such include a number of networks that are the out- similar work has modeled the potential impacts puts of global prioritization analyses. These are of marine protected areas in the ocean. Second, a network that focuses on areas beyond national we develop a novel machine learning technique jurisdiction in the high seas [26], and a suite of for predicting global changes in fishing effort as a networks that prioritize either biodiversity pro- result of MPA implementation that is tractable, tection, carbon sequestration, food provision, or flexible, and data-driven. Machine learning lets multiple objectives [20] (Fig. 2). We also consider us move beyond rigid assumptions about the an expert-designed bottom-up network of Eco- structure of complex economic and ecological logically or Biologically Significant Marine Areas interactions because the algorithm allows for non- linear, data-driven relationships [29]. We also use machine learning to build a plausible BAU coun- terfactual, which allows for inference when there is no unaffected control group [30]. Finally, our model can provide a decision-support tool for local marine managers to predict the effects of future spatial closures (fully protected MPAs or other spatially-explicit fishing prohibitions). By providing more clarity on potential fishing effort outcomes, our model can help mangers reduce the probability of downgrading, downsizing, and degazettement of future MPAs, thus decreasing Fishing hours 0 10 1,000 100,000 the uncertainty and regulatory burden associated Fig. 1: Map of observed fishing effort (hours) with MPA expansion. Importantly, our model is in 2021, shown using a log10 scale for visualiza- general enough that it could also predict fishing tion purposes. Pixels have a 1x1 degree geographic redistribution from other spatiotemporal changes coordinate resolution, the spatial unit of our anal- such as climate change. ysis. 3 Business−as−usual Expert−derived EBSA Visalli et al. 2020 Sala et al. 2021 biodiversity Sala et al. 2021 carbon Sala et al. 2021 food Sala et al. 2021 multi−objective Random Protecting most−fished pixels Protecting unfished pixels Area protected 2.5% 3% 5% 10% 16% 20% 30% Fig. 2: Maps of business-as-usual (BAU) network and hypothetical global MPA networks used in our simulations. The fill color of the global MPA network maps is by the global area coverage size, and only pixels that are fully enclosed in MPAs are colored. The BAU scenario holds fixed the existing fully protected MPA coverage as of the end of 2020 (2.5% of ocean area). Since the Sala et al. 2021 network scenarios, protecting most-fished pixel scenario, and random, unfished, and most-fished scenarios each protect pixels in descending order of priority, the network for each area protected size (3%, 5%, 10%, 16%, 20%, and 30%) is inclusive of all pixels in smaller coverage sizes. The Visalli et al. 2020 and expert- derived EBSA scenarios are each only available for a single coverage size (16% and 20%). Pixels have a 1x1 degree geographic coordinate resolution, the spatial unit of our analysis. 2 Results these out-of-sample tests and is sufficient for our purpose of predicting future fishing effort under How well can a globally tuned machine learning the expansion of fully protected MPAs. To test model actually predict fishing? An important first the model’s ability to predict future fishing effort, step in validation is to test the predictions of our we perform a temporal out-of-sample test using a trained models against out-of-sample fishing effort model trained on early years of the dataset and data. We find that the model performs well in tested on held-out later years of the dataset. Using 4 90% Business−as−usual 80% 70% MPA network scenario Current Protecting most−fished pixels fishing hours 60% overlapping Sala et al. 2021 biodiversity 50% with Sala et al. 2021 carbon area protected 40% Sala et al. 2021 multi−objective 30% Random 20% Sala et al. 2021 food 10% Expert−derived EBSA Visalli et al. 2020 0% Protecting unfished pixels 0% 10% 20% 30% Area protected Fig. 3: Percent of global fishing effort (hours) that spatially occurs within pixels that would be protected by hypothetical MPA networks versus percent of global ocean area protected, colored by MPA scenario. Colors differentiate the various hypothetical MPA networks. Linetypes are used to further differentiate networks that can have various levels of protection, while shapes are used to differentiate networks that only have a single level of protection. The scenarios are listed in the legend in the same order as they appear in the figure at the level of their largest area protected. this test, the ROC area-under-the-curve in the models across all performance metrics (Fig. S6). first stage prediction is approximately 0.97 and These out-of-sample evaluations give credence to the F1 score is approximately 0.91 (Fig. S4A and our subsequent MPA network scenario predictions Table S1). In the second stage prediction of fishing because the MPA network scenario predictions intensity, the R2 is approximately 0.8 (Fig. S4B consider MPAs in future years and in locations and Table S1). Additionally, across performance that may not yet have MPAs. metrics, there is little reduction in performance In the absence of any new MPAs, our BAU as we predict fishing effort additional years into counterfactual scenario predicts greater total the future. To test the model’s ability to predict global fishing effort in the future compared to cur- both future fishing effort and in spatial areas that rently observed levels (Fig. 4). Under any of the have never before seen fully protected MPAs, we hypothetical MPA expansion scenarios we con- perform a spatiotemporal out-of-sample test that sider, our model predicts that total future global uses a suite of models for each ocean trained on fishing effort will invariably be lower than the early years of the dataset and in other oceans, BAU counterfactual scenario of no new MPAs. and tested on held-out later years of the dataset The extent of this decrease depends on the MPA and in the ocean of interest. We do this leave- network and the percentage of ocean area it one-out test for each ocean, allowing us to see encompasses. For some scenarios such as protect- how well the model can predict fishing effort in ing unfished pixels, “Visalli et al. 2020”, “Expert- spatial areas where the model has not seen any derived EBSA”, “Random”, and for MPA cover- training data. Again, we find high performance age levels of 5% to 20%, we find that predicted for the spatiotemporal out-of-sample testing (Fig. total fishing effort may be roughly equal to or even S5), indicating that the model can forecast across above the current levels we see today. However, both time and space. Finally, we test our model’s for other scenarios, and all scenarios with 30% performance against a series of simpler models, protection (other than protecting unfished pix- again using temporal out-of-sample testing. We els), we find that total global fishing effort would find that our model outperforms these simpler be both lower than the currently observed level 5 and lower than the predicted future level under (Figs. S11 - S12). Aggregate fishing effort outside business-as-usual. MPAs also decreases across all scenarios, which Intuitively, the decrease in future total fishing drives the majority of the global decline (Figs. S11 from MPA expansion is smallest under the sce- - S12). This effect is most prominent in locations narios that extend protection to areas that are closer to MPA boundaries. Pixels located fully currently unfished (-0.4% to -6%, dark gray line inside new MPAs see the largest median decrease in Fig. 5A). Scenarios that extend protection to in fishing effort (-18%), and this median decrease areas that are currently most-fished would lead becomes smaller and approaches zero as the dis- to the largest decrease in total fishing effort (- tance increases from the MPA boundary (Fig. 6% to -55%, light gray line). These two scenarios 6B). This spatial dissipation is consistent with the are not meant to represent plausible real-world before-versus-after changes we observe in the his- MPA networks; rather, they are intended to dis- torical raw data following the implementation of play a range of possible effects from the large-scale real MPAs (with an observed median decrease of expansion of fully protected MPAs. Most of the -57% for pixels fully inside new MPAs, and dimin- actual proposed networks result in reductions in ishing magnitude changes as the distance to the fishing effort of about 10% to 20%. Two excep- nearest MPA increases) (Fig. 6A). tions are “Sala et al. 2021 carbon” and “Sala et al. How do our results compare to the conven- 2021 biodiversity”, which have predicted declines tional wisdom? The two most common assump- of 37% and 38%, respectively, after three years and tions employed in the previous literature are “full with full 30% area protection. For all scenarios, as displacement” (where each unit of fishing effort the percentage of ocean area protected increases, covered by an MPA is displaced outside the MPA) we predict incrementally larger decreases in total and “complete exit” (where each unit of fishing fishing effort. effort covered by an MPA disappears) [19–23]. Our We have shown that across the range of results suggest that neither of these hypotheses proposed protection scenarios (i.e. excluding the is correct. Instead, we find that the large-scale “Most-fished”, “Unfished”, and “Random” sce- expansion of fully protected MPAs is likely to narios”), global fishing effort is likely to decrease, trigger a redistribution of fishing effort, most and the magnitude ranges from about -3% to - prominently near new MPAs but with effects that 38%. What drives the magnitude of predicted also arise farther away, which in aggregate imply decreases for different MPA networks? Regardless a global reduction in fishing. Furthermore, as fully of the network’s objectives, how it was designed, protected MPAs cover larger fractions of current the percentage of the ocean covered, or the fore- fishing activity, this reduction magnifies, under- cast horizon, the key driver is the overlap of the scoring the importance of carefully considering proposed network with current fishing effort (Fig. fishers’ adaptation to MPA expansion. 5B). As this percentage increases, we consistently predict larger decreases in global fishing effort. For 3 Discussion instance, the proposed scenario that results in the smallest predicted decrease covers only 4% of cur- The ecological and economic consequences of rent global fishing hours. Similarly, the proposed large-scale MPA expansion will hinge to a large scenario prompting the largest predicted decrease extent on how global fishing effort responds to covers the greatest percentage of current global increased protection, yet little is known about fishing hours (78%). This pattern implies that the how this will unfold. Understanding the potential overlap between current fishing effort and new redistribution of fishing effort, and the subse- MPAs will play a crucial role in determining the quent changes in the aggregate quantity of global impacts of future MPA expansion. fishing, is an important first step. We developed Crucially, these reductions in aggregate fish- an empirical global model to predict how fishing ing effort arise both inside and outside the new activity will respond to changes in fully protected MPAs. Aggregate fishing effort inside the MPAs MPA coverage. Importantly, our model is general diminishes under all network scenarios; however, enough that it could also be used to predict fish- it never drops to zero, even though these new ing changes in response to other oceanic changes, MPAs ostensibly prohibit all commercial fishing such as climate change or new fishing regulations. 6 MPA coverage: 3% MPA coverage: 5% MPA coverage: 10% Business−as−usual 40 20 MPA network scenario Fishing Protecting unfished pixels hours 0 Visalli et al. 2020 (millions) MPA coverage: 16% MPA coverage: 20% MPA coverage: 30% Expert−derived EBSA Sala et al. 2021 food 40 Random Sala et al. 2021 multi−objective 20 Sala et al. 2021 carbon Sala et al. 2021 biodiversity 0 Protecting most−fished pixels −5 −4 −3 −2 −1 0 1 2 3 −5 −4 −3 −2 −1 0 1 2 3 −5 −4 −3 −2 −1 0 1 2 3 Forecast horizon (years) Fig. 4: Total observed global fishing effort (hours) (where forecast horizons -5 to 0 correspond to observed data from years 2016 to 2021), and total predicted global fishing effort in the business-as-usual (BAU) scenario and all MPA scenarios for the three predicted forecast horizons. A vertical dashed line is shown at 0 years (such that the line to the left represents observed data, and the lines to the right represent predictions). A horizontal dashed line is shown at the level of currently observed fishing effort in the last year of historically observed data. Each panel represents global MPA networks that are sized for a given percentage coverage. Colors and linetypes differentiate the various hypothetical MPA networks and the BAU scenario. The MPA network scenarios are listed in the legend in the same order as they appear in the figure at a forecast horizon of 3 years and their largest MPA coverage. Across a wide range of hypothetical MPA net- Our results show that new fully protected works, our key finding is clear: aggregate global MPAs will lead to a global redistribution of fish- fishing is likely to decline, and the magnitude ing effort, but we also find that other factors play of this decline is largely driven by the amount an important role in determining fishing effort. of current fishing activity in the newly protected While model features based on MPAs make impor- areas. This factor is more important than either tant contributions towards predictions of future the conservation objectives of the MPA network fishing effort (Figs. S7, S9, and S10), in aggre- or even the total area it covers. Specifically, when gate they are less important than features like new fully protected MPAs are placed in regions previous fishing effort and spatiotemporal environ- currently experiencing intense fishing activity, we mental factors (Fig. S8). Climate change, which predict the most substantial decreases in total fish- may have impacts on these spatiotemporal envi- ing effort. While policymakers may choose to place ronmental factors such as sea surface temperature MPAs in locations with limited fishing activity— and the El Ni˜ no-Southern Oscillation [31], could for strategic reasons such as protecting critical therefore play an important role in the redistri- habitats, or for political reasons such as protect- bution of future fishing effort. In fact, changes in ing areas that are currently unfished or minimally sea surface temperature are already playing a role fished—our results suggest that such placements in the redistribution of tuna catch in the Eastern would exert minimal impact on fishers’ decisions. Pacific Ocean [32]. In other words: which parts of the ocean are Because our model is training on historical protected is more important in determining over- data, the validity of our predictions relies on all fishing effort than how much of the ocean is the assumption that future fisher responses to protected. MPAs will resemble past responses of fishers to 7 (A) 0% Forecast: 1 year Forecast: 2 years Forecast: 3 years Change in −20% fishing hours MPA network scenario −40% Protecting unfished pixels Visalli et al. 2020 3% 10% 20% 30% 3% 10% 20% 30% 3% 10% 20% 30% Area protected Expert−derived EBSA Sala et al. 2021 food Random (B) Forecast: 1 year Forecast: 2 years Forecast: 3 years 0% Sala et al. 2021 multi−objective Sala et al. 2021 carbon Change −20% Sala et al. 2021 biodiversity in fishing Protecting most−fished pixels hours −40% 0% 25% 50% 75% 0% 25% 50% 75% 0% 25% 50% 75% Current fishing hours overlapping with area protected Fig. 5: Predicted aggregate percentage changes in global fishing hours from expanding MPAs. The y-axis shows the relative difference in fishing hours between each MPA network scenario and the business-as- usual counterfactual scenario, with values aggregated globally. (A) presents the relationship between the percentage change in fishing hours and the percentage of ocean area covered by each hypothetical MPA network, with panels for each of the three forecast horizon years. (B) presents the relationship between the percentage change in fishing hours and the percentage of current global fishing hours occurring in areas that would be covered by each hypothetical MPA network, with panels for each of the three forecast horizon years. In all panels, colors differentiate the various hypothetical MPA networks. Linetypes are used to further differentiate networks that can have various levels of protection, while shapes are used to differentiate networks that only have a single level of protection. The MPA network scenarios are listed in the legend in the same order as they appear in the top right panel and their largest MPA coverage. MPAs. While our MPA expansion scenarios rep- provisioning, catch-per-unit-effort (CPUE), rev- resent different degrees of extrapolation from past enue, or profits. A recent review paper found 48 experience, it is reassuring that predictions across examples of fisheries benefits relating to MPAs proposed MPA networks consistently align when across 25 countries [33]. For example, an empirical assessed in relation to the percentage of current analysis of the impacts from Papah¯ anaumoku¯ akea fishing activity they cover. Our predicted per- Marine National Monument found that CPUE centage changes inside and nearby MPAs are also increased in areas surrounding the MPA as a result consistent with the changes seen in the historically of the MPA [34]. Increased CPUE – as empirically observed data for MPAs that were implemented shown in this example – combined with decreased after 2016 and before 2021 (Fig. 6). effort – as predicted by our model – could lead Although our results consistently predict that to either increases or decreases in catch, depend- large new networks of MPAs will lead to decreased ing on the relative magnitude of the changes of global fishing effort, decreases in effort do not each. A study of Revillagigedo National Park in necessarily translate to decreases in catch, food Mexico, the largest fully protected MPA yet to be implemented in North America, found that while 8 Change in observed fishing effort between (e.g., catch exceeds maximum sustainable level), (A) year after and year before MPA implementation standard fisheries surplus production models pre- 150% 100% dict that reducing fishing effort will increase both 50% catch and biomass [36]. But in fisheries that are 0% currently fished at or below sustainable levels, −50% reducing fishing effort may reduce catch (while −100% still increasing biomass). MPAs are not used in isolation but in the con- Change in predicted fishing effort between text of other types of fisheries management. As (B) MPA scenario and BAU scenario an alternative or complement to spatial protec- 150% tions, fisheries management has been shown to 100% effectively increase biomass while simultaneously 50% increasing catch and profits [36, 37]. While our current analysis on the impact of global MPA 0% expansion on fishing effort is an important first −50% step, future research could use bioeconomic sim- −100% ulation modeling to explore how this predicted Inside MPA 0−100 100−200 >200 change in fishing effort would translate to changes Distance to nearest MPA (km) in catch, CPUE, revenue, or profits. Fig. 6: (A) Observed historic pixel-level changes While the decrease in fishing effort within the in fishing hours between one-year after MPA bounds of newly-created MPAs aligns with expec- implementation and one-year before MPA imple- tations and historically observed patterns [38], our mentation, for fully protected MPAs implemented consistent prediction that fishing effort will also after 2016 and before 2021. Pixels are bucketed decrease outside of MPAs is more surprising. This into different distance bins to the nearest MPA. is in contrast to the commonly discussed con- (B) Predicted pixel-level changes in fishing hours cept of ‘fishing-the-line’, in which fishing effort from expanding MPAs, relative to business-as- concentrates around the edges of newly created usual scenario, with pixels grouped into different MPAs [39]. Here we hypothesize three potential distance categories to the nearest MPA. Each mechanisms that could lead to our result. In prac- point represents the pixel-level results for a differ- tice, different mechanisms may be at play under ent hypothetical MPA scenario, coverage size, and different contexts, and multiple mechanisms may forecast horizon. For both (A) and (B), the points simultaneously occur. While it is outside the scope are jittered to avoid visual overlap. Boxplots and of this paper, future research could empirically violin plots show distributions for each bin. The examine which of these different explanations are boxplots for each bin show the median, 25th per- most important and in which contexts. centile, and 75th percentile. A line connects the First, when there is significant biological median values for each distance bin. The y-axis is spillover, which refers to the movement of fish and limited to -100% to 150%, although some outlier other marine life from newly protected areas to the values extend beyond 150%. remaining fishing grounds, CPUE may actually increase in the remaining fishing grounds. This increase could allow vessels to reduce their fishing the MPA reduced fishing effort, CPUE did not sig- effort while maintaining catch levels (a decision- nificantly change [35]. In general, constant CPUE making strategy known as ‘satisficing’ in the combined with decreased effort would lead to economics, psychology, and political science liter- decreases in catch. Aside from potential fisheries atures). This phenomenon may also arise when impacts, benefits from MPAs relating to tourism catch is closely regulated outside the protected have also been widely documented [33]. area. The impact of MPAs on catch and biomass A second possible explanation is that under will depend on current fishery status and other certain circumstances, a new MPA may close off types of existing fisheries management institu- the most productive fishing grounds, leaving less tions. In fisheries that are currently overfished 9 productive areas open to fishing. If there is lit- of the MPA, Japanese fishing effort decreased by tle or no biological spillover of the region’s target 95% inside the MPA, but actually increased out- species, fishing in the remaining open areas may side the MPA by 155%, leading to a net increase become less profitable. If this is the case and in Japanese fishing in Palua’s waters of 0.2%. fishers use a ‘profit-maximizing’ decision-making Since both fleets predominantly use drifting long- strategy, this reduced profitability could lead to a lines in Palau and are therefore likely fishing for decrease in fishing effort outside the MPA. similar species that would have similar biologi- A third possible explanation is that the fixed cal responses to the MPA, the different behavioral cost of traveling to distant waters could lead to responses from these two fleets suggests that eco- a decrease in fishing outside large and distant nomics plays an important role in the spatial MPAs. Consider a scenario where the MPA gen- changes in fishing effort. erates negligible biological spillovers for the target We next investigate whether this anecdote can species and where the MPA covers a large fraction also exemplify our main finding that large-scale of a fleet’s fishing grounds. Some distant water expansion of fully protected MPAs is likely to fishing fleets that specialize in fishing this species reduce global fishing effort. We do so by following and region may have historically incurred large Palau’s “pre-MPA fishing fleet”, which consists fixed costs to transit from their home ports to the of all vessels that fished in Palau’s EEZ prior fishing grounds. If a large fraction of those his- to the 2020 MPA implementation, regardless of torical grounds are closed to fishing by an MPA, their flag. While some of these vessels reallocated fishing the remaining open area may become fishing activity to regions outside Palau’s EEZ unprofitable, so the fleet may divest entirely from (Fig. S14B), others ceased fishing entirely (Fig. those fishing grounds, and we would observe a S14A). Between 2019 and 2021, the number of decrease both inside and outside the MPA. This active vessels who fished anywhere in the world fixed cost effect could even be compounded by from Palau’s pre-MPA fishing fleet decreased in displacement to less productive fishing grounds size from 202 to 159 vessels (-21%). The pre-MPA which have lower CPUE, the second mechanism fishing fleet’s global fishing effort also decreased discussed above. by 49% between 2019 and 2021. By comparison, Our discussion of all three mechanisms is global fishing effort across all vessels decreased by speculative; we merely intend to offer potential only 1.9% between 2019 and 2021. interpretations of our results and motivate future Global reductions in fishing effort will occur research. In that spirit, we share the following across a range of habitats, which could have eco- real-world example regarding our third mecha- logical benefits for both biodiversity and fish pop- nism. In 2020 the Pacific island nation of Palau ulations. We can gauge these benefits by overlay- closed 80% of its EEZ to fishing, and the Tai- ing the predicted spatial changes in fishing effort wanese fleet, which was the largest fleet fishing in with regions of biological or ecological interest. Palau, is said to have left Palau entirely. Indeed, There are many different ways to do this, so as one an inspection of global fishing effort reveals that example we overlay our spatial predictions with this is exactly what happened. In the year fol- the spatial boundaries of Large Marine Ecosys- lowing Palau’s MPA implementation, Taiwan’s tems (LMEs) [41] (Fig. S15). LMEs represent fishing effort decreased by over 99% inside the regions in continental coastal waters characterized new MPA and by 62% outside the MPA, for a by trophically-dependent populations and higher total decrease of 97% (Fig. S13). This finding is primary productivity compared to open water corroborated by a major Taiwanese-owned fish- areas. They can thus be relevant for informing ing company that stated that the reduced size of ecosystem-based management and MPA network Palau’s fishing grounds caused by the MPA made design. Naturally, the specific changes in each it no longer financially viable to continue oper- LME will be contingent on the exact placement ating anywhere in Palau [40]. While fixed costs of new MPAs. Given that we predict the largest may substantially reduce fishing by some fleets, decreases in effort inside the boundaries of new other fleets may respond differently. For exam- MPAs, we can expect that fishing pressure will be ple, prior to 2020, Japan was the second-largest reduced most in those habitats directly protected fleet operating in Palau. After the establishment by new MPAs. A clear pattern emerges across the 10 range of hypothetical networks we considered in interventions. Such understanding will be essen- our analysis: Regardless of the MPA network sce- tial for formulating policies that effectively pro- nario, almost all LMEs will experience a decrease tect marine ecosystems and economically benefit in fishing effort compared to business-as-usual. fisheries-dependent people. Similar to understanding how predicted changes in effort would translate to predicted changes in catch, making precise predictions for the resulting 4 Methods biological effects would require explicit modeling Data processing and feature of population dynamics [42]. Since our analysis uses AIS-based vessel mon- engineering itoring as the basis of our global dataset of We build a global, spatial-temporal dataset of observed fishing effort, it is important to acknowl- AIS fishing hours and 42 model features that pre- edge that not all fishing vessels use AIS, and thus dict the location and intensity of fishing effort. our results are likely most relevant to those that For each feature, we discuss the rationale for do. AIS can be used to monitor most of the world’s its inclusion in the model and data processing large fishing vessels above 24m in length [43]; the steps below. Table 1 summarizes the complete data set in our analysis represents fishing activity list of features, measurement units, variable types, by over 110,000 fishing vessels between 2016-2021. spatial-temporal variation, and data sources. For However, most vessels below 24m do not use AIS all features, our observation unit is a 1x1 degree [43], such as many small-scale, artisanal, or subsis- pixel-year, although our model specification can tence fishing vessels. Our results should therefore theoretically handle any spatial-temporal resolu- be interpreted with caution when trying to under- tion. stand how MPAs may impact fishing effort of 1. Outcome variable. Our outcome variable small-scale vessels. Even for industrial fleets, a is AIS fishing effort (hours) from Global Fishing recent analysis using satellite imagery data found Watch (GFW) between 2016 and 2021 [24]. We that many fishing vessels do not broadcast AIS normalize the fishing effort in each pixel by the [44]. Future research could empirically examine spatial area of each pixel (m2 ) in order to account how effort by these ‘dark’ vessels respond to the for slightly different pixel sizes across the globe, implementation of MPAs. depending on the latitude. We also log trans- Our findings, indicating a decrease in global form fishing effort because it is a highly skewed fishing effort both inside and outside of MPAs distribution. Our final measurement unit is there- regardless of the hypothetical network scenario, fore log (hours/m2 ). After making predictions, we suggest that expanding fully protected MPA cov- back-transform the predicted log fishing effort to erage would likely benefit fish populations. This obtain predicted level fishing effort [45], and then could also benefit fishers who are operating in multiply this area-normalized effort by the pixel fisheries that are currently experiencing over- area to obtain absolute predicted fishing effort. We fishing. However, fishers may bear an economic do not use earlier years of GFW data (2012-2015) cost in fisheries that are currently being fished because AIS coverage and quality were generally more conservatively. This divergence underscores increasing during those years; these trends could an important challenge that international conser- cause confounding in the model because MPA vation initiatives like 30x30 confront: balancing expansion is also increasing over time. ambitious conservation goals with the livelihoods 2. MPA implementation. We delineate the of those who depend on the sea. Achieving such spatial-temporal spread of MPAs globally. For a balance is possible, and we are not arguing in each year, we determine the global network of this paper against MPA expansion or that MPAs fully protected MPAs that had been implemented will have a negative impact on fisheries. Future during or before that year. The boundaries of research could strive to furnish direct evidence MPAs come from a version of the MPA Atlas regarding both the conservation impacts and eco- downloaded in December 2020 [5]. We first fil- nomic fisheries impacts of proposed global ocean ter these MPAs to those that are listed as fully “no-take”, and also those that are either “Desig- nated” or “Established”. Implementation dates in 11 MPA Atlas frequently do not correspond to the partial MPA coverage or near MPAs with any cov- date when fishing bans begin, so for each MPA erage could experience an increase, rather than a we also performed an extensive gray literature decrease, in fishing effort post-implementation review using Google and Google Scholar to deter- Then, we quantify the fraction of the cur- mine the exact date when the no-take zone was rent year that the MPA was in place. For pixels fully implemented and enforced. When the exact inside or partially covered by MPAs, the frac- date could not be found, we next tried to deter- tion =1 for full year coverage and is >0 but mine the month, and if that could not be found <1 for partial year coverage; for pixels outside the year. We used the search terms: “mpa name MPAs, the fraction always =0. We expect that no take implementation date” and “mpa name MPAs implemented later in the year will be asso- implementation date”. Priority of dates was given ciated with a smaller reduction in annual fishing to government management plans, followed by effort. In fact, an MPA implemented later in the non-government organization reports, news arti- year could be associated with an increase in fish- cles, and blogs. This left us with 845 “no-take” ing effort if the announcement of a new MPA MPAs in the analysis for which we had an MPA causes an anticipatory increase in fishing effort implementation date. To remain consistent with prior to implementation, an effect known as the the MPA classification language provided in the ‘Blue Paradox’ [17]. To further capture anticipa- MPA Guide [4], we refer to these MPAs in the text tory effects, we also include two boolean features as “fully protected”. for whether an MPA will be implemented in a pixel We then calculate 11 MPA-based model fea- in the following year or in two years, interacted tures, which vary spatially and temporally. First, with two boolean features for whether the MPA we calculate the distance to the nearest MPA (=0 will be fully or partially covered by that MPA (this if the cell contains an MPA) and the years since procedure generates four unique features). that MPA was designated (=0 if the MPA was In the special case where a pixel overlaps with designated in the current year). The distance is multiple MPAs, the fraction of MPA overlap is calculated based on the distance to the nearest based on the union of these MPAs, and the years 1x1 degree rasterized version of the MPA bound- since designation is based on the oldest MPA. ary shapefile, while preventing travel through land 3. Environmental. Twelve features capture masses. We recalculate distances for every pixel environmental conditions that may influence fish- in every year because MPAs are implemented ing desirability. Sea surface temperature and sea over time, so the nearest MPA will change from surface temperature anomaly [46], chlorophyll- year to year. We expect that MPAs will have a A [47], and wind speed [48] vary spatially and larger effect on nearby pixels, and that the effect temporally, whereas indices for both the El Ni˜ no- may change over time following implementation Southern Oscillation [49] and Pacific Decadal (e.g., due to changes in enforcement, compliance, Oscillation [50] vary only temporally. Since these and/or transitions to new fishing grounds over variables all have higher spatial and/or temporal time). resolution than our final data set, we calculate Next, we quantify the fraction of MPA cover- both the mean and standard deviation across age for each pixel, its first degree neighbor pixels, source pixels and months for each of our 1x1 and its second degree neighbor pixels. We also degree pixel-years. generate a categorical variable for whether a pixel 4. Geographic. Seven features capture geo- is fully inside (pixel coverage = 100%), fully out- graphic features of each pixel, which vary spatially side (pixel coverage = 0%), or partially covered by but not temporally. We include numeric features an MPA (pixel coverage >0% and <100%). Since for latitude, longitude, distance to shore, dis- fishing effort is allowed near the edges of MPAs, a tance to the nearest seamount, and bathymetry pixel with only partial MPA coverage could have depth. Global Fishing Watch provides distance to a smaller reduction in fishing effort than an MPA shore and bathymetry depth for the centroids of with full coverage; indeed, if there is significant 0.01x0.01 pixels [24]; we calculate the average of ‘fishing the line’ (aggregation of fishing effort near these values for our 1x1 degree pixels. Distance MPA boundaries to capture spillover), pixels with to the nearest seamount is based on the distance between each 1x1 degree pixel centroid and over 12 37,000 seamount point locations [51]. Fish tend of ‘high seas’; the GFI also does not have com- to aggregate near seamounts, so we expect pixels plete global coverage for all EEZs, so for pixels in that are closer to seamounts to have more fish- EEZs that don’t have data from GFI we assign a ing effort. The relationship between fishing effort category value of ‘no data’). and distance to shore or bathymetry depth likely 6. Economic. Three features capture eco- depends on gear type; trawlers tend to prefer nomic conditions that may affect the profitability closer, shallower waters, while longliners tend to of fishing, and therefore fishing effort. Distance to prefer farther, pelagic waters. Latitude and longi- port, which varies spatially but not temporally, is tude flexibly capture other elements of geographic a proxy variable for the cost of fishing, with far- location that we do not directly measure. ther trips tending to cost more. GFW provides We also include categorical variables for the distance to port for the centroids of 0.01x0.01 pix- ocean [52] and mesopelagic zone [53] of the pixel. els [24]; we calculate the average of these values If a pixel falls into more than one ocean or for our 1x1 degree pixels. Fuel prices, which varies mesopelagic zone, the pixel is assigned to the cat- temporally but not spatially, also affect fishing egory that covers the largest fraction of the pixel. costs. We calculate the mean and standard devia- These variables capture potential geographic dif- tion across months in the year of the Intermediate ferences in where commercially valuable fish tend Fuel Oil (IFO) 380 fuel price [57]. For each cost to aggregate, which therefore influences fishing proxy, we expect that higher costs tend to reduce effort. fishing effort. 5. Governance. Six features describe the gov- 7. Technological. Our fishing effort outcome ernance characteristics of a pixel, which vary spa- variable is derived from satellite-based AIS ves- tially but not temporally. The first set of features sel messages. Satellite-based AIS reception varies concern presence of Exclusive Economic Zones globally, and there are some regions with consis- (EEZs), which may differ in their capacity to effec- tently poor reception (e.g., Southeast Asia). We tively manage fisheries. First, we intersect pixels capture the quality of AIS reception with two fea- with EEZs [54] and assign pixels to the EEZ with tures that vary spatially: messages per day for the largest pixel overlap using its sovereign ISO3 Type A transponders and for Type B transpon- EEZ label. To reduce the dimensionality of the ders [24]. Type A transponders are higher quality data, we group all EEZs that individually rep- and usually used by commercial vessels, whereas resent less than 1% of the pixels in the training Type B transponders tend to be lower quality and data into an ‘other’ category. Pixels that do not used by recreational vessels. overlap with any EEZs are assigned an EEZ value 8. Residual effects. Finally, we cannot cap- of ‘high seas’. Next, for each pixel, we calculate ture every feature that determines the location the fraction of the pixel area overlapping with its and intensity of fishing effort in a pixel-year, assigned EEZ (this takes a value of 0 for pixels so we include two features that capture a wide in the high seas; for pixels that overlap multiple range of the residual (or otherwise unmeasured) EEZs, this value is calculated using only the single effects. First, we calculate lagged AIS fishing effort EEZ with which it has the largest overlap). Then, (measured as log (hours/m2 ), like our outcome we calculate the distance to the nearest 1x1 degree variable), which varies spatially and temporally rasterized pixel of the EEZ boundary shapefile, [24]. We use a 1, 2, or 3 year lag, depending on the while preventing travel through land masses (=0 model’s forecast horizon (see details in the next for pixels fully inside EEZs). We also determine section). We also include a numeric variable for the name of this nearest EEZ sovereign state. We the current year to flexibly capture time trends assign each pixel to one of 7 World Bank Devel- not otherwise defined by the temporally-varying opment Indicators regions based on its EEZ [55]; features. if there is no EEZ, it is assigned to the high seas. Then using the Global Fishing Index (GFI) [56], we assign each pixel to one of 11 governance capac- ity score categories based on its EEZ (the GFI does not have data for high seas areas, so for pix- els in the high seas we assign a category value 13 Model training, out-of-sample (i.e., testing) split. We use these for optimizing performance testing, and robustness hyperparameters for a Stage 1 model (classifica- checks tion, or extensive margin), and then for optimizing hyperparameters for a Stage 2 model (regres- We train and test three separate models for three sion, or intensive margin). We create a series of separate forecast horizons t (1, 2, and 3 years). time-based folds where the assessment split comes For each forecast horizon and pixel-year observa- from the last year of data and the analysis split tion, we predict future fishing effort in t years. To uses the preceding year of data. In this way, the evaluate the performance of the three models, we folds allow us to independently test the CV per- separately perform both temporal and spatiotem- formance in predicting different years from the poral out-of-sample testing using holdout testing training dataset. We use the timetk R package to datasets. The following steps apply to each of the implement time-based CV splitting [58]. three models. 4. For each of the CV partitions, train and 1. For each pixel-year, modify the full dataset test random forest models [59] across a grid to create a new outcome variable for fishing effort of 10 hyperparameter combinations. We test an in t years using time leads by pixel. Naturally entropy-maximizing grid over two hyperparam- this reduces the size of the dataset - for any given eters - mtry (the number of features that will year, there are only so many years in the future be randomly sampled at each tree split node) with existing data. Larger forecast horizons will and min n (the minimum number of observations therefore have smaller datasets to work with. For required for further splitting each node). We use example, for t = 1, we use a dataset that includes 500 trees for all random forest models. We inde- rows of model features from 2020 and fishing effort pendently optimize hyperparameters separately in 2021, rows of model features from 2019 and fish- for both the Stage 1 and Stage 2 models, so that ing effort in 2020, etc. For t = 3, we use a dataset each gets its own set of hyperparameters. Note that includes rows of model features from 2018 that the Stage 1 model uses all observations from and fishing effort in 2021, rows of model features the analysis partition for training, while the Stage from 2017 and fishing effort in 2020, etc. 2 model is conditional on there being fishing effort 2. For each of these three datasets, temporally and therefore uses only those observations with split the data into a training dataset and a testing a non-zero fishing effort from the analysis parti- dataset. The testing dataset contains all data in tion for training. We use the ranger R package to the last year of the data, while the training dataset implement random forests [60]. contains all data from the previous years. We can 5. For the Stage 1 model, we choose the opti- therefore test the temporal out-of-sample perfor- mized hyperparameter set that maximizes ROC mance of the model to assess how well it predicts area-under-the curve (roc auc ), a model perfor- future years that were unseen during the model mance metric that measures the model’s general training. This helps us understand how well the ability to differentiate between two classes and model will perform in our simulations, which pre- is agnostic to the classification cutoff threshold. dict fishing effort in future years that have not yet We calculate disaggregated roc auc separately for been observed. Fig. S1 shows the years represented each CV fold and then average it across folds. in the training and testing datasets for each ocean 6. For the Stage 1 model and using its opti- and forecast horizon. Fig. S2A shows the result- mized hyperparameter set, we further choose the ing dataset sizes, by region (inside MPA, partial optimal classification cutoff threshold that maxi- MPA overlap, or outside MPAs). Fig. S2B shows mizes the F1 score (f meas ), which is the harmonic the number of distinct MPAs represented in each mean of precision and recall. Precision and recall of these datasets, by region (i.e., distinct MPAs depend on the number of true positives (TP) and that are fully covered by inside MPA pixels, MPAs false negatives (FN). Precision is TP / (TP + FP), that are partially covered, and MPAs that are the and recall is TP / (TP + FN). nearest MPA to outside MPA pixels). 7. For the Stage 2 model, we choose the 3. Using the training dataset, create cross- optimized hyperparameter set that maximizes validation (CV) partitions which each have an rsq trad. rsq trad is calculated using the tradi- analysis (i.e., training) split and an assessment tional definition of R squared which uses sum 14 of squares, and allows for negative values. This 13. We train the final Stage 1 and Stage is opposed to rsq which forces the values to be 2 models using the entire dataset (which uses between 0 and 1. rsq trad is therefore a more con- data from all years). Again, we use CV to tune servative estimate of model performance [61]. We hyperparameters for each model, as in Steps 3- calculate the back-transformation smearing coef- 8. For the Stage 2 model, we calculate the final ficient for each fold based on the predictions from back-transformation smearing coefficient using the the training split, and then back-transform the entire dataset. These final Stage 1 and Stage test split predictions to get level predictions. Using 2 models, in combination with the final back- these level predictions, we calculate rsq trad for transformation smearing coefficient, will be used each CV fold and then summarize it as the average to make predictions in the simulations. rsq trad across folds. 14. As an additional out-of-sample perfor- 8. Using the optimized hyperparameter sets for mance test, we repeat Steps 2-12, but this time the Stage 1 and Stage 2 models, we train mod- doing a spatiotemporal training/testing data split els using the full training dataset. For the Stage instead of just a temporal split. For this test we 2 models, we calculate the back-transformation split the data into ten different training datasets smearing coefficient using the training data, and and ten matched leave-one-location-out testing then back-transform the testing data predictions datasets, one for each ocean. The testing dataset using this value to get level predictions. for each ocean contains all data in the last year 9. Using the Stage 1 and Stage 2 trained mod- of the data, while the training dataset contains all els, we make out-of-sample predictions for the data from the other nine oceans in the previous testing dataset, which has not been used up to this years. In this way, we test the spatiotemporal out- point. For Stage 1 predictions, we use the opti- of-sample performance of the model to assess how mized classification threshold cutoff for classifying well it predicts future years in spatial areas that each prediction. were unseen during the model training. This helps 10. We combine each Stage 1 and Stage 2 us understand how well the model will perform out-of-sample predictions into full hurdle model in our simulations, which predict fishing effort in predictions. For each observation, we simply mul- future years that have not yet been observed, and tiply the stage 1 classification prediction (0 or 1) in areas that have new MPAs but which did not by the stage 2 prediction (hours/m2 of fishing have MPAs in the training dataset. Using these effort). ten splits, we build and test ten optimized models 11. For the Stage 1 model, we calculate out-of- using the full training and tuning procedure out- sample roc auc, f meas, precision, and recall. We lined in Steps 3-12. This gives us out-of-sample do this for all pixels globally, as well as 3 mutually performance measures for each of our ten models exclusive regions (inside MPAs, partial overlap, corresponding to the out-of-sample performance and outside MPAs). This gives us the Stage 1 in each of the ten holdout testing oceans (Fig. S5). performance for predicting out-of-sample future 15. To test the robustness of the baseline model fishing effort (Fig. S4A, Table S1). specification from Steps 1-12 (random forest with 12. To quantify stage 2 performance for pre- all model features), we test three additional model dicting future fishing effort, we calculate out- specifications: 1) a random forest with just the of-sample rsq trad, rsq, rmse (root-mean-square model feature of lagged log fishing effort (using error), and nrmse (normalized root-mean-square- the entire procedure described above); 2) logistic error, which is rmse divided by the standard regression for stage 1 (using the stats::glm func- deviation of the outcome in the testing data). tion) and linear regression for stage 2 (using the Note that rsq trad, rsq, and nrmse are unitless, stats::lm function) [62], with all model features while nrmse is in units of the outcome variable (similar to what is described above, but without (hours/m2 ). We do this for all pixels globally to cross-validation since these models have no hyper- obtain global performance metrics, as well as for parameters to tune); and 3) logistic regression for three mutually exclusive regions (inside MPAs, stage 1 and linear regression for stage 2, with partial MPA overlap, and outside MPAs) (Fig. just the model feature of lagged log fishing effort S4B, Table S1). (again similar to what is described above, bbut 15 without cross-validation since these models have Simulations no hyperparameters to tune) (Fig. S6). 1. Using the final trained Stage 1 and Stage 2 mod- els and smearing coefficient, make predictions for Model interpretability each of the three forecast horizons using the actual fully protected MPA network observed in 2020 To better understand why the model gives certain (which covers 2.5% of the oceans). This is our predictions and to increase the model’s inter- business-as-usual (BAU) scenario of what would pretability, we calculate Shapley values for the happen in a world without any new MPAs. Since final trained Stage 1 and Stage 2 models (i.e., the the predicted outcome variable is area-normalized models that are trained using the entire dataset). effort in hours/m2 , we multiply this number by Shapley values quantify the absolute contribu- the area of each pixel to get predicted fishing hours tion of each feature towards the prediction of per pixel. each individual observation (i.e., it is a mea- 2. Using the final trained Stage 1 and Stage sure of local explanation) [63]. To translate local 2 models and smearing coefficient, make predic- observation-level explanations into measures of tions for each of the three forecast horizons under global explanation, we: a series of hypothetical MPA networks. These net- 1. We select a random sample of 1000 obser- works were designed for different objectives, and vations from the training dataset, and a random each cover different percentages of the world’s sample of 100 observations to represent the back- oceans, from current levels up to 30%. For each of ground data. For each observation, we use the Ker- these hypothetical networks, we include the loca- nel SHAP method to calculate the approximate tions of current fully protected MPAs from 2020 Shapley value, using the kernelshap R package in order to be able to directly compare predictions [64]. For each feature, we take the mean abso- to the BAU counterfactual scenario (which only lute value from across the observations to obtain includes fully protected MPAs from 2020): a single Shapley value (Fig. S7). - Networks from Sala et al. 2021 [20]: We 2. We then group the features into the 7 gen- consider four network types from this analysis eral categories defined in the “Data processing and that each prioritize different objectives: biodiver- feature engineering” section and Table 1 (except sity protection; carbon sequestration; food provi- including the previous fishing effort and year fea- sion; and a multi-objective network that equally tures individually); based on the additive property weights all three objectives. For each network of Shapley values, sum the observation-wise val- type, we use seven different protection targets: 3%; ues across features for each feature group and 5%; 10%; 16%, 20%; and 30%. observation, and then take the mean absolute fea- - Network to protect Ecologically or Biolog- ture group values from across all observations to ically Significant Marine Areas (EBSAs)[27, 65]: obtain a single Shapley value per feature group This network was proposed by an expert-driven (Fig. S8). We use the following groups: Previ- process facilitated by the Convention on Biological ous fishing effort; Geographic (spatial geographic Diversity, and covers roughly 20% of the world’s features including bathymetry depth, distance to oceans. shore, mesopelagic region, ocean, latitude, and - Network in Visalli et al. 2020 [26]: This longitude); Environmental (spatiotemporal and network proposes areas for priority protection in temporal environmental features, including sea marine areas beyond national jurisdiction, and surface temperature (SST), SST anomoly, wind covers roughly 16% of the world’s oceans. speed, chlorophyll concentration, ENSO and PDO - Random: We randomly rank the pixels and indices); AIS reception (class A and class B AIS successively close pixels until a desired target is reception); MPA (MPA-related features); Gover- reached. We use 3%; 5%; 10%; 16%, 20%; and 30% nance (EEZ indicators and Global Fishing Index area protected targets. governance capacity index); Economic (distance - Most-fished areas: We first rank pixels in to port and fuel price); and Year. descending order by the amount of fishing effort 3. We create dependence plots for each feature (hours) in 2020. We use 3%; 5%; 10%; 16%, 20%; that plot the Shapley value against the feature and 30% area protected targets. For each target, value (Figs. S9 - S10). 16 we close the top most-fished pixels necessary to Gaines, Ray Hilborn, Jack Kittinger, John Lyn- achieve the target. ham, Juan Mayorga, Michael Melnychuk, Dan - Unfished areas: We first select pixels that Ovando, Randi Rotjan, Dale Squires, and Juan have zero fishing effort in 2020. We use 3%; 5%; Carlos Villase˜nor-Derbez for their helpful feed- 10%; 16%, 20%; and 30% area protected targets. back at various stages of this research. We thank For each target, we close a random sample of these two anonymous referees for helping us to improve unfished pixels necessary to achieve the target. the manuscript. The findings, interpretations, and 3. For each hypothetical MPA network and conclusions expressed in this paper are entirely forecast horizon, calculate the absolute and per- those of the authors. They do not necessarily centage difference between the hypothetical MPA represent the views of the World Bank and its network predictions and the BAU network predic- affiliated organizations, or those of the Executive tions. We make calculate these differences for all Directors of the World Bank or the governments pixels globally, as well as for the three mutually they represent. exclusive pixel regions (fully inside MPAs, partial MPA overlap, and fully outside MPAs). Funding We gratefully acknowledge the financial support Data, Materials, and Software from Jon Arnhold that enabled this research. Availability Peer-Reviewed Version of Article. This All code necessary for reproducing the analy- article was also published in the Proceedings of the sis can be found at https://zenodo.org/records/ National Academy of Sciences : https://doi.org/ 11625791 [66]. Most data necessary for repro- 10.1073/pnas.2400592121 ducing the analysis are also available directly in this repository, including our fishing effort out- come variable and all model feature data except References those relating to bunker fuel prices. The bunker fuel price data used in our analysis are subject [1] S Woodley, et al., A review of evidence for to restricted use and are not available for pub- area-based conservation targets for the post- lic redistribution. Information on obtaining these 2020 global biodiversity framework. Parks data directly from Bunker Index can be found at 25, 31–46 (2019). https://bunkerindex.com/ [2] I Eckert, A Brown, D Caron, F Riva, LJ We use the R programming language for all Pollock, 30× 30 biodiversity gains rely on code [62]. We use the targets package to ensure national coordination. Nature Communica- full reproducibility of the data processing and tions 14, 7113 (2023). analysis pipeline [67], and the renv package to ensure a reproducible package environment [68]. [3] E Dinerstein, et al., A Global Deal For We use the tidyverse suite of packages for all Nature: Guiding principles, milestones, and data wrangling tasks [69], and the tidymodels targets. Science Advances 5, eaaw2869 suite of packages for the general machine learn- (2019). ing framework [70]. We use functions from the yardstick package for calculating all model perfor- [4] K Grorud-Colvert, et al., The MPA Guide: mance metrics. For spatial operations, we use the A framework to achieve global goals for the sf [71], terra [72], stars [73], exactextractr [74], ocean. Science 373, eabf0861 (2021). and raster [75] packages. For all spatial joins, area calculations, and distance calculations, we use the [5] MC Institute, Mpatlas, version accessed and Mollweide equal-area projection. downloaded december 2020. (2020). Acknowledgments. We gratefully thank [6] MD Smith, JE Wilen, Economic impacts of Anannya Deshmukh for building our dataset of marine reserves: the importance of spatial fully protected MPA implementation dates. We behavior. Journal of Environmental Eco- thank Kristina Boerder, Ginny Farmer, Steve nomics and Management 46, 183–206 (2003). 17 [7] R Hilborn, Are MPAs effective? ICES Jour- fishery catch rates. Nature Communications nal of Marine Science 75, 1160–1162 (2018). 11, 979 (2020). [8] S Lester, et al., Biological effects within [17] GR McDermott, KC Meng, GG McDonald, no-take marine reserves: a global synthesis. CJ Costello, The blue paradox: Preemptive Marine Ecology Progress Series 384, 33–46 overfishing in marine reserves. Proceedings of (2009). the National Academy of Sciences 116, 5319– 5325 (2019). [9] R Albrecht, et al., Protected area downgrad- ing, downsizing, and degazettement (paddd) [18] G Englander, Information and Spillovers in marine protected areas. Marine Policy from Targeting Policy in Peru’s Anchoveta 129, 104437 (2021). Fishery. American Economic Journal: Eco- nomic Policy 15, 390–427 (2023). [10] F Bastardie, JR Nielsen, T Miethe, DIS- PLACE: a dynamic, individual-based model [19] J Hampton, et al., Limited conservation effi- for spatial fishing planning and effort dis- cacy of large-scale marine protected areas for placement — integrating underlying fish pacific skipjack and bigeye tunas. Frontiers population models. Canadian Journal of in Marine Science 9, 2817 (2023). Fisheries and Aquatic Sciences 71, 366–386 (2014). [20] E Sala, et al., Protecting the global ocean for biodiversity, food and climate. Nature 592, [11] AC Haynie, DF Layton, An expected profit 397–402 (2021). model for monetizing fishing location choices. Journal of Environmental Economics and [21] SP Greenstreet, HM Fraser, GJ Piet, Using Management 59, 165–176 (2010). MPAs to address regional-scale ecological objectives in the North Sea: modelling the [12] JK Abbott, AC Haynie, What are we pro- effects of fishing effort displacement. ICES tecting? Fisher behavior and the unintended Journal of Marine Science 66, 90–100 (2009). consequences of spatial closures as a fish- ery management tool. Ecological Applications [22] ES Klein, GM Watters, What’s the catch? 22, 762–777 (2012). Profiling the benefits and costs associated with marine protected areas and displaced [13] V Kahui, WRJ Alexander, A Bioeconomic fishing in the Scotia Sea. PLoS One 15, Analysis of Marine Reserves for Paua e0237425 (2020). (Abalone) Management at Stewart Island, New Zealand. Environmental and Resource [23] HJ Albers, MF Ashworth, Economics of Economics 40, 339–367 (2008). marine protected areas: Assessing the lit- erature for marine protected area network [14] S Hynes, H Gerritsen, B Breen, M Johnson, expansions. Annual Review of Resource Eco- Discrete choice modelling of fisheries with nomics 14, 533–554 (2022). nuanced spatial information. Marine Policy 72, 156–165 (2016). [24] DA Kroodsma, et al., Tracking the global footprint of fisheries. Science 359, 904–908 [15] J Lynham, Fishing activity before closure, (2018). during closure, and after reopening of the northeast canyons and seamounts marine [25] E Commission, Eu biodiversity strategy for national monument. Scientific Reports 12, 2030: Bringing nature back into our live 917 (2022). (2020). [16] J Lynham, A Nikolaev, J Raynor, T Vilela, [26] ME Visalli, et al., Data-driven approach for JC Villase˜ nor-Derbez, Impact of two of the highlighting priority areas for protection in world’s largest protected areas on longline marine areas beyond national jurisdiction. 18 Marine Policy 122, 103927 (2020). [37] R Hilborn, et al., Effective fisheries manage- ment instrumental in improving fish stock [27] DC Dunn, et al., The convention on biological status. Proceedings of the National Academy diversity’s ecologically or biologically signifi- of Sciences 117, 2218–2224 (2020). cant areas: origins, development, and current status. Marine Policy 49, 137–145 (2014). [38] TD White, et al., Tracking the response of industrial fishing fleets to large marine pro- [28] Y Zeng, LP Koh, DS Wilcove, Gains in bio- tected areas in the pacific ocean. Conserva- diversity conservation and ecosystem services tion Biology 34, 1571–1578 (2020). from the expansion of the planet’s protected areas. Science Advances 8, eabl9885 (2022). [39] JB Kellner, I Tetreault, SD Gaines, RM Nis- bet, Fishing the line near marine reserves in [29] CU Soykan, T Eguchi, S Kohin, H Dewar, single and multispecies fisheries. Ecological Prediction of fishing effort distributions using Applications 17, 1039–1054 (2007). boosted regression trees. Ecological Applica- tions 24, 71–83 (2014). [40] Island Times, PITI suspends fish- ing in Palau, blames PNMS [30] BC Prest, CJ Wichman, K Palmer, RCTs for exit (https://islandtimes.org/ Against the Machine: Can Machine Learn- piti-suspends-fishing-in-palau-blames-pnms-for-exit/) ing Prediction Methods Recover Experimen- (2020) Accessed: 2023-11-08. tal Treatment Effects? Journal of the Association of Environmental and Resource [41] K Sherman, AM Duda, Large marine ecosys- Economists 10, 1231–1264 (2023). tems: an emerging paradigm for fishery sus- tainability. Fisheries 24, 15–26 (1999). [31] W Cai, et al., Increased enso sea surface temperature variability under four ipcc emis- [42] D Ovando, O Liu, R Molina, A Parma, C sion scenarios. Nature Climate Change 12, Szuwalski, Global effects of marine protected 228–231 (2022). areas on food security are unknown. Nature 621, E34–E36 (2023). [32] HJ Mediodia, V Kahui, I Noy, Sea surface temperature and tuna catch in the Eastern [43] M Taconet, D Kroodsma, JA Fernandes, Pacific Ocean under climate change. Marine Global atlas of ais-based fishing activ- Resource Economics 38, 329–351 (2023). ity—challenges and opportunities. (2019). [33] MJ Costello, Evidence of economic benefits [44] FS Paolo, et al., Satellite mapping reveals from marine protected areas. Scientia Marina extensive industrial activity at sea. Nature 88, e080 (2024). 625, 85–91 (2024). [34] S Medoff, J Lynham, J Raynor, Spillover ben- [45] N Duan, Smearing estimate: a nonparamet- efits from the world’s largest fully protected ric retransformation method. Journal of the mpa. Science 378, 313–316 (2022). American Statistical Association 78, 605–610 (1983). [35] F Favoretto, C L´ opez-Sag´astegui, E Sala, O Aburto-Oropeza, The largest fully pro- [46] B Huang, et al., Noaa 0.25-degree daily opti- tected marine area in north america does not mum interpolation sea surface temperature harm industrial fishing. Science Advances 9, (oisst), version 2.1. NOAA National Centers eadg0709 (2023). for Environmental Information (2020). [36] C Costello, et al., Global fishery prospects [47] C Hu, Z Lee, B Franz, Chlorophyll aal- under contrasting management regimes. Pro- gorithms for oligotrophic oceans: A novel ceedings of the National Academy of Sciences approach based on three-band reflectance dif- 113, 5125–5129 (2016). ference. Journal of Geophysical Research: 19 Oceans 117 (2012). [59] L Breiman, Random forests. Machine learn- ing 45, 5–32 (2001). [48] C Mears, T Lee, L Ricciardulli, X Wang, F Wentz, Improving the accuracy of the [60] MN Wright, A Ziegler, ranger: A fast imple- cross-calibrated multi-platform (ccmp) ocean mentation of random forests for high dimen- vector winds. Remote Sensing 14, 4230 sional data in C++ and R. Journal of (2022). Statistical Software 77, 1–17 (2017). [49] N Rayner, et al., Global analyses of sea sur- [61] M Kuhn, D Vaughan, E Hvitfeldt, yard- face temperature, sea ice, and night marine stick: Tidy Characterizations of Model Per- air temperature since the late nineteenth formance, (2023) R package version 1.2.0. century. Journal of Geophysical Research: Atmospheres 108 (2003). [62] R Core Team, R: A Language and Envi- ronment for Statistical Computing (R Foun- [50] N Mantua, The pacific decadal oscillation: a dation for Statistical Computing, Vienna, brief overview for non-specialists. Encyclope- Austria), (2023). dia of Environmental Change (1999). [63] SM Lundberg, SI Lee, A unified approach [51] C Yesson, et al., List of seamounts in the to interpreting model predictions. Advances world oceans - An update (2020). in neural information processing systems 30 (2017). [52] FM Institute, Global oceans and seas, version 1. available online at https:// [64] M Mayer, D Watson, kernelshap: Kernel www.marineregions.org/, https://doi.org/10. SHAP, (2023) R package version 0.3.7. 14284/542 (2021). [65] Ebsa statistics using r (https://github.com/ [53] TT Sutton, et al., A global biogeographic iobis/ebsa np) (2018) Accessed: Novermber classification of the mesopelagic zone. Deep 20, 2022. Sea Research Part I: Oceanographic Research Papers 126, 85–102 (2017). [66] G McDonald, emlab-ucsb/mpa-fishing-effort- redistribution: Public release to accompany [54] FM Institute, Maritime boundaries geo- published paper (2024). database: Maritime boundaries and exclusive economic zones (200nm), version 11. avail- [67] WM Landau, The targets r package: able online at https://www.marineregions. a dynamic make-like function-oriented org/, https://doi.org/10.14284/386 (2019). pipeline toolkit for reproducibility and high- performance computing. Journal of Open [55] V Arel-Bundock, N Enevoldsen, C Yetman, Source Software 6, 2959 (2021). countrycode: An r package to convert country names and country codes. Journal of Open [68] K Ushey, H Wickham, renv: Project Environ- Source Software 3, 848 (2018). ments, (2023) R package version 1.0.2. [56] J Spijkers, et al., Diversity of global fisheries [69] H Wickham, et al., Welcome to the tidy- governance: Types and contexts. Fish and verse. Journal of Open Source Software 4, Fisheries 24, 111–125 (2023). 1686 (2019). [57] Bunker Index, Bunker Index BIX World [70] M Kuhn, H Wickham, Tidymodels: a collec- Indices (2021). tion of packages for modeling and machine learning using tidyverse principles., (2020). [58] M Dancho, D Vaughan, timetk: A Tool Kit for Working with Time Series, (2023) R package [71] EJ Pebesma, , et al., Simple features for r: version 2.8.3. standardized support for spatial vector data. R J. 10, 439 (2018). 20 [72] RJ Hijmans, terra: Spatial Data Analysis, (2023) R package version 1.7-29. [73] E Pebesma, R Bivand, Spatial Data Sci- ence: With applications in R. (Chapman and Hall/CRC, London), p. 352 (2023). [74] Daniel Baston, exactextractr: Fast Extraction from Raster Datasets using Polygons, (2022) R package version 0.9.1. [75] RJ Hijmans, raster: Geographic Data Analy- sis and Modeling, (2023) R package version 3.6-20. 21 Table 1: Data sources for all included model features, including the units, feature type, whether it varies spatially, whether it varies temporally, and the source name and reference citation. No. Feature (units) Type Spatial Temporal Source Ref. MPA implementation 1 Nearest MPA: distance (m) Numeric Yes Yes MPA Atlas, December 2020 [5] 2 Nearest MPA: years since designation (years) Numeric Yes Yes MPA Atlas, December 2020 [5] 3-5 Spatial coverage: fraction of pixel, 1st and 2nd neighbor Numeric Yes Yes MPA Atlas, December 2020 [5] 6 Spatial coverage: inside, outside, or partial Categorical Yes Yes MPA Atlas, December 2020 [5] 7 Temporal coverage: fraction of year Numeric Yes Yes MPA Atlas, December 2020 [5] 8-11 Future coverage: inside or partial, 1 and 2 year leads Boolean Yes Yes MPA Atlas, December 2020 [5] Environmental (mean and sd for each) 12-15 Sea surface temperature and anomaly (°C) Numeric Yes Yes NOAA 0.25-deg Daily OI SST V2.1 [46] 16-17 Chlorophyll-A (mg/m3 ) Numeric Yes Yes Aqua MODIS Chl-a 4km monthly [47] 18-19 Wind speed (m/s) Numeric Yes Yes CCMP Wind Vector V2.1, 4x daily [48] 20-21 no Southern Oscillation index El Ni˜ Numeric No Yes NOAA ENSO 3.4 Index [49] 22-23 Pacific Decadal Oscillation index Numeric No Yes NOAA PDO Index [50] Geographic 24-25 Latitude and longitude (°) Numeric Yes No Global Fishing Watch [24] 26 Shore: nearest distance (m) Numeric Yes No Global Fishing Watch [24] 27 Seamount: nearest distance (m) Numeric Yes No Yesson, et al. 2020 [51] 28 Bathymetry: depth (m) Numeric Yes No Global Fishing Watch [24] 29 Ocean Categorical Yes No Marine Regions Global Oceans [52] 30 Mesopelagic zone Categorical Yes No Biogeographic mesolpelagic zones [53] Governance 31 Exclusive Economic Zone: sovereign state Categorical Yes No Marine Regions V11 [54] 32 Exclusive Economic Zone: fraction of pixel coverage Numeric Yes No Marine Regions V11 [54] 33 Exclusive Economic Zone: nearest distance (m) Numeric Yes No Marine Regions V11 [54] 33 Exclusive Economic Zone: nearest sovereign state Numeric Yes No Marine Regions V11 [54] 34 World Bank Development Indicators region Categorical Yes No R ‘countrycode’ package [55] 35 Global Fishing Index governance capacity Categorical Yes No Global Fishing Index [56] Economic 36 Distance from port (m) Numeric Yes No Global Fishing Watch [24] 37-38 IFO 380 fuel price, mean and sd (USD/MT) Numeric No Yes Bunker Index [57] Technological 39-40 AIS reception: Type A and Type B transponders (pings/day) Numeric Yes No Global Fishing Watch [24] Residual effects 41 AIS fishing effort: 1, 2, or 3 year lag (log(hours / m2 )) Numeric Yes Yes Global Fishing Watch [24] 42 Year Numeric No Yes Global Fishing Watch [24] 22 1 2 Supporting Information for 3 Global expansion of marine protected areas and the redistribution of fishing effort 4 Gavin McDonald, Jennifer Bone, Christopher Costello, Gabriel Englander, Jennifer Raynor 5 Corresponding Author: Gavin McDonald 6 E-mail: gmcdonald@bren.ucsb.edu 7 This PDF file includes: 8 Supporting text 9 Figs. S1 to S15 10 Table S1 11 SI References Gavin McDonald, Jennifer Bone, Christopher Costello, Gabriel Englander, Jennifer Raynor 1 of 18 12 Supporting Information Text 13 Materials and Methods. The first set of figures and tables in the Supporting Information are those referenced in the Materials 14 and Methods section of the main text (Figs. S1 - S10 and Table S1). The next figures show our simulation prediction results 15 disaggregated by forecast horizon region (inside MPAs, partial overlap with MPAs, outside MPAs, and global), both in terms 16 of the number of fished pixels (Fig. S11) and in terms of the amount of effort (Fig. S12). 17 Palau case study. We then use Palau as a case study to look at the historically observed fishing effort from 2016-2021 inside 18 and outside the Palau National Marine Sanctuary, which was implemented in 2020 (Fig. S13). We do this for the two largest 19 fishing fleets in Palau, Taiwan and Japan. We also consider the fleets of Taiwanese and Japanese vessels that fished in Palau 20 prior to the MPA implementation (which we call the “pre-MPA fishing fleet”) and look at where these vessels fished before and 21 after the MPA implementation (Fig. S14). 22 Results by Large Marine Ecosystem. For each each Large Marine Ecosystem (1) and MPA network scenario of 30% coverage, 23 we: 1) calculate the predicted total business-as-usual counterfactual fishing hours after a 3 year time horizon; 2) calculate the 24 predicted total fishing hours in the MPA network scenario after a 3 year time horizon; and 3) calculate the percentage change 25 in fishing hours in the MPA network scenario relative to the business-as-usual (Fig. S15). Table S1. Temporal out-of-sample global performance for Stage 1 classification models and Stage 2 regression models, for each of the three forecast horizon models. Model Metric Forecast horizon: 1 Forecast horizon: 2 Forecast horizon: 3 Stage 1 f_meas 0.912 0.909 0.91 Stage 1 precision 0.9 0.889 0.907 Stage 1 recall 0.924 0.931 0.913 Stage 1 roc_auc 0.97 0.967 0.969 Stage 2 nrmse 0.409 0.469 0.423 Stage 2 rmse 6.10e-07 6.98e-07 6.31e-07 Stage 2 rsq 0.898 0.853 0.861 Stage 2 rsq_trad 0.832 0.78 0.821 2 of 18 Gavin McDonald, Jennifer Bone, Christopher Costello, Gabriel Englander, Jennifer Raynor Model features Forecast horizon: 1 Outcome variable Model features Forecast horizon: 2 Outcome variable Model features Forecast horizon: 3 Outcome variable 16 17 18 19 20 21 20 20 20 20 20 20 Data partition Training Testing Fig. S1. Summary of the years contained in the training and testing datasets for each forecast horizon (this is the same for each of the ten out-of-sample oceans). The rows of each dataset contain numerous model features (e.g., sea surface temperature) and a single outcome variable (i.e., hours/m2 ). The arrows indicate which outcome variable year is being predicted by which model feature year. For example, in the training dataset for a forecast horizon of 1, model features from 2016 are used to predict fishing effort in 2017; in the testing dataset for a forecast horizon of 2, model features from 2019 are used to predict fishing effort in 2021. All testing datasets are predicting fishing effort in 2021. Gavin McDonald, Jennifer Bone, Christopher Costello, Gabriel Englander, Jennifer Raynor 3 of 18 (A) Inside MPAs Partial MPA overlap Outside MPAs Forecast horizon: 1 Forecast horizon: 2 Forecast horizon: 3 0 0 0 0 0 0 0 0 0 00 00 00 00 00 00 00 00 00 00 ,0 0 0 0 1, 2, 2, 4, 6, 8, 0, 0, 0, 50 10 15 20 Number pixel−years (B) Inside MPAs Partial MPA overlap Outside MPAs Forecast horizon: 1 Forecast horizon: 2 Forecast horizon: 3 0 10 20 30 40 50 0 0 0 0 0 0 0 0 20 40 60 20 40 60 Number distinct fully covered (Inside), partially covered (Partial), or nearest MPAs (Outside) Training Testing Fig. S2. Summary of the (A) training and temporal out-of-sample testing dataset observations (pixel-years), and (B) the number of distinct MPAs represented in the training and temporal out-of-sample testing datasets. Summaries are shown for each of the three forecast horizon models, by region (i.e., distinct MPAs that are fully covered by inside MPA pixels, distinct MPAs that are partially covered, and distinct MPAs that are the nearest MPA to outside MPA pixels). 4 of 18 Gavin McDonald, Jennifer Bone, Christopher Costello, Gabriel Englander, Jennifer Raynor CV fold: 1 CV fold: 2 CV fold: 3 Model features Forecast horizon: Outcome variable 1 Model features Forecast horizon: Outcome variable 2 Model features Forecast horizon: Outcome variable 3 16 17 18 19 20 16 17 18 19 20 16 17 18 19 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 CV split Analysis Assessement Fig. S3. Summary of the years represented in the cross-validation (CV) assessment and analysis splits for each fold and forecast horizon (this is the same for each of the ten out-of-sample oceans). The rows of each split contain numerous model features (e.g., sea surface temperature) and a single outcome variable (i.e., hours/m2 ). The arrows indicate which outcome variable year is being predicted by which model feature year. For example, in the analysis split for CV fold 1 and a forecast horizon of 1, model features from 2019 are used to predict fishing effort in 2020; in the assessment split for CV fold 2 and forecast horizon of 2, model features from 2017 are used to predict fishing effort in 2019. Gavin McDonald, Jennifer Bone, Christopher Costello, Gabriel Englander, Jennifer Raynor 5 of 18 (A) Stage 1 models Inside MPAs Partial MPA overlap Outside MPAs Global 1.00 0.75 0.50 f_meas 0.25 0.00 1.00 0.75 0.50 precision 0.25 0.00 1.00 0.75 0.50 recall 0.25 0.00 1.00 0.75 0.50 roc_auc 0.25 0.00 1 2 3 1 2 3 1 2 3 1 2 3 Forecast horizon (years) (B) Stage 2 models Inside MPAs Partial MPA overlap Outside MPAs Global 1.00 0.75 0.50 nrmse 0.25 0.00 6e−07 4e−07 rmse 2e−07 0e+00 1.00 0.75 0.50 rsq 0.25 0.00 1.00 0.75 0.50 rsq_trad 0.25 0.00 1 2 3 1 2 3 1 2 3 1 2 3 Forecast horizon (years) Fig. S4. Temporal out-of-sample performance for (A) Stage 1 classification models and (B) Stage 2 regression models, for different forecast horizons, performance metrics, and regions (the Global region aggregates all pixels). 6 of 18 Gavin McDonald, Jennifer Bone, Christopher Costello, Gabriel Englander, Jennifer Raynor (A) Stage 1 models (B) Stage 2 models 1.00 1.00 0.75 0.75 0.50 f_meas 0.50 nrmse 0.25 0.25 0.00 0.00 1.00 5e−06 0.75 4e−06 3e−06 0.50 precision rmse 2e−06 0.25 1e−06 0.00 0e+00 1.00 1.00 0.75 0.75 0.50 recall 0.50 rsq 0.25 0.25 0.00 0.00 1.00 1.00 0.75 0.75 0.50 roc_auc 0.50 rsq_trad 0.25 0.25 0.00 0.00 1 2 3 1 2 3 Forecast horizon (years) Forecast horizon (years) Fig. S5. Spatiotemporal out-of-sample performance for (A) Stage 1 classification models and (B) Stage 2 regression models for each forecast horizon and different performance metrics. Each point represents a different out-of-sample testing ocean, and the points are jittered to avoid visual overlap. Boxplots and violin plots show distributions of values from across the ten testing oceans. The boxplots for each bin show the median, 25th percentile, and 75th percentile. Gavin McDonald, Jennifer Bone, Christopher Costello, Gabriel Englander, Jennifer Raynor 7 of 18 (A) Stage 1 models Forecast horizon: 1 Forecast horizon: 2 Forecast horizon: 3 1.00 0.75 0.50 f_meas 0.25 0.00 1.00 0.75 0.50 precision 0.25 0.00 1.00 0.75 0.50 recall 0.25 0.00 1.00 0.75 0.50 roc_auc 0.25 0.00 (B) Stage 2 models Forecast horizon: 1 Forecast horizon: 2 Forecast horizon: 3 2.0 1.5 1.0 nrmse 0.5 0.0 3e−06 2e−06 rmse 1e−06 0e+00 1.00 0.75 0.50 rsq 0.25 0.00 1 0 −1 rsq_trad −2 −3 Baseline (RF with all model features) RF with just lagged fishing model feature LM with all model features LM with just lagged fishing model feature Fig. S6. Temporal out-of-sample performance for (A) Stage 1 classification models and (B) Stage 2 regression models, using four model specifications: 1) baseline using random forest (RF) with all model features; 2) random forest (RF) with just the model feature of lagged log fishing effort; 2) linear model (LM) using logistic regression for stage 1 and linear regression for stage 2, with all model features; and 4) linear model (LM) using logistic regression for stage 1 and linear regression for stage 2, with just the model feature of lagged log fishing effort. 8 of 18 Gavin McDonald, Jennifer Bone, Christopher Costello, Gabriel Englander, Jennifer Raynor Stage 1 Stage 2 AIS fishing effort: 1, 2, or 3 year lag (log(hour/m^2)) Sea surface temperature, mean (degree C) AIS reception: Type A transponders (pings/day) AIS reception: Type Type B transponders (pings/day) Latitude Mesopelagic zone Sea surface temperature, SD (degree C) Shore: nearest distance (m) Distance from port (m) Longitude Wind speed, mean (m/s) Bathymetry: depth (m) Wind speed, SD (m/s) Chlorophyll−A, mean (mg/m^3) Exclusive economic zone: nearest distance (m) MPA future coverage: partial, 2 year lead MPA future coverage: inside, 2 year lead MPA future coverage: inside, 1 year lead MPA future coverage: partial, 1 year lead Temporal coverage: fraction of year with MPA MPA Spatial coverage: fraction of pixel MPA spatial coverage: inside, outside, or partial MPA Spatial coverage: fraction of pixels, 1st neighbor Nearest MPA: distance (m) Exclusive economic zone: sovereign state Ocean Pacific Decadal Oscillation index, SD IFO 380 fuel price, SD (USD/MT) MPA Spatial coverage: fraction of pixels, 2nd neighbor IFO 380 fuel price, mean (USD/MT) El Niño Southern Oscillation index, SD Year El Niño Southern Oscillation index, mean Pacific Decadal Oscillation index, mean Sea surface temperature anomaly, SD (degree C) Exclusive economic zone: fraction of pixel coverage Chlorophyll−A, SD (mg/m^3) Sea surface temperature anomaly, mean (degree C) Seamount: nearest distance (m) World Bank Development Indicators region Nearest MPA: years since designation (years) Global Fishing Index governance capacity Exclusive economic zone: nearest sovereign state 0.00 0.05 0.10 0.15 0.0 0.4 0.8 1.2 Mean absolute Shapley value MPA implementation feature All other features Fig. S7. Mean absolute Shapley values from final trained Stage 1 and Stage 2 models for a 1 year forecast horizon. Bars are color-coded by whether or not the feature is related to MPA implementation. Bars are arranged in descending order based on the feature importance from the Stage 1 model. Gavin McDonald, Jennifer Bone, Christopher Costello, Gabriel Englander, Jennifer Raynor 9 of 18 Stage 1 Stage 2 AIS fishing effort: 1, 2, or 3 year lag Environmental Technological Geographic MPA implementation Governance Economic Year 0.00 0.05 0.10 0.15 0.0 0.4 0.8 1.2 Sum of mean absolute Shapley value by feature group MPA implementation feature All other features Fig. S8. Sum of mean absolute Shapley values by feature group from final trained Stage 1 and Stage 2 models for a 1 year forecast horizon, aggregated by feature group. We use the following groups: Previous fishing effort; Geographic (spatial geographic features including bathymetry depth, distance to shore, mesopelagic region, distance to seamount, ocean, latitude, and longitude); Environmental (spatiotemporal and temporal environmental features, including sea surface temperature (SST), SST anomaly, wind speed, chlorophyll concentration, ENSO and PDO indices); Technological (class A and class B AIS transponder, reception); MPA implementation (MPA-related features); Governance (EEZ sovereign state, nearest EEZ sovereign state, distance to nearest EEZ World Bank region, and Global Fishing Index governance capacity index); Economic (distance to port and fuel price); and Year. Bars are arranged in descending order based on the feature importance from the Stage 1 model. 10 of 18 Gavin McDonald, Jennifer Bone, Christopher Costello, Gabriel Englander, Jennifer Raynor AIS fishing effort: 1, 2, or 3 AIS reception: Type A transponders AIS reception: Type Type B Bathymetry: depth (m) year lag (log(hour/m^2)) (pings/day) transponders (pings/day) 0.1 0.05 0.10 0.2 0.0 0.05 0.00 0.0 0.00 −0.1 −0.05 −0.05 −0.2 −0.10 −0.10 −25 −20 −15 −10 0 200 400 600 0 100 200 300 −6000 −4000 −2000 0 El Niño Southern Oscillation Chlorophyll−A, mean (mg/m^3) Chlorophyll−A, SD (mg/m^3) Distance from port (m) index, mean 0.05 0.2 0.04 0.025 0.00 0.1 0.02 0.000 −0.05 −0.025 0.0 0.00 −0.050 −0.02 0 10 20 0 5 10 15 20 0e+00 1e+06 2e+06 3e+06 4e+06 −0.4 −0.2 0.0 0.2 0.4 El Niño Southern Oscillation Exclusive economic zone: fraction Exclusive economic zone: nearest IFO 380 fuel price, mean (USD/MT) index, SD of pixel coverage distance (m) 0.04 0.025 0.10 0.000 0.050 0.02 0.05 −0.025 0.025 −0.050 0.00 0.00 −0.075 0.000 −0.05 −0.02 0.3 0.6 0.9 1.2 0.00 0.25 0.50 0.75 1.00 0e+00 2e+06 4e+06 6e+06 300 350 400 450 MPA Spatial coverage: fraction of IFO 380 fuel price, SD (USD/MT) Latitude Longitude pixel 0.04 0.10 0.10 0.00 0.05 0.02 0.05 0.00 −0.04 0.00 0.00 −0.05 Shapley value −0.05 −0.08 −0.10 −0.02 −0.10 25 30 35 40 45 −50 0 50 −100 0 100 0.00 0.25 0.50 0.75 1.00 MPA Spatial coverage: fraction of MPA Spatial coverage: fraction of Nearest MPA: years since Nearest MPA: distance (m) pixels, 1st neighbor pixels, 2nd neighbor designation (years) 0.00 0.03 0.05 0.00 0.00 −0.04 0.00 −0.05 −0.03 −0.08 −0.05 −0.06 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0e+00 2e+06 4e+06 6e+06 0 25 50 75 Pacific Decadal Oscillation index, Pacific Decadal Oscillation index, Sea surface temperature anomaly, Sea surface temperature anomaly, mean SD mean (degree C) SD (degree C) 0.075 0.10 0.02 0.05 0.01 0.050 0.05 0.00 0.025 0.00 −0.01 −0.05 0.00 0.000 −0.02 −0.5 0.0 0.5 1.0 1.5 0.3 0.4 0.5 0.6 0.7 −2 0 2 0 1 2 3 Sea surface temperature, mean Sea surface temperature, SD Seamount: nearest distance (m) Shore: nearest distance (m) (degree C) (degree C) 0.10 0.05 0.05 0.1 0.05 0.00 0.0 0.00 0.00 −0.05 −0.1 −0.05 −0.10 −0.2 −0.05 −0.10 0 10 20 30 0 2 4 6 0e+00 1e+06 2e+06 3e+06 0e+00 1e+06 2e+06 Temporal coverage: fraction of Wind speed, mean (m/s) Wind speed, SD (m/s) Year year with MPA 0.02 0.04 0.05 0.05 0.01 0.02 0.00 0.00 0.00 −0.01 0.00 −0.05 −0.05 −0.02 −0.02 −0.03 −0.10 −0.10 0.00 0.25 0.50 0.75 1.00 2.5 5.0 7.5 10.0 12.5 1 2 3 4 5 2016 2017 2018 2019 2020 Feature value Fig. S9. Shapley value dependence plots from final trained Stage 1 model for all numeric model features and a 1 year forecast horizon. Each point represents the Shapley value from an individual observation, and the blue line uses a Generalized Additive Model smoothing function. Gavin McDonald, Jennifer Bone, Christopher Costello, Gabriel Englander, Jennifer Raynor 11 of 18 AIS fishing effort: 1, 2, or 3 AIS reception: Type A transponders AIS reception: Type Type B Bathymetry: depth (m) year lag (log(hour/m^2)) (pings/day) transponders (pings/day) 0.5 4 0.25 0.50 2 0.0 0.25 0.00 0 0.00 −0.5 −0.25 −2 −0.25 −0.50 −0.50 −25 −20 −15 −10 0 200 400 600 0 100 200 300 −6000 −4000 −2000 0 El Niño Southern Oscillation Chlorophyll−A, mean (mg/m^3) Chlorophyll−A, SD (mg/m^3) Distance from port (m) index, mean 0.5 0.4 0.6 0.5 0.4 0.4 0.0 0.2 0.3 0.2 0.2 0.0 0.0 0.1 −0.5 −0.2 0.0 −0.2 −0.1 −0.4 0 5 10 15 20 0 5 10 15 0e+00 1e+06 2e+06 3e+06 −0.4 −0.2 0.0 0.2 0.4 El Niño Southern Oscillation Exclusive economic zone: fraction Exclusive economic zone: nearest IFO 380 fuel price, mean (USD/MT) index, SD of pixel coverage distance (m) 0.2 0.4 0.1 0.1 0.05 0.3 0.0 0.2 0.0 −0.1 0.00 0.1 −0.1 0.0 −0.2 −0.05 −0.2 −0.1 −0.3 0.3 0.6 0.9 1.2 0.00 0.25 0.50 0.75 1.00 0e+00 1e+06 2e+06 3e+06 4e+06 300 350 400 450 MPA Spatial coverage: fraction of IFO 380 fuel price, SD (USD/MT) Latitude Longitude pixel 0.08 0.5 0.8 0.00 0.04 0.4 0.0 0.00 0.0 −0.25 −0.5 −0.50 Shapley value −0.04 −0.4 −0.8 25 30 35 40 45 −50 0 50 −100 0 100 0.00 0.25 0.50 0.75 1.00 MPA Spatial coverage: fraction of MPA Spatial coverage: fraction of Nearest MPA: years since Nearest MPA: distance (m) pixels, 1st neighbor pixels, 2nd neighbor designation (years) 0.1 0.1 0.3 0.0 0.6 0.2 0.0 −0.1 0.1 −0.1 0.3 −0.2 0.0 −0.3 0.0 −0.1 −0.2 −0.4 −0.2 0.00 0.25 0.50 0.75 0.0 0.2 0.4 0.6 0e+00 1e+06 2e+06 3e+06 4e+06 0 25 50 75 100 Pacific Decadal Oscillation index, Pacific Decadal Oscillation index, Sea surface temperature anomaly, Sea surface temperature anomaly, mean SD mean (degree C) SD (degree C) 0.08 0.2 0.50 0.05 0.1 0.04 0.25 0.00 0.0 −0.05 0.00 0.00 −0.1 −0.10 −0.04 −0.25 −0.2 −0.3 −0.5 0.0 0.5 1.0 1.5 0.3 0.4 0.5 0.6 0.7 −2 0 2 0 1 2 3 Sea surface temperature, mean Sea surface temperature, SD Seamount: nearest distance (m) Shore: nearest distance (m) (degree C) (degree C) 0.5 0.4 0.2 0.4 0.2 0.0 0.0 0.0 0.0 −0.5 −0.2 −0.4 −0.2 −0.8 0 10 20 30 0.0 2.5 5.0 7.5 0e+00 1e+06 2e+06 3e+06 0 500000 100000015000002000000 Temporal coverage: fraction of Wind speed, mean (m/s) Wind speed, SD (m/s) Year year with MPA 0.10 0.4 0.2 0.4 0.05 0.2 0.1 0.2 0.00 0.0 0.0 0.0 −0.05 −0.1 −0.2 −0.2 −0.10 −0.2 0.00 0.25 0.50 0.75 1.00 2.5 5.0 7.5 10.0 12.5 1 2 3 4 5 2016 2017 2018 2019 2020 Feature value Fig. S10. Shapley value dependence plots from final trained Stage 2 model for all numeric model features and a 1 year forecast horizon. Each point represents the Shapley value from an individual observation, and the blue line uses a Generalized Additive Model smoothing function. 12 of 18 Gavin McDonald, Jennifer Bone, Christopher Costello, Gabriel Englander, Jennifer Raynor Forecast: 1 year Forecast: 2 years Forecast: 3 years 0% −20% Inside −40% MPAs −60% MPA network scenario 0% Protecting unfished pixels −20% Visalli et al. 2020 Partial Change in −40% MPA overlap number of Expert−derived EBSA fished pixels −60% Sala et al. 2021 food 0% Random −20% Sala et al. 2021 multi−objective Outside −40% MPAs Sala et al. 2021 carbon −60% Sala et al. 2021 biodiversity 0% Protecting most−fished pixels −20% −40% Global −60% 10% 20% 30% 10% 20% 30% 10% 20% 30% Area protected Fig. S11. Simulation results for the number of fished pixels (i.e., the Stage 1 model predictions). The y-axis shows the relative percentage difference between the MPA network scenario and the business-as-usual scenario. The x-axis shows the percent area of global oceans protected. The left-to-right panels represent the three model forecast horizons, and the top-to bottom panels represent results for pixels from the three mutually exclusive regions (inside MPAs, partial MPA overlap, and outside MPAs), as well as all pixels globally. Colors differentiate the various hypothetical MPA networks. Linetypes are used to further differentiate networks that can have various levels of protection, while shapes are used to differentiate networks that only have a single level of protection. Gavin McDonald, Jennifer Bone, Christopher Costello, Gabriel Englander, Jennifer Raynor 13 of 18 Forecast: 1 year Forecast: 2 years Forecast: 3 years 0% −25% Inside −50% MPAs −75% MPA network scenario 0% Protecting unfished pixels −25% Visalli et al. 2020 Partial Change in −50% MPA overlap Expert−derived EBSA fishing effort −75% Sala et al. 2021 food 0% Random −25% Sala et al. 2021 multi−objective Outside −50% MPAs Sala et al. 2021 carbon −75% Sala et al. 2021 biodiversity 0% Protecting most−fished pixels −25% Global −50% −75% 10% 20% 30% 10% 20% 30% 10% 20% 30% Area protected Fig. S12. Simulation results for the amount of predicted fishing effort for each MPA scenario from the full hurdle model. The y-axis shows the relative percentage difference between the MPA network scenario and the business-as-usual scenario. The x-axis shows the percent area of global oceans protected. The left-to-right panels represent the three model forecast horizons, and the top-to bottom panels represent results for pixels from the three mutually exclusive regions (inside MPAs, partial MPA overlap, and outside MPAs), as well as all pixels globally. Colors differentiate the various hypothetical MPA networks. Linetypes are used to further differentiate networks that can have various levels of protection, while shapes are used to differentiate networks that only have a single level of protection. 14 of 18 Gavin McDonald, Jennifer Bone, Christopher Costello, Gabriel Englander, Jennifer Raynor (A) 2016 2017 2018 2019 2020 2021 Taiwan Fishing hours 200 150 100 50 0 Japan 60,000 (B) 40,000 Taiwan 20,000 Fishing hours 0 30,000 20,000 Japan 10,000 0 2016 2017 2018 2019 2020 2021 Inside 2020 MPA area Outside 2020 MPA area Fig. S13. Spatiotemporal trends of observed annual fishing effort from Global Fishing Watch in the Palau EEZ by Taiwanese and Japanese fleets from 2016 to 2021. (A) Spatial fishing patterns by flag and year, aggregated to 0.1x0.1 degree pixels; the outline of the Palau National Marine Sanctuary, which was implemented on January 1, 2020, is shown in dark orange. (B) Temporal trends of annual fishing effort, by flag, and disaggregated by inside and outside the Palau National Marine Sanctuary area, which was implemented on January 1, 2020 (a vertical dashed line is shown at this implementation date). Gavin McDonald, Jennifer Bone, Christopher Costello, Gabriel Englander, Jennifer Raynor 15 of 18 (A) Number distinct fishing vessels in pre−MPA Palau fleet operating globally, across flags 200 150 100 50 0 Total global fishing hours by pre−MPA Palau fleet, across flags 500,000 400,000 300,000 200,000 100,000 0 2016 2017 2018 2019 2020 2021 Number of distinct Taiwanese− and Japanese−flagged fishing vessels in pre−MPA Palau fleet operating by EEZ and high seas (B) Palau High seas Taiwan Japan 60 50 60 80 50 40 60 40 40 30 30 40 20 20 20 20 10 10 0 0 0 0 Micronesia (Federated States of) Philippines China United States 30 20 10 15 Flag 15 8 20 10 Taiwan 6 10 4 10 5 Japan 5 2 0 0 0 0 Marshall Islands Papua New Guinea Kiribati Solomon Islands 10 6 20 8 15 15 6 4 10 10 4 2 5 5 2 0 0 0 0 2016 2017 2018 2019 2020 2021 2016 2017 2018 2019 2020 2021 2016 2017 2018 2019 2020 2021 2016 2017 2018 2019 2020 2021 Fig. S14. Temporal trends of Palau’s “pre-MPA fishing fleet” (i.e., those vessels observed fishing in Palau’s EEZ between 2016 and 2019, prior to the Palau National Marine Sanctuary implementation on January 1, 2020). (A) shows the number of distinct vessels from this fleet that operated globally over time, as well as the total global fishing hours by this fleet over time. Panel A aggregates vessels and effort across all fishing flags. (B) shows the number of unique Taiwanese- and Japanese-flagged vessels from this fleet that fished in different EEZs and the high seas, focusing on the 12 regions with the largest number of active fishing vessels from this fleet. A vertical dashed line is shown at the January 1, 2020 implementation date of the Palau National Marine Sanctuary. 16 of 18 Gavin McDonald, Jennifer Bone, Christopher Costello, Gabriel Englander, Jennifer Raynor Protecting Sala et al. Sala et al. Protecting Sala et al. Sala et al. unfished Random 2021 2021 most−fished 2021 food 2021 carbon pixels multi−objective biodiversity pixels Black Sea Baltic Sea Bay of Bengal Scotian Shelf Gulf of Thailand California Current Labrador − Newfoundland North Sea Yellow Sea Celtic−Biscay Shelf Faroe Plateau Southeast U.S. Continental Shelf Norwegian Sea East Central Australian Shelf Iceland Shelf and Sea Arabian Sea East Bering Sea Sulu−Celebes Sea Sea of Okhotsk Sea of Japan Barents Sea Gulf of Alaska Oyashio Current Northeast U.S. Continental Shelf Greenland Sea Antarctica Pacific Central−American Coastal New Zealand Shelf Northern Bering − Chukchi Seas Aleutian Islands Red Sea Kuroshio Current Northeast Australian Shelf South China Sea Mediterranean Sea Hudson Bay Complex Southeast Australian Shelf Gulf of California Iberian Coastal East Brazil Shelf East China Sea Somali Coastal Current Northwest Australian Shelf West Bering Sea Indonesian Sea Patagonian Shelf Canadian E. Arctic ... W. Greenland Gulf of Mexico West Central Australian Shelf Insular Pacific−Hawaiian Agulhas Current Caribbean Sea South Brazil Shelf Guinea Current South West Australian Shelf Humboldt Current North Brazil Shelf North Australian Shelf Canary Current Beaufort Sea Benguela Current −100% 0% −100% 0% −100% 0% −100% 0% −100% 0% −100% 0% −100% 0% Change in fishing hours Fig. S15. Predicted change in fishing hours between various 30% MPA network scenarios and the business-as-usual counterfactual scenario, aggregated by Large Marine Ecosystem (LME). All results represent the predictions from a 3 year forecast horizon. An inset map of the LMEs is provided in the upper left corner. Gavin McDonald, Jennifer Bone, Christopher Costello, Gabriel Englander, Jennifer Raynor 17 of 18 26 References 27 1. K Sherman, AM Duda, Large marine ecosystems: an emerging paradigm for fishery sustainability. Fisheries 24, 15–26 28 (1999). 18 of 18 Gavin McDonald, Jennifer Bone, Christopher Costello, Gabriel Englander, Jennifer Raynor