Policy Research Working Paper 9539 Improving Tax Compliance without Increasing Revenue Evidence from Population-Wide Randomized Controlled Trials in Papua New Guinea Christopher Hoy Luke McKenzie Mathias Sinning Poverty and Equity Global Practice February 2021 Policy Research Working Paper 9539 Abstract This paper studies the impact of “nudges” on taxpayers with tax declarations filed without increasing the amount of tax varying tax compliance histories in Papua New Guinea. It paid because the taxpayers who responded to the nudges presents the results from two population-wide randomized were largely exempt from paying tax. This result is consis- controlled trials in a setting that is characterized by low tent across tax types, communication channels, and time compliance rates and a lack of effective enforcement. The periods. The findings also show that the treatments had study tests the impact of text messages, flyers, and emails no impact on previously non-filing taxpayers. Collectively, that remind taxpayers of declaration due dates and provide the results illustrate that taxpayers who face the lowest cost information about the public benefits of paying tax. The from complying are the most likely to respond to a nudge. findings show that the treatments increased the number of This paper is a product of the Poverty and Equity Global Practice. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at choy@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Improving Tax Compliance without Increasing Revenue: Evidence from Population-Wide Randomized Controlled Trials in Papua New Guinea* Christopher Hoy1,2,3 Luke McKenzie2 Mathias Sinning2,4 JEL-Classification: C93, D91, H2, H20, O1, O17 Keywords: Tax Compliance, Field Experiments, Behavioral Economics * This study was pre-registered with the American Economic Association RCT Registry (ID num- ber AEARCTR-0004056). We thank Robert Breunig, Anne Brockmeyer, Guillermo Cruces, Ryan Edwards, Christian Gillitzer, Rema Hanna, Stephen Howes, Ricardo Perez-Truglia, Fabrizio San- toro, Carlos Scartascini, Joel Slemrod, Teodora Tsankova and participants of the Centre for the Study of African Economies (CSAE) Conference and the Midwest International Economics and Development Conference (MWIEDC) for helpful comments. We gratefully acknowledge financial support from the Australian Department of Foreign Affairs and Trade and the Australian Research Council (LP160100810). The views expressed in this paper are those of the authors and do not nec- essarily reflect those of the Australian Government. 1 Corresponding author (choy@worldbank.org). 2 World Bank. 3 ANU Crawford School of Public Policy. 4 PKU NSD/CCER, RWI and IZA. 1 Introduction Tax as a share of GDP and levels of tax compliance are substantially lower in the poorest countries in the world (Besley and Persson, 2014; Slemrod, 2019; Brockmeyer et al., 2019). In these settings, authorities typically have less capacity to enforce tax legislation and attempts to deter taxpayers from being non-compliant by threatening punishment are often not seen as credible (Bergolo et al., 2019; IMF, 2020; Kleven et al., 2018). As a result, governments have begun exploring alternative, non-deterrence approaches to encourage taxpayers to comply, such as by providing them with information that tax revenue funds public goods and services (Antinyan and Asatryan, 2019; Mascagni, 2018). The impacts of these “nudges” are likely to be inversely related to the cost taxpayers face from complying, especially when enforcement is weak as they may believe they can easily avoid being punished for non-compliance (Gordon and Li, 2009). However, to be able to meaningfully increase revenue, nudges would have to be able to have a positive impact on taxpayers who face a high cost from complying (i.e. those that have a large tax liability). This raises the question as to which types of taxpayers respond to credible, non-deterrence nudges in low- and lower-middle-income countries and consequently whether these messages are an effective way to raise revenue. We examine this issue by conducting two population-wide randomized controlled trials involving all firms registered to pay Salaries and Wages Tax (SWT) and Value Added Tax (VAT) in Papua New Guinea (PNG). PNG faces similar development challenges to other resource dependent, fragile states. It is a lower-middle-income country, with one of the highest rates of extreme poverty in Asia and ranks in the bottom 20 percent of countries globally in terms of the United Nations Human Development Index (World Bank, 2019; UNDP, 2019).1 Tax revenue has remained around 13 percent of GDP over the last five years, which is in line with the average for lower-middle-income countries (World Bank, 2019). Tax compliance is extremely low. In 2018, around 95 percent of registered taxpayers did not file the required monthly SWT and VAT declarations with the tax authority, the Internal Revenue Commission (IRC), and no one has been imprisoned for non-compliance. To develop an understanding of what types of messages are likely to be effective in this 1 There were only four countries outside Sub-Saharan Africa with a lower Human Development Index ranking in 2018 than PNG (the Syrian Arab Republic, Afghanistan, the Republic of Yemen and Haiti), three of which were undergoing severe conflict. 1 environment, we conducted a survey of taxpayers, a series of focus groups and extensive consultation with the IRC. This process revealed that deterrence messages were not seen as credible by taxpayers and there were genuine concerns these types of nudges may even backfire as the tax authority is unable to follow through on threats of punishment from non-compliance. Our two distinct but complementary trials test the impact of credible, non-deterrence nudges across all taxpayers, multiple tax types, using a range of communication channels and at different points in time. Our first trial includes all taxpayers registered for SWT and VAT that were contactable by mobile phone (23,489 enterprises) and they were randomly assigned to receive either a simple reminder (Treatment Group 1), information about the public benefits from paying taxes (Treatment Group 2) or no information (Control Group). The two treatment groups were contacted via SMS message ten and three days before SWT declarations were due in April 2019. The second trial involved all taxpayers who were non- compliant for either SWT or VAT and were contacted by the IRC via post and email over a nine-month period (May 2019 to February 2020). Taxpayers with an odd Tax Identification Number (TIN) received a flyer highlighting the public benefits from paying tax, in addition to standard correspondence from the IRC. In both trials, we study the effects of the nudges on declarations filed and on the amount of tax paid. We are particularly interested in understanding differences between three broad types of registered taxpayers: firms that had not filed before (“non-filers”), firms that had filed previously but claimed they were exempt from paying tax (“zero filers”) and firms that had filed previously and paid tax (“non-zero filers”). Standard theoretical models of tax compliance (Allingham and Sandmo, 1972) have been extended recently to illustrate that different types of taxpayers decide whether to comply based on a broad range of costs and benefits, not just the risk of being punished for non- compliance (Slemrod, 2019). We build on this existing theory by introducing the idea that in addition to financial costs (the amount of tax due), taxpayers face transaction costs associated with complying (the time and effort spent on filing a declaration). Two predictions emerge from our model. Firstly, taxpayers with a history of compliant behavior (i.e. zero filers and non-zero filers) are expected to be more responsive to nudges than those that have 2 not filed before (i.e. non-filers) because their familiarity with filing a declaration means they face a lower transaction cost from complying. Secondly, nudges are expected to be more effective at increasing the number of declarations filed by taxpayers who are exempt from paying tax (zero filers) as they only face transaction costs from complying compared to those who face both transaction and financial costs (non-zero filers). Previous studies in low- and lower-middle-income countries have been unable to test these predictions and how nudges impact different types of taxpayers in general as they have focused on comparing the effects of various messages on a narrow subset of taxpayers (Mascagni et al., 2017; Shimeles et al., 2017; Kettle et al., 2016). In addition, the results from existing research on tax compliance in upper-middle- and high-income countries may not translate into lower income settings characterized by low compliance rates, weak enforcement and where third party reporting is limited (Besley and Persson, 2014; Brockmeyer et al., 2019; Kleven et al., 2018; Pomeranz, 2015). The findings from both trials illustrate that it is primarily taxpayers who face the lowest cost from complying with tax legislation that respond to credible, non-deterrence nudges. Our first trial reveals that both nudges almost doubled the number of on-time SWT decla- rations made by zero filers (i.e. those that had filed a declaration in the previous 15 months and claimed they were exempt from paying SWT as they have no employees). However, there were no significant effects on non-zero filers and non-filers. Similarly, the findings of our second trial indicate that the inclusion of the flyer along with standard correspondence from the IRC primarily increased the number of SWT and VAT declarations made by zero filers (i.e. those who previously reported they had no employees in the case of SWT and those who reported that their revenue stream was exempt in the case of VAT). As such, in both trials there was no change in the total amount of tax revenue collected. We also conduct further analysis to reveal three key findings suggesting that simply being contacted by the tax authority is driving the effects we observe, as such the nudges we provide are essentially working as a deterrence. Firstly, there were no statistically significant differences between the effects of the different treatments on filing on-time declarations in the first trial. This is despite the fact that one message provided details about due dates for specific tax types and the other only provided general information. Secondly, the differences 3 in declaration behavior between taxpayers in the treatment group and the control group in the first trial were only short-lived. It appears likely that the effect we observe is purely an immediate response to having received a nudge, as opposed to a sustained change in behavior. Thirdly, across both trials the overall impact of the treatments is primarily driven by taxpayers that are outside the largest city and capital of PNG, Port Moresby. These firms rarely come into contact with the tax authority and, as such, may well be more likely to perceive a nudge as a sign that they are under greater scrutiny. Our research contributes to the existing literature in several ways, which are in relation to the scale and setting of the study, the underlying mechanisms we uncover that drive the behavior of different types of taxpayers, and the innovative design of our trials. Firstly, this study involves the first population-wide randomized controlled trials examining the impact of nudges on tax compliance in a low- or lower-middle-income country. The population-wide nature of our trials allow us to provide insight into the effect of nudges when they are scaled up across all taxpayers, which makes the findings far more relevant to policy makers and more generalizable (Muralidharan and Niehaus, 2017; DellaVigna and Linos, 2020). This also enables us to examine heterogeneity in the responsiveness of different types of taxpayers based on their history of compliance and their location, which helps enable us to isolate the channels driving the findings we observe. Existing research in these settings has been limited by only involving a sample of taxpayers in a capital city or a specific type of non-compliant taxpayers (Mascagni et al., 2017; Shimeles et al., 2017; Kettle et al., 2016). These studies focused on comparing the impact of different kinds of nudges as opposed to the impact of nudges on different types of taxpayers (Antinyan and Asatryan, 2019; Mascagni, 2018). Secondly, we provide evidence across tax types, communication channels and time peri- ods, which supports theoretical predictions that nudges primarily increase the declarations of taxpayers who face the lowest cost from complying. We extend existing theory (Allingham and Sandmo, 1972; Slemrod, 2019) by introducing transaction costs into a simple model of tax compliance and theorize that nudges will have dramatically different effects on taxpayers depending on their compliance history. Our results are consistent with these predictions as we show that taxpayers who claim they are exempt from paying tax (zero filers) are the most likely to respond to a nudge and that non-filers are the least likely to respond. By doing so 4 our study extends trends from prior work in upper-middle- and high-income countries that show taxpayers who have the least to lose from complying are often the most likely respond to messages from tax authorities (Carrillo et al., 2017; Perez-Truglia and Troiano, 2018) to a lower income setting. Finally, there are a number of innovative design features in our study. Our trial is one of the first to examine the effect of nudges across different types of taxes that constitute a major source of revenue (SWT and VAT make up more than half of the total tax revenue collected in PNG) (Mascagni, 2018; OECD, 2019). In addition to the amount of tax paid, our first trial measures the likelihood of filing a declaration on-time because we contact all taxpayers prior to the due date. Most studies are unable to analyze on-time filing behavior because they focus exclusively on non-compliant taxpayers (Castro and Scartascini, 2015). Furthermore, our nudges were designed to be easily scalable and did not require tax authorities to make substantial changes to existing processes (Mascagni, 2018; Muralidharan and Niehaus, 2017). We also implemented our interventions in line with best practice by communicating with taxpayers via SMS messages and email, in addition to the traditional methods of using postal services (Ortega et al., 2020; Ortega and Scartascini, 2020; Brockmeyer et al., 2019). The remainder of this paper is structured as follows. We provide details about the conceptual framework and summarize existing research on this topic in Section 2. We discuss the setting of the study in Section 3. We describe the study design and explain our empirical model in Section 4. We present the findings of our analysis in Section 5. In Section 6, we discuss the implications from our study and conclude. 2 Conceptual framework and related literature 2.1 Conceptual framework Seminal theories of tax compliance argue that taxpayer behavior depends on the costs and benefits of tax compliance (Allingham and Sandmo, 1972), similar to economic theories about crime (Becker, 1968). Put simply, Allingham and Sandmo (1972) theorize that people choose between the certain cost of paying tax (t, which is assumed to be a fixed amount that represents a taxpayer’s tax obligation) and the potential punishment they could face if 5 they do not file (γp, where γ is the probability of being found to be non-compliant and p is a fixed amount that represents the punishment a taxpayer will receive). These theories have been extended in recent years to take into account broader reasons for why individuals may choose to pay tax (Slemrod, 2019). For example, taxpayers may value the public benefits financed by tax revenue or the value of complying with social norms (Hallsworth, 2014; Kettle et al., 2016). The utility gain from paying tax is represented formally as a in the model below. Recent studies have also recognized that taxpayer behavior is more likely to be determined by their belief about the probability of being punished, γb , as opposed to the actual probability, γ . We follow a simple model of tax compliance in the literature, discussed by Kettle et al. (2016), to illustrate that in a given reporting period the individual utility associated with compliance, Uc , and non-compliance, Un , can be written as: Uc = y − t + a and Un = y − γb p, where y is income before tax. According to this model, taxpayers comply if Uc >Un , which requires that t < γb p + a. We extend this simple model by introducing the idea that there are transaction costs associated with filing a declaration, r, along with the financial costs of paying tax, t (a somewhat similar approach is taken in Meiselman, 2018). We update the utility functions above accordingly and therefore predict that taxpayers will comply if t + r < γb p + a. Although this model is far from comprehensive, it provides a basic framework to illustrate 6 the different channels through which nudges can impact taxpayers’ behavior. Nudges can make the risk of punishment more salient for taxpayers (in our model this would be consistent with increasing the term γb ).2 They may also provide taxpayers with information that increases their utility from complying, such as letting them know the public benefits from paying tax or suggesting they will be abiding by social norms (in our model this would mean increasing the term a). Another channel through which nudges may affect compliance is by reducing the transaction costs for taxpayers by reminding them of due dates or by simplifying the process (in our model this would mean reducing the term r) (Hallsworth, 2014). Our simple model also helps us to illustrate how the impacts of nudges are likely to vary based on differences in the costs of complying (both financial and transaction costs) faced by different types of taxpayers. Specifically, we make predictions about the effect of nudges on the likelihood a taxpayer files a declaration based on their history of compliant behavior. We consider three broad types of taxpayers: those that have not filed a declaration previously (non-filers)3 , those that have filed in a previous reporting period but claimed to be exempt from paying tax (zero filers), and those that have filed in a previous reporting period and paid tax (non-zero filers). In the following, we consider the possibility that transaction costs associated with filing a declaration, r, beliefs about the probability of being punished, γb , and utility gains from paying taxes, a, may differ between non-filers, zero filers and non-zero filers. For taxpayers who have not filed a declaration in a previous reporting period, their compliance costs have always outweighed the benefits from complying. Formally, n t + r n > γb p + an , where the superscript “ n ” refers to non-filers. In contrast, zero filers and non-zero filers have filed a declaration in a previous reporting period (i.e. during the reporting period in which f they filed t + rf < γb p + af , where the superscript “ f ” refers to having filed a declaration in a previous reporting period). For simplicity, we assume that all taxpayers who have filed in a previous reporting period face the same transaction costs (rf ) resulting from the time 2 A more traditional approach of increasing the costs of non-compliance would be to directly increase the term p. 3 To be precise, we define not filing previously as not having filed a declaration in the past 15 months. We only observe the filing behavior of taxpayers in the 15 months before the trial. 7 taken to complete a declaration and file it with the tax authority.4 We also assume that non-filing taxpayers face larger transaction costs than taxpayers that have filed in a previous reporting period (i.e. rf < rn ) because they may not be familiar with how to complete a declaration and how to file it with the tax authority. Given these assumptions, the following two predictions emerge: Prediction 1: A nudge will be more likely to lead taxpayers who have filed previously to file a declaration compared to those who have not filed previously. This prediction is a consequence of non-filers facing larger transaction costs than zero filers and non-zero filers (i.e. rf < rn ). In addition, it is reasonable to believe that a higher share of non-filing firms may no longer be in operation, and consequently no longer need to file a declaration, when compared to firms that have filed recently (McKenzie and Paffhausen, 2019). Prediction 2: Among taxpayers that have filed previously, a nudge will be more likely to lead taxpayers that are exempt from paying tax to file a declaration compared to those that are not exempt. This prediction is a consequence of the fact that the total cost of complying for taxpayers that claim to be exempt from paying tax (zero filers) is equal to their transaction costs, rf , while the total cost of complying for taxpayers that are not exempt from paying tax (non-zero filers) is equal to both their transaction costs and financial costs from paying tax, r f + t. The population-wide experiments that follow allow us to test these predictions. 2.2 Related literature Since the 1970s, there have been numerous studies examining the effectiveness of strategies used to increase tax compliance, however the use of messages drawing on insights from behavioral economics has only recently been tested (Arcos Holzinger and Biddle, 2016). Studies using nudges have illustrated that tax revenue can be increased through different types of messages that draw on a range of motivations to comply, including the risk of 4 Arguably, transaction costs are lower for taxpayers who are exempt from paying tax as they are not required to make complex calculations to determine their tax liability. If this is the case, our second prediction is even more likely to hold. 8 punishment, social norms of compliance, reminders of legal obligations or highlighting the public benefits from paying tax (Hallsworth, 2014; Mascagni, 2018). An extensive number of trials have been conducted testing the impact of these types of nudges in high income countries and in general they show that deterrence nudges are more likely to raise revenue than non-deterrence nudges (Antinyan and Asatryan, 2019; Slemrod, 2019). However, even in these settings far less attention has been given to how the impact of nudges varies between different types of taxpayers (De Neve et al., 2019). There is a growing body of literature about tax compliance in upper-middle-income coun- tries, largely in Latin America, that examines various strategies that can be implemented even when enforcement is not as strong as is the case in many high income countries. In particular, there has been a strong focus on the potential of third party reporting to im- prove tax compliance (Pomeranz, 2015; Brockmeyer et al., 2019; Naritomi, 2019). These studies in Chile, Costa Rica and Brazil have consistently illustrated that when available, tax authorities can use third party reporting to improve compliance. There is also growing evidence that there is considerable variation in the way how different types of taxpayers respond to information from the tax authority (Castro and Scartascini, 2015; Bergolo et al., 2019; Carrillo et al., 2017). While this body of work has greatly improved the knowledge base about tax compliance in a setting outside high income countries, very little is known about whether these findings would extend to countries with lower incomes, high rates of non-compliance, weaker enforcement and limited third party information. Only a small number of field experiments have tested how nudges impact tax compliance in low- and lower-middle-income countries and none of them has been conducted across an entire population.5 In Guatemala, Kettle et al. (2016) examine the effectiveness of sending different kinds of letters to a small subset of non-compliant individuals and enterprises.6 They find that simply receiving a letter increased declarations. Letters that highlighted the risk of punishment and letters that referred to social norms were the most effective in increasing tax revenue. Shimeles et al. (2017) show that hand-delivered letters, which refer to the risk 5 In addition to the studies discussed here, a number of related studies have focused on the effects of “naming and shaming” taxpayers for non-compliant behavior or publicly celebrating those who are compliant in low- and lower-middle-income countries (Chetty et al., 2014; Slemrod et al., 2020). 6 Kettle et al. (2016) focus on all non-compliant taxpayers who opted in to pay profits tax, which captures only 3.5 percent of declarations. 9 of punishment or to public benefits from paying taxes, increased the amount of tax paid by enterprises in the capital city of Ethiopia, Addis Ababa, in the short run. They restrict their sample to taxpayers who were not exempt from paying tax and had never been contacted by the tax authority before. Similarly, Mascagni et al. (2017) find positive effects on tax paid by individuals and enterprises in the capital city of Rwanda, Kigali, from reminder SMS messages and emails that included either a punishment or a public benefit motivation. Their analysis is limited to firms that have previously filed and recently registered their contact details with the tax authority. 3 Setting of the Study 3.1 Taxation in Papua New Guinea Tax revenue as a share of GDP in PNG has been declining since its peak in 2011 and was at 13.7 percent in 2017, which is around the average for lower-middle-income countries (World Bank, 2019). The three types of taxes that generated the highest shares of revenue in 2017 were SWT (34 percent), Corporate Income Tax (26 percent) and VAT (21 percent) (OECD, 2019). This is a somewhat similar tax structure to comparable countries, except that PNG is particularly reliant on SWT. All firms in PNG are expected to register with the IRC and as part of this process they receive a Tax Identification Number (TIN). It is not possible for firms to engage in international trade or bid for government projects without a TIN. The IRC regularly updates the tax register and went through an extensive process of removing firms that were no longer operational in 2018.7 At the start of 2019, 61.4 percent of firms in the IRC tax register were less than five years old. All registered firms in PNG are required to file SWT declarations with the IRC by the 7th day of each month and VAT declarations by the 21st day of each month. SWT consists of taxes that were withheld from employees who are required to pay Personal Income Tax (PIT) and sole traders do not need to pay SWT (but they are still required to file a declaration). 7 This involved more than 50 staff working full-time for one year contacting taxpayers to check if they are still operational. 10 VAT is applied at the rate of 10 percent for the sales and import of goods and services in PNG for most sectors of the economy, however there is a minimum revenue threshold below which firms are exempt from paying VAT (but they are still required to file a declaration). The requirement for all registered firms to regularly file declarations regardless of whether they are exempt from paying tax is common practice in most low- and lower-middle-income countries (Mascagni, 2018). To file a tax declaration, firms must complete a form providing information about their business activities and tax liabilities over the previous month. An example of the declaration form that is required to be submitted to the IRC for SWT is provided in Figure B1 in Appendix B. In the case of SWT, taxpayers are expected to calculate the withholding tax for each of their employees, while in the case of VAT, taxpayers are expected to identify how much of their revenue and expenditure is non-exempt. Accurately completing these calculations can be quite challenging for many taxpayers.8 The majority of SWT and VAT declarations made by small businesses and sole traders are filed in person at an IRC office. In the days immediately prior to due dates there are long queues of taxpayers waiting to file their declarations. The formal sector in PNG is largely concentrated in the largest city and capital, Port Moresby, which is the case in many low- and lower-middle-income countries. The population of Port Moresby is around four times larger than the second most populous city, Lae, and is home to 56 percent of firms that are registered with the tax authority. Furthermore, 94 percent of the declarations are processed through the head office in Port Moresby as the IRC only has three small offices located elsewhere throughout the country. As such, firms located outside Port Moresby are substantially less likely to come into contact with the IRC and, unsurprisingly, these firms have lower rates of compliance. 3.2 Tax Compliance in Papua New Guinea Tax compliance has multiple dimensions. Three key dimensions are: whether individuals and firms have registered with the tax authority for the relevant taxes; whether registered 8 The IRC provides guidance on their website, however this may still be too advanced for many taxpayers. For example, in early 2019 a six-page guide for employers was released explaining how to calculate SWT for employees. The guide assumes among other things that all taxpayers have access to Microsoft Excel. 11 individuals and firms file declarations; and whether registered individuals and firms ulti- mately pay the correct amount of tax. The tax authority is only able to accurately measure the second of these because the extent of the other dimensions of compliance depends on information that is not reported directly. Consequently, we focus on measuring tax compli- ance by assessing whether registered firms file their SWT and/or VAT declarations (either on-time or late). We also analyze the amount of tax paid.9 The rate of compliance among taxpayers in PNG is very low. Over 75 percent of firms registered for SWT and VAT did not file a declaration in 2018 (see Figures 1 and 2). We refer to taxpayers that have not filed a declaration as “non-filers”, and to the remaining 25 percent of taxpayers that have filed at least once in 2018 or the first three months of 2019 as “filers.” To test the theoretical predictions discussed in Section 2.1, we estimate hetero- geneous treatment effects based on these categories to account for the possibility that these types of firms will respond differently to the nudges we provide. [Insert Figures 1 and 2] We observe a considerable amount of heterogeneity in tax compliance behavior among firms that have filed in the last 15 months. Around 25 percent of these firms file a declaration every month (we refer to them as “regular filers”). The remaining 75 percent file an average of once every two months (we refer to them as “semi-regular filers”). Importantly, only around half of SWT and VAT declarations are for a non-zero amount. The reason for this is that many taxpayers state in their declarations that they are exempt from paying tax. This is the case for SWT when firms have no employees, and for VAT when firms do not earn enough revenue from activities subject to VAT (or do not earn any revenue at all). To test the theoretical predictions discussed in Section 2.1, we examine heterogeneous treatment effects based on whether taxpayers have previously reported that they are exempt from paying tax (zero filers) and those that have not (non-zero filers) because we expect the responses of taxpayers to vary with the compliance costs they face. Existing enforcement activities by the IRC are similar to those used by tax authorities in other low- and middle-income countries (Mascagni, 2018). Firstly, the focus of compli- 9 We do not consider firms that are not registered because our analysis is based on administrative records from the IRC. 12 ance activities is primarily on ensuring that firms register and then on whether they file declarations. A lower priority is assigned to whether declarations are accurate. Firms that do not file a declaration have a small chance of receiving a letter in the post or via email asking them to file their overdue declaration. If they do not respond, three more letters are sent. The mechanism for determining which firms are contacted is based on the number of outstanding declarations or the amount of debt overdue. 3.3 Local explanations for why tax compliance is so low We conducted focus groups and a short survey in order to understand local explanations for why around 95 percent of firms in PNG do not file SWT and VAT declarations every month and to help inform the development of messages that would make firms more likely to be compliant. The survey was conducted in February 2019 and had 137 respondents, of which 58 percent were IRC staff that regularly interact with taxpayers and they also pay tax themselves. The remaining respondents were primarily owners of small and medium sized enterprises registered with the IRC. The answers did not differ significantly between the types of respondents. We summarize below the main reasons why respondents believe firms do not pay tax and the main types of messages that they believe could change tax behavior. In Appendix A, we provide more details about the implementation of this scoping survey. 3.3.1 Respondents’ views on why people do not pay tax Respondents had mixed views about why some Papua New Guineans do not pay the right amount of tax (see Figure A1 in Appendix A). Uncertainty about how the tax revenue would be used was ranked as the most important reason. Respondents mentioned that many taxpayers are not aware that tax revenue helps to fund services that benefit the public, such as health and education. The other main reasons provided were that firms find paying tax too complicated or they forget to pay. 3.3.2 Respondents’ views on what messages would lead firms to pay more tax Two types of messages were expected to be the most effective in encouraging taxpayers to pay tax (see Figure A2 in Appendix A). These were messages that reminded firms of their 13 tax obligations (“reminder message”) and explained how the tax revenue was spent (“public benefit message”). Messages that refer to social norms or highlight the risk of punishment were viewed as not being credible. More than five times as many respondents thought a reminder or public benefit message would be effective, compared to a message that referred to social norms or that highlighted the risk of punishment. 4 Methodology 4.1 SMS Trial design This trial focused on the entire population of firms registered to pay SWT and VAT in PNG with a valid mobile phone number (23,489 firms).10 The 23,489 firms were randomly assigned to Treatment Group 1 (7,828 firms), Treatment Group 2 (7,830 firms) and to the Control Group (7,831 firms). Stratified randomization was implemented using a set of baseline characteristics, including location, industry, age of firm and an indicator for being inactive in 2018. The randomization was done using the Stata command randtreat (version 1.4, 5 April 2017). The stratified randomization resulted in no statistically significant differences in baseline characteristics between the groups (see Table A1 in Appendix A). Our main outcomes of interest are the number of declarations filed on-time and the amount of tax paid on-time. We also use information about firms’ taxpaying behavior over a period of four months from when the SMS messages were sent, which allows us to measure whether the treatment effects were only short-lived or sustained over a longer period. The treatments we provide in both trials were informed by other studies (e.g. Mascagni et al., 2017) and extensive consultation with senior management of the IRC and focus groups (see Section 3.3). We did not include a deterrence nudge because research in low- and lower- middle-income countries indicates that the effects of deterrence and non-deterrence messages are often similar (Mascagni, 2018) and because our consultations revealed that the threat of punishment from the tax authority for non-compliance is not seen as credible in this setting. We also did not include a nudge that referred to social norms of tax compliance because 10 Since we focus on firms with a mobile number listed with the IRC, this means some large enterprises are excluded because they provided the IRC with a landline phone number. 14 informing firms that around 95 percent of taxpayers were non-compliant was unlikely to trigger a positive reaction. Moreover, since we were particularly interested in the impact of our nudges on different types of taxpayers – in contrast to previous research that focuses on the effects of different types of nudges – we did not consider deterrence or social norms nudges as essential. The two types of non-deterrence messages provided to firms in the treatment groups aimed to address different reasons for why people do not pay tax (see Section 3.3). Treat- ment 1 provided firms with a reminder about what they were required to do to comply with tax legislation (see Figure 3). This reminder SMS message was sent ten days before SWT declarations were due (the relevant due date for our trial was the 7th of April 2019). Firms also received a second reminder SMS message three days before the due date.11 Treatment 2 provided firms with information about the public benefit from paying taxes (see Figure 4). This SMS message is based on what has been used in other countries, but it was modified to suit the local context.12 This message was also sent ten and three days before SWT declarations were due. [Insert Figures 3 and 4] 4.2 Flyer Trial design This trial studied all firms that were contacted by the IRC via post and email for being non-compliant for either SWT or VAT over a nine-month period (May 2019 to February 2020).13 In total, the IRC sent 9,709 letters and emails to non-compliant taxpayers. We present the template of the letter (called a “Demand Notice”) that was sent via post and email in Figure B2 in Appendix B. To measure the effectiveness of including a flyer along with standard correspondence14 , taxpayers were assigned to two groups based on whether 11 The second reminder was identical to the first reminder but instead of “REMINDER” it stated “FINAL REMINDER.” 12 Our SMS message is very similar to a message used by Mascagni et al. (2017) in Rwanda. They used the following message: “By paying your taxes you make it possible to educate our children, fund our healthcare, and keep us safe. Pay taxes. Build Rwanda. Be proud.” 13 This trial was implemented after the SMS trial. As the treatment was randomized there is no reason to believe that the validity of this trial was affected by the first trial. 14 The Demand Notice letter includes language trying to deter taxpayers from being non-compliant. As such our trial is based on a comparison of a deterrence letter combined with a non-deterrence nudge (treatment) and a deterrence letter without a nudge (control). 15 they had an odd or an even TIN. TINs are a sequence of randomly allocated numbers that were determined at the time of registration. As a result, we can exploit TIN allocation as a form of random assignment. Firms with an odd TIN received a flyer (the treatment group) and those with an even TIN did not (the control group). To implement the trial, we leveraged an existing random split in the compliance enforcement division of the IRC across two floors. The compliance enforcement team on one floor focused exclusively on firms with an odd TIN, and the team on the other floor focused exclusively on firms with an even TIN. This minimized the risk of an incomplete intervention due to taxpayers in the control group receiving a flyer or taxpayers in the treatment group not receiving a flyer.15 As we were unable to stratify the randomization of the second trial, there was a slight imbalance between the treatment and control groups (see Tables A2 and A3 in Appendix A). To address this, we control for differences in observed characteristics in our regression analysis (see Section 4.3 below). The outcome measures considered for this trial are the likelihood of filing a SWT or VAT declaration with the IRC and the size of tax payments. The flyer that was included with standard correspondence from the IRC to firms in the treatment group drew upon a public benefit motivation for why taxpayers should comply. This included information about how tax revenue helps to provide access to free education to over two million children and free healthcare to all citizens as well as the same wording as the SMS message sent to Treatment Group 2 in the first trial (i.e. “help educate our children, fund our healthcare and build our roads”). The flyer was sent as an attachment via email and also included in the envelope along with a Demand Notice from the IRC that explains how a taxpayer has been non-compliant. Each flyer had an English version printed on one side and the Tok Pisin (a near-universal dialect in PNG) version on the other side. A copy of the English version of the flyer is shown as Figure B3 in Appendix B. 4.3 Empirical model To comply with international research standards, we pre-registered our trial with the AEA RCT Registry (ID number AEARCTR-0004056). The most straightforward type of empirical analysis we conduct is an OLS regression, which measures the effect of receiving a treatment 15 Staff were randomly allocated to either the odd TIN or even TIN teams. 16 compared to not receiving a treatment (i.e. the control group) on whether taxpayers filed a declaration and on the total amount of taxes paid.16 In the first trial, our analysis involved combining both treatment groups and comparing members of this combined group to the control group. The results from our analysis provide information about the effect of receiving a SMS message on taxpayer behavior. In the second trial, we compare the behavior of taxpayers who received a flyer (those with an odd TIN) along with standard correspondence from the IRC to those that only received standard correspondence from the IRC without receiving a flyer (those with an even TIN). In both cases, we control for a set of baseline characteristics (such as location, industry and age of firm).17 Our empirical strategy can be formalized as follows. To study the effect of treatment t on an outcome measure of interest, we estimate the model t Yit = β0 Ti + Xit β2 t t + β1 t + εt i, (1) where Yit is a given outcome of taxpayer i. Tit is the treatment indicator for the comparison of treatment group t to the control group, εt t i is the model error term and Xi is a vector of baseline characteristics. We also use the baseline characteristics to estimate heterogeneous treatment effects. We are particularly interested in the history of tax compliance because treatment effects are predicted to vary based on this dimension (see Section 2.1). We use baseline information about whether taxpayers filed a declaration over the period from the 1st of January 2018 to the 27th of March 2019 and compare the treatment effect on non-filing and filing firms. We also examine heterogeneous treatment effects between zero filers and non-zero filers based on whether firms had previously reported to be exempt from paying tax (those without employees in the case of SWT and those whose revenue stream was exempt from VAT in the 16 As we only consider two outcomes of interest and heterogenous treatment effects along a small number of pre-specified dimensions, it appears very unlikely that multiple hypothesis testing changes the conclusions drawn from our results. 17 We control for the baseline characteristics of taxpayers, as opposed to the characteristics of firms reported in their tax declarations following the treatment. This is because after the nudge is provided, the treatment is not exogenous to the information taxpayers provide in their declaration. For example, the treatment may lead some firms to change their reporting of the number of employees they pay in their SWT declaration. As such we do not include the number of employees in the current period as a control variable and we do not consider heterogenous treatment effects along this dimension. 17 case of VAT). We conduct three types of additional analyses to allow us to examine the mechanisms behind the effects we observe. Firstly, in our first trial, we examine whether there are differ- ences in responses of taxpayers based on which treatment they received. To investigate this, we conduct separate regression models to compare the members of one of the two treatment groups to the control group. Secondly, to examine whether the effects of the treatments of the first trial are sustained over time, we conduct multiple OLS regressions based on the model specified above, but we vary the time frame that we examine. Specifically, we analyze the cumulative treatment effect over a period of two-week intervals from when the SMS messages were sent (i.e. 0-2 weeks, 0-4 weeks, 0-6 weeks, and so on).18 Finally, to examine if firms in the capital city of PNG, Port Moresby, respond differently to firms elsewhere in the country we also explore heterogeneous treatment effects based on the location of taxpayers. In Appendix A, we present robustness checks of the treatment effects on the amount of tax paid (see Tables A4 and A5). Specifically, we winsorize this variable (at the 5th and the 95th percentile) to reduce the influence of outliers. We also use an inverse hyperbolic sine transformation of this variable to minimize the effect of having a large share of observations taking on the value zero. The results we present in the body of the paper are qualitatively similar to the results of these robustness checks. 5 Results 5.1 Treatment effects on previously non-filing and previously filing firms Table 1 shows the treatment effects on the number of SWT declarations and the amount of SWT paid by firms observed in Trial 1 (Columns (1) and (2)), the effects on the number of SWT declarations and the amount of SWT paid by firms observed in Trial 2 (Columns (3) and (4)), and the effects on the number of VAT declarations and the amount of VAT paid by firms observed in Trial 2 (Columns (5) and (6)). In each case, we estimate the treatment 18 Our main results for the first trial only examine on-time payments, as this additional analysis captures the effect of the treatment over various time frames. 18 effect for the full sample of all firms (Panel A) and for the subsamples of previously filing and previously non-filing firms (Panels B and C). The numbers in Panel A indicate that on average the treatments had little effect on whether firms filed a declaration and on the amount of tax paid. We only observe a statistically significant effect on SWT and VAT declarations made by all taxpayers in flyer trial and the effect size is between 4.1 and 6.3 percentage points (between 19.9 and 27.5 percent). [Insert Table 1] The results in Panel B of Table 1 reveal that the treatments had a significantly positive effect on declarations of previously filing firms in both trials and across both tax types. The size of the treatment effect ranges from 8.3 to 11.1 percentage points (between 25.1 and 31.1 percent). However, we find no effect on the amount of tax collected. There was also no effect on previously non-filing taxpayers, with the exception of a slight increase in VAT declarations made in Trial 2 (see Panel C of Table 1). 5.2 Treatment effects on previously filing firms based on whether they are exempt from paying tax Table 2 presents separate treatment effect estimates for previously filing firms based on whether they are exempt (zero filers) or non-exempt from paying tax (non-zero filers). We find a positive treatment effect on SWT declarations made by zero filers in both trials. The corresponding effects on non-zero filers are not statistically significant. In the first trial, the treatment increases the average number of declarations made by 17.7 percentage points (an increase of more than 90 percent). In the second trial, the treatment increases the number of declarations made by 3.3 to 4.3 percentage points (an increase of 13.6 to 22.4 percent). In the case of VAT declarations, the effect is not significant (the p-value is 0.155), but it is clear from examining the direction of the point estimates that the overall positive effect across previously filing firms is driven by zero filers. The negative effect on the amount of VAT paid is not robust to alternative specifications of this variable that we show in Tables A4 and A5. [Insert Table 2] 19 5.3 Effects of different treatments in the first trial Table 3 presents separate treatment effects for the two nudges that were used in Trial 1. We find that both treatments increase the number of on-time SWT declarations among previously filing firms, although the effect is only significant for the reminder treatment (the p-value for the public treatment is 0.163). The two treatments do not lead to a significant increase in the amount of SWT paid. The size of the effect on SWT declarations was between 7.4 and 11.1 percentage points (ranging from 19.9 to 29.9 percent) and the difference between the two treatments is not statistically significant. There are no effects from the treatments on previously non-filing taxpayers. [Insert Table 3] 5.4 Treatment effects in the first trial over time To study how the effect of the SMS messages evolves over time, we obtain the treatment effect observed after two weeks and then gradually add two-week intervals to our sample to re-estimate the effect. We also examine whether any differences existed between the treatment and control groups in the four weeks prior to the SMS messages being sent. The estimates presented in Figure 5 reveal that the treatment effect of our first trial was short- lived. We observe a significant and economically meaningful effect on on-time declarations of previously filing firms, which is 25 percent higher than the control group. However, within four months the difference between the treatment and control groups disappears entirely. This can be seen in Figure 5, which shows the cumulative treatment effect at two-week intervals from when the SMS messages were sent.19 [Insert Figure 5] 5.5 Differences in treatment effects across PNG Table 4 shows the treatment effects on previously filing taxpayers disaggregated by whether or not they are based in Port Moresby. We present the effects on the number of SWT 19 Figure A3 in Appendix A shows the treatment effect within each two-week period. The estimates indicate that the observed point estimate turns negative after six weeks. 20 declarations and SWT paid in Trial 1 (Panel A), the number of SWT declarations and SWT paid in Trial 2 (Panel B), and the number of VAT declarations and VAT paid in Trial 2 (Panel C). We find that the treatment effect on SWT declarations is more than twice as large for previously filing firms outside Port Moresby in both trials. The effects of the nudges as a proportion of the control group are particularly large because tax compliance is considerably lower outside Port Moresby. We observe treatment effects between 41.7 and 67.3 percent for previously filing firms outside Port Moresby and between 13.6 and 26.0 percent for previously filing firms in Port Moresby. [Insert Table 4] 6 Discussion and Conclusion 6.1 Channel driving our effects Our results indicate firms that face the lowest cost from tax compliance are the most likely to respond to credible, non-deterrence nudges. These findings support the predictions of our model in Section 2.1 where we build on standard theory by introducing the idea that taxpayers face transaction costs from complying, in addition to financial costs. In line with our first prediction, we find that previously filing taxpayers are more responsive to nudges than previously non-filing taxpayers. In both trials, the treatments had a positive effect on firms that had filed a declaration at least once in the previous 15 months but were not potent enough to change the behavior of firms that had an entrenched behavior of not paying tax. In support of our second prediction, we show that zero filers (i.e. taxpayers who are exempt from paying tax) are more responsive to nudges than non-zero filers (i.e. taxpayers who are not exempt from paying tax). In both trials there was no effect on the total amount of revenue collected because the increase in the number of declarations made was primarily driven by firms that are exempt from paying tax. We provide evidence suggesting that the channel through which the treatments affect taxpayer behavior is simply being contacted by the tax authority, as such the nudges we provide appear to be working as a deterrent. We find very similar effects on filing on-time declarations and the amount of tax paid from the two types of SMS messages in the first trial 21 (see Section 5.3). This is despite the fact that one message explicitly refers to the specific due date for SWT (the reminder nudge) and the other message does not refer to dates or types of taxes (the public benefit nudge). We also show that the differences in declaration behavior between taxpayers in the treatment group and the control group in the first trial appear to be purely an immediate response to having received a nudge, as opposed to a sustained change in behavior. Further, we show across both trials that the overall impact of the treatments is primarily driven by firms that rarely come into contact with the tax authority because they are based outside Port Moresby. Collectively, these findings point toward the idea that taxpayers may well perceive nudges as a sign that they are under greater scrutiny. Relating this proposed mechanism to our model in Section 2.1, the treatments we pro- vided potentially increased taxpayers’ belief that they may receive punishment from non- compliance in the short-term (i.e. the nudges may have increased γb ). This could explain why, in the first trial, we observe the same effect regardless of whether we attempt to de- crease the transaction cost taxpayers face from complying with the reminder message (r) or whether we attempt to increase the utility they gain from paying tax with the public benefit message (a). This mechanism could also explain why the treatment effect disappears over time without further correspondence from the IRC because taxpayers may gradually reduce their perceived likelihood of receiving punishment resulting from non-compliance to a level similar to what it was before. At a minimum, our results indicate that the absence of on-time filing of declarations is not due to taxpayers being unaware of due dates. If this were the case, we would have found a substantially larger effect from the reminder treatment that mentioned the due date, than from the public benefit treatment in the first trial. 6.2 How our findings relate to previous studies Our analysis extends the frontier of knowledge on the use of nudges to improve tax compliance in low- and lower-middle-income countries in several directions. For example, this is the first population-wide tax compliance trial in a country with a low level of development, which allows for a thorough examination of how the impact of nudges varies between different types of taxpayers based on their history of tax compliance and location. However, there are still 22 important parallels between the findings of our research and previous studies. Firstly, our finding that correspondence from the tax authority can change filing behavior in short run but not revenue has also been observed in previous studies (Carrillo et al., 2017; Kettle et al., 2016; Perez-Truglia and Troiano, 2018). Carrillo et al. (2017) illustrate this occurred in Ecuador where in response to letters drawing on third party information firms increased their reported costs to largely offset the higher revenues that had been detected. Kettle et al. (2016) find that a simple reminder message increases declarations without in- creasing tax paid in Guatemala because the effect on declarations is driven by non-compliant firms that subsequently report that they are exempt from paying tax. These findings are consistent with the explanation in our paper, whereby taxpayers facing lower total costs from compliance are the most likely to respond to nudges. Secondly, the finding of similar effects of different types of nudges is consistent with earlier research (Antinyan and Asatryan, 2019; Mascagni, 2018; Slemrod, 2019). For example, in Ethiopia and Rwanda, Shimeles et al. (2017) and Mascagni et al. (2017) respectively, find that deterrence and non-deterrence messages have similar effects on the likelihood of taxpayers filing a declaration and on the amount of tax paid. They note that in these settings most registered taxpayers do not receive any correspondence from their tax authority (in the case of Shimeles et al. (2017), their sample consisted of taxpayers that had never been contacted before). As such, they argue that any nudge may make the risk of punishment from non- compliance more salient for taxpayers. We provide suggestive evidence that this is also the case for the trials we conducted. Future research could explore this issue in a low- or lower- middle-income country setting in more detail, such as by conducting population-wide trials that directly compare the effects of deterrence and non-deterrence nudges. 6.3 Implications of our findings Our results draw into question the value for money of credible, non-deterrence nudges aiming to generate substantial amounts of tax revenue across the population of taxpayers in low- and lower-middle-income countries. While we do detect a sizeable effect on the likelihood of firms filing a declaration, this does not translate into greater tax revenue being collected. However, there are potentially other benefits that are difficult to quantify from taxpayers 23 filing declarations. These include providing the government up to date information about economic activity and helping taxpayers to form habits so that if in the future they are no longer exempt from paying tax they will still continue to file declarations. Although the SMS messages and printing of flyers is relatively low cost compared to traditional methods of hiring and training more staff to audit taxpayers, they are not free. In total, the SMS messages sent as part of the first trial cost around US$1,625 (PGK 5,500) and the printing of flyers for the second trial cost around US$500 (PGK 1,700). As such, it is highly unlikely the interventions we provided would pass a cost-benefit test. By introducing the idea that taxpayers face transaction costs from complying, and provid- ing extensive experimental findings to support this, we illustrate that substantially different approaches are required to improve the compliance of various subsets of taxpayers depend- ing on their previous tax compliance behavior. Firms that are either non-filers or non-zero filers did not respond to the treatments we provided. To be able to improve the compliance of these types of firms, tax authorities may need to rely on traditional methods, such as increasing punishment for non-compliance (this is captured as p in our model in Section 2.1) or increasing the likelihood that firms are caught if they do not comply (this is illustrated by γ in our model in Section 2.1). Alternatively, governments in low- and lower-middle-income countries could consider investing in raising the quality and quantity of third party reporting and using personalized information to nudge taxpayers (Kleven et al., 2018; Brockmeyer et al., 2019). 24 7 Tables and figures Figure 1: Breakdown of registered taxpayers by degree of SWT Compliance Note: This figure shows that 76 percent of taxpayers registered to pay SWT did not file a declaration in 2018 and only 7 percent filed a declaration each month. Figure 2: Breakdown of registered taxpayers by degree of VAT Compliance Note: This figure shows that 74 percent of taxpayers registered to pay VAT did not file a declaration in 2018 and only 6 percent filed a declaration each month. 25 Figure 3: Message sent to firms in Treatment Group 1 (Reminder) Note: This figure shows the SMS message that was sent from the IRC to taxpayers in Treatment Group 1. Figure 4: Message sent to firms in Treatment Group 2 (Public Benefit) Note: This figure shows the SMS message that was sent from the IRC to taxpayers in Treatment Group 2. 26 Figure 5: Treatment effects over time Note: This figure shows the cumulative treatment effect at two-week inter- vals in the four weeks before and 16 weeks after the SMS messages were sent. The coefficient on the treatment dummy of equation (1) (presented in Section 4.3) is plotted on the y-axis. The vertical red line indicates when the SMS messages were sent. 27 Table 1 - Treatment effects across different types of firms SMS Trial Flyer Trial SWT SWT Paid SWT SWT Paid VAT VAT Paid Declarations (PGK) Declarations (PGK) Declarations (PGK) (1) (2) (3) (4) (5) (6) Panel A - All firms Treatment Effect 0.030 1185 0.041∗∗∗ 4679 0.063∗∗∗ -979 (0.02) (2315) (0.01) (5237) (0.01) (1302) p-value 0.156 0.609 0.000 0.372 0.000 0.452 Controls Y Y Y Y Y Y Mean of dependent variable 0.149 6,500 0.149 10,362 0.151 2,383 Observations 23,489 23,489 5,157 5,157 4,552 4,552 Panel B - Previously filing firms Treatment Effect 0.093∗∗ 3986 0.083∗∗∗ 6230 0.111∗∗∗ -1166 (0.05) (8733) (0.02) (11150) (0.02) (1340) p-value 0.042 0.648 0.000 0.576 0.000 0.384 28 Controls Y Y Y Y Y Y Mean of dependent variable 0.371 4,486 0.309 25,164 0.357 3,882 Observations 6,169 6,169 2,249 2,249 2,045 2,045 Panel C - Previously non-filing firms Treatment Effect 0.006 -72∗ 0.000 2530 0.022∗∗ -751 (0.02) (42) (0.01) (3544) (0.01) (2097) p-value 0.790 0.089 0.969 0.475 0.039 0.720 Controls Y Y Y Y Y Y Mean of dependent variable 0.068 84.62 0.054 862.95 0.066 2,455 Observations 17,319 17,319 2,908 2,908 2,507 2,507 Note: This table shows the treatment effects for all firms (Panel A) and for the subsamples of previously filing firms (Panel B) and previously non-filing firms (Panel C) of the first trial (Columns (1) and (2)) and the second trial (Columns (3) to (6)). PGK - Papua New Guinea Kina (USD1 was approx. PGK3.4 on the 29th August 2019). Previously filing firms - defined as having filed a declaration between the 1st of January 2018 and 27th of March 2019. Previously non-filing firms - defined as having not filed a declaration between the 1st of January 2018 and 27th of March 2019. Robust standard errors in parentheses. ∗ p < 0.10,∗∗ p < 0.05,∗∗∗ p < 0.01. Table 2 - Treatment effects on zero filers and non-zero filers SMS Trial Flyer Trial SWT SWT Paid SWT SWT Paid VAT VAT Paid Declarations (PGK) Declarations (PGK) Declarations (PGK) (1) (2) (3) (4) (5) (6) Panel A - Zero filers Treatment Effect 0.177∗∗ 0 0.043∗∗ 0 0.033 0 (0.08) 0 (0.02) 0 (0.02) 0 p-value 0.024 0.050 0.155 Controls Y Y Y Y Y Y Mean of dependent variable 0.196 0 0.192 0 0.242 0 29 Observations 3,150 3,150 1,406 1,406 1,421 1,421 Panel B - Non-zero filers Treatment Effect 0.017 12680 0.036 -9328 -0.022 -11776∗∗ (0.05) (17612) (0.03) (30537) (0.03) (4715) p-value 0.713 0.472 0.288 0.760 0.528 0.013 Controls Y Y Y Y Y Y Mean of dependent variable 0.544 48,105 0.586 85,076 0.807 19,110 Observations 3,019 3,019 843 843 624 624 Note: This table shows the treatment effects on zero filers (Panel A) and non-zero filers (Panel B) of the first trial (Columns (1) and (2)) and the second trial (Columns (3) to (6)). PGK - Papua New Guinea Kina (USD1 was approx. PGK3.4 on the 29th August 2019). Zero filers - defined as having filed a declaration between the 1st of January 2018 and 27th of March 2019, but claimed to be exempt from paying tax. Non-zero filers - defined as having filed a declaration between the 1st of January 2018 and 27th of March 2019 and did not claim to be exempt from paying tax. ∗ p < 0.10,∗∗ p < 0.05,∗∗∗ p < 0.01. Table 3 - Effects of the different treatments in trial 1 SWT SWT Paid Declarations (PGK) (1) (2) Panel A - Reminder Previously filing firms 0.111∗∗∗ 2515 (0.04) (6901) p-value 0.010 0.716 Controls Y Y Mean of dependent variable 0.371 24,156 Observations 4,145 4,145 Previously non-filing firms 0.006 -72 (0.03) (59) p-value 0.810 0.226 Controls Y Y Mean of dependent variable 0.068 85 Observations 11,514 11,514 Panel B - Public Benefit Previously filing firms 0.074 5097 (0.05) (11198) p-value 0.163 0.649 Controls Y Y Mean of dependent variable 0.371 24,156 Observations 4,111 4,111 Previously non-filing firms 0.006 -73 (0.03) (59) p-value 0.826 0.217 Controls Y Y Mean of dependent variable 0.068 85 Observations 11,550 11,550 Note: This table shows the effect of the reminder treatment on previously filing and non-filing firms in Trial 1 (Panel A) and the effect of the public benefit treatment on previously filing and non-filing firms in Trial 1 (Panel B). PGK - Papua New Guinea Kina (USD1 was approx. PGK3.4 on the 29th August 2019). Previously filing firms - defined as having filed a declaration between the 1st of January 2018 and 27th of March 2019. Previously non-filing firms - defined as having not filed a declaration between the 1st of January 2018 and 27th of March 2019. Robust standard errors in parentheses. ∗ p < 0.10,∗∗ p < 0.05,∗∗∗ p < 0.01. 30 Table 4 - Heterogenous treatment effects by location of previously filing firms Outside Port Moresby In Port Moresby Tax Paid Tax Paid Declarations (PGK) Declarations (PGK) (1) (2) (3) (4) Panel A - Trial 1 (SWT) Treatment Effect 0.140 7607 0.064 6338 (0.09) (5725) (0.05) (13428) p-value 0.104 0.184 0.214 0.637 Controls Y Y Y Y Mean of dependent variable 0.336 6,392 0.394 36,098 Observations 2,439 2,439 3,730 3,730 Panel B - Trial 2 (SWT) Treatment Effect 0.138∗∗∗ 2889 0.051∗∗ 8813 (0.03) (12183) (0.03) (16365) p-value 0.000 0.813 0.047 0.590 31 Controls Y Y Y Y Mean of dependent variable 0.205 14,100 0.374 32,146 Observations 853 853 1,396 1,396 Panel C - Trial 2 (VAT) Treatment Effect 0.125∗∗∗ -30 0.106∗∗∗ -1743 (0.04) (473) (0.03) (2030) p-value 0.000 0.949 0.000 0.391 Controls Y Y Y Y Mean of dependent variable 0.257 760 0.407 5,435 Observations 709 709 1,336 1,336 Note: This table shows the treatment effects on previously filing firms in Trial 1 (Panel A) and Trial 2 (Panel B and C) disaggregated based on whether firms are based outside Port Moresby (Columns (1) and (2)) or in Port Moresby (Columns (3) and (4)). PGK - Papua New Guinea Kina (USD1 was approx. PGK3.4 on the 29th August 2019). Robust standard errors in parentheses. ∗ p < 0.10,∗∗ p < 0.05,∗∗∗ p < 0.01. References Allingham, M. and A. Sandmo, “Income Tax Evasion: A Theoretical Analysis,” Journal of Public Economics, 1972, 1 (3-4), 323–338. Antinyan, A. and Z. Asatryan, “Nudging for Tax Compliance: A Meta-Analysis,” ZEW Discussion Paper No. 19-055 2019. Becker, G., “Crime and Punishment: An Economic Approach,” Journal of Political Econ- omy, 1968, 76 (2), 169–217. Bergolo, M., R. Ceni, G. Cruces, M. Giaccobasso, and R. Perez-Truglia, “Tax Audits as Scarecrows. Evidence from a Large-Scale Field Experiment,” IZA Discussion Papers 12335 2019. Besley, T. and T. Persson, “Why Do Developing Countries Tax So Little,” Journal of Economic Perspectives, 2014, 28 (4), 99–120. Brockmeyer, A., S. Smith, M. Hernandez, and S. Kettle, “Casting a Wider Tax Net: Experimental Evidence from Costa Rica,” American Economic Journal: Economic Policy, 2019, 11 (3), 55–87. Carrillo, P., D. Pomeranz, and M. Singhal, “Dodging the Taxman: Firm Misreporting and Limits to Tax Enforcement,” American Economic Journal: Applied Economics, 2017, 9 (2), 144–164. Castro, L. and C. Scartascini, “Tax Compliance and Enforcement in the Pampas: Evi- dence from a Field Experiment,” Technical Report 2015. Chetty, R., M. Mobarak, and M. Singhal, “Increasing Tax Compliance through Social Recognition,” International Growth Centre Policy Brief 14/0658 2014. De Neve, J., C. Imbert, J. Spinnewijn, T. Tsankova, and M. Luts, “How to Im- prove Tax Compliance? Evidence from Population-Wide Experiments in Belgium,” Said Business School Working Paper 2019-07 2019. DellaVigna, S. and E. Linos, “RCTs to Scale: Comprehensive Evidence from Two Nudge Units,” Working Paper 2020. Gordon, R. and W. Li, “Tax structures in developing countries: Many puzzles and a possible explanation,” Journal of Public Economics, 2009, 93 (7-8), 855–866. Hallsworth, M., “The use of Field Experiments to Increase Tax Compliance,” Oxford Review of Economic Policy, 2014, 30 (4), 658–679. Holzinger, L. Arcos and N. Biddle, “Behavioural insights of Tax Compliance: An Overview of Recent Conceptual and Empirical Approaches,” Tax and Transfer Policy Institute Working Paper 8/2016 2016. 32 IMF, “Revenue Administration Fiscal Informal Tool (RA-FIT),” International Mone- tary Fund (IMF), available at: https://data.imf.org/?sk=BA91013D-3261-42F8-A931- A829A78CB1EC 2020. Kettle, S., M. Hernandez, S. Ruda, and M. Sanders, “Behavioral Interventions in Tax Compliance: Evidence from Guatemala,” World Bank Group Policy Research Working Paper 7690 2016. Kleven, H., C. Kreiner, and E. Saez, “Why Can Modern Governments Tax So Much? An Agency Model of Firms as Fiscal Intermediaries,” Economica, 2018, 83, 219–246. Mascagni, G., “From The Lab To The Field: A Review Of Tax Experiments,” Journal of Economic Surveys, 2018, 32 (2), 273–301. , C. Nell, and N. Monkam, “One Size Does Not Fit All: A Field Experiment on the Drivers of Tax Compliance and Delivery Methods in Rwanda,” International Centre for Tax and Development Working Paper 58 2017. McKenzie, D. and A. Paffhausen, “Small Firm Death in Developing Countries,” The Review of Economics and Statistics, 2019, 101 (4), 645–657. Meiselman, B. S., “Ghostbusting in Detroit: Evidence on nonfilers from a controlled field experiment,” Journal of Public Economics, 2018, 158 (2018), 180–193. Muralidharan, K. and P. Niehaus, “Experimentation at Scale,” Journal of Economic Perspectives, 2017, 31 (4), 103–124. Naritomi, J., “Consumers as Tax Auditors,” American Economic Review, 2019, 109 (9), 3031–72. OECD, “Revenue Statistics Asia and Pacific - Papua New Guinea,” Organisation for Economic Cooperation and Development (OECD), available at: https://www.oecd .org/countries/papuanewguinea/revenue-statistics-asia-and-pacific-papua-new-guinea.pdf 2019. Ortega, D. and C. Scartascini, “Don’t Blame the Messenger. The Delivery method of a message matters,” Journal of Economic Behavior and Organisation, 2020, 170, 286–300. , , and M. Mogollon, “Who’s Calling?: The Effect of Phone Calls and Personal Interaction on Tax Compliance,” Inter-American Development Bank Working Paper 10057 2020. Perez-Truglia, R. and U. Troiano, “Shaming tax delinquents,” Journal of Public Eco- nomics, 2018, 167, 120–137. Pomeranz, D., “No Taxation without Information: Deterrence and Self-Enforcement in the Value Added Tax,” American Economic Review, 2015, 105 (8), 2539–2569. Shimeles, A., D. Z. Gurara, and F. Woldeyes, “Taxman’s Dilemma: Coercion or Persuasion? Evidence from a Randomized Field Experiment in Ethiopia,” American Eco- nomic Association: Papers and Proceedings, 2017, 107 (5), 420–424. 33 Slemrod, J., “Tax Compliance and Enforcement,” Journal of Economic Literature, 2019, 57 (4), 904–954. , O. U. Rehman, and M. Waseem, “How Do Taxpayers Respond to Public Disclosure and Social Recognition Programs? Evidence from Pakistan,” Review of Economics and Statistics, forthcoming, 2020. UNDP, “Human Development Index,” United Nations Development Programme (UNDP), New York City: The United Nations 2019. World Bank, “World Bank Country and Lending Groups,” Washington DC: The World Bank 2019. 34 Appendix A Description of the scoping survey To better understand why Papua New Guineans might not be paying the right amount of tax and possible messages that would make them more likely to do so, we conducted a short scoping survey. We emailed a link to the survey to IRC staff, staff and students at the University of Papua New Guinea (UPNG), staff of a large managing contractor (Abt Associates) and small business owners. We did not have the email addresses of each IRC staff and Abt employee, so we asked management to forward the email on our behalf. We also provided hard copy printouts to small business owners who attended focus groups. The list of small business owners who received the survey was provided by the PNG Women’s Business Resource Centre. Participation was voluntary and no reward was provided. A large share of the IRC re- spondents worked in the Debt and Lodgement Enforcement Division. This is the Division we worked most closely with for the trials and we expected them to have the best under- standing of tax compliance behavior. Their management was most proactive in encouraging and facilitating staff to complete the survey. Technical issues meant that some IRC staff who tried to complete the survey online were unable to successfully access it.20 We used SurveyMonkey and randomized the ordering of options presented within each question. There were 137 responses. Fifty-eight percent were IRC staff, 29 percent were employees of Abt Associates and 12 percent were small business owners. No UPNG staff or students responded. Other than asking respondents which types of taxes they pay; occupation was the only demographic measure collected. The answers did not differ significantly between occupations. Responses were collected from February 5 to 11, 2019. Six questions were asked regarding why Papua New Guineans might not be paying the right amount of tax and possible messages that would make them more likely to do so. The first set of questions related to why Papua New Guineans might not be paying the right amount of tax. They were: “We are interested in understanding why some people in Papua New Guinea do not pay tax (or the right amount of tax). Please rank in order (with the 20 In these instances we provided hard copy printouts to these IRC staff. 35 most likely reason first) the following reasons why you think some Papua New Guineans do not pay tax (or the right amount of tax).”The options provided were: “They are unsure what the tax is used for OR think the tax will not be used well”; “They think most other people like them avoid paying tax”; “They find paying tax too complicated OR they forget to do so”; “They think they can avoid paying tax and not face punishment”. Figure A1 shows the proportion of respondents who selected each option as their most likely reason. Figure A1: Respondents’ views on why people do not pay tax 40 36 32 30 28 20 10 10 0 They think They think They are They find they can most other unsure what paying tax avoid people like the tax is too paying tax them avoid used for OR complicated and not paying tax think the OR they face tax will forget to punishment not be used do so well Then we asked “In the list in the previous question, you selected one option as number 1. Why did you select this option as the most likely reason why some Papua New Guineas do not pay tax (or the right amount of tax)?” and “What are other reasons you think some people in Papua New Guinea do not pay tax (or the right amount of tax)?” The qualitative answers provided to these questions helped enrich our understanding of why respondents believe many people in Papua New Guinea do not pay the right amount of tax. The next set of questions related to possible messages that would make Papua New Guineans more likely to pay tax. They were: “We are interested in understanding what types of messages could be provided to improve tax compliance. Please rank in order (with the most likely message first) which of the following types of messages would be the most 36 effective in encouraging people to pay the right amount of tax.” The options provided were: “A message that explains the possible punishment people can face for not paying the right amount of tax (Deterrence)”; “A message that explains simply how people should determine the amount of tax they owe and easy ways to make this payment (Simplification)”; “A message that explains how the tax they pay is used (Shared Benefit)”; “A message that explains the majority of taxpayers in Papua New Guinea pay their taxes on time (Shared Norms)”, “A message that appeals to peoples’ national pride and their duty to pay tax to support the development of Papua New Guinea (Tax Morale)”. Figure A2 shows the proportion of respondents who selected each option as the message type they thought would most likely be effective. Figure A2: Respondents’ views on what messages would lead people to pay more tax 40 38 32 30 27 20 10 6 6 0 Simplification Deterrence Shared Benefit Shared Norms Tax Morale Note: Simplification defined as a message that explains simply how people should determine the amount of tax they owe and easy ways to make this payment. Deterrence defined as a message that explains the possible punishment people can face for not paying the right amount of tax. Shared benefit defined as a message that explains how the tax they pay is used. Shared Norms defined as a message that explains that the majority of taxpayers in PNG pay their taxes on time. Tax Morale defined as a message that appeals to people’s national pride and their duty to pay tax to support the development of PNG. Then we asked: “In the list in the previous question, you selected one option as num- 37 ber 1. Why did you select this message as the most effective in encouraging people to pay the right amount of tax?” and “What is another type of message you think would improve the payment of taxes?” The qualitative answers provided to these questions helped enrich our understanding of what types of messages respondents believed would be the most effective at leading Papua New Guineans to pay more tax. 38 Additional Tables and Figures Table A1 - Balance across the background characteristics of taxpayers in trial 1 Control Treatment Treatment t-test t-test Group Group 1 Group 2 Difference Difference Variable (1) (2) (3) (1)-(2) (1)-(3) Enterprise over five years old 0.384 0.389 0.385 -0.004 -0.001 (0.006) (0.006) (0.006) (0.007) (0.008) Filed a declaration in last 15 months 0.265 0.262 0.257 0.003 0.008 39 (0.005) (0.005) (0.005) (0.007) (0.007) Located in greater Port Moresby region 0.560 0.560 0.560 -0.000 0.000 (0.006) (0.006) (0.006) (0.008) (0.008) Tax processed through head office 0.939 0.943 0.940 -0.004 -0.001 (0.003) (0.003) (0.003) (0.004) (0.004) Registered as a company 0.652 0.659 0.659 -0.007 -0.007 (0.005) (0.005) (0.005) (0.008) (0.008) Industry classified as business services 0.424 0.426 0.431 -0.002 -0.007 (0.006) (0.006) (0.006) (0.008) (0.008) Observations 7,831 7,828 7,830 ∗ Note: Robust standard errors in parentheses. p < 0.10,∗∗ p < 0.05,∗∗∗ p < 0.01. Table A2 - Balance across the background characteristics of SWT taxpayers in trial 2 Control Treatment t-test Group Group Difference Variable N (1) N (2) (1)-(2) Enterprise over five years old 2324 0.359 2833 0.442 -0.082*** (0.010) (0.009) (0.014) Filed a declaration in last 15 months 2,324 0.439 2,833 0.433 0.006 (0.010) (0.009) (0.014) Located in greater Port Moresby region 2,324 0.608 2,833 0.609 -0.001 (0.010) (0.009) (0.014) Tax processed through head office 2,324 0.970 2,833 0.962 0.008 (0.004) (0.004) (0.005) Registered as a company 2,324 0.568 2,833 0.640 -0.073*** (0.010) (0.009) (0.014) Industry classified as business services 2,324 0.408 2,833 0.413 -0.005 (0.010) (0.009) (0.014) ∗ Note: Robust standard errors in parentheses. p < 0.10,∗∗ p < 0.05,∗∗∗ p < 0.01. 40 Table A3 - Balance across the background characteristics of VAT taxpayers in trial 2 Control Treatment t-test Group Group Difference Variable N (1) N (2) (1)-(2) Enterprise over five years old 2,002 0.295 2,550 0.308 -0.013 (0.010) (0.009) (0.014) Filed a declaration in last 15 months 2,002 0.445 2,550 0.453 -0.007 (0.011) (0.010) (0.015) Located in greater Port Moresby region 2,002 0.646 2,550 0.625 0.020 (0.011) (0.010) (0.014) Tax processed through head office 2,002 0.976 2,550 0.965 0.011** (0.003) (0.004) (0.005) Registered as a company 2,002 0.644 2,550 0.679 -0.035** (0.011) (0.009) (0.014) Industry classified as business services 2,002 0.407 2,550 0.417 -0.011 (0.011) (0.010) (0.015) ∗ Note: Robust standard errors in parentheses. p < 0.10,∗∗ p < 0.05,∗∗∗ p < 0.01. 41 Table A4 - Robustness checks for treatment effect on tax paid for previously filing firms Tax Paid Tax Paid (winsorized) (IHS† ) Panel A - Trial 1 (SWT) Treatment effect 63 -0.085 (165) (0.10) p-value 0.703 0.408 Controls Y Y Observations 6,169 6,169 Panel B - Trial 2 (SWT) Treatment effect 2019∗ 0.846∗∗∗ (1207) (0.18) p-value 0.095 0.000 Controls Y Y Observations 2,249 2,249 Panel C - Trial 2 (VAT) Treatment effect -1166 0.476∗∗∗ (1340) (0.13) p-value 0.384 0.000 Controls Y Y Observations 2,045 2,045 Note: † IHS: inverse hyperbolic sine transformation. We do not present marginal effects of the IHS model. These results are qualitatively similar to the those presented in the paper. PGK - Papua New Guinea Kina (USD1 was approx. PGK3.4 on the 29th August 2019). Robust standard errors in parentheses. ∗ p < 0.10,∗∗ p < 0.05,∗∗∗ p < 0.01. 42 Figure A3: Additional figure showing effect of the treatments over time 43 Table A5 - Robustness checks for treatment effect on tax paid for previously non-filing firms Tax Paid Tax Paid (winsorized) (IHS† ) Panel A - Trial 1 (SWT) Treatment effect -2 -0.003 (8) (0.00) p-value 0.775 0.440 Controls Y Y Observations 17,320 17,320 Panel B - Trial 2 (SWT) Treatment effect 26 0.043 (259) (0.04) p-value 0.919 0.261 Controls Y Y Observations 2,908 2,908 Panel C - Trial 2 (VAT) Treatment effect -751 0.136∗∗∗ (2097) (0.05) p-value 0.720 0.010 Controls Y Y Observations 2,507 2,507 Note: † IHS: inverse hyperbolic sine transformation. We do not present marginal effects of the IHS model. These results are qualitatively similar to the those presented in the paper. PGK - Papua New Guinea Kina (USD1 was approx. PGK3.4 on the 29th August 2019). Robust standard errors in parentheses. ∗ p < 0.10,∗∗ p < 0.05,∗∗∗ p < 0.01. 44 Appendix B Figure B1: Form required to be completed every month for SWT registered firms 45 Figure B2: Example of a letter sent to non-compliant taxpayers 46 Figure B3: Flyer 47