Policy Research Working Paper 10901 Evaluation of Door-to-Door Tax Enforcement Strategy in Indonesia Paulo Antonacci Muhammad Khudadad Chattha Governance Global Practice September 2024 Policy Research Working Paper 10901 Abstract This paper presents an evaluation of a tax enforcement pro- strong as they persist in the following period. The findings gram conducted in Indonesia where officials from the tax show that the visited properties had better compliance his- authority visited properties to engage directly with owners tory, lower value, smaller area, and were more likely to have about their property tax obligations. Through these visits, some construction on them. A key finding from the analysis auditors explained outstanding debts and payment pro- is that higher-value properties are less sensitive to the visits. cesses, aiming to improve tax compliance and revenue In other words, if a data-driven tax-enforcement strategy collection. The paper uses an administrative data set and a is to be applied, then it may focus resources on enforcing new set of machine learning–based techniques to assess the taxation at the poorest part of the population in this case. program’s effectiveness. The program was responsible for This opens up the discussion of the distributional conse- increasing tax compliance on the extensive margin by 4.3 quences of an algorithm-based enforcement strategy, which percent and on the intensive margin by 5.1 percent in the is increasingly important as machine learning techniques first year it was implemented. These effects are particularly are used by tax authorities. This paper is a product of the Governance Global Practice. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at paulo.antonacci@duke.edu and mchattha@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Evaluation of Door-to-Door Tax Enforcement Strategy in I NDONESIA * Paulo Antonacci † Muhammad Khudadad Chattha ‡ JEL:H20,C54,C99,D04 *We thank Peter Arcidiacono, Christopher Alexander Hoy, Viet Anh Nguyen and Naranggi Pramudya Soko for their valued comments. In memoriam, we thank Raka Rizky Fadilla for his exceptional research assistance, work ethic and kindness. We are grateful for the cooperation and assistance of the local government representatives in Gorontalo. We thank Alma Kanani, Habib Rab, Rama Krishnan Venkateswaran, Jurgen Blum and Daniel Ortega for their guidance. The work was financed by the governments of Canada, Switzerland, and the European Union, through the World Bank’s Public Financial Management Multi-Donor Trust Fund for Indonesia. All potential errors are our own. †Duke University; paulo.antonacci@duke.edu ‡World Bank; mchattha@worldbank.org 1. Introduction Increasing tax collection in developing countries is challenging. The core issues encompass a broad spectrum, including inefficient tax systems, high levels of informality, and constrained tax bases, which lead to high levels of tax evasion and avoidance. In this paper, we present an evaluation of a tax enforcement program conducted in the City of Gorontalo, Indonesia, where the tax authorities sent officers to visit properties and engage directly with owners about their property tax obligations. By conducting visits to discuss tax liabilities and explaining the payment process, the program aimed to enhance tax compliance and augment revenue. We had access to an administrative dataset encompassing all properties in the City of Gorontalo from 2014 until 2022. The data includes property characteristics, historical compliance, and some information on the ownership. This allowed us to use machine learning techniques to assess the program’s effectiveness and investigate heterogeneous effects. The techniques used allow us to assess the program’s effectiveness by contrasting the outcomes of visited households with comparable households that were not visited. We are mainly interested in the program’s capacity to induce compliance in the absence of immediate penalties. More explicitly, our goals are to answer the following questions: (1) Are visited houses more likely to pay property taxes? (2) Does this effect persist over time? (3) What are the characteristics associated with higher/lower responsiveness to the visits? We find a notable positive impact of the visits on tax compliance, underscoring the po- tential of direct engagement strategies in improving tax collection efforts. Specifically, these interventions increased the tax compliance on the extensive margin by 3.5% and on the intensive margin by 3.4%. Considering the baseline results for the visited properties, the program was responsible for a relative increase in the extensive margin of 4.3% and on the intensive margin of 5.1%. Visited properties differ considerably from their counterparts, among many factors, they dis- played much better historical tax compliance. The methods we use control for the unbalanced 2 selection of visited households. We found that the visited properties were much less sensitive to the program, i.e., the properties that received the visits were the ones less likely to increase compliance, compared to the properties that were not visited. Therefore, there is potential for program improvement by implementing targeting of the visitations. We estimate that if the intervention had been exclusively focused on the untreated units, the impact of the program would have been 4.7% and 3.8% for the extensive and intensive margins, respectively. Among the unvisited properties, the baseline compliance was 66.9% and 55%. Therefore, had these units been visited, we would expect that the program would be responsible for 7.0% and 6.9% of the average compliance in 2021. The program results are stronger in the face of the persistent effects we find. The absolute effects of the program increased in 2022. We find an increase of 6.1% in the intensive/extensive margin.1 Out of the baseline among the visited properties (76.3%), the program was responsible for increasing the overall compliance by 8.6% in 2022. We do not have an explanation for the increasing results. However, we strongly suspect the effect of COVID-19 policies in 2021. We speculate that the program results for 2021 would have been even greater in regular years in the absence of the pandemic. Our methodological approach offers a significant advantage by accommodating heteroge- neous effects, enabling us to pinpoint the characteristics of properties that exhibit the best and worst responses to the intervention. We discuss heterogeneous effects on several dimensions. A key finding from our analysis is that the higher the value of the tax object is, the lower is the effect of the program. That opens up a discussion on the distributional consequences of a fully data-driven enforcement strategy. Such a strategy would redirect efforts toward law enforcement in low-value properties. While the optimal strategy for enforcement is still an open question, our results make a case for the need for interpretable algorithms in using machine learning methods for policy. This research is connected to a vast literature on tax compliance and enforcement strategies. 1Due to changes in payment rules, the extensive and intensive margins for 2021 are the same margin. 3 However, to the best of our knowledge, this is the first such work using tax data from devel- oping countries. Also, unlike most previous studies,2 the tax authorities have not imposed any extra penalty on the visited properties. Together, these two characteristics represent a setting where auditing programs were less likely to work and make a stronger case for the adoption of such programs in a developing country setting. Our results, also open a discussion on the importance of considering enforcement strate- gies while designing tax law. The distributional impacts of a tax system are frequently addressed through tax policy design separately from tax enforcement strategies. The presence of hetero- geneous effects in enforcement strategies illustrates the case for which a progressively designed tax system where the mainly poorest are induced to be compliant may yield a regressive out- come. We are investigating these issues more deeply in another working paper, "Policy Learning in Taxation". This paper is structured to provide a comprehensive exploration of the evaluation of a tax audit program in Gorontalo, Indonesia, from theoretical models to empirical findings. Section 2 is a literature review, focusing on two pivotal areas: the economics of tax compliance, where we examine the theoretical underpinnings and empirical evidence on audit-like strategies and their effectiveness, and Machine Learning for causal inference with heterogeneous treatment effects. Section 3 offers a background description of revenue collection in Indonesia and motivates the importance of fostering tax compliance, followed by a description of the context within which the tax audit experiment was conducted in Gorontalo. Section 4 presents the empirical results of the study. Section 5 presents the conclusions. 2The exception is Hebous et al., 2023 for Norway. 4 2. Literature Review This paper intersects with two distinct bodies of scholarly work. First, it contributes to the field of public finance, more specifically to the economics of tax compliance. Second, it explores the adoption of recent machine learning techniques to examine heterogeneous effects. The following literature review is divided into two parts, each summarizing key contributions within these areas. 2.1. Economics of Tax Compliance There are three main channels through which our results can be rationalized: (i) Pecuniary Incentives; (ii) "Tax Morale"; (iii) Information Friction. The classic economic models of tax evasion analyze the pecuniary incentives for evasion. The models are modified versions of Becker’s (1968) framework for dealing with crime and punishment. Early research rooted in economic theory posited that the deterrent effect of audits hinges on the perceived probability of being audited and the ensuing penalties if caught (Allingham and Sandmo, 1972). This perspective views the taxpayer’s decision as a gamble, weighing potential savings against the risks of evading. Slemrod and Yitzhaki (1987) extended this framework, suggesting that taxpayers adjust their compliance levels based on changing perceptions of audit risks and penalties. Moreover, Yitzhaki, 1987 explored the relationship between the penalty rate and tax evasion, suggesting that it is the combined effect of both audit probability and penalties that molds taxpayer behavior. The visiting program we study is not technically an auditing program, as its main goal is not to identify tax evasion. However, within the framework of tax compliance through pecu- niary incentives mentioned in the paragraph above, the visitation program works similarly to auditing in rationalizing the results. The visits provide a personal interaction between the 5 taxpayers and the tax authorities. The taxpayers perceive it as an increase in the authorities’ effort to recover their money; therefore, the relative return of avoidance decreases. The problem with advocating for individual-specific interventions such as auditing or door- to-door visitations is that those are costly programs. There has been some disagreement in the literature regarding such interventions. On one hand, Slemrod and Yitzhaki, 2002 argue that due to the potentially significant costs of auditing, the optimization of audit numbers and their effective distribution emerge as a pivotal policy dilemma. On the other hand, many studies have found high returns to investing in tax auditing. Tax authorities’ official statements seem to corroborate this understanding. The Congressional Bud- get Office CBO, 2020 estimated that modest investments in the IRS would generate somewhere between $60 billion and $100 billion in additional revenue over a decade. Sarin and Summers, 2020 suggest that the methodology used seriously underestimates the potential revenue. Moreover, Boning et al., 2023 find that the direct revenues collected during an audit exceed the costs of the audit by a factor of more than 2:1. The returns to audits of high-income taxpayers are substantially greater than the returns to audits of low-income taxpayers. If sticky behavioral responses are considered, the return of auditing is much higher. H. Kleven et al., 2011 conducted a seminal study in Denmark, which showed that audited taxpayers increased their reported liabilities by 55% of the audit adjustment in the subsequent year. This suggests that audits not only recover immediate lost revenue, but can also have a positive influence on future compliance. Advani et al., 2023 uses microdata from the UK to find a persistent impact of audits on increasing reported tax liabilities for five years after the audit. Moreover, effects are longer- lasting for more stable sources of income, and only individuals found to have made errors respond to audits. Similarly to our setting, Hebous et al., 2023 show that dynamic effects seem to be present despite the absence of penalties. For those found non-compliant from the audits, there was an improvement in the future up to six post-audit years. The main caveat: This study took place in Norway. It is imperative to note that the impact of audits can be contingent on the specific socio- economic and cultural context. On the one hand, there are questions on the external validity of 6 such results. On the other hand, if there is one country in which we would expect this reaction, it is Norway. Asatryan and Peichl, 2017 underscored that in developing countries, where formal institutions might be weaker, the effect of audits could be different from that in developed economies. This is precisely the context we explore in our analysis of Indonesian data. The second class of mechanisms behind the increase in tax compliance is often referred to as "tax morale". It represents a more recent body of literature that has investigated the role of nonpecuniary motives more broadly Kirchler, 2007, Luttmer and Singhal, 2014, Besley et al., 2023. In that sense, the visitation program would increase the intrinsic value of being compliant. The last class of information explains compliance behavior through "information frictions" Bhargava and Manoli, 2015, H. J. Kleven and Kopczuk, 2011, Cox et al., 2020. Here, evasion could originate from taxpayers who are not aware of their obligation or who do not know how to be compliant. The visits would increase compliance to the extent that tax authorities inform the taxpayers how to fulfill their obligations Kirchler, 2007 and Slemrod et al., 2001. Recently, De Neve et al., 2021 explored this hypothesis in an experiment in Belgium using quantitative meth- ods similar to those used in this paper. 2.2. Use of Machine Learning to Investigate Heterogeneous Effects The advent of Machine Learning (ML) techniques signifies a pivotal advancement in the toolkit available to applied researchers across various fields. It encompasses a broad spec- trum of statistical learning methods such as Random Forest, Neural Networks, and Penalized Regression, which have revolutionized prediction and pattern recognition, especially in high- dimensional settings (Athey and Imbens, 2019; Hastie et al., 2009;Bishop, 2006). Machine Learning (ML) methods excel in conducting research within observational studies, particularly in scenarios where significant sample imbalances exist between treated and non- treated groups. Traditional approaches in this domain have largely relied on propensity score 7 matching. However, this method can be inefficient and may even exacerbate sample imbal- ances, resulting in model dependency and bias. To address these shortcomings, more efficient "pruning" strategies have been developed, including Coarsened Exact Matching (CEM) Stefano M. Iacus and Porro, 2011, k-nearest neighbors (knn) matching, and Matching After Learning to Stretch (MALTS) Parikh et al., 2022, offering more robust solutions to the challenges of observa- tional study designs. Estimations derived from matching strategies that employ pruning are vulnerable to extrap- olation bias. In response, researchers have turned to popular interpolation methods like syn- thetic controls to examine causal effects using disaggregated data, though these methods can introduce interpolation bias. To address these limitations, recent methodologies propose a hy- brid approach, combining elements of both matching and interpolation to achieve more effi- cient estimators, as suggested by Abadie and Kellog. This innovative strategy aims to mitigate the biases inherent in each method individually, offering a more balanced and accurate estima- tion of causal effects. The lasso method proposed in Belloni et al., 2013 and Belloni et al., 2017 deals with variable selection within high-dimensional datasets, efficiently identifying key predictors by applying penalties to coefficients, thus enhancing treatment effect estimations. This method outper- forms traditional approaches like propensity score matching, which often falls short in com- plex data scenarios. Chernozhukov et al., 2018 further this exploration with Double Machine Learning (DML), a method that divides the estimation into two phases: prediction and causal estimation. Initially, machine learning predicts outcomes and effects from covariates, then uses these predictions’ residuals for causal analysis. Both strategies have proven useful for causal inference in the presence of imbalanced samples. Metalearners for causal inference represent frameworks that utilize machine learning (ML) models to estimate the Conditional Average Treatment Effect (CATE) or the Average Treatment Effect (ATE) within various subpopulations in a dataset. They streamline the causal inference process by breaking it down into simpler components that ML algorithms can more effectively 8 tackle though its prediction capabilities. The Single-Model Learner (S-Learner) applies a solitary ML model to assess treatment ef- fects, treating the treatment indicator as an additional input feature Hill, 2011 and Foster et al., 2011. On the other hand, the Two-Model Learner (T-Learner) distinguishes itself by training two distinct ML models—one for individuals in the treatment group and another for those in the control group. The differential predictions from these models serve to estimate the treat- ment effect, as discussed by Athey and Imbens, 2016 and further elaborated by Lu et al., 2018 and Powers et al., 2018. The X-Learner is an extension of the T-learner. It works in three steps: First, it estimates the effect of treatment on both treated and control groups separately, as in the T-learner. Second, it calculates ’imputed treatment effects’ for each group, aiming to identify the potential outcomes for untreated units had they been treated, and vice versa. Third, these imputed effects are used as outcomes in separate regression models, employing the chosen ML model and using covariates as predictors, as detailed by Künzel et al., 2019. The R-learner, introduced by Nie and Wager, 2020 use a two-stage meta-algorithm that builds upon Robinson’s 1988 decomposition of the outcome, hence the name R-Learner. This approach distinctively incorporates the Conditional Average Treatment Effect (CATE) directly into the out- come regression model, facilitating a nuanced analysis of treatment effects across different co- variate levels. This approach is characterized by its structured process: predicting nuisance functions in the first phase and estimating CATE in the second by applying a loss function specifically tai- lored to it. It adeptly addresses selection bias by adjusting for propensity scores, and is particularly effective in scenarios with extreme propensity scores where treatment assignment is nearly deterministic. By employing this model, the R-learner corrects biases in treatment effect estimation, accommodating variations in treatment assignment among individuals through a unique parameter tuning mechanism. This is conducted using base learners like random forests or 9 gradient boosted trees, tailored not to the raw outcomes but to a model that compensates for covariate effects and propensity score balancing. This method significantly advances the esti- mation of CATE in observational studies, offering a precise and dependable approach to causal analysis. Recent literature has developed some methods to investigate the presence of heterogenous treated effects. Chernozhukov et al., 2023 proposes a comprehensive method for machine learn- ing inference on heterogeneous treatment effects, incorporating the Best Linear Predictor (BLP), Sorted Group Average Treatment Effects (GATES), and Classification Analysis (CLAN). This method effectively identifies and analyzes variations in treatment effects and their associated covari- ates. On the other hand, Semenova and Chernozhukov, 2020 focuses on model-based strategies for "partial" CATE inference, using machine learning to analyze outcomes relative to a specific, lower-dimensional set of covariates. This technique facilitates nonparametric inference on par- tial CATE, revealing treatment effects within selected covariate groups, though it requires arbi- trary selection of baseline covariates. Together, these approaches enhance our understanding of treatment effects, addressing both broad heterogeneity and effects within specific covariate groups. Our chosen approach, Clustered Random Causal Forest (C-RCF) approach Athey and Wa- ger, 2019 and Wager and Athey, 2018, aligns with the R-Learner framework by employing honest trees to compute R-Learners. We chose this method for several reasons: First, it has demon- strated strong performance in recovering the Conditional Average Treatment Effect (CATE) com- pared to other methods Caron et al., 2022. In that sense, Causal BART as in Hahn et al., 2020 may have an even better performance in simulation studies and was a strong candidate, however, we are now allowed to proceed with inference techniques on the partial CATE.3 Lastly, the design of the RCF algorithm naturally extends to the domain of policy learning in observational studies 3Causal Basian additive regression trees - CBART is not orthogonal do to the errors in nuisance components. Other inference techniques may be available within the Baysian framework to analyze the partial CATE. However, we would rather use techniques with the frequentist framework. A possible extension of this paper is to add results from C-BART and other methods as a robustness check. 10 Athey and Wager, 2021, facilitating the application of various policy targeting rules.4 After the estimation we proceed with the analysis proposed by Chernozhukov et al., 2023 and Semenova and Chernozhukov, 2020. 3. Background 3.1. Revenue Mobilization in Indonesia: Low Revenue Limits Public Expenditure Indonesia has achieved remarkable economic stability, growth, and poverty reduction over the last two decades. With an annual growth rate of 5.3% from 2000 to 2018, surpassing the 4.9% average of lower-middle-income countries, Indonesia has seen significant economic progress. This period was characterized by a reduction in growth volatility and the creation of over 30 mil- lion service and industrial jobs, which replaced less productive agricultural roles and increased household incomes. Consequently, poverty rates dropped dramatically from 19.1% in 2000 to 9.4% in 2019, and Gross National Income (GNI) per capita increased six-fold, reaching US3,840, moving millions into the middle class. Prudent fiscal management has been key to this success, particularly since the State Finance Law of 2003 mandated fiscal deficit and public debt limits, which have been well maintained. The public debt-to-GDP ratio declined sharply from 83 percent in 2000 to 30 percent of GDP in 2018, thereby earning Indonesia an investment-grade rating from the four major credit agen- cies. Fiscal policies have also contributed to poverty and inequality reduction, albeit modestly compared to peers. Indonesia’s expenditure-to-GDP ratio (9.3% in 2018) is below the emerging economy average 4Notice that RCF is asymptotically equivalent to propensity score matching in recording some structural param- eters such as ATE. 11 and restricts the expansion of public services. Figure 1 displays a cross-country comparison of Indonesia’s general revenue and government expenditure. In both cases Indonesia is below the average of its peers. According to the World Bank, 2020 "The main reason for the low level of spending is the structurally low level of revenue collections. Indonesia’s revenue-to-GDP ratio is low at I4.6 percent in 2018, compared with the emerging economy average of 27.8 percent. Revenue mobilization relies heavily on tax revenues, which have historically accounted for over 75% of the government’s total revenues. However, the country has experienced a decline in tax performance over the last two decades, evidenced by a significant drop in the tax-to-GDP ratio from 13.3% in 2008 to 9.1% in 2021. This decrease, even after a recovery post-pandemic, indicates a persistently low tax-to-GDP ratio compared to peers.5 The World Bank identifies the compliance gap as the primary issue for the decline of revenue mobilization performance. It accounts for about 85% of the total revenue gap, or roughly 6.1% of GDP. The compliance gap arises from taxpayer non-compliance, including evasion, fraud, and avoidance, rather than policy decisions like rate reductions or exemptions. For instance, Indonesia’s audit performance is weaker than that of its peers, with fewer audits per capita between 2018 and 2020 and less tax liability generated from these audits when compared to peer countries. The system’s vulnerability to evasion and avoidance is exacerbated by an ineffective risk-based audit system, which has shown diminishing returns over time. Improving audit performance, particularly in risk-based audits, is crucial for Indonesia to bolster its revenue mobilization. Following the deterrence model by Allingham and Sandmo, 1972, enhancing the probability of detection through more effective special and risk-based au- dits is key to inducing compliance and addressing the country’s revenue challenges. 5The figures presented in the last two paragraphs are from the central government revenue authority, which is different from subnational taxation, which is what we focus on in this paper. However, they reflect Indonesia’s structural problems with revenue mobilization that also occur at the local level. 12 3.2. Property Taxes in Kota Gorontalo In this paper, we will analyze compliance in tax property in Kota Gorontalo (KG),6 in the state of Gorontalo Indonesia. The city covers an area of 79.59 km², around 2.26 times larger than Duke University’s campus, with an estimated population of 201,350 in 2022. Figure 3 is a map displaying the location of where the experiment took place. Property tax revenues in KG have had a notable rise over recent years due to increases in house prices and the number of registered tax-paying properties. The average house prices have increased 104% and the number of registered properties increased 7% between 2014 and 2022. Despite the increase in tax revenues, there has been an increase in non-compliance. The debt associated with properties that have not cleared their dues increased 225%. This can be traced back to both an escalation in the property values that remain unpaid (88%) and a surge in the actual number of properties not meeting their tax obligations (76%). Figure 5 illustrates the evolution of the Tax Bill; Tax Revenue and Tax Debt between 2014 and 2022. We observed that partial tax payments increased exceptionally in 2021. We speculated7 that these partial payments arose from a possibility to pay taxes in installments. Later, due to the pandemic, some taxpayers opted to suspend the payments of their tax property. For this reason, it is important to consider the intensive and extensive margins in our analysis. In 2021 most properties were partially compliant, at8 48.9%, 28.3% were non-compliant, and 22.8% were fully compliant of the total number of properties. Moreover, 58% of the total property tax bill was paid, 30% by fully compliant properties, and 28% by partially compliant properties. The outstanding debt of 42% of the tax bill 46.7% comes from properties that did not exert any payment and 53.2% comes from properties that 6Kota Gorontalo means Gorontalo City. The city is the capital of a homonym state Gorontalo. 7Inquiries regarding this matter have been directed to the tax authorities in Kota Gorontalo, with response pend- ing. 8We define a property as partially compliant in a given year if the tax property payment was positive, but inferior to the total tax bill. 13 made some sort of payment (Figures 7 and 8). 3.3. The Door-to-Door Intervention To tackle the growing evasion, the local tax authority created the door-to-door tax enforce- ment campaign in Kota Gorontalo. It consisted in sending tax officials to properties to directly interact with property owners regarding their tax responsibilities. The program spanned 14 months, from November 2020 to December 2021. Visits peaked in January 2021 with 8,776 monthly visits, but they significantly decreased from March 2021 due to COVID-19, resuming at a slower pace from August to December 2021 when the program was terminated (Figure 6c). If the visitation rate was kept, all properties would have been visited before the end of 2021. By the end of the program in December 2021, of 52,941 registered properties, 31,730 received visits, 59.9% of the total. The tax officials were equipped with geolocated tablets. Once a visit is entered into the tablet, its location is recorded along with the name of the citizen who answers the door and the name of the person responsible for the property. The visit finishes after the tax official talks to the person responsible for the property or if no one was found at the house. Then, another visit should restart at the next-door neighbor. No house was visited twice. The geolocation allows us to attest that the visits took place, as well as checking visitation patterns. Our initial analysis shows that the geolocation records are consistent with the regis- tered addresses in our administrative dataset. In Kota Gorontalo, there are 50 neighborhoods. Of the 31,730 visitation records, no geolocation record was located more than 3 km away from Kota Gorontalo. Moreover, only 18 records were distanced more than 500 meters from the neighborhood it was supposed to be. The high level of autonomy of tax officials in deciding where to visit first has created a major selection of what properties were visited. For example, despite covering almost 60% of the reg- 14 istered properties, the tax bills of these properties amounted to only 46.2% of the total tax bill, and 50.4% of the total taxes paid in 2021. 3.4. Selection on Treatment Our analysis highlights a pronounced selection. First, we look at the difference between the visited and non-visited groups across neighborhoods. Figure 6a contains a bivariate map that showcases the distribution of average compliance and the proportion of visited properties in each neighborhood. The shares of visited properties in a neighborhood range from 34% to 95%, while historical compliance ranges from 63% to 98%. The neighborhoods more frequently visited were the ones with higher historical compliance, those are the neighborhoods in dark blue in (Figure 6a). Conversely, the neighborhoods in grey were less likely to have paid the property taxes and also less likely to be visited. Figure 6b plots illustrates this correlation. Also, notice that the number of properties in each neighborhood represented by the size of each dot is roughly the same across all the neighborhoods. The baseline comparison in Table 2 reveals notable differences between the treated and control groups.9 The mean test indicates that all variables except for the logarithm of taxable value have statistically different means. However, the composition of this variable varies wildly between treated and non-treated. The properties of each group differ in its main characteristics: size, price and land use. On average, visited properties had lower total values, smaller land areas, and lower land prices. However, also they had much higher construction value land area, and value of the area by square meter and were more likely to have construction in it. The properties also diverge in the characteristics of its owner. The owners of properties in the treatment group are more likely to live in the visited properties and less likely to own multiple 9We are using the terms treated/control as equivalent to visited/non-visited. 15 properties. On average, the value of their portfolio in KG is lower. Treated and non-treated properties also differ vastly in historical compliance. Visited prop- erties had better historical compliance. They were more likely to have fulfilled their tax obliga- tions in 2020, were less likely to have made payments after the deadline, and were less likely to have received a fine beforehand. The distinctions mentioned above highlight the complex dynamics influencing property tax compliance. Despite all the differences in the treatment and control group, given the observ- able characteristics in our dataset, no unity was deterministically immune or targeted by the program.10 In other words, no covariable seems to play an effect strong enough to degenerate the probability distribution of treatment assignment. Therefore we can vouch for the existence of overlapping support in treatment for all observations in our dataset. We will discuss it further in our model identification assumptions. 4. Empirical Results This section is composed of two parts. First, we discuss the estimation strategies suitable for analyzing the impact of a specific intervention, focusing on the nuanced detection of het- erogeneous treatment effects. We begin by delineating the array of estimation techniques at our disposal, ultimately arguing in favor of the Clustered Random Forest (CRF) method. The prefer- ence for CRF is due to the ability to do provably valid statistical inference and the robustness of the results to the possible presence of irrelevant covariates. The second part of this section discusses the results obtained by our empirical estimation. We present empirical evidence demonstrating the positive impacts of the tax compliance pro- 10Figure 9 shows the distribution propensity score estimation through boosted regression trees. This is the propensity score estimation method we use as input for the causal forest. The lowest probability of receiving treat- ment was 14% and the highest was 91%. 16 gram in both 2021 and 2022. The findings underscore a significant variability in outcomes, with the Average Treatment effect on the Treated (ATT) surpassing the Average Treatment effect on the Untreated (ATU). Our analysis investigates the program and highlights the characteristics that correlate with higher program effect. 4.1. Model We begin this section with a high-level overview of the C-RCF model. Our approach will be guided by the Neyman-Rubin Causal Model, as described in Rubin, 1978 and further elaborated in Imbens and Rubin, 2015. Every property is indexed as i . We consider a binary variable for treatment assignment, Zi j , which can take values in the set {0, 1}. Here, Zi = 1 signifies that the unit i was visited, and Zi j = 0 means otherwise. Potential outcomes for each unit are denoted as (Yi (0), Yi (1)). The term Yi (1) represents the outcome for unit i when it undergoes treatment, corresponding in our study if a property was ever visited. Conversely, Yi (0) represents the outcome without treatment expo- sure. The outcome variable is continuous, meaning (Yi (0), Yi (1)) ∈ R2. The vector Xi contains the covariates, among them the categorical variable ui that indicates the neighborhood where the property is located. We proceed we a cluster robust estimation procedure.11 We start by proposing a partially linear model, assuming that a unit’s baseline outcome can be described by an unknown (potentially complex) function f , with a treatment assignment introducing a constant shift in the outcome by an amount τ (Xi ): Y i j = τ ( X i ) Z i + f Xi j + ε i , E [εi | Xi , Zi ] = 0, (1) Our analysis will describe 4 outcomes variables: (i) Tax Participation - Y Pa - Dummy that assumes value 1 if any positive payment; (ii) Compliance - Y Co - The ratio of paid to payable; (iii) log Value Paid - Y lnP; and (iv) Value Paid - Y V P . We are not able to find any statistically significant 11For a more detailed explanation, see Athey and Wager, 2019, section 1.2. 17 VP result for the outcome {Y }. We opt to leave this variable explicitly in case the reader wants to check it, but we will not comment on it in this section. 4.2. Estimation Results Result 1 The effect of the door-to-door intervention is positive. Result 2 There was negative selection into treatment, the units with the highest potential effects were not visited. Table 4 presents the point estimates. For all the variables the ATE and ATU are statistically different than zero. The Average Treatment Effect (ATE) and the Average Treatment Effect on the Treated (ATT) being positive indicate beneficial outcomes from the treatment. In 2021, we observed the following patterns ATT>ATE>ATU which suggests the presence of heterogeneous treatment effects and negative on the visit: The units visited on average show lower treatment effect compared to those not visited. Tax participation increase by 4% in 2021, with the effect varying across different treatment statuses: 3.5% increase was observed in the units directly visited by tax officials, whereas not- visited are estimated to have a boost of 4.7% had they been visited. Notice here that the variance for our estimated CATE is much higher among visited properties. Considering the baseline re- sults (Table 3), the program was responsible for between 0.2% and 8.1% increase among the visited units.12 However, the results would be significantly higher if the visits took place in the houses that were not visited. We estimate that the average participation for the treatment group would be 63.9% and between 3.8% and 9.3% of the effect could be attributed to the interventions. If the program was to be fully implemented, it would be responsible for between 1.6% and 8.7% of the total tax participation. 12We use 95% confidence intervals. 18 Compliance increased overall by 3.5%. Among visited properties, it increased 3.4% and we estimate that it would increase 3.8% in the non-visited properties. In comparison to the base- line values (Table 3), the program had the potential to increase overall compliance by between 5.1% and 5.8%; it increased the compliance of the visited properties by between 1.2% and 8.6% and could have increased the average compliance among the control group to 58.9%, where we expect the program to be responsible for between 3.7% and 9.1% of the total compliance. The impact on the logarithm of total payments was observed at 0.422, with visited units experiencing an effect of 0.377, and non-visited units seeing a higher impact of 0.492. These outcomes suggest an average uplift in payment values ranging from 31.9% to 76.3%. Notice that this result is mainly due to the ATU. If we consider only the group of non-visited units, there are larger potential gains, between 46.9% and 82%. Due to the presence of outliers, we do not see any statistically significant effect when analyzing just the value paid. Result 3 The effect of the door-to-door visits in 2022 compared to 2021: (i) Are on average higher; (ii) Have narrower confidence intervals; (iii) Have lower difference between ATT and ATU. First notice that as the partial payments virtually disappear from our dataset in 2022, then Y Co converges to Y Pa , therefore τCo converges to τPa. Overall the intervention increased Tax Participation/Compliance by 6.3% in 2022, that is between 5.4% and 11.1% of the overall com- pliance. These results are higher than what we found for 2021. Results 3 are in line with the literature on the economics of tax compliance that consistently finds time persistence in the effects of interventions. However, as time passes those results lose strength. We do not have an explanation for the increase in the effect, but we speculate that it may be due to the economic effects of the COVID-19 pandemic that lost strength in 2022. Therefore, it is also plausible to imagine that the results we find underestimate the potential gains in regular years. The difference between ATT and ATU is lower in 2022. However, when we compare the re- 19 sults to the aggregated outcomes we see that the treated group had a much higher participation and compliance rate in 2021 as well as in 2022. The estimated ATT for Tax Participation/Com- pliance was 6.1% which is 8.0% of the observed aggregated outcome, whereas the ATU was 6.4% which represents 9.8% of the estimated Tax Participation/Compliance if the properties had been visited. 4.3. Heterogeneous Effects Result 4 There is statistically significant heterogeneity in the treatment effect. The heterogeneity seems to be concentrated in the control group. Analyzing only average treatment effects can be deceiving as average may hide potentially high heterogeneity. Figure 10 illustrates the dispersion of our CATE estimator where red is the treatment and blue is the control group. We start by checking if our model is capable of address- ing heterogeneity. We will do it using two different methods. First, we compute the Average Treatment Effect (ATE) for both the upper and lower halves of the Conditional Average Treatment Effect (CATE) distribution using a doubly robust method. Subsequently, we assess the statistical significance of the mean difference between these groups. The results, presented in Table 5, indicate that for 2021, a significant difference between the high and low treatment effects groups is observed only for the outcome τlnP . Analysis of the treated and untreated units reveals a notable difference between these groups, suggesting the presence of heterogeneous treatment effects. Furthermore, this disparity became more pronounced in 2022. The second method to assess heterogeneity follows the suggestions of Athey and Wager, 2019 and Chernozhukov et al., 2023. This method aims to model the Conditional Average Treatment Effect (CATE) as a linear function of out-of-bag estimates from a causal forest. Two synthetic predictors are generated: Ci , reflecting the deviation of treatment effects from their mean af- 20 ter adjusting for propensity scores, and Di , capturing the interaction between these deviations and the actual treatment effects. By regressing adjusted outcomes against these predictors, the coefficient of Di serves as an indicator of the precision in estimating treatment heterogeneity. Table 6 displays the results where the estimators for C and D are in the lines named Mean and Differential. If the coefficient related to Mean is close to 1, the more confident we can be that the forest prediction is correct. A coefficient of 1 for Di suggests accurate calibration of heterogeneity estimates. Moreover, if the coefficient is significantly greater than 0, then we reject the null of no heterogeneity. However, if Differential is far from one but significant and positive we have evidence of a useful association between the out-of-bag estimated CATE and the true CATE. Our estimation suggests that C-RCF is capable of correctly addressing the average treatment effect and that there is heterogeneity in all the tested outcomes, except for the paid value before the log transformation - Y PV . 4.3.1. Heterogeneous Effects In this section, we explore how various property characteristics influence the program’s ef- fectiveness. The results, presented in Tables 7 to 14, stem from fitting a doubly robust linear model E [Y (1) − Y (0) | X = Xi ] = τ(Xi ) ∼ β0 + Ai β, where Ai represents the covariates projected onto the Conditional Average Treatment Effect (CATE). More specifically, we want Ai to include the following: Land Area, Land Price per m2; Dummy if there is construction in the Land interacted with the Construction Area and Construction price per m2; Vector of Owner’s Characteristics; Vector of Payment History of the property; Vector of neighborhood fixed effects. The tables detail estimates with neighborhood-level fixed effects in columns (1) to (7), while column (8) provides estimated projectors akin to those in column (7) but excludes neighbor- hood fixed effects. 21 Result 5 The visits were more effective in properties with lower tax base values. The program is more effective in inducing compliance in properties with lower area and lower value per square meter. Tables 7 to 14 show that for most of our outcomes of interest, the effectiveness of the pro- gram is negatively correlated with the value of the tax base. When decomposing these effects by the value of the property and the value of the construction we see that the main drivers of this correlation come from the values associated with the land. The visits are more effective in inducing compliance in properties with lower area and lower value per square meter. This is reinforced by the dummy "Dummy - Lower Tax Bracket - 2021" which indicates that a property value is below 1,000,000 rupiah. Interestingly, for the year 2021, these results are not valid for the outcome that measure if any payment was done looks at any payment or the outcome of the logarithm of the paid value. However, the results are valid for all outcomes for the year 2022. Overall, the coefficient associated with land area presents a much higher magnitude than the coefficient associated with land price. These results seem to suggest that the optimal strategy to induce compliance includes visiting property with a lower value. Then a visiting strategy that aims to induce compliance may have serious regressive consequences. Result 6 The visits were more effective in properties that were fined in the past. However, the effectiveness of the visits decreases with the value of the fine. For all the relevant outcomes, the effect of the visits was higher in properties that received fines in previous years. Moreover, the effectiveness of the visits decreases with the values of the fines. One may argue that this is mechanically driven by the fact that higher-value properties receive higher fines. However, even after controlling linearly by the value of the property, the negative correlation persists. This is valid for all previous years and all outcomes. This is aligned with the literature on the persistent effects of auditing on tax compliance. 22 Result 7 Characteristics of the owner, presence of the building, and its characteristics do not seem to correlate to the effects of the visits. 5. Conclusion This paper evaluates a tax enforcement program conducted in Gorontalo, Indonesia, where tax officials visited properties to engage directly with owners about their tax obligations, aiming to improve compliance and revenue collection. Initiated to address growing tax evasion, the door-to-door campaign ran from November 2020 to December 2021, peaking in January 2021 with 8,776 monthly visits before slowing due to COVID-19. By December 2021, 31,730 of 52,941 properties (59.9%) had been visited. A strong aspect of this paper is its connection to the literature on the effects of close in- tervention for tax compliance and its potential to generalize findings to other contexts. Many equivalent studies in developed countries demonstrate that direct contact with tax authorities, often through audits, increases compliance. However, the literature lacks examples from de- veloping countries, where tax compliance and state capacity are generally lower. Moreover, we find a positive effect on tax compliance even in the absence of punishments for non-compliant properties. These results suggest positive results of implementing such a policy in other settings. While property taxes’ specificity might limit generalizability, this specificity is advantageous for statistical identification, as it allows us access to the real tax bill of both treatment and control groups, enabling a comprehensive analysis using the entire dataset, not just the treated units. We applied observational study analysis techniques to address the imbalance between treated and untreated units and conducted cluster robust analysis to mitigate the impact of local treat- ment spillovers. The baseline comparison in the study reveals notable differences between the treated (vis- 23 ited) and control (non-visited) groups. The mean test shows statistically significant differences in all variables except for the logarithm of taxable value, which varies greatly between the two groups. Treated properties, on average, had lower total values, smaller land areas, and lower land prices but higher construction value, land area, and value per square meter, and they were more likely to have constructions on them. Owners of treated properties were more likely to re- side in them and less likely to own multiple properties, with lower overall portfolio values. Addi- tionally, treated properties exhibited better historical compliance, being more likely to have met tax obligations in 2020, less likely to have made late payments, and less likely to have received fines. These distinctions highlight the complex dynamics influencing property tax compliance. Our findings reveal statistically significant evidence that Gorontalo’s tax enforcement strat- egy through visitation notably increased compliance rates among property owners, with effects intensifying in subsequent years. The influence of the COVID-19 pandemic in 2021 might have played a significant role in these outcomes. For 2021, if the program was implemented in all the districts, we would expect a tax par- ticipation increase of 1.6% and 8.7%; and an increase in overall compliance by between 5.1% and 5.8%. The visited properties experienced more pronounced effects than the control group, suggesting potential benefits from revising the targeting strategy of the visitations. Specifically, implementing the program for the control group could potentially increase tax participation by 7.0% and tax compliance by 6.4%. These effects are not only persistent in 2022, but also increas- ing. We do not have an explanation as to why the effect would increase over time, but we find it reasonable to speculate that the results in 2021 are underestimated due to the pandemic. We find strong evidence of the presence of heterogeneity in the program’s effects, particu- larly more significant among the control group. Moreover, visits were more effective in proper- ties with lower tax base values, especially those with smaller land areas and lower land values per square meter. Payment history is also highly associated with the visitation effect: proper- ties previously fined showed higher compliance post-visit, though the visitation’s effectiveness diminished with the fine’s value, even after adjusting for property value. Also, the set of owner’s 24 characteristics we have access to does not seem to influence the program effect, this is subject to change in a setting where more variables are available. Notably, the enhanced effectiveness in lower-value properties prompts a discussion on algorithm- based taxation strategies, which, while aiming to improve compliance, could inadvertently focus enforcement on lower-income owners, potentially leading to regressive taxation. This underscores the need for further research on algorithm-based taxation and policy learning as a natural progression from this study. References Advani, A., Elming, W., & Shaw, J. (2023). The dynamic effects of tax audits. The Review of Eco- nomics and Statistics, 105(3), 545–561. https://doi.org/https://doi.org/10.1162/rest_a_ 01101 Allingham, M., & Sandmo, A. (1972). Income tax evasion: A theoretical analysis [cited By 2152]. Journal of Public Economics, 1(3-4), 323–338. https://doi.org/10.1016/0047-2727(72) 90010- 2 Asatryan, Z., & Peichl, A. (2017). Responses of firms to tax, administrative and accounting rules: Evidence from armenia [cited By 9]. Responses of firms to tax, administrative and ac- counting rules: Evidence from Armenia. Athey, S., & Imbens, G. (2016). Recursive partitioning for heterogeneous causal effects. Proceed- ings of the National Academy of Sciences, 113(27), 7353–7360. https://doi.org/10.1073/ pnas.1510489113 Athey, S., & Imbens, G. W. (2019). Machine learning methods that economists should know about. Annual Review of Economics, 11(Volume 11, 2019), 685–725. https: // doi. org / https://doi.org/10.1146/annurev-economics-080217-053433 Athey, S., & Wager, S. (2019). Estimating treatment effects with causal forests: An application. Observational Studies, 5(2), 37–51. https://doi.org/10.1353/obs.2019.0001 25 Athey, S., & Wager, S. (2021). Policy learning with observational data. Econometrica, 89(1), 133– 161. https://doi.org/https://doi.org/10.3982/ECTA15732 Becker, G. S. (1968). Crime and punishment: An economic approach. Journal of Political Econ- omy, 76(2), 169–217. Retrieved January 22, 2024, from http: //www. jstor. org /stable/ 1830482 Belloni, A., Chernozhukov, V., Fernández-Val, I., & Hansen, C. (2017). Program evaluation and causal inference with high-dimensional data. Econometrica, 85(1), 233–298. https://doi. org/https://doi.org/10.3982/ECTA12723 Belloni, A., Chernozhukov, V., & Hansen, C. (2013). Inference on Treatment Effects after Selection among High-Dimensional Controls†. The Review of Economic Studies, 81(2), 608–650. https://doi.org/10.1093/restud/rdt044 Besley, T., Jensen, A., & Persson, T. (2023). Norms, enforcement, and tax evasion. Review of Eco- nomics and Statistics, 105(4), 998–1007. Bhargava, S., & Manoli, D. (2015). Psychological frictions and the incomplete take-up of so- cial benefits: Evidence from an irs field experiment. American Economic Review, 105(11), 3489–3529. Bishop, C. M. (2006). Pattern Recognition and Machine Learning (M. Jordan, J. Kleinberg, & B. Schölkopf, Eds.; Vol. 4). Springer. https://doi.org/10.1117/1.2819119 Boning, W. C., Hendren, N., Sprung-Keyser, B., & Stuart, E. (2023). A welfare analysis of tax audits across the income distribution. Caron, A., Baio, G., & Manolopoulou, I. (2022). Estimating Individual Treatment Effects using Non-Parametric Regression Models: a Review. Journal of the Royal Statistical Society Se- ries A: Statistics in Society, 185(3), 1115–1149. https://doi.org/10.1111/rssa.12824 CBO. (2020). Trends in the internal revenue service’s funding and enforcement. https://doi.org/ https://www.cbo.gov/publication/56422 Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., & Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21(1), C1–C68. https://doi.org/10.1111/ectj.12097 26 Chernozhukov, V., Demirer, M., Duflo, E., & Fernández-Val, I. (2023). Fisher-schultz lecture: Generic machine learning inference on heterogenous treatment effects in randomized experiments, with an application to immunization in india. Cox, J. C., Kreisman, D., & Dynarski, S. (2020). Designed to fail: Effects of the default option and information complexity on student loan repayment. Journal of Public Economics, 192(11). https://doi.org/https://doi.org/10.1016/j.jpubeco.2020.104298 De Neve, J.-E., Imbert, C., Spinnewijn, J., Tsankova, T., & Luts, M. (2021). How to improve tax compliance? evidence from population-wide experiments in belgium. Journal of Politi- cal Economy, 129(5), 1425–1463. https://doi.org/10.1086/713096 Foster, J. C., Taylor, J. M., & Ruberg, S. J. (2011). Subgroup identification from randomized clini- cal trial data. Statistics in Medicine, 30(24), 2867–2880. https://doi.org/10.1002/sim.4322 Hahn, P. R., Murray, J. S., & Carvalho, C. M. (2020). Bayesian Regression Tree Models for Causal Inference: Regularization, Confounding, and Heterogeneous Effects (with Discussion). Bayesian Analysis, 15(3), 965–2020. https://doi.org/10.1214/19-BA1195 Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction. Springer. https://books.google.com/books?id=eBSgoAEACAAJ Hebous, S., Jia, Z., Løyland, K., Thoresen, T. O., & Øvrum, A. (2023). Do audits improve future tax compliance in the absence of penalties? evidence from random audits in norway. Journal of Economic Behavior Organization, 207, 305–326. https://doi.org/https://doi. org/10.1016/j.jebo.2023.01.001 Hill, J. L. (2011). Bayesian nonparametric modeling for causal inference. Journal of Computa- tional and Graphical Statistics, 20(1), 217–240. https://doi.org/10.1198/jcgs.2010.08162 Imbens, G. W., & Rubin, D. B. (2015). Causal inference for statistics, social, and biomedical sci- ences: An introduction. Cambridge University Press. Kirchler, E. (2007). The economic psychology of tax behaviour. Cambridge University Press. Kleven, H. J., & Kopczuk, W. (2011). Transfer program complexity and the take-up of social ben- efits. American Economic Journal: Economic Policy, 3(1), 54–90. https://doi.org/10.1257/ pol.3.1.54 27 Kleven, H., Knudsen, M., Kreiner, C., Pedersen, S., & Saez, E. (2011). Unwilling or unable to cheat? evidence from a tax audit experiment in denmark [cited By 474]. Econometrica, 79(3), 651–692. https://doi.org/10.3982/ECTA9113 Künzel, S. R., Sekhon, J. S., Bickel, P. J., & Yu, B. (2019). Metalearners for estimating heteroge- neous treatment effects using machine learning. Proceedings of the National Academy of Sciences, 116(10), 4156–4165. https://doi.org/10.1073/pnas.1804597116 Lu, M., Sadiq, S., Feaster, D. J., & Ishwaran, H. (2018). Estimating individual treatment effect in observational data using random forest methods. Journal of Computational and Graph- ical Statistics, 27(1), 209–219. https://doi.org/10.1080/10618600.2017.1356325 Luttmer, E. F., & Singhal, M. (2014). Tax morale. Journal of economic perspectives, 28(4), 149–168. Nie, X., & Wager, S. (2020). Quasi-oracle estimation of heterogeneous treatment effects. Biometrika, 108(2), 299–319. https://doi.org/10.1093/biomet/asaa076 Parikh, H., Volfovsky, A., & Rudin, C. (2022). Malts: Matching after learning to stretch. Journal of Machine Learning Research, 23(240). Powers, S., Qian, J., Jung, K., Schuler, A., Shah, N. H., Hastie, T., & Tibshirani, R. (2018). Some methods for heterogeneous treatment effect estimation in high dimensions. Statistics in Medicine, 37(11), 1767–1787. https://doi.org/10.1002/sim.7623 Robinson, P. M. (1988). Root-n-consistent semiparametric regression. Econometrica, 56(4), 931– 954. Retrieved January 26, 2024, from http://www.jstor.org/stable/1912705 Rubin, D. B. (1978). Bayesian inference for causal effects: The role of randomization. The Annals of Statistics, 6(1), 34–58. Retrieved January 25, 2024, from http://www.jstor.org/stable/ 2958688 Sarin, N., & Summers, L. H. (2020, July). Understanding the revenue potential of tax compliance investment (Working Paper No. 27571). National Bureau of Economic Research. https: //doi.org/10.3386/w27571 Semenova, V., & Chernozhukov, V. (2020). Debiased machine learning of conditional average treatment effects and other causal functions. The Econometrics Journal, 24(2), 264–289. https://doi.org/10.1093/ectj/utaa027 28 Slemrod, J., Blumenthal, M., & Christian, C. (2001). Taxpayer response to an increased probabil- ity of audit: Evidence from a controlled experiment in minnesota [cited By 346]. Journal of Public Economics, 79(3), 455–483. https://doi.org/10.1016/S0047-2727(99)00107-3 Slemrod, J., & Yitzhaki, S. (1987). The optimal size of a tax collection agency [cited By 69]. Scan- dinavian Journal of Economics, 89(2), 183–192. Slemrod, J., & Yitzhaki, S. (2002). Chapter 22 tax avoidance, evasion, and administration [cited By 429]. Handbook of Public Economics, 3, 1423–1470. https://doi.org/10.1016/S1573- 4420(02)80026-X Stefano M. Iacus, G. K., & Porro, G. (2011). Multivariate matching methods that are monotonic imbalance bounding. Journal of the American Statistical Association, 106(493), 345–361. https://doi.org/10.1198/jasa.2011.tm09599 Wager, S., & Athey, S. (2018). Estimation and inference of heterogeneous treatment effects us- ing random forests. Journal of the American Statistical Association, 113(523), 1228–1242. https://doi.org/10.1080/01621459.2017.1319839 World-Bank. (2020). Indonesia - public expenditure review : Spending for better results. https: / / doi . org / http : / / documents . worldbank . org / curated / en / 154721588612658163 / Indonesia-Public-Expenditure-Review-Spending-for-Better-Results Yitzhaki, S. (1987). On the excess burden of tax evasion [cited By 64]. Public Finance Review, 15(2), 123–137. https://doi.org/10.1177/109114218701500201 29 A. Tables Table 1: Treatment group vs Control group comparison: Outcomes Not Treated/Visited Treated/Visited P-value (N=21184) (N=31725) Tax Participation - 2021 12468 (66.9%) 24520 (83.9%) <0.001 Tax Participation - 2022 10175 (59.2%) 21878 (76.3%) <0.001 Compliance - 2021 0.55 ± 0.44 0.69 ± 0.38 <0.001 Compliance - 2022 0.59 ± 0.49 0.76 ± 0.43 <0.001 ln Value Paid - 2021 7.61 ± 5.46 9.10 ± 4.11 <0.001 ln Value Paid - 2022 6.88 ± 5.80 8.46 ± 4.80 <0.001 Value Paid - 2021 212335 ± 1410139 138643 ± 2237168 <0.001 Value Paid - 2022 246751 ± 1514197 150380 ± 2248708 <0.001 30 B. Figures Figure 1: Cross Country Finances Expenditure Revenue 30 30 25 25 20 20 % of GDP % of GDP 15 15 10 10 5 5 7 8 9 10 11 12 7 8 9 10 11 12 Log GDP per capita (PPP) Log GDP per capita (PPP) Other Countries Indonesia Fitted values Other Countries Indonesia Fitted values Source: World Bank − World Development Indicators Ref.: GC.REV.XGRT.GD.ZS NY.GDP.PCAP.PP.CD NE.CON.GOVT.ZS 31 Figure 2: Tax to GDP 32 Figure 3 (a) Indonesia - Gorontalo in Blue (b) Gorontalo - Kota Gorontalo in Blue Figure 4: Distribution of Visits and Compliance 33 Figure 5: Distribution of Visits and Compliance All 250 218 Normalized to 100 on first year 204 200 150 107 100 2014 2016 2018 2020 2022 Year Paid 250 213 200 194 150 100 91 2014 2016 2018 2020 2022 Year Not Paid 350 326 300 250 200 186 176 150 100 2014 2016 2018 2020 2022 Year 34 Average Tax Bill Total Taxbill Payable Number of Properties 35 Figure 6: Map Pre-Treatment Compliance vs Propensity Score Bulotadaa Bulotada Tim Molosipat U Dulomo Tapa Tomulabutao Tanggikiki Paguyaman Dulomo Selat Tomulabutao Pulubala Huangobotu Liluwo Wongkaditi B Dulalowo Tuladenggi Dulalowo Tim Wongkaditi Wumialo Dembe Jaya Buladu Libuo Limba U Dua Heledulaa Limba U Satu Dembe I Moodu Dembe II Molosipat W Lekobalo Limba B Heledulaa Se Tamalate Pilolodaa BiawuBiawao Buliide Padebuolo Ipilo Tenilo Bugis Botu Donggala Siendeng Tenda Talumolo 1.0 0.9 Pohe Leato Utara 0.9 0.6 0.30.50.70.9 Tanjung Kram Leato Selata 8,000 1 6,000 Compliant property (%) − 2019 .8 4,000 2,000 .6 0 .2 .4 .6 .8 1 Visited property (%) − 2020 36 Figure 7: Distribution Property by Payment and Treatment Status Partially Paid (48.9%) Not Paid (22.8%) Control 16.2% Control Visited 13.1% 9.7% Fully Paid (28.3%) Visited 32.8% Visited Control 17.6% 10.7% 37 Figure 8: Distribution Payable by Payment and Treatment Status Partially Compliant − Paid (28.0%) Non compliant (19.7%) Visited Control Control Visited 14.0% 13.9% 12.7% 7.0% Fully Compliant (30.0%) Partially Compliant − Debt (22.4%) Visited Control Control Visited 15.2% 14.8% 12.3% 10.0% 38 Figure 9: Distribution Propensity Score 39 Figure 10: Distribution ITE 40 Table 2: Baseline: Covariates Not Treated/Visited Treated/Visited P-value (N=18631) (N=29226) ln Value of Tax Object - 2021 18.28 ± 1.51 17.91 ± 1.36 <0.001 Dummy - Lower Tax Rate - 2021 17632 (94.6%) 28612 (97.9%) <0.001 ln Land Area - 2021 5.86 ± 1.07 5.65 ± 0.91 <0.001 ln Taxable Value - 2021 12.39 ± 0.87 12.18 ± 0.74 0.60 ln Taxable Value - Land (by m2) - 2021 6.53 ± 1.48 6.54 ± 1.21 <0.001 Dummy - Contruction - 2021 13128 (70.5%) 25143 (86.0%) <0.001 ln Value - Land (Total) - 2021 9.18 ± 5.97 11.11 ± 4.52 <0.001 ln Building Area - 2021 3.22 ± 2.19 3.80 ± 1.65 <0.001 ln Taxable Value - Building (by m2) - 2021 8.49 ± 0.72 8.54 ± 0.65 <0.001 Dummy - Owner Lives in The Property - 2021 1914 (10.3%) 3478 (11.9%) <0.001 Dummy - Owner has multiple homes - 2021 1757 (9.4%) 1596 (5.5%) <0.001 ln # properties by owner if multiple - 2021 0.75 ± 0.21 0.73 ± 0.16 <0.001 ln # properties by owner - 2021 0.03 ± 0.11 0.02 ± 0.09 <0.001 ln Value Owner’s Portfolio** - 2022 1.87 ± 5.82 1.07 ± 4.46 <0.001 Dummy - Property Paid >0 - 2020 13754 (73.8%) 26215 (89.7%) <0.001 Dummy - Property Payable is Null - 2021 0 (0.0%) 0 (0.0%) <0.001 Dummy - Tax was paid after deadline - 2020 4965 (26.6%) 6290 (21.5%) <0.001 Dummy - Property fined - 2020 4870 (26.1%) 6261 (21.4%) <0.001 ln Tax Fine* - 2020 2.39 ± 4.11 1.86 ± 3.64 <0.001 ln Tax Paid* - 2020 8.43 ± 5.13 9.79 ± 3.46 <0.001 41 Table 3: Treatment group vs Control group comparison: Outcomes Not Treated/Visited Treated/Visited P-value (N=21184) (N=31725) Tax Participation - 2021 - Y Pa 12468 (66.9%) 24520 (83.9%) <0.001 2021 Tax Participation - 2022 - Y Pa 10175 (59.2%) 21878 (76.3%) <0.001 2022 Compliance - 2021 - Y Co 0.55 ± 0.44 0.69 ± 0.38 <0.001 2021 Compliance - 2022 - Y Co 0.59 ± 0.49 0.76 ± 0.43 <0.001 2022 ln Value Tax Bill Paid - 2021 - Y lnP 7.61 ± 5.46 9.10 ± 4.11 <0.001 2021 ln Value Tax Bill Paid - 2022 - Y lnP 6.88 ± 5.80 8.46 ± 4.80 <0.001 2022 Value Tax Bill Paid - 2021 - Y V P 212335 ± 1410139 138643 ± 2237168 <0.001 2021 Value Tax Bill Paid - 2022 - Y V P 246751 ± 1514197 150380 ± 2248708 <0.001 2022 Table 4: Treatment Effect 2021 2022 τPa τCo τlnP τV P τPa τCo τlnP τV P ATE 0.04*** 0.035*** 0.422*** 29597 0.063*** 0.063*** 0.678*** 29645 (0.014) (0.011) (0.145) (29156) (0.011) (0.011) (0.122) (28764) ATT 0.035** 0.034*** 0.377** 18638 0.061*** 0.061*** 0.663*** 19133 (0.017) (0.013) (0.172) (18072) (0.013) (0.013) (0.136) (17554) ATU 0.047*** 0.038*** 0.492*** 47986 0.064*** 0.064*** 0.693*** 48431 (0.01) (0.008) (0.107) (52113) (0.009) (0.009) (0.102) (53937) 42 Table 5: High vs Low CATE 2021 τPa τCo τlnP τV P (H) (L) (Diff) (H) (L) (Diff) (H) (L) (Diff) (H) (L) (Diff) ATE 0.056** 0.023*** 0.033 0.06*** 0.008 0.052*** 0.622*** 0.238*** 0.384* -17 55178 -55195 (0.025) (0.008) (0.026) (0.019) (0.006) (0.02) (0.248) (0.077) (0.26) (13377.1) (53974.8) (55607.8) ATT 0.044* 0.025*** 0.02 0.053*** 0.01* 0.043** 0.513** 0.249*** 0.264 767.2 32176.1 -31408.9 (0.031) (0.008) (0.031) (0.022) (0.006) (0.023) (0.3) (0.078) (0.31) (10479) (31375.7) (33079.3) ATU 0.076*** 0.019*** 0.057*** 0.074*** 0.005 0.069*** 0.801*** 0.204*** 0.597*** -2968.1 103749.7 -106717.8 (0.017) (0.007) (0.019) (0.014) (0.007) (0.016) (0.172) (0.082) (0.19) (24975.4) (107574.8) (110436) 2022 τPa τCo τlnP τV P (H) (L) (Diff) (H) (L) (Diff) (H) (L) (Diff) (H) (L) (Diff) ATE 0.099*** 0.037*** 0.062*** 0.098*** 0.036*** 0.063*** 0.953*** 0.507*** 0.447** -17 55178 -55195 (0.014) (0.01) (0.017) (0.014) (0.01) (0.017) (0.143) (0.13) (0.193) (13377.1) (53974.8) (55607.8) ATT 0.095*** 0.04*** 0.055*** 0.094*** 0.038*** 0.056*** 0.889*** 0.562*** 0.327* 767.2 32176.1 -31408.9 (0.015) (0.011) (0.019) (0.015) (0.011) (0.018) (0.15) (0.141) (0.206) (10479) (31375.7) (33079.3) ATU 0.108*** 0.032*** 0.075*** 0.107*** 0.032*** 0.075*** 1.065*** 0.418*** 0.647*** -2968.1 103749.7 -106717.8 (0.015) (0.01) (0.018) (0.015) (0.01) (0.018) (0.154) (0.128) (0.201) (24975.4) (107574.8) (110436) 43 Table 6: Calibration Test 2021 2022 Estimate Sd P-Value Estimate Sd P-Value τPa Mean 1.00176 0.2718 0.00011 1.02106 0.13588 - Differential 1.36729 0.28072 - 1.20537 0.22464 - τCo Mean 1.00982 0.24418 0.00002 1.0073 0.13887 - Differential 1.47134 0.2412 - 1.28648 0.27367 - τlnP Mean 1.00429 0.26355 0.00007 1.02692 0.14324 - Differential 1.22987 0.21412 - 0.91808 0.22855 0.00003 τV P Mean 3.29768 3.28017 0.15737 3.29768 3.28017 0.15737 Differential -3.10676 3.33146 0.82447 -3.10676 3.33146 0.82447 44 Table 7: Outcome: Paid - 2021 (1) (2) (3) (4) (5) (6) (7) (8) ln Value Tax Base -0.015 ** -0.030 -0.058 (0.006) (0.020) (0.043) ln Value Tax Base - LAND -0.007 -0.004 (0.007) (0.007) ln Land Area - 2021 -0.022 ** -0.022 ** -0.063 *** -0.029 0.014 (0.011) (0.011) (0.016) (0.029) (0.067) ln Taxable Value - Land (by m2) - 2021 -0.004 -0.004 -0.024 *** -0.007 0.006 (0.007) (0.007) (0.009) (0.015) (0.039) Dummy - Contruction - 2021 -0.121 -0.207 -0.210 -0.071 -0.132 -0.167 -0.159 (0.119) (0.141) (0.143) (0.135) (0.118) (0.154) (0.142) ln Building Area - 2021 0.010 0.010 -0.008 0.006 -0.000 (0.011) (0.011) (0.011) (0.014) (0.014) ln Value Tax Base - BUILDING 0.010 0.011 (0.009) (0.009) ln Taxable Value - Building (by m2) * - 2021 0.018 0.018 0.010 0.015 0.018 * (0.011) (0.011) (0.010) (0.011) (0.011) Dummy - Owner has lives in Property - 2021 -0.004 -0.004 -0.000 -0.004 0.018 (0.014) (0.014) (0.013) (0.014) (0.018) Dummy - Owner has multiple homes - 2021 0.028 0.014 0.081 -0.010 0.027 (0.234) (0.235) (0.243) (0.236) (0.235) ln Value Owners Portfolio** - 2021 -0.002 -0.001 -0.004 0.000 -0.001 (0.012) (0.012) (0.012) (0.012) (0.012) Dummy - Property Paid > 0 - 2020 -0.244 -0.091 -0.267 -0.267 (0.188) (0.188) (0.187) (0.195) Dummy - Property fined - 2020 0.073 * 0.072 * 0.072 * 0.076 * (0.041) (0.041) (0.041) (0.045) ln Value Fine - 2020 -0.021 *** -0.020 *** -0.021 *** -0.020 *** (0.006) (0.006) (0.006) (0.007) Dummy - Tax was paid after deadline - 2020 0.103 *** 0.100 ** 0.102 *** 0.092 ** (0.039) (0.040) (0.039) (0.041) Dummy - Lower Tax Bracket - 2021 -0.029 -0.012 -0.032 -0.015 (0.021) (0.020) (0.021) (0.034) *** p < 0.01; ** p < 0.05; * p < 0.1. 45 Table 8: Outcome: Sh Paid - 2021 (1) (2) (3) (4) (5) (6) (7) (8) ln Value Tax Base -0.020 *** 0.025 -0.018 (0.007) (0.022) (0.029) ln Value Tax Base - LAND -0.025 ** -0.024 ** (0.010) (0.010) ln Land Area - 2021 -0.048 *** -0.049 *** -0.089 *** -0.117 *** -0.051 (0.014) (0.014) (0.021) (0.035) (0.049) ln Taxable Value - Land (by m2) - 2021 -0.025 ** -0.025 ** -0.044 *** -0.058 *** -0.026 (0.010) (0.010) (0.013) (0.019) (0.029) Dummy - Contruction - 2021 -0.111 -0.211 -0.214 -0.098 -0.135 -0.018 -0.089 (0.121) (0.144) (0.147) (0.138) (0.121) (0.147) (0.133) ln Building Area - 2021 0.019 * 0.019 * 0.003 -0.009 -0.003 (0.011) (0.011) (0.012) (0.015) (0.013) ln Value Tax Base - BUILDING 0.011 0.013 (0.009) (0.009) ln Taxable Value - Building (by m2) * - 2021 0.016 0.016 0.010 0.006 0.012 (0.012) (0.012) (0.011) (0.011) (0.011) Dummy - Owner has lives in Property - 2021 -0.007 -0.006 -0.002 -0.007 0.006 (0.014) (0.014) (0.014) (0.014) (0.017) Dummy - Owner has multiple homes - 2021 -0.178 -0.190 -0.106 -0.171 -0.158 (0.226) (0.228) (0.237) (0.230) (0.226) ln Value Owners Portfolio** - 2021 0.009 0.009 0.005 0.008 0.008 (0.011) (0.011) (0.012) (0.012) (0.011) Dummy - Property Paid > 0 - 2020 0.017 0.160 0.037 0.027 (0.187) (0.192) (0.186) (0.192) Dummy - Property fined - 2020 0.104 ** 0.105 ** 0.105 ** 0.119 ** (0.047) (0.046) (0.047) (0.049) ln Value Fine - 2020 -0.022 *** -0.021 *** -0.022 *** -0.023 *** (0.006) (0.006) (0.006) (0.007) Dummy - Tax was paid after deadline - 2020 0.072 0.070 0.072 0.071 (0.047) (0.048) (0.047) (0.048) Dummy - Lower Tax Bracket - 2021 -0.084 *** -0.067 ** -0.080 *** -0.059 (0.031) (0.031) (0.031) (0.037) *** p < 0.01; ** p < 0.05; * p < 0.1. 46 Table 9: Outcome: Ln Tax Bill - 2021 (1) (2) (3) (4) (5) (6) (7) (8) ln Value Tax Base -0.133 ** -0.264 -0.555 (0.066) (0.222) (0.432) ln Value Tax Base - LAND -0.059 -0.032 (0.079) (0.076) ln Land Area - 2021 -0.209 * -0.205 * -0.574 *** -0.277 0.150 (0.115) (0.114) (0.178) (0.333) (0.678) ln Taxable Value - Land (by m2) - 2021 -0.038 -0.036 -0.206 ** -0.059 0.058 (0.076) (0.075) (0.095) (0.169) (0.398) Dummy - Contruction - 2021 -1.479 -2.518 * -2.554 * -1.311 -1.577 -2.159 -2.088 (1.274) (1.521) (1.548) (1.479) (1.271) (1.697) (1.535) ln Building Area - 2021 0.152 0.155 -0.005 0.120 0.056 (0.120) (0.122) (0.125) (0.162) (0.152) ln Value Tax Base - BUILDING 0.121 0.133 (0.095) (0.095) ln Taxable Value - Building (by m2) * - 2021 0.204 * 0.206 * 0.135 0.180 0.216 * (0.118) (0.120) (0.112) (0.123) (0.115) Dummy - Owner has lives in Property - 2021 -0.036 -0.037 0.002 -0.034 0.186 (0.142) (0.143) (0.139) (0.142) (0.191) Dummy - Owner has multiple homes - 2021 0.554 0.300 0.943 0.095 0.510 (2.455) (2.460) (2.546) (2.474) (2.450) ln Value Owners Portfolio** - 2021 -0.030 -0.017 -0.049 -0.006 -0.026 (0.124) (0.124) (0.128) (0.124) (0.122) Dummy - Property Paid > 0 - 2020 -1.498 -0.162 -1.707 -1.649 (2.082) (2.103) (2.058) (2.136) Dummy - Property fined - 2020 0.687 0.686 0.679 0.729 (0.444) (0.441) (0.444) (0.486) ln Value Fine - 2020 -0.221 *** -0.215 *** -0.220 *** -0.215 *** (0.070) (0.069) (0.070) (0.072) Dummy - Tax was paid after deadline - 2020 1.179 ** 1.160 ** 1.179 ** 1.065 ** (0.459) (0.461) (0.458) (0.473) Dummy - Lower Tax Bracket - 2021 -0.211 -0.066 -0.245 -0.065 (0.276) (0.267) (0.278) (0.391) *** p < 0.01; ** p < 0.05; * p < 0.1. 47 Table 10: Outcome: Tax Bill - 2021 (1) (2) (3) (4) (5) (6) (7) (8) ln Value Tax Base 222556.640 145230.336 106586.402 (238387.509) (201134.874) (165276.316) ln Value Tax Base - LAND 141312.019 -10856.498 (124510.525) (61185.407) ln Land Area - 2021 72004.568 96342.613 -297889.522 -461245.351 -422964.872 (85054.791) (98539.560) (339333.927) (551840.292) (492083.566) ln Taxable Value - Land (by m2) - 2021 72557.910 85813.642 -85330.750 -166327.192 -200313.282 (47950.435) (56428.892) (138920.357) (244258.347) (251479.182) Dummy - Contruction - 2021 -1773094.134 -2711378.936 -2846357.515 -1557781.564 -1129181.664 -1091813.908 -990397.573 (1736977.289) (2574868.952) (2670915.225) (1240293.393) (956958.886) (783220.229) (794946.760) ln Building Area - 2021 459088.192 472430.159 284093.842 215631.170 206154.638 (450239.627) (459576.851) (250928.046) (179683.003) (180829.272) ln Value Tax Base - BUILDING 135769.624 92088.337 (133091.175) (79755.672) ln Taxable Value - Building (by m2) * - 2021 80765.418 88473.464 18876.799 * -5974.395 -6335.561 (74747.500) (80241.888) (11111.874) (35908.242) (32132.195) Dummy - Owner has lives in Property - 2021 -72756.846 -80161.736 -48808.148 -81853.644 -43829.009 (67744.420) (78066.459) (49222.839) (79962.813) (46748.139) Dummy - Owner has multiple homes - 2021 3571921.222 3483366.365 3964645.613 3596353.590 3537330.581 (2347234.344) (2335380.049) (2681344.929) (2440665.161) (2404088.996) ln Value Owners Portfolio** - 2021 -190296.461 -186934.613 -210400.184 -192791.815 -189404.135 (125270.934) (125816.473) (142768.543) (131337.370) (129360.885) Dummy - Property Paid > 0 - 2020 151592.047 158639.105 266208.731 459714.822 (1022722.255) (1002016.777) (903613.994) (738871.049) Dummy - Property fined - 2020 -2331957.402 -2311408.205 -2327587.929 -2409083.337 (2126048.710) (2104967.352) (2119490.815) (2181472.530) ln Value Fine - 2020 -218064.353 -219624.406 -218475.117 -213020.195 (214449.852) (216701.669) (215206.130) (208956.165) Dummy - Tax was paid after deadline - 2020 4333982.569 4329002.477 4334433.707 4320873.143 (4053395.258) (4054572.019) (4054654.588) (4018702.211) Dummy - Lower Tax Bracket - 2021 -83810.999 -64311.203 -64652.907 -148636.771 (194584.078) (184923.738) (190673.323) (230035.851) *** p < 0.01; ** p < 0.05; * p < 0.1. 48 Table 11: Outcome: Paid - 2022 (1) (2) (3) (4) (5) (6) (7) (8) ln Value Tax Base -0.027 *** 0.041 0.015 (0.009) (0.026) (0.031) ln Value Tax Base - LAND -0.005 0.008 (0.011) (0.011) ln Land Area - 2021 -0.042 ** -0.041 ** -0.093 *** -0.142 *** -0.103 * (0.017) (0.017) (0.029) (0.040) (0.053) ln Taxable Value - Land (by m2) - 2021 -0.000 0.000 -0.025 * -0.048 ** -0.041 (0.012) (0.012) (0.014) (0.020) (0.029) Dummy - Contruction - 2021 0.041 -0.016 -0.021 0.138 -0.032 0.273 * 0.199 (0.094) (0.121) (0.121) (0.141) (0.094) (0.164) (0.173) ln Building Area - 2021 -0.000 0.000 -0.022 -0.042 ** -0.039 * (0.012) (0.012) (0.016) (0.019) (0.020) ln Value Tax Base - BUILDING -0.001 0.005 (0.007) (0.007) ln Taxable Value - Building (by m2) * - 2021 0.001 0.001 -0.007 -0.015 -0.006 (0.010) (0.010) (0.011) (0.012) (0.012) Dummy - Owner has lives in Property - 2021 0.001 0.001 0.008 0.000 0.012 (0.015) (0.015) (0.014) (0.015) (0.015) Dummy - Owner has multiple homes - 2021 0.099 0.102 0.217 0.140 0.219 (0.342) (0.337) (0.334) (0.340) (0.332) ln Value Owners Portfolio** - 2021 -0.006 -0.006 -0.012 -0.008 -0.012 (0.017) (0.017) (0.017) (0.017) (0.017) Dummy - Property Paid > 0 - 2020 -0.044 0.226 -0.010 0.029 (0.168) (0.175) (0.171) (0.186) Dummy - Property fined - 2020 0.141 * 0.143 ** 0.143 ** 0.141 * (0.072) (0.072) (0.072) (0.072) ln Value Fine - 2020 -0.023 *** -0.022 *** -0.023 *** -0.022 *** (0.007) (0.007) (0.007) (0.007) Dummy - Tax was paid after deadline - 2020 0.060 0.055 0.060 0.047 (0.056) (0.057) (0.056) (0.056) Dummy - Lower Tax Bracket - 2021 -0.052 * -0.041 -0.049 * -0.030 (0.027) (0.025) (0.027) (0.032) *** p < 0.01; ** p < 0.05; * p < 0.1. 49 Table 12: Outcome: Sh Paid - 2022 (1) (2) (3) (4) (5) (6) (7) (8) ln Value Tax Base -0.028 *** 0.041 0.016 (0.009) (0.026) (0.031) ln Value Tax Base - LAND -0.006 0.008 (0.011) (0.011) ln Land Area - 2021 -0.044 ** -0.043 ** -0.093 *** -0.143 *** -0.105 ** (0.017) (0.017) (0.029) (0.040) (0.053) ln Taxable Value - Land (by m2) - 2021 -0.001 -0.001 -0.025 * -0.049 ** -0.042 (0.012) (0.012) (0.014) (0.020) (0.029) Dummy - Contruction - 2021 0.038 -0.021 -0.025 0.129 -0.037 0.266 0.195 (0.095) (0.122) (0.122) (0.142) (0.095) (0.166) (0.176) ln Building Area - 2021 0.000 0.000 -0.021 -0.041 ** -0.038 * (0.012) (0.012) (0.016) (0.019) (0.021) ln Value Tax Base - BUILDING -0.001 0.005 (0.007) (0.007) ln Taxable Value - Building (by m2) * - 2021 0.001 0.002 -0.006 -0.015 -0.006 (0.010) (0.010) (0.011) (0.012) (0.012) Dummy - Owner has lives in Property - 2021 0.001 0.001 0.008 0.000 0.013 (0.015) (0.015) (0.015) (0.015) (0.015) Dummy - Owner has multiple homes - 2021 0.096 0.096 0.210 0.134 0.213 (0.348) (0.343) (0.341) (0.345) (0.338) ln Value Owners Portfolio** - 2021 -0.006 -0.006 -0.011 -0.008 -0.012 (0.018) (0.017) (0.017) (0.017) (0.017) Dummy - Property Paid > 0 - 2020 -0.040 0.229 -0.005 0.040 (0.166) (0.176) (0.170) (0.185) Dummy - Property fined - 2020 0.138 * 0.139 * 0.140 ** 0.138 * (0.072) (0.072) (0.071) (0.071) ln Value Fine - 2020 -0.023 *** -0.022 *** -0.024 *** -0.023 *** (0.007) (0.007) (0.007) (0.007) Dummy - Tax was paid after deadline - 2020 0.069 0.064 0.069 0.055 (0.057) (0.057) (0.056) (0.057) Dummy - Lower Tax Bracket - 2021 -0.046 * -0.035 -0.043 -0.025 (0.028) (0.026) (0.028) (0.033) *** p < 0.01; ** p < 0.05; * p < 0.1. 50 Table 13: Outcome: Ln Tax Bill - 2022 (1) (2) (3) (4) (5) (6) (7) (8) ln Value Tax Base -0.245 ** 0.437 0.183 (0.100) (0.273) (0.321) ln Value Tax Base - LAND -0.009 0.125 (0.141) (0.135) ln Land Area - 2021 -0.377 * -0.357 * -0.869 *** -1.392 *** -1.032 * (0.203) (0.203) (0.313) (0.434) (0.558) ln Taxable Value - Land (by m2) - 2021 0.038 0.048 -0.195 -0.447 ** -0.411 (0.146) (0.144) (0.166) (0.228) (0.302) Dummy - Contruction - 2021 0.350 -0.288 -0.369 1.199 -0.345 2.645 1.943 (1.061) (1.375) (1.382) (1.569) (1.051) (1.793) (1.905) ln Building Area - 2021 0.026 0.033 -0.189 -0.394 * -0.375 * (0.137) (0.136) (0.175) (0.205) (0.224) ln Value Tax Base - BUILDING -0.004 0.052 (0.079) (0.078) ln Taxable Value - Building (by m2) * - 2021 0.011 0.015 -0.065 -0.153 -0.061 (0.109) (0.110) (0.118) (0.128) (0.134) Dummy - Owner has lives in Property - 2021 0.006 0.002 0.074 -0.005 0.129 (0.163) (0.166) (0.159) (0.166) (0.168) Dummy - Owner has multiple homes - 2021 2.375 2.338 3.473 2.738 3.671 (3.866) (3.797) (3.758) (3.826) (3.734) ln Value Owners Portfolio** - 2021 -0.131 -0.130 -0.186 -0.151 -0.196 (0.198) (0.194) (0.192) (0.195) (0.190) Dummy - Property Paid > 0 - 2020 0.350 2.937 0.715 1.232 (1.810) (1.868) (1.827) (2.035) Dummy - Property fined - 2020 1.323 1.339 1.346 1.317 (0.852) (0.853) (0.849) (0.848) ln Value Fine - 2020 -0.251 *** -0.240 *** -0.253 *** -0.243 *** (0.085) (0.084) (0.085) (0.085) Dummy - Tax was paid after deadline - 2020 0.938 0.890 0.936 0.788 (0.674) (0.677) (0.672) (0.670) Dummy - Lower Tax Bracket - 2021 -0.468 -0.365 -0.437 -0.239 (0.386) (0.370) (0.387) (0.430) *** p < 0.01; ** p < 0.05; * p < 0.1. 51 Table 14: Outcome: Tax Bill - 2022 (1) (2) (3) (4) (5) (6) (7) (8) ln Value Tax Base 191860.798 202568.216 169489.533 (192775.827) (203550.247) (175560.689) ln Value Tax Base - LAND 136749.389 -24377.663 (120279.781) (68229.876) ln Land Area - 2021 80894.376 97515.643 -286346.122 -529133.247 -502186.600 (50414.784) (61683.601) (375396.769) (598098.463) (553014.873) ln Taxable Value - Land (by m2) - 2021 72248.362 ** 81222.329 * -91849.304 -208611.677 -246587.359 (35833.826) (43098.857) (158008.785) (265315.244) (279355.853) Dummy - Contruction - 2021 -1777883.746 -2483952.412 -2572881.388 -1318861.544 -1066597.281 -648312.950 -579828.070 (1787569.246) (2410305.617) (2493156.799) (1043026.195) (960797.511) (628525.904) (657482.555) ln Building Area - 2021 411714.230 419787.780 232699.958 137801.804 134621.209 (430228.651) (437731.582) (222052.519) (157865.419) (159945.240) ln Value Tax Base - BUILDING 136166.811 86767.259 (137106.085) (78814.779) ln Taxable Value - Building (by m2) * - 2021 80918.511 86364.893 23508.452 -17345.846 -17901.532 (65369.652) (70302.607) (15815.938) (46178.666) (41811.290) Dummy - Owner has lives in Property - 2021 -53032.328 -63491.737 -34891.594 -66494.618 -32618.266 (60608.882) (74880.861) (44483.350) (77194.959) (47027.638) Dummy - Owner has multiple homes - 2021 2303081.400 2343631.583 2835274.643 2529147.274 2428627.282 (1964964.565) (2090330.960) (2537803.185) (2213191.889) (2130124.220) ln Value Owners Portfolio** - 2021 -124017.307 -127307.062 -151569.926 -136943.942 -131686.607 (105912.077) (113653.800) (135885.423) (120103.760) (115807.715) Dummy - Property Paid > 0 - 2020 134380.306 335277.291 303938.866 574382.588 (1053525.329) (824420.603) (932896.724) (753696.650) Dummy - Property fined - 2020 -2312804.727 -2291088.091 -2302214.253 -2382672.164 (2120201.685) (2095679.564) (2109916.180) (2169174.609) ln Value Fine - 2020 -233711.557 -234338.303 -234440.845 -229381.495 (202172.162) (203378.248) (203139.217) (197133.854) Dummy - Tax was paid after deadline - 2020 4432343.180 4423667.898 4431304.303 4422784.682 (3913812.603) (3909139.023) (3914042.956) (3880265.760) Dummy - Lower Tax Bracket - 2021 -73330.674 -81400.968 -58962.942 -146508.051 (159224.152) (160205.064) (156616.772) (199614.531) *** p < 0.01; ** p < 0.05; * p < 0.1. 52