Policy Research Working Paper                              9124




              Digital Innovation in East Asia
                    Restrictive Data Policies Matter?

                             Martina Francesca Ferracane
                                 Erik van der Marel




East Asia and the Pacific Region
Office of the Chief Economist
January 2020
Policy Research Working Paper 9124


  Abstract
 Digital technologies encourage companies to innovate                               more reliant on software. Regressions show that in countries
 with new processes, goods, and services, which ultimately                          that have more restrictive data policies, firms are less likely
 enhance their competitiveness in local and global markets.                         to use foreign technologies through licensing as part of their
 This paper analyzes whether a wide set of data restrictions                        innovation process. Country-specific cases for which data
 are negatively associated with digital innovation of firms.                        are available also show that restrictive data policies are neg-
 The paper develops an index of data restrictions that mea-                         atively associated with firms’ likelihood of using intangible
 sures the level of data policy restrictiveness for 15 East Asian                   assets, such as patents and goodwill, for performing innova-
 countries over time. Using various firm-level data sets, the                       tion (in Malaysia and China) and developing innovations
 analysis shows that data restrictions inhibit firms’ ability to                    as a result of research and development that are new to the
 innovate. The analysis takes into account that data restric-                       market (in Vietnam). The paper concludes that open data
 tions are likely to have a greater impact in sectors that are                      policies are likely to foster digital innovation.




 This paper is a product of the Office of the Chief Economist, East Asia and the Pacific Region. It is part of a larger effort by
 the World Bank to provide open access to its research and make a contribution to development policy discussions around
 the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors
 may be contacted at erik.vandermarel@ecipe.org.




         The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development
         issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the
         names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those
         of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and
         its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.


                                                       Produced by the Research Support Team
             Digital Innovation in East Asia: Do 
             Restrictive Data Policies Matter? 

                                     Martina Francesca Ferracane 
                                          University of Hamburg, ECIPE 



                                             Erik van der Marel* 
                                 Univestité Libre de Bruxelles & ECARES, ECIPE 




    JEL classification: O31; D22; C54; F14 

    Keywords: Firm‐level innovation; data policy restrictions; software.  




* Corresponding author is erik.vandermarel@ecipe.org, Senior Economist at ECIPE & Assistant Professor at the
Université Libre de Bruxelles (ULB), ECARES, Avenue des Arts 40, 1050, Brussels; co‐author is Martina Francesca
Ferracane,  PhD,  martina.ferracane@gmail.com,  Max  Weber  Fellow  at  the  European  University  Institute  (EUI)
and  Research  Associate  at  ECIPE.  We  thank  Prerna  Rakheja,  Pinyi  Chen  and  Faruk  Miguel  Liriano  for  their
excellent  research  assistance.  We  thank  Francesca  de  Nicola  valuable  advice  and  coordination  when  using  the
different sets of the firm‐level data.
    1. Introduction 
The digital transformation in many economies opens a wide range of innovation opportunities for 
firms. Digital technologies encourage companies to innovate with new processes, goods and 
services, which ultimately enhance their competitiveness in local and global markets. Digital 
innovation often happens through the internet and new online platforms to which firms increasingly 
have access across borders. However, many firms today face significant restrictions when it comes to 
these new digital technologies, access to the internet, the use of online platforms and the cross‐
border flow of data – most of which have been only recently enacted by governments. 

This paper analyzes data restrictions that are expected to affect digital innovation that happens with 
the support of data, the internet and online platforms. Together, we conveniently call them 
restrictions to data. Restrictions to data inhibit firms to innovate using advanced software and more 
generally data across borders, which today are an essential part of the innovation process of many 
firms (Guellec and Paunov, 2018). For instance, big data, Artificial Intelligence (AI) and blockchain are 
new digital technology developments that generate and make available huge volumes of data that 
firms use to develop new products, services and even processes – all with the help of software. 
These new technologies help create a competitive advantage for the firm. Restrictions to data are 
therefore likely to slow down this competitive process.  

We record data policy restrictions for 15 countries in the East Asian region and investigate whether 
these barriers indeed impact the likelihood of firms to innovate.1 The reason we take the East Asian 
region as a case in point is twofold. One is that digital innovation is rife in the region. According to a 
recent OECD report, the increased use of digital technologies in East Asia is ushering the 
transformation of economies and societies (OECD, 2019). Second, the region provides an interesting 
variation of policy responses over time regarding data. On the one hand, there are countries such as 
Indonesia, China and Vietnam that have either very strict data policies or have become much more 
restricted with regards to data over time. On the other hand, there are countries such as the 
Republic of Korea, Malaysia and the Philippines which have removed data restrictions.  

For the purpose of this research, we have constructed an index that measures the extent to which 
15 East Asian countries are restricted regarding data policies. This restrictiveness index builds on 
previous work from Ferracane et al. (2018a; 2018b), but is expanded with new policies, such as 
those related to Intellectual Property Rights (IPR), that are expected to affect more generally digital 
innovation. The first step of this paper is to describe and analyze the developments of data 
restrictiveness for the 15 countries in the region for which we have collected policy developments. 
Then, the policy index is used to see how it correlates with firms’ performance regarding their 
innovation activities in 10 East Asian countries. In doing so, we take into account that data 
restrictions are likely to have a greater impact on digital innovation in sectors that are more data‐
intense, which we proxy by their software use. Finally, we select three countries (Malaysia, Vietnam 
and China) for which we have specific firm‐level data and analyze further whether our restrictiveness 
index has any bearing for firms’ innovation activities using different variables.  

The conclusions of the correlation exercises for both the cross‐country and the three country studies 
show that a more restrictive policy framework regarding data policies correlates negatively with the 
extent to which firms innovate digitally. For instance, firms in countries that exhibit higher levels of 
                                                            
1
   The countries are Cambodia, China, Indonesia, the Lao People’s Democratic Republic, Malaysia, Mongolia, 
Myanmar, the Philippines, Thailand and Vietnam. Other countries in the Southeast Asia region for which we 
have developed an index of data policy restrictiveness are Hong Kong SAR, China; Japan; the Republic of Korea; 
Singapore; and Taiwan, China, which will be discussed.  

                                                      2 
 
restrictive data policies are less likely to use foreign technologies through licensing. In addition, the 
country cases show that firms in Malaysia that face higher levels of data restrictions are less likely to 
purchase foreign intangible assets, whereas firms in Vietnam that encounter higher levels of 
restrictions are less likely to develop new goods and services that are new to international markets. 
Together, the results show that data policy restrictions are significant obstacles for firms to develop 
digital innovation in East Asia.  

The paper is organized as follows. The next section provides the motivation for performing this study 
and summarizes the recent literature regarding data, digital trade and digital trade policy 
restrictions. Section 3 presents the estimation strategy in which the two levels of empirical analyses 
we employ are discussed, i.e. the cross‐sectional regression examination as well as the country‐
specific cases of regressions. Section 4 discusses the results of both analyses and finally Section 5 
concludes by putting the results in a wider policy context.  

 

    2. Motivation and Previous Literature 
Despite the rising trend of data flowing across borders worldwide, research on this topic has been 
surprisingly limited. Manyika et al. (2016) claim that the contribution of the cross‐border and use of 
data flows to GDP has overtaken that of flows in goods as part of globalization today. The study 
states that data flows currently account for $2.8 trillion of the total increased world GDP over the 
last decade, thereby exerting a larger impact on growth than traditional trade in goods.  

Recent literature has looked at the restrictive policies applied to data. A first attempt was performed 
by Stone et al. (2015), which covers measures of data localization requirements only. Their study 
notes that data flows enhance the efficiency of trade for specialized services firms both domestically 
and across borders. Furthermore, work by Ferracane (2017) further categorizes the different forms 
of existing data policies that affect the cross‐border movement of data. The study surveys data 
policies applied across 64 major economies to show that data restrictions are implemented in many 
countries, in different forms, and on different types of data. Finally, Ferracane et al. (2018b) have 
developed a sophisticated index for 64 countries in which the level of data restrictiveness is assessed 
covering many policy restrictions related to the cross‐border movement and domestic use of data. 
An updated and expanded version of the index is used in this paper. 

Research that analyzes the impact of data restrictions on economic outcomes is scarce. Van der 
Marel et al. (2016) and Ferracane et al. (2018b) are the only two studies that explore how regulatory 
policies related to data affect productivity. The authors analyze this linkage econometrically by 
setting up a regulatory restrictiveness index for the cross‐border and domestic use of data from 
Ferracane et al. (2018a) and extending this index over time. The authors calculate the costs 
associated with restrictive data policies by regressing firm‐level productivity on a composite 
indicator which measures the extent to which restrictive data regulations affect industries relying on 
data using software as a proxy. They find that stricter data policies tend to have a negative impact on 
the performance of firms in sectors which are more data‐intense. This paper employs a similar 
identification strategy and analyzes the impact of restrictive data policies on firms’ innovation.  

Other previous studies have looked specifically at one policy framework regarding data, namely the 
EU General Data Protection Regulation (GDPR) and estimated the costs on the economy. Christensen 
et al. (2013) uses calibration techniques to evaluate the impact of the GDPR proposal on small and 
medium‐sized enterprises (SMEs) and concludes that SMEs that use data rather intensively are likely 
to incur substantial costs in complying with these new rules. The authors compute this result using a 

                                                    3 
 
simulated dynamic stochastic general equilibrium model and show that up to 100,000 jobs could 
disappear in the short‐run and more than 300,000 in the long‐run. Another study by Bauer et al. 
(2013) uses a computable general equilibrium GTAP model to estimate the economic impact of the 
GDPR. It finds that this law could lead to losses up to 1.3 percent of the EU’s GDP as a result of a 
reduction of trade between the EU and the rest of the world. 

Goldfarb and Tucker (2012) empirically prove the adverse link between restrictive data policies and 
innovation and point out that stricter privacy regulations may harm innovative activities by 
presenting the results of previous studies undertaken with respect to two sectors, namely health 
services and online advertising. Both studies show that there are strong linkages between the 
effective sourcing and use of data and innovation based on open markets. Recent work by Goldfarb 
and Trefler (2018) discusses the potential theoretical implications of restrictive data policies such as 
data localization and strict privacy regulations on innovation and trade, albeit from the perspective 
of AI. The authors make clear that an expanded innovative AI industry in which data flows are an 
important factor would be distorted by restrictive data policies such as data localization.  

This paper combines the two strands of the literature by developing a specific yet expanded data 
policy restrictiveness index based on Ferracane et al. (2018a). It then relates the index to firms’ 
digital innovation activities for a set of East Asian countries for which we specifically have developed 
the data policy index. The index covers various measures related to data activities and is much 
broader in scope than equivalent indexes used in the papers described above. For instance, in 
addition to restrictions on the cross‐border flow and domestic use of data, we now also include 
restrictions related to IPR for digital sectors, intermediate liability and content access for online 
platforms, as well as regulatory policies regarding the telecommunication market. We expect these 
data restrictions to be negatively correlated with the extent to which firms innovate, particularly in 
sectors that are more reliant on data. In assessing this hypothesis, we use the identification strategy 
as developed in Ferracane et al. (2018b).  

 

    3. Empirical Strategy 
This section sets out the empirical strategy. We develop a composite indicator following the works of 
Ferracane et al. (2018a) and Ferracane and van der Marel (2018). In these two works, a data linkage 
variable is developed that interacts their data policy index with an industry‐level measure of data‐
intensity. In our case, the composite indicator is comprised of the index that covers for data 
restrictions, including the ones related to IPR and telecommunication, which is interacted with 
variable that measures the extent to which sector are intensive in the use of software. In our view, 
this latter proxy crudely specifies how much each a sector employs data. Some sectors are more 
dependent on data than others and we expect that data‐intensive sectors are proportionately more 
affected by changes in restrictive data policies. To reflect this consideration, we therefore weight the 
data policy index with our measure of software use that signifies data‐intensity at the industry‐level.  

In a second step, we present our baseline specification for the regressions in which we use different 
firm‐level variables that measure innovation and regress them on our composite indicator of data 
restrictiveness. We perform regressions using two types of firm‐level data, namely one at the 
aggregate cross‐country and industry level using the World Bank Enterprise Survey database, in 
addition to three country‐specific firm‐level data sets for a small number of East Asian countries, 
namely Malaysia, Vietnam and China. The results that derive from the two data sets are 
complementary as they provide us different insights. The World Bank data represents a cross‐


                                                    4 
 
country assembly of firms across time which would give us a collective view of the policy choices 
countries made in the region. The latter data sets look specifically within Malaysia, Vietnam and 
China and see whether those policies that are significant drivers for the cross‐country firm results 
are also validated for firms within the three countries separately.  

 

3.1           Data Linkage 
The data linkage index builds on the methodology pioneered by Arnold et al. (2011; 2015). Their 
approach in which the authors create a so‐called services linkage index has been widely used in the 
empirical field. In our case, we develop a data linkage index variable for digital innovation and use 
this composite indicator in our regressions. For each country, we interact the country‐specific data 
policy index with software use as proxy for measuring the extent to which a sector uses data in their 
production process. This identification strategy relies on the assumption that sectors more reliant on 
the use of software are more affected by data restrictions. This weighted method is a more refined 
way of measuring the impact of restrictive data policies rather than simply taking an unweighted 
approach of regressing our data policy index on any outcome variable of innovation. 

In doing so, the country‐specific index of restrictive data policies we develop is multiplied with 
sector‐specific data‐intensity proxied by software use for each downstream industry j in country c. 
This is how the data linkage (DL) variable is set up. In this variable, data‐intensities are expressed as 
(D/L) which is measured by the sector’s software use over labor (see below). In equation (1), 
therefore, the term ϛ  denotes the software use for each sector j for which data is retrieved from 
the US Census ICT survey. Then in equation (1), the data‐intensities are stated as a ratio over labor, 
called ������������������ , that is employed in each downstream sector j. The data for labor is retrieved from the 
US Bureau of Labor Statistics (BLS). As a result, we apply the following formula: 

 
                                                                    ∑ ϛ
                                ������������������������ ������������������������������������������ DL     ln          ∗ ������������������������ ������������������������������������ ������������������������������              (1) 

                                                                                                                               

Note that we put the intensity indicators in logs, in line with previous literature on factor intensities. 
This expression of intensities is close to the literature of comparative advantage such as Chor (2011), 
Nunn (2007) and Romalis (2004).2 Finally, in equation (1), the data policy index refers to a country‐
year specific variable measuring restrictive data policy (see Section 3.3), whereas the data on 
software refers to the US‐specific data on software use by industry for one year (see Section 3.2), 
which is done to avoid endogeneity issues. This may occur in the event that high data‐intensive 
sectors with greater digital innovation activities over time push for lower regulatory restrictions 
regarding data in any particular country. The use of this common sector‐specific data‐intensity for 
one country therefore makes the variable more exogenous.  


                                                            
2
   An alternative way of measuring intensities such as the one used in Arnold et al. (2011; 2015) and Bourlès et 
al. (2013) is to create an indicator of dependency using input‐output matrixes, which we have also done in 
Ferracane et al. (2018a) and Ferracane and van der Marel (2018). We use this information as well in our paper 
for the country‐specific case studies. In such way, we have two approaches to data‐intensities, i.e. one based 
on data from surveys and one from accounting data. In our view, however, data‐intensities over labor are a 
more sophisticated way of measuring data‐intensities, particularly regarding innovation. 

                                                                      5 
 
3.2     Data Intensities 
For our measure of data‐intensity as defined in equation (1), we use information on software use 
from the 2011 US Census ICT Survey. These data are survey‐based and record at detailed 4‐digit 
NAICS sector‐level how much each industry and services sector spends on inputs from the ICT‐sector 
in terms of ICT equipment and types of computer software in million USD.  

We take computer software expenditure to compute data‐intensity. The ICT Survey records two 
separate variables on software expenditure, namely capitalized and non‐capitalized. Non‐capitalized 
computer software expenditure is comprised of purchases and payroll for developing software and 
software licensing and service/maintenance agreements for software. Capitalized computer 
software expenditures cover capital expenditures of equipment and software itself. Although this 
proxy of software does not entirely capture the extent to which sectors use electronic data, it 
nonetheless is the closest kind of data‐use variable we can publicly find. Note however that inside 
firms, data‐based innovation is based on software and therefore provides a good reason to use 
software as a proxy. We take the year 2010 for our regressions and divide this software expenditure 
over labor, also for 2010, and use it for our data linkage variable.  

Admittedly, this proxy for data‐intensity is not ideal. Currently there is no data on the extent to 
which data is used by sectors. There are only some guesstimates on how much data are used by 
countries, such as recorded by Cisco or Teleography, but even these sources provide data for only a 
handful of observations. Having said that, what is clear is that the transmission of data for 
innovation within and across borders over the internet is performed using software technologies. 
Software is needed to develop digital innovations in its simplest form and with the help of software 
data are transmitted. In addition, more technology advanced transmissions of data over the internet 
are done with the help of cloud computing technologies which in themselves are a form of software. 
Hence, despite not entirely capturing how much data are really being used in sectors, using the 
intensity of each sector’s use of software is in our view the first‐best available proxy. 

Figure 1 provides an overview of the data‐intensities for each sector calculated on the basis of non‐
capitalized expenditures of software.3 The data to construct these intensities are downloaded at 
various digits levels in NAICS given that the US Census records this information at mixed levels 
between 2‐digit and 4‐digit. All data are re‐concorded into the ISIC Rev 3.1 2‐digit level. Employment 
data are from the US Labor statistics and given in 6‐digit level and also re‐concorded into the ISIC 
Rev 3.1 2‐digit level. We have developed our own concordance matrix at the most disaggregated 
level between the two data sources and then aggregated up to 2‐digit level by taking the simple 
average. The reason for re‐classifying these data points is that our innovation variables are provided 
in 2‐digit ISIC Rev. 3.1. Since data are given at two different aggregations across the software and 
labor, we first concord all data into ISIC and then compute the intensities.  

The 15 sectors in Figure 1 show the ranking sectors that have the highest data‐intensities based on 
our proxy of software expenditure. Not surprisingly, telecommunication is the sector that shows the 
highest data‐intensity level and is therefore very software‐intense compared to labor. Other very 
high data‐intensity sectors are computer and insurance and finance, which is also unsurprising. They 
also use a high amount of software compared to labor. The latter two sectors are more broadly 
considered as very technological‐intensive and internet technologies have massively increased in the 
                                                            
3
   Of note, we take that part of non‐capitalized software expenditures which measures how much each industry 
spends on purchases and payroll for developing software, which represents, on average, 47 percent of total 
non‐capitalized software expenditures. The other 53 percent of non‐capitalized software expenditures covers 
for each industry the software licensing and service/ maintenance agreements.  

                                                     6 
 
financial services industry. On the other side of the spectrum (not shown in the figure), sectors such 
as furniture, construction, sale of motor vehicles and wearing apparel are shown to be least data‐
intense. The middle‐range of sectors using software intensively is a mix of modern and traditional 
sectors such as transport services and various manufacturing industries such as basic metals.4 

 

3.3           Data Policy Index for Digital Innovation 
The second term of our data linkage variable is the data policy index, which is based on a 
quantifiable set of country‐specific regulatory policies which are expected to have a restrictive 
impact on digital innovation. These restrictive policies relate to the use and transfer of data, IPR, 
intermediate liability, content access, as well as regulatory policies regarding the telecommunication 
market. We draw on Ferracane et al. (2018a) and ECIPE’s Digital Trade Estimates (DTE) database to 
develop and construct this index.5 The policies used for the analysis are those considered to create a 
regulatory cost burden for firms relying on data for their innovation activities. The criteria for listing 
a certain policy measure as a restriction in the database are the following: (i) it creates a more 
restrictive regime for online versus offline users of data; (ii) it implies a different treatment between 
domestic and foreign users of data; and (iii) it is applied in a manner considered disproportionately 
burdensome to achieve a certain policy objective.  

The data policy index is composed of 6 different categories, each containing a set of policy 
restrictions related to a specific digital policy field which are: Intellectual property rights (IPR), cross‐
border data flows (CBDF), domestic use and processing of data (DP), intermediate liability (IL), 
content access (CA), and finally infrastructure and connectivity (INF). In our view, these categories of 
data‐related policies present the most important policy restrictions to digital innovation that can be 
found in East Asia. As said, each category has various specific restrictions which can be further found 
in Table 1. All restrictions are explained in Ferracane et al. (2018), which also provides further 
information on the motivation for why they form a restriction and discusses the way of scoring their 
level of restrictiveness. The index covers the years 2009‐2019. In addition, the policies in the index 
have been updated with new regulatory measures found in each country.  

To build up the index, each specific policy measure receives a score that varies between 0 
(completely open) and 1 (virtually closed) according to how vast its scope of restrictiveness is. A 
higher score represents a higher level of restrictiveness in data policies. While certain data policies 
can be legitimate and necessary to protect non‐economic objectives such as the privacy of the 
individual or to ensure national security, these policies nevertheless create substantial costs for 
businesses performing data‐related innovation activities and are therefore taken up in our index. 
Starting from the DTE database, the specific policies are aggregated into an index using a detailed 
weighting scheme adapted from Ferracane et al. (2018b), which can be found in the last column of 
Table 1. 

                                                            
4
  One noticeable outlier in Figure 1 is the food products sector, which appears to have an extreme above‐
average data‐intensity. One potential reason is that the US Labor Statistics only records employment data for 
10 out of a total of 85 6‐digit NAICS sub‐sectors for this industry that all fall into the 2‐digit ISIC industry of 
food products as measured by the concordance table. Hence, it is very likely that labor is underreported for 
this sector which as a result increases the data‐intensity given that our measure is expressed as a ratio in 
which labor forms the denominator. Therefore, in the regressions we exclude the sector of food products 
although including this sector does not significantly alter the main results.   
5
   The authors have contributed to the development of the database at ECIPE. The data set comprises 64 
economies and is publicly available on the website of the ECIPE at the link: www.ecipe.org/dte/database. 

                                                               7 
 
More specifically, each category of data policy restriction is weighted for the full index. In addition, 
within each category, each specific policy restriction is also weighed against each other. Yet, in most 
cases the policy restrictions receive equal weights within their respective category as can also be 
found in Table 1. For the categories, both the IPR and the CBDF also receive an equal weight of 0.25, 
which therefore together accounts for half of the overall index. The other half is covered by the four 
remaining categories in which both the DP and CA categories receive a weight of 0.15. Further, the IL 
and the INF categories are assigned a weight of 0.1. Note that in some occasions a new specific 
policy restriction is included, such as whether a country has a data protection law in place, which 
was not taken up in Ferracane et al. (2018). Annex A of Ferracane et al. (2018b) provides further 
detailed information on the weights, scoring and description of the policy measures. 

After applying our weighting scheme, the data policy index varies between 0 (completely open) and 
1 (virtually closed). The higher the index, the stricter the data policies implemented in the countries. 
Table 2 presents an overview of the final index for each East Asian country and shows how each 
category of restrictions contributes to the final index score. What becomes clear is that China is most 
restricted with a score of 0.91. In large part, this is caused by the high level of policy restrictiveness 
in the categories of IPR and CBDF. After China comes Vietnam with a score of 0.82, and then third 
both Thailand and Indonesia which both have a level of restrictiveness that scores 0.64. The least 
restricted country is Hong Kong SAR, China, with a score of 0.09 and is therefore almost virtually 
open. It only shows some minor restrictions related to IPR and intermediate liability. Japan is the 
second least restricted country with a score of 0.20. Together the set of East Asian countries allow 
for substantial variability in our data policy index as illustrated in Figure 2 and Figure 4.  

Figure 3 shows how the full index of data policy restrictiveness has evolved over time between the 
years 2009 and 2019. The line is computed as the weighted average of the 15 East Asian countries 
covered by the index with their respective GDP used as weights. The reason for doing so is that in 
order to get a non‐biased trend of restrictiveness for the entire region, countries’ restrictions should 
be corrected for their individual developments. A small country such as Vietnam might be very 
restricted but compared to China or Indonesia has a much smaller economic impact in the region. 
Treating all countries equally would therefore give a distorted picture of the aggregate level of 
restrictiveness for the entire area. As one can see, there is a clear upward trend reflecting the fact 
that data policies in the East Asian region have become more restrictive over time.  

 
3.4     Descriptive Analysis 
Before turning to the econometric assessment using firm‐level data of innovation, we first provide 
some descriptive analysis of our data policy index and show how it relates to existing variables of 
innovation that are computed at the level of country and sector.  

We first do so by taking one of the firm‐level innovation variables from the World Bank Enterprise 
Survey and average this binary information by country and sector. Admittedly, doing so has 
problems as the variable is initially dichotomous and would much depend on the number of firms 
included in the sample. Nonetheless, it would be worthwhile to conduct this analysis in order to 
obtain a first impression of the potential direction that the econometric correlations may take. We 
undertake this analysis for all sectors, but for now focus on the computer and related services sector 
given that our interest lies in the responsiveness of firms in a sector that is data‐intensive. Figure 1 
showed that the computer and related services sector is an intense user of software.  

 

                                                    8 
 
Figure 4 shows a negative relationship once we plot our data policy index of restrictiveness for each 
East Asian country against our preferred variable of innovation in the computer and related services 
sector that is found in the Enterprise Survey database. The figure selects the average of the h5 
variable, but an equally sharp negative correlation appears for the other innovation variables that 
have been selected from the same database. Clearly, countries with higher levels of data 
restrictiveness appear to have lower innovation activities in the computer services sector. China is an 
interesting outlier in Figure 4. The country is very restricted in data as measured by our data policy 
index, but at the same time exhibits a very high degree of firm‐level innovation in computer services. 
This fact is little surprising given China’s fast‐moving activities in the digital field, but the figure also 
shows that the country is clearly an exception in the region.6  

Another interesting correlation is illustrated in Figure 5. We use a country‐specific variable that is 
plotted against our data policy index. We take a standard variable that measures how much East 
Asian countries import digital services as a share of their total commercial services imports. In this 
case too, we see that a tight negative correlation exists between the two variables. This suggests 
that countries which are more restricted regarding data policies exhibit a lower share of digital 
services imports. Although this variable of services trade is not taken up in our econometric analysis, 
it nonetheless points out to close the link between digital innovation and open markets. The services 
trade variable measures digital services imports performed over the internet such as software whilst 
the data policy index captures restrictive trade policies that target digital technologies such as the 
internet, data and online platforms.  

However, in order to formally assess whether across the entire economy of each East Asian country 
firms in data‐intensive sectors are truly affected in their innovation activities as a result of higher 
data restrictiveness, the identification strategy takes into account the extent to which each sector is 
data‐intense by employing a sector’s software use as a proxy, as explained above.  

 
3.5           Baseline Regressions 
As previously said, to measure whether the data policy index has any meaningful relation with 
innovation activities at the level of the firm in East Asian, we employ two regression approaches.  

The first approach takes a cross‐country dimension in which we perform regressions for 10 East 
Asian countries for which we have data. Then, as the next step we select a number of countries for 
which we have specifically recorded firm‐level data from a national source and evaluate whether our 
cross‐country outcomes are consistent with these country‐specific regressions. There are two main 
reasons why we undertake this two‐step approach. One is that the cross‐country exercise tells us 
something about the differences of countries over time, whereas the country‐specific puts more 
emphasis on the development of the policy restrictions as such and therefore guides for a more 
specific policy advice.7 Also, the two sources of data record different innovation variables, which 

                                                            
6
   Furthermore, Table A1 in Annex A shows that China’s high innovation activities are not caused by an 
exceptionally high number of firms recorded in the Enterprise Survey database. Therefore, it means that 
aggregating the firm‐level innovation variables into an average by country and sector does is not influence 
China’s extreme position. 
7
   Moreover, there are also technical reasons for why we exploit these two dimensions of data. One is that the 
source for the cross‐country approach reports years with intervals, is survey‐based and is unbalanced, whereas 
the country‐specific data are more complete and census‐based. Although one could argue that the latter 
approach is more meaningful to analyze, exploiting the two approaches is useful because of the reasons 

                                                               9 
 
therefore provides us further insights on which specific part of innovation that firms perform data 
restrictions have an impact.  

We start with the cross‐country approach. Equation (1) is used in our baseline regression which is 
specified in equation (2) below. Equation (2) measures the correlation over time between the data 
linkage index as described above and several variables of innovation (see below) measured at the 
level of the firm. Hence, we regress our variables of firm‐level innovation that is recorded for each 
firm f¸ for country c, in sector j, at year t, on the data linkage (DL) index which itself is specified at 
country‐sector‐year level. As a result, the baseline specification for our regressions as correlations 
takes the following form:  

 

                                                INNO           ������   ������DL         ������   ������   ������   ������                          (2) 

  
In equation (2), the vector INNO consists of four firm‐level innovation variables across our selected 
group of East Asian countries. These variables are: (1) whether the firm has introduced a new 
product / services over the last 3 years; (2) whether the firm has introduced a new process over the 
last 3 years; (3) whether the firm uses technology that is licensed from a foreign company; and 
finally (4) whether the firm has spent on new R&D (excl. market research) in the last 3 years. The 
four variables are respectively indicated by h1, h5, e6 and h8, which is consistent with the labeling of 
the World Bank Enterprise Survey database from where the data are sourced. Note that these firm‐
level data are cross‐sections for each year between 2009 and 2018 with intervals and as such do not 
record data for the same firm each year. Tables A1, A2 and A3 and Figure A1 in Annex A provide an 
overview of the cumulative firm distribution of the four innovation variables and gives summary 
statistics by country and sector.   

Note that our dependent variables are formulated for which responses are only allowed in a binary 
way. The Enterprise Survey database reports these answers with a simple Yes or No. We have 
transformed the variables in a dummy so that effectively it becomes a non‐linear estimation in the 
sense that INNO      ∈ 0,1 . We are therefore compelled to perform a Probit model. However, we 
first perform an LPM model with fixed effects before moving into a Probit regression, as the former 
provides us additional information about the direction in which the Probit results are most probably 
going to when regressing.8 Moreover, for our three country cases only LPM regressions can be 
performed and thus for reasons of consistency we report both types of result. We estimate our 
Probit model with a conditional (fixed‐effects) logistic regression, because of the inclusion of our 
various dimensions of fixed effects.  

As described above, our DL variable is defined at country‐sector‐year level following equation (1) 
and therefore varies over all three dimensions. Although we have data for our data policy index up 
until 2019, we can only include up to 2018 as the Enterprise Survey data do not go any further. 
                                                            
provided above and because it effectively tests for two different kinds of variations. Moreover, the country‐
specific data source reports different types of innovation variables which are used as our dependent variable.  
8
   There are however problems with the LPM. One of the main issues is that the LPM does not estimate the 
structural parameters of a non‐linear model (Horace and Oaxaca, 2006). If the Conditional Expectation 
Function (CEF) is linear (which means that conditional mean of a random variable is its expected value), then 
even an LPM regression gives the CEF. Instead, if the CEF is non‐linear the otherwise standard approach of 
using Probit approximates the CEF, in which case the LPM does not give any meaningful marginal effects. 
However, given that we do not know whether the model is truly non‐linear both LPM and Probit are useful. 

                                                                           10 
 
Equation (2) also includes fixed effects by country (������ ), sector (������ ) and time (������ ), respectively. Note 
that despite the fact that our dependent variable is given at the firm‐level, we cannot include firm‐
level fixed effects because of the repeated cross‐sectional nature of the Enterprise Survey data set 
and so following developments of the same firm over time is not possible. Finally, the ������  is the 
error term, which for the LPM regressions are clustered by sector country. For our Probit regression, 
we are unable to cluster, but the data are grouped by sector.  

Our second approach is using country‐specific firm‐level data. We have firm‐level data sets from 
Malaysia, Vietnam and China. Obviously, the three data sets differ in variable coverage which means 
that the innovation indicators are not consistent across each other despite all three data sets report 
companies’ balance sheets information. Data are available for the manufacturing sector only. 

In the regression specification presented below in equation (3), the innovation variables are again 
summarized in a vector called INNO in which the dependent innovation variables are dummies as 
well. Hence, INNO       ∈ 0,1  in equation (3). The empirical setup is largely similar compared to 
equation (2) with only some minor differences. One is that our data usage indicator as defined in 
equation (1) needs to be adjusted as we do not observe software use and labor in any of the 
countries. Data on these two variables are hard to find for any of the East Asian countries. Second is 
that the regression equation is specified for one country only so that policy changes over time are 
the focus (as opposed to policy differences across countries in the cross‐section analysis). In order to 
analyze this latter aspect in more detail, our DL measure will now be lagged with 2 years when 
possible and 1 year otherwise.  

In all, the baseline regression equation for the three countries looks as follows:  

 

                                                INNO           ������   ������DL         ������   ������   ������   ������                         (3) 

 

where INNO is comprised of the innovation variables recorded in each country‐specific data set for 
Malaysia, Vietnam and China. The DL term is exactly similar to the one in equation (2) where the 
data policy index is interacted with the data/software intensity. In our three country cases however, 
given the lack of data on this variable in the region, we are left with measuring the sheer proportion 
of data usage as part of total input use instead. We use national input‐output (IO) matrices to 
compute the proportion of ICT‐services in total input use for each sector in the three countries. The 
national IO tables are taken from the World Bank and are reported at 2‐digit ISIC Rev. 4 level. IO 
tables are available for each country and therefore represent a consistent source. For each 
regression, we take input coefficients at the domestic level (i.e. excluding imports) and for a year 
that falls at the beginning or in the middle of the time period of analysis.  

Further, the terms ������ , ������  and ������  are the firm, sector and year fixed effects, respectively. Note that it 
follows naturally that due to the fact we have three country‐specific regressions, we are unable to 
include any country fixed effects. Finally, the ������  is the error term, which now for the LPM 
regressions are clustered by sector. Of note, due to technical constraints we cannot run the Probit 
model and therefore perform LPM for all three countries.9 In addition, for China we also perform 
OLS on several occasions as the type of data allows us to do so.  

                                                            
9
  More specifically, when running a Probit model while performing the regressions the combinations of groups 
and observations result in a numeric overflow in the two country cases of Vietnam and Malaysia. This 

                                                                           11 
 
       4. Results 
This section reports the results of both approaches in similar subsequent manner. The results of the 
cross‐country regressions are given in Tables 3 and 4 in which the LPM and Probit results are 
reported respectively. The country‐specific results are provided in subsequent tables.  

 

4.1           Cross‐Country Results 
For the LPM regressions in Table 3, results are in all but one occasion insignificant, meaning that in 
most cases no statistically significant correlation is found between the data linkage variable and 
firms’ innovation activities. That is, restrictive policies in data do not show any meaningful 
correlation with respect to the firm’s choice to introduce a new product or service as shown in 
column 1, or to introduce a new organizational procedure as reported in column 2, or to spend more 
on R&D as shown in column 4. However, the data linkage variable does come out statistically 
significant in column 3 which shows that restrictive data policies are significantly negative correlated 
with whether a firm takes on a technology that is licensed from a foreign company.  

The results for the Probit regressions reported in Table 4 are consistent in the sense that the 
coefficient result in column 3 are now estimated with precision. This means that more restrictive 
data policies are significantly correlated with a lower likelihood of firms to acquire a technology 
licensed from a foreign company. Taking into account that the marginal impact of changing the data 
policy index is not constant, a one‐unit increase in restrictions as part of the data‐linkage index 
variable is therefore associated with a lower probability by firms to use foreign‐licensed technology. 
The other three innovation variables remain again insignificant even though the sign in columns 1 
and 4 give a negative direction, which in the LPM regression was not the case. Note as well that the 
coefficient sizes increase substantially compared to the LPM results. 

The fact that the variable of foreign licensed technology is significant may raise potential suspicion. 
For instance, one could suggest that the foreign technology that is licensed may also include 
software, which therefore may be correlated with our data policy index. This is because of the 
multiplicative term of the data‐linkage variable in equation (2) also includes the extent to which each 
sector uses software. However, a closer look at the Enterprise Survey variable description states the 
survey question as: “Does this establishment at present use technology licensed from a foreign‐
owned company, excluding office software?” Hence, we are assured that no artificial or spurious 
correlation is being picked up in our regressions.10 On the contrary, given that the coefficient results 
are significant, it seems likely that foreign licensed technology as part of a firm’s innovation activity 
is related to a country’s framework of regulatory policies in data. Moreover, higher levels of data 
restrictiveness found in countries appear to hamper firm‐level innovation in sectors that are more 
intense in using software.  

 

 


                                                            
effectively means that mathematical computations of our econometric performance exceed the limit for the 
largest number representable when an attempt is made to calculate the binomial coefficient.  
10
    See the Enterprise Surveys Indicator Description, page 112, which can be found here:  
https://www.enterprisesurveys.org/content/dam/enterprisesurveys/documents/methodology/Indicator‐
Descriptions.pdf.  

                                                               12 
 
4.2     Country‐Specific Results 
This section presents the regression findings for the country‐specific cases of Malaysia, Vietnam and 
China. Further details of the survey questions and variables covered in the regressions, as well as 
some summary statistics for each of the specific country data sets for Malaysia, Vietnam and China, 
are provided in Annexes B, C and D, respectively.  

 

4.2.1  Malaysia  
For Malaysia, we have data on the extent to which each firm has purchased, used and produced 
intangible capital such as patent, goodwill, work in progress (including imports of both new and used 
intangible assets), and to the amount of R&D spending for every firm. Used assets are the purchases 
of assets previously used in Malaysia including those reconditioned or modified before acquisition in 
the country. Purchased assets are newly bought assets and finally, produced assets are assets 
produced by the establishment in Malaysia for its own use. Data only span the years after 2008, 
because of its use of the MSIC 2008 classification which neatly corresponds to ISIC Rev. 4. However, 
this leaves us with only two years for the analysis, namely 2010 and 2015, which is demanding if we 
apply year fixed effects. All variables are transformed in a binary mode so that positive values of 
greater than 0 will be assigned a 1 and 0 otherwise.  

Before turning to the regression results, Figure 7 provides a descriptive examination of the main 
variables used for the empirical specification. The graph plots the IO coefficients of ICT‐services 
inputs for Malaysia on the horizontal axis against a composite indicator of all the four firm‐level 
innovation variables from our Malaysian data set, which we call innovation score, and which is 
summarized into the INNO term. This Innovation Score is then computed as ∑ INNO ������  where N 
is the total number of questions. The innovation score is averaged by sector and year. In this graph, 
the fitted values line is plotted on the basis of excluding the sectors Coke & Petroleum and Other 
manuf. & Repair as they appear to be extreme outliers. (Note that the two sectors are also excluded 
from the regressions.) An upwards sloping correlation is visible in the sense that more ICT‐services 
intensive sectors have a higher value on our innovation score. The regressions will show whether the 
index of data policy restrictiveness in Malaysia, as shown in Figure 8, has any role to play.  

Results are reported in Table 5. In there, the regression coefficients for R&D expenditures in column 
1 gives significant results, which is somehow counterintuitive. This unexpected result could be seen 
in light of a reaction by firms to perform more R&D as a consequence of the restricted access to 
foreign markets for their innovation activities that otherwise is essential for digital innovation. In 
column 3, the coefficient result gives a negative and significant outcome. It indicates that firms in 
data‐intense sectors (proxied by their share of ICT‐services inputs) faced with higher levels of data 
policy restrictions is associated with a lower use of firms’ intangible assets as part of their 
production. Both variables of purchased and produced intangible assets in columns 2 and 3 
respectively provide negative coefficient signs but are statistically unimportant.  

 

4.2.2  Vietnam  
In the case of Vietnam, we have a different set of variables although the first variable to use overlaps 
with the Malaysian data: both report whether the firm performs the size of R&D activities. The 
second innovation variable measures more precisely whether the firm’s R&D activities are targeted 


                                                   13 
 
at an innovation that is new for the market or world in which case the value of this variable takes a 
1. If the innovation is only new to the firm, this observation receives a score of 0. The next two 
variables measure whether the firm has any national or international patents which is also provided 
in the Vietnamese data set. Finally, the last variable that is included measures whether the firm 
undertakes a research collaboration in any format. All variables cover the years 2010‐2013, but due 
to our lagged structure only three years can be included.  

Figure 9 first provides an overview of the extent to which the IO coefficients and the Innovation 
Score for the five firm‐level variables are correlated. The Innovation Score for Vietnam is computed 
in similar way as for Malaysia and the IO coefficients are from the Vietnamese IO tables. An upward 
sloping fitted values line is plotted indicating that, on the whole, a positive association exists 
between the two variables. (Note however that for similar reasons as described above the Coke & 
Petroleum sector is excluded as well as Paper & Printing sector when plotting the fitted values. The 
two sectors are also excluded in our regressions.) The sector of Chemicals & Pharmaceuticals has a 
much higher Innovation Score and also a high ICT‐services inputs as part of its total domestic input 
use. On the other hand, a sector like Food & Beverages reports much lower levels on both indicators. 
Figure 10 provides an overview of the Vietnamese developments of the data policy index. 

Results of the regressions as correlations are reported in Table 6. In almost all columns the results 
are statistically insignificant with positive coefficient results. The only variable that is negative and 
significant at the 5 percent level is whether firms target innovations that is new to the market or 
world in column 2. Interestingly, however, is the fact that also in this case the R&D variable in 
column 1 is positive as in the case of Malaysia. When applying a 1‐year lag this result becomes 
significant at the 10 percent level, which is also the case for the research collaboration variable in 
column 5.  

 

4.2.3  China  
For China, we have different data which are not survey‐based. Data on innovation for China are 
generally extremely hard to obtain. We are therefore forced to use data from the Thomson Reuters 
data base that records information of private and public companies whose headquarters are in 
China. Only two variables are recorded that seem relevant for our research purpose which are the 
net intangible assets and R&D expenditure (both in USD). Years span a longer time period, which 
therefore covers our entire duration of the data policy index, namely from 2009‐2019.11 Figure 11 
provides an overview of the developments of our restrictiveness index for China. As one can see, 
little variation can be detected for the country given that the level of restrictiveness is extremely 
high throughout the entire period.  

Figure 10 shows how the variable of R&D expenditures when divided by the number of employees 
for each firm is correlated with our ICT‐services input coefficient. (Note again that the Coke & 
Petroleum sector is excluded.) As one can see, the correlation is positive and tight and shows that 
sectors intensive in the use of ICT services as part of their overall input structure have higher per 
capita firm‐level expenditures on R&D. For the regressions, because we perform an LPM we 

                                                            
11
    The Thomson Reuters data report data by fiscal year which may not entirely overlap with the calendar year 
of our restrictiveness index. Moreover, the database reports data calling each year “Fiscal Year 0”, Fiscal Year ‐
1”, etc. In order to assign a calendar year value for each fiscal year, we assume that the first reporting year of 
the Thomson Reuters data base refers to 2019, which is Fiscal Year 0. Usually the end of fiscal years falls in the 
middle of the calendar year although this may vary by firm or country.   

                                                               14 
 
transform our variables into a binary mode between 1 in case firms report positive values on R&D 
expenditures and intangibles; and a zero when firms do not report any values. Then, we also use the 
size of R&D expenditures as well as the per capita expenditures in our regressions and perform OLS 
to see if these results provide any further evidence.  

Results are reported in Table 7. The first two columns show the results from the LPM regressions for 
R&D expenditures and net intangible assets respectively. The results show that only the outcome on 
the net intangible assets have a negative and significant sign. It therefore suggests that firms active 
in ICT‐services intensive sectors reports lower levels of intangible assets when faced with higher 
levels of data restrictions. Results are not significant or even have the negative expected coefficient 
sign for R&D expenditures, but instead have a positive sign – consistent with the results for Malaysia 
and Vietnam. However, when performing standard OLS regressions using similar variables, the 
results in column 3 show that in this case R&D expenditures have a negative and significant sign. Yet 
the intangible assets variable remains insignificant. The per capita variables in columns 5 and 6 
neither show significant outcomes when performing OLS.  

         
    5. Conclusion 
Given the importance of open markets for firms to successfully innovate with data, policy 
restrictions on data, IPR, platforms and the telecom market are likely to have a knock‐on impact on 
the digital innovation success of firms. Indeed, this paper finds that restrictive policies for a set of 10 
East Asian countries regarding data, online platforms and other data‐related areas, are negatively 
associated with the likelihood of firms to perform innovation, which appears to be particularly true 
for sectors using a high amount of software. Therefore, less restrictive policies regarding data, IPR, 
platforms and telecom do matter for firms to successfully innovate in the digital economy.  

Using firm‐level data for 10 East Asian countries as well as using firm‐level data sets for three specific 
countries in the East Asian region, this paper in particular finds that for countries with a more 
restrictive set of data policies, firms are less likely to use foreign technologies through licensing as 
part of their innovation activities. Moreover, the three country‐specific cases show that restrictive 
data policies are negatively associated with firms’ likelihood to use intangible assets such as patents 
and goodwill for performing innovation (in the case of Malaysia and China) and to develop 
innovations as a result of R&D that are new to the market (in the case of Vietnam). For all cases, we 
therefore conclude that open digital markets free from unnecessary and restrictive policies for data 
and data‐related areas are likely to help firms to innovate.  

Even though this paper only shows correlations, nothing suggests that causal inferences are unlikely 
to be present too in the region. However, one should of course be careful with such conclusion. If 
anything, this paper has shown that closed markets regarding data and other data‐related 
technologies are unlikely to contribute to successful innovations in more digital sectors. Moreover, it 
is telling that the restrictions picked up by our index also have significant bearing for firms using 
intangible assets as part of their innovation process. Many countries around the world, including 
East Asia, currently undergo significant transformations from a tangible economy based on goods 
and commodities towards one that is increasingly based on intangibles such as services, data and 
ideas. It is therefore of utmost importance that countries develop a friendly policy environment in 
which firms can capitalize on these new economic developments while taking into account the 
various legitimate non‐economic objectives that may exist in countries.  

                                   

                                                    15 
 
Bibliography 
Arnold, J., B. Javorcik and A. Mattoo (2011) “The Productivity Effects of Services Liberalization: 
Evidence from the Czech Republic”, Journal of International Economics, Vol. 85, No. 1, pages 136‐
146.  

Arnold, J., B. Javorcik, M. Lipscomb and A. Mattoo (2015) “Services Reform and Manufacturing 
Performance: Evidence from India”, The Economic Journal, Vol. 126, Issue 590, pages 1‐39.  

Bauer, M., F. Erixon, H. Lee‐Makiyama, M. Krol (2013) “The Economic Importance of Getting Data 
Protection Right: Protecting Privacy, Transmitting Data, Moving Commerce”, Washington DC: US 
Chamber of Commerce. 

Bourlès, R., G. Cette, J. Lopez, J. Mairesse and N. Nicoletti (2013) “Do Product Market Regulations in 
Upstream Sectors Curb Productivity Growth? Panel Data Evidence for OECD Countries”, The Review 
of Economics and Statistics, Vol. 95, No. 5, pages 1750‐1768.  

Christensen, L., A. Colciago, F. Etro and G. Rafert (2013) “The Impact of the Data Protection 
Regulation in the EU”. Intertic Policy Paper, Intertic.  

Ferracane, M.F. (2017), “Restrictions on Cross‐Border data flows: a taxonomy”, ECIPE Working Paper 
No. 1/2018, European Center for International Political Economy, Brussels: ECIPE. 

Ferracane, M.F. and E. van der Marel (2018) “Do Data Flows Restrictions Inhibit Trade in Services?”, 
ECIPE DTE Working Paper Series No. 2, Brussels: ECIPE. 

Ferracane, M.F., H. Lee‐Makiyama and E. van der Marel (2018b) “Digital Trade Restrictiveness 
Index”, European Centre for International Political Economy, Brussels: ECIPE.  

Ferracane, M.F., J. Kren and E. van der Marel (2018a) “Do Data Policy Restrictions Impact the 
Productivity Performance of Firms and Industries?”, ECIPE DTE Working Paper Series No. 1, Brussels: 
ECIPE.  

Goldfarb, A. and C. Tucker (2012) “Privacy and Innovation,” in Innovation Policy and the Economy 
(eds.) Josh Lerner and Scott Stern, / University of Chicago Press, pages 65–89. See also NBER 
Working Paper Series No. 17124, National Bureau of Economic Research, Cambridge MA: NBER.  

Guellec, D. and C. Paunov (2018) "Innovation Policies in the Digital Age", OECD Science, Technology 
and Industry Policy Papers, No. 59, OECD Publishing, Paris. 

Horrace, W. and R. Oaxaca (2006) “Results on the Bias and Inconsistency of Ordinary Least Squares 
for the Linear Probability Model," Economics Letters, Vol. 90, No. 3, pages 321‐327.  

Manyika, J., S. Lund, J. Bughin, J. Woetzel, K. and D. Dhingra (2016) “Digital Globalization: The New 
Era of Global Flows”, McKinsey Global Institute, Washington DC: McKinsey and Company. 

OECD (2019) “East Asia Going Digital: Connecting SMEs”, OECD, Paris, www.oecd.org/going‐
digital/East‐asia‐connecting‐SMEs.pdf. 

van der Marel, E., H. Lee‐Makiyama, M. Bauer and B. Verschelde (2016) "A Methodology to Estimate 
the Costs of Data Regulation", International Economics, Vol. 146, Issue 2, pages 12‐39.   

 

                                  

                                                  16 
 
Tables and Figures 
 
Table 1: Categories of the data policy index and weights 

    Categories  Type of measures                                                      Weighting 
  1  Intellectual Property Rights (IPR)                                                  0.25 
       1.1     Restrictions related to the application process                           0.20 
       1.2     Lack of clear copyright exceptions for the digital economy                0.20 
       1.3     Inadequately enforced of copyrights                                       0.20 
       1.4     Mandatory disclosure of business trade secrets                            0.20 
       1.5     Mandatory encryption standards that deviate from int. standards           0.20 
  2  Cross‐border data flows (CBDF)                                                      0.25 
       2.1     Ban to transfer or local processing requirement                           0.25 
       2.2     Local storage requirement                                                 0.25 
       2.3     Conditional flow regime                                                   0.25 
       2.4     Infrastructure requirement (residency requirements)                       0.25 
  3  Domestic use and processing of data (DP)                                            0.15 
       3.1     Minimum / maximum period                                                  0.25 
       3.2     Data protection law in place                                             0.375 
       3.3     Impact assessment (DPIA) or Appoint a data protection officer (DPO)      0.125 
       3.4     Government access to personal data collected                              0.25 
  4  Intermediate liability (IL)                                                         0.10 
         4.1  Safe harbor for intermediaries                                             0.60 
         4.2  Identity / monitoring requirements                                         0.40 
  5  Content access (CA)                                                                 0.15 
         5.1  Blocking or filtering practices                                            0.40 
         5.2  Discriminatory use of license schemes & Bans on cloud services             0.40 
         5.3   Other restrictions                                                        0.20 
  6  Infrastructure & Connectivity (INF)                                                 0.10 
         6.1  Maximum foreign equity share for investment in telecom                     0.50 
         6.2  Anticompetitive practices in the telecom                                   0.50 
Source: Authors’ using Ferracane et al. (2018) 

 

                                 




                                                 17 
 
Table 2: Data policy index by category of restriction and country. 

    Country             IPR       CBDF         DP          IL         CA        INF      Final index 

  Cambodia             0.10        0.00        0.09      0.06        0.06       0.07         0.38 
  China                0.23        0.25        0.09      0.10        0.15       0.09         0.91 
  Hong Kong SAR, 
  China                0.03        0.00        0.00      0.06        0.00       0.00         0.09 
  Indonesia            0.18        0.20        0.07      0.06        0.08       0.07         0.64 
  Japan                0.05        0.10        0.01      0.00        0.00       0.04         0.20 
  Korea, Rep.          0.05        0.15        0.06      0.04        0.03       0.04         0.37 
  Lao PDR              0.05        0.00        0.06      0.10        0.00       0.05         0.26 
  Malaysia             0.08        0.05        0.04      0.00        0.14       0.05         0.35 
  Mongolia             0.10        0.00        0.03      0.06        0.06       0.03         0.27 
  Myanmar              0.13        0.00        0.09      0.10        0.08       0.08         0.47 
  Philippines          0.06        0.05        0.04      0.00        0.06       0.07         0.27 
  Singapore            0.01        0.10        0.06      0.00        0.12       0.03         0.31 
  Taiwan, China        0.09        0.08        0.02      0.00        0.06       0.07         0.30 
  Thailand             0.10        0.25        0.08      0.03        0.12       0.07         0.64 
  Vietnam              0.13        0.25        0.10      0.10        0.15       0.09         0.82 
Note: Latest year taken for 2019. Abbreviations in each column are consistent with Table 1 which 
provides the type of restriction falling into each category. Intellectual Property Rights (IPR); Cross‐
border data flows (CBDF); Domestic use and processing of data (DP); Intermediate liability (IL); 
Content access (CA); Infrastructure & Connectivity (INF). The final column represents the overall 
index score computed as the sum of all sub‐categories (i.e. column 2‐7) 

                                  




                                                     18 
 
Figure 1: Data‐intensities using US Census software expenditures over labor by sector (2010) 


    (D/L) Data-intensity US Census & BLS          Non-Capitalized Software Expenditures over Labour (ISIC Rev 3.1)
                                       3
                                            2.7


                                                    2.2
                            2




                                                           1.8
                                                                 1.6
                                                                       1.5
                                                                              1.4
                                                                                    1.2
               1




                                                                                            0.8   0.7
                                                                                                        0.5
                                                                                                              0.3   0.3    0.3   0.3     0.3
                     0




                                                                       r
                                                                     m




                                                                      s

                                                                     ry
                                                                      n




                                                                    ce
                                                                      rt




                                                                      s




                                                                     ts




                                                                     ss




                                                                      n
                                                                       t
                                                                      e




                                                                      s
                                                                     m
                                                                  en
                                                                    te




                                                                   le
                                                                   al




                                                                   ic
                                                                  io

                                                                  nc




                                                                  io
                                                                 po




                                                                 uc
                                                                 co




                                                                 ne
                                                                eu




                                                               ne
                                                               an
                                                               pu




                                                                pt
                                                                ic
                                                                ic
                                                               at




                                                               at
                                                               m
                                                              ra

                                                             ns




                                                            od
                                                             le




                                                             hi
                                                               l




                                                            eh
                                                             m




                                                             O
                                                             si
                                                           uc




                                                             n




                                                           tro




                                                            re
                                                            ip
                                                          om
                                                          su
                                                         Te




                                                          ac
                                                          Fi
                                                         tra




                                                         he




                                                         bu
                                                         pr
                                                        qu




                                                        ec
                                                          v




                                                          &
                                                       Ed




                                                      Pe
                                                       In




                                                       C




                                                      m
                                                      or
                                                      C




                                                      al
                    &




                                                    od




                                                      R
                                                    .e
                                                     er




                                                     er
                                                   ot




                                                   ic
                                                   &




                                                    g
                 st




                                                  th




                                                  th
                                                Fo
                                               om




                                                tin




                                               ed
                                                M
                                                 e
    Po




                                               O




                                               O
                                             ok




                                             en
                                             C




                                             M
                                           C




                                           R
                                                                                                                                                
Source: US Labor Statistics and US Census. 

 

Figure 2: Data policy index by country and type (2019) 


                                                           Data policy restrictiveness index for digital innovation
                     1
    Restrictiveness index (0-1)
           .4       .6
                     .2
                     0       .8




                                            CHN VNM IDN THA MMR KHM KOR MYS SGP TWN PHL MNG LAO JPN HKG
                                                          CA           CBDF                DP           IL           INF               IPR

                                                                                                                                                
Source: Authors’ using Ferracane and van der Marel (2018). Note: Latest year taken for 2019. 
Abbreviations in each column are consistent with Table 1 which provides the type of restriction 
falling into each category. Intellectual Property Rights (IPR); Cross‐border data flows (CBDF); 
Domestic use and processing of data (DP); Intermediate liability (IL); Content access (CA); 
Infrastructure & Connectivity (INF). 

 

                                                                                          19 
 
Figure 3: Level of data policy restrictiveness over time. 


                                            Level of data policy restrictiveness (weighted)
                  1             .8
    Restrictiveness index (0-1)
            .4    .2
                  0   .6




                                     2008       2010         2012     2014        2016        2018
                                                                                                      
Source: ECIPE. Note: Countries include the ones covered under Figure 2. A weighted average is 
constructed using GDP as weights in order to reflect size of market (some markets such as 
China are huge, whereas others are small such as Hong‐Kong). Checks with population have 
been done with similar increasing trend (albeit smaller). 

                                                          




                                                                       20 
 
Figure 4: Level of data policy restrictiveness over time, by country 



                                                           Cambodia                        China                     Hong Kong
                                  0 .2 .4 .6 .8 1




                                                           Indonesia                       Japan                       Korea
                                  0 .2 .4 .6 .8 1
    Restrictiveness index (0-1)




                                                             Laos                         Malaysia                   Mongolia
                                  0 .2 .4 .6 .8 1




                                                           Myanmar                       Philippines                 Singapore
                                  0 .2 .4 .6 .8 1




                                                            Taiwan                        Thailand                    Vietnam
                                  0 .2 .4 .6 .8 1




                                                    2010      2015         2020   2010         2015    2020   2010      2015     2020

                                                                                         Year
                                                                                                                                         
Source: ECIPE 

                                                                        


                                                                                         21 
 
Figure 5: Level of data policy restrictiveness and innovation score in digital services (2018) 

                                                    Data policy index & Digital innovation
                        .5                                                                            CHN
                                                          MNG
    Indicator of innovation in digital sector
                                       .4




                                                          PHL
                           .3




                                                                   MYS    MMR         IDN
                                                         LAO
                 .2




                                                                    KHM
                                                                                           VNM
      .1




                                                                         THA
                        0




                                                0   .2           .4              .6              .8     1
                                                             Restrictiveness index (0-1)
                                                                                                             
Source: ECIPE and World Bank Enterprise Survey Database. Note: Latest year taken for 2019. The 
indicator of innovation represents the “h5” question in the World Bank Enterprise Survey Database, 
which asks whether the establishment introduced new or significant improved or introduced new 
process of organizational or management structures over the last 3 years. For this figure, only the 
sectors of Computer and related services; Publishing, printing and recorded media; and Post and 
Telecommunication (ISIC Rev. 4) have been selected and averaged by country. The trend line is 
plotted excluding China.  

 

 

                                                                                   

                                                                




                                                                                22 
 
Figure 6: Level of data policy restrictiveness and ICT‐services imports (2018) 

                                                                Data policy index & ICT-services imports
                                                                     JPN
    Computer services (% of commercial service imports)
                                                 60




                                                                            SGP
                                      50




                                                                              KOR
                             40




                                                                           MNG         THA
                                                              HKG
                                                                                                      IDN
                 30




                                                                           PHL
                                                                                                                 CHN
       20




                                                                                 KHM
                                                          0         .2           .4              .6         .8     1
                                                                             Restrictiveness index (0-1)
                                                                                                                        
Source: ECIPE and World Bank Development Indicators. Note: Digital services include computer, 
communications and digital services such activities as international telecommunications, and postal 
and courier services; computer data; news‐related service transactions between residents and non‐
residents; construction services; royalties and license fees; miscellaneous business, professional, and 
technical services; and personal, cultural, and recreational services. The trend line is plotted with 
China included.  

 

 

                                                                              




                                                                                                23 
 
Table 3: LPM results for regressions as correlations  

                                       (1)                   (2)                   (3)                   (4) 
                                       h1                    h5                    e6                    h8 
                 
    Index * ln(D/L)                  0.016                 0.025                ‐0.041**                0.003 
                                     (0.310)               (0.215)               (0.025)               (0.881) 
    Constant                       0.265***              0.432***              0.135***               0.188*** 
                                     (0.000)               (0.000)               (0.000)               (0.000) 
                  
  Observations                              8775                   7694                 7654               7773 
  R2A                                      0.142                   0.140               0.033              0.143 
  R2W                                      0.000                   0.000               0.001              0.000 
  RMSE                                     0.397                   0.454               0.380              0.359 
Note:  *  p<0.10;  **  p<0.05;  ***  p<0.01,  representing  p‐values  not  standard  errors.  The  dependent 
variable  h1  stands  for  whether  a  new  product  /  services  has  been  introduced  over  the  last  3  years? 
Yes = 1 | No = 0. The variable h5 whether the firm has new process introduced over the last 3 years?  
Yes  =  1  |  No  =  0.  The  variable  e6  whether  the  firm  has  used  technology  licensed  from  a  foreign 
company? Yes = 1 | No = 0. The variable h8 whether the firm spent on new R&D (excl. market research) 
in  last  3  years?   Yes  =  1  |  No  =  0.  The  term  (D/L)  is  comprised  of  non‐capitalized  computer  software 
expenditures over labor. Fixed effects by country, sector and year applied separately. Robust standard 
errors clustered by country‐sector. 
 

 

 

Table 4: Probit estimates for regressions as correlations 

                                         (1)                   (2)                  (3)                   (4) 
                                         h1                    h5                   e6                    h8 
                  
    Index * ln(D/L)                    ‐0.014                0.121              ‐0.336***               ‐0.122 
                                       (0.887)              (0.283)               (0.001)               (0.297) 
                   
  Observations                                 9988                   8855               9276               8933 
  LR chi2(10)                               1193.96                1072.52              215.74             947.12 
  No. groups                                    32                     32                 24                 12 
  Log likelihood                            ‐4690.1                ‐5207.8             ‐4045.7            ‐3475.1 
Note:  *  p<0.10;  **  p<0.05;  ***  p<0.01,  representing  p‐values  not  standard  errors.  The  dependent 
variable  h1  stands  for  whether  a  new  product  /  services  has  been  introduced  over  the  last  3  years? 
Yes = 1 | No = 0. The variable h5 whether the firm has new process introduced over the last 3 years?  
Yes  =  1  |  No  =  0.  The  variable  e6  whether  the  firm  has  used  technology  licenced  from  a  foreign 
company? Yes = 1 | No = 0. The variable h8 whether the firm spent on new R&D (excl. market research) 
in  last  3  years?   Yes  =  1  |  No  =  0.  The  term  (D/L)  is  comprised  of  non‐capitalized  computer  software 
expenditures over labor. Fixed effects by country, sector and year applied separately. Data is grouped 
by sector. 


                                                         24 
 
Figure 7: Correlation between Innovation Score and IO Coefficient of ICT‐services (2015), Malaysia 

                                                  Innovation Score and IO Coefficient
             .1
                                                                                                            Computer & Electr.
             .08




                                              Coke & Petroleum
                                                                                Chemicals & Pharm.
      Innovation Score




                                                                                     Motor vehicles
                                                                          Rubber & Plastics
               .06




                                                   Other transport                        Electrical equipment



                                                             Basic metals                  Machinery and equip.
    .04




                                                                              Non-metallic mineral
                                                       Food & Beverages

                                                    Wood products
                                                                                                 Paper & Printing
                                                                                    Metal products               Other manuf. & Repair
             .02




                                                                                  Textiles & Wearing

                                          0     .001                .002            .003                      .004                .005
                                                                 IO Coefficients (Domestic)
                                                                                                                                          
Source: World Bank. Note: the IO Coefficients are computed as the fraction of ICT‐services usage in 
total input use for each industry in Malaysia using IO tables. Innovation represents a composite 
indicator varying between 0‐1 of all innovation variables from the Malaysian firm‐level dataset (see 
text for further explanations). The fitted values line is plotted on the basis of excluding Coke & 
Petroleum and Other manuf. & Repair.  

Figure 8: Malaysia’s level of data policy restrictiveness (2010‐2015) 


                                               Level of data innovation restrictiveness
                           .4
            Data innovation restrictiveness
            .1           .20           .3




                                                       2010                                          2015
                                                                                                                                          
Source: ECIPE  

                                                                                         25 
 
Table 5: LPM estimates for regressions as correlations for Malaysia 

                                            (1)                  (2)                  (3)                  (4) 
                                                            Intg. Assets         Intg. Assets         Intg. Assets 
                                           R&D 
                                                               Purch.               Used                 Prod. 
                      
    Index * (D/T)                         1.296**                ‐0.754           ‐0.090**               ‐0.017 
                                          (0.043)                (0.177)            (0.045)              (0.572) 
    Constant                              ‐0.006                0.126***          0.011***              0.005** 
                                          (0.897)                (0.009)            (0.004)              (0.043) 
                      
  Observations                              39206               39206               39206                39206 
  R2A                                       0.476               0.245               0.067                0.064 
  R2W                                       0.005               0.002               0.000                0.000 
  RMSE                                      0.216               0.214               0.059                0.060 
Note:  *  p<0.10;  **  p<0.05;  ***  p<0.01,  representing  p‐values  not  standard  errors.  The  term  (D/T)  is 
computed as the proportion of ICT‐services as input in total input expenditure for each sector. R&D is 
Research and Development activities in  thousands RM. Assets Purch.  is other assets (such as patent, 
goodwill, work in progress) new purchases including imports of both new and used assets in thousands 
RM.  Assets  used  is  other  assets  (such  as  patent,  goodwill,  work  in  progress)  purchases  of  assets 
previously used in Malaysia including those reconditioned or modified before acquisition in thousands 
RM.  Assets  prod.  is  other  assets  (such  as  patent,  goodwill,  work  in  progress)  assets  produced  by  the 
establishment  for  its  own  use,  the  costs  of  all  works  done  during  the  year  should  be  recorded  in 
thousands RM. All dependent variables are transformed into a binary mode varying between 0‐1 with 
a  value  of  1  assigned  for  any  value  >  0.  Robust  standard  errors  clustered  by  sector.  Fixed  effects  by 
firm, sector and year are applied. A lag of 1 year is also applied.  

                                       




                                                          26 
 
Figure 9: Correlation between Innovation score and IO Coefficient of ICT‐services (2013), Vietnam 

                                                 Innovation Score and IO Coefficient
             .1

                                                                                         Chemicals & Pharm.
             .08
    Innovation Score




                                                                   Electrical
                                                                         Motorequipment
                                                                                   Machinery and equip.
                                                                                vehicles
                                                                                 Computer & Electr.
            .06
             .04




                                                                          Other transport
                                                                          Non-metallic mineral
                                              Basic
                                           Food     metals
                                                & Beverages        Rubber & Plastics
                                                                 Metal products


                                                                    Other &
                                                                  Textiles manuf. & Repair
                                                                            Wearing                           Paper & Printing
             .02




                                                                                   Wood products

                              .0005                  .001               .0015                       .002            .0025
                                                              IO Coefficients (Domestic)
                                                                                                                              
Source: World Bank. Note: the IO Coefficients are computed as the fraction of ICT‐services usage in 
total input use for each industry in Vietnam using IO tables. Innovation variable in this figure 
represents a composite indicator measuring (i) whether the firm undertakes R&D; (ii) whether R&D 
is new to the market or world; (iii) whether the firm has national and international patents; (iv) 
whether the firm is involved in any research collaborations. The fitted values line is plotted on the 
basis of excluding Paper & Printing. Coke & Petroleum is omitted because of lack of credible data.  

Figure 10: Vietnam’s level of data policy restrictiveness (2010‐2013) 


                                                              Level of data innovation restrictiveness
                        .8
         Data innovation restrictiveness
         .2           .40           .6




                                              2010                      2011                          2012                   2013
                                                                                                                                     
Source: ECIPE  


                                                                                     27 
 
Table 6: LPM estimates for regressions as correlations for Vietnam 

                              (1)             (2)                  (3)              (4)                (5) 
                             R&D            R&D new            Patent Nat.      Patent Int.          Collab. 
              
    Index * (D/T)           0.305            ‐3.490**            0.006             0.003              0.072 
                            (0.148)           (0.043)            (0.914)           (0.882)           (0.149) 
    Constant              0.064***          0.763***            0.008***          0.003**           0.006** 
                            (0.000)           (0.000)            (0.006)           (0.024)           (0.034) 
              
 Observations               20462               1123              20473            20473               20392 
 R2A                        0.283              0.485              0.251             0.067              0.172 
 R2W                        0.000              0.007              0.000             0.000              0.000 
 RMSE                       0.230              0.355              0.079             0.053              0.088 
Note:  *  p<0.10;  **  p<0.05;  ***  p<0.01,  representing  p‐values  not  standard  errors.  The  term  (D/T)  is 
computed  as  the  proportion  of  ICT‐services  as  input  in  total  input  expenditure  for  each  sector.  R&D 
measures  whether  the  firms  undertakes  any  R&D  activities  Yes  =  1  |  No  =  0;  R&D  new  measures 
whether the firms has R&D activities that target at  an  innovation that is  new to the market or world 
Yes = 1 | No = 0; Patent Nat. measures whether the firm has national patents Yes = 1 | No = 0; Patent 
Int. measures whether the  firm has  international patents Yes = 1 |  No = 0; Collab measures whether 
the firm is involved in any research collaborations Yes = 1 | No = 0. Robust standard errors clustered 
by sector. Fixed effects by firm, sector and year are applied. A lag of 2 year is also applied.  

 

 
                                        




                                                         28 
 
Figure 11: Correlation between R&D expenses and IO Coefficient of ICT‐services (2015), China 

                                                                 R&D per empl. and IO Coefficient
            2                                                                                                                               Computer & Electr.


                                                                           Other transport


                                                                               Rubber & Plastics              Chemicals & Pharm.
                                                                                                              Electrical equipment
                                                                                                      Non-metallic mineral
                                                                                           Metal products
                  1.5




                                                                                                    Other manuf. &Machinery   and equip.
    R&D per employee




                                                                                                                   Repair
                                                          Basic metals


                                                                                                           Paper & Printing




                                                          Textiles & Wearing
    1




                                               Food & Beverages


                                                                Wood products
            .5




                                           .0005                    .001                   .0015                       .002                 .0025
                                                                               IO Coefficients (Domestic)
                                                                                                                                                              
Source: World Bank. Note: the IO Coefficients are computed as the fraction of ICT‐services usage in 
total input use for each industry in China using IO tables. Innovation variable in this figure represents 
R&D expenditures over employee for each firm. Coke & Petroleum is omitted because of lack of 
credible data.  

Figure 12: China’s level of data policy restrictiveness (2009‐2019) 


                                                             Level of data innovation restrictiveness
                           1              .8
            Data innovation restrictiveness
           .2        .4    0   .6




                                               2009            2011                2013               2015               2017              2019
                                                                                                                                                   
Source: ECIPE  



                                                                                                        29 
 
Table 7: LPM and OLS estimates for regressions as correlations for China 

                                (1)           (2)             (3)           (4)           (5)        (6) 
                               LPM           LPM             OLS            OLS          OLS         OLS 
                               R&D           Intg.           R&D           Intg.         R&D        Intg. 
                                   
                                                 
                                                                 
                                                                               
                                                                                       per empl.  per empl. 
                
    Index * (D/T)              2.946      ‐6.216***  ‐49.178***  ‐2.952    ‐5.062                      4.587 
                              (0.634)       (0.000)    (0.000)   (0.707)   (0.630)                    (0.601) 
    Constant                   0.444      1.319***  19.302***  16.531***  8.449***                   8.549*** 
                              (0.357)       (0.000)    (0.000)   (0.000)   (0.000)                    (0.000) 
                
  Observations                   38133       38133          25484          31933         24362          30696 
  R2A                            0.684        0.508          0.876          0.822        0.828          0.716 
  R2W                            0.000        0.000          0.002          0.000        0.000          0.000 
  RMSE                           0.264        0.258          0.504          0.684        0.487          0.626 
Note:  *  p<0.10;  **  p<0.05;  ***  p<0.01,  representing  p‐values  not  standard  errors.  The  term  (D/T)  is 
computed  as  the  proportion  of  ICT‐services  as  input  in  total  input  expenditure  for  each  sector.  The 
first  to  column.  Are  the  transformed  variables  that  vary  between  0‐1  and  are  regressed  using  LPM. 
R&D  is  Research  and  Development  expenditures  by  firm  in  USD.  Intg.  represents  the  net  intangible 
assets by the firm (such as patent, goodwill, work in progress) in USD. Empl. stands for the number of 
employees for each firm. R&D Rev Share stands for the R&D expenditures as a share of total revenue 
for  each  firm.  All  dependent  variables  except  R&D  Rev  Share  are  transformed  into  logs.  Robust 
standard errors clustered by sector. Fixed effects by firm, sector and year are applied. A lag of 2 year 
is also applied.  

 

 
 
 
                                     




                                                       30 
 
Annex A: World Bank Enterprise Survey  
 

Table A1: Firm distribution and frequency World Bank Enterprise Survey Database by country 

 Country                                   Freq.                  Percent                   Cum. 
 Cambodia                                    845                     5.30                    5.30 
 China                                     2,700                    16.93                   22.23 
 Indonesia                                 2,763                    17.32                   39.55 
 Lao PDR                                     970                     6.08                   45.63 
 Malaysia                                  1,000                     6.27                   51.90 
 Mongolia                                    722                     4.53                   56.43 
 Myanmar                                   1,239                     7.77                   64.20 
 Philippines                               2,661                    16.68                   80.88 
 Thailand                                  1,000                     6.27                   87.15 
 Vietnam                                   2,049                    12.85                  100.00 
 Total                                    15,949                   100.00                          
Source: World Bank Enterprise Survey Database. Note: Data is over the years 200‐2016 
 

 

 

 

 

 

 

 

Table A2: Firm distribution and frequency World Bank Enterprise Survey Database by year 

 year                                      Freq.                  Percent                   Cum. 
 2009                                      4,184                    26.23                   26.23 
 2012                                      2,970                    18.62                   44.86 
 2013                                        832                     5.22                   50.07 
 2014                                        632                     3.96                   54.03 
 2015                                      4,651                    29.16                   83.20 
 2016                                      2,348                    14.72                   97.92 
 2018                                        332                     2.08                  100.00 
 Total                                    15,949                   100.00    
Source: World Bank Enterprise Survey Database 
 

                                




                                                31 
 
Table A3: Summary statistics for the variable used (2009‐2016) 

    Variable               Observations         Mean           Std. Dev.         Min           Max 
               
    Cambodia            
    Index * ln(D/L)            324              ‐1.03            0.26            ‐1.64         0.14 
    h1                         316               0.14            0.35            0.00          1.00 
    h5                         300               0.34            0.48            0.00          1.00 
    e6                          86               0.22            0.42            0.00          1.00 
    h8                         306               0.11            0.32            0.00          1.00 
               
    China                                                                                   
    Index * ln(D/L)           2,492             ‐2.16            1.15            ‐4.13         0.97 
    h1                        2,543              0.46            0.50            0.00          1.00 
    h5                        1,535              0.65            0.48            0.00          1.00 
    e6                        1,529              0.24            0.43            0.00          1.00 
    h8                        1,525              0.42            0.49            0.00          1.00 
               
    Indonesia           
    Index * ln(D/L)           2,307             ‐1.16            0.48            ‐2.39         0.56 
    h1                        1,142              0.10            0.30            0.00          1.00 
    h5                        1,139              0.17            0.38            0.00          1.00 
    e6                        1,877              0.21            0.41            0.00          1.00 
    h8                        1,144              0.05            0.22            0.00          1.00 
               
    Malaysia            
    Index * ln(D/L)            834              ‐0.71            0.33            ‐1.29         0.30 
    h1                         830               0.07            0.25            0.00          1.00 
    h5                         827               0.51            0.50            0.00          1.00 
    e6                         416               0.23            0.42            0.00          1.00 
    h8                         831               0.20            0.40            0.00          1.00 
               
    Mongolia            
    Index * ln(D/L)            627              ‐0.79            0.29            ‐1.30         0.28 
    h1                         326               0.26            0.44            0.00          1.00 
    h5                         325               0.42            0.49            0.00          1.00 
    e6                         160               0.18            0.38            0.00          1.00 
    h8                         327               0.19            0.39            0.00          1.00 
 




                                                    32 
 
Table A3: Summary statistics for the variable used (2009‐2016), continued 

    Variable               Observations     Mean         Std. Dev.           Min      Max 
               
    Myanmar             
    Index * ln(D/L)           1,022         ‐1.26          0.35              ‐2.16    0.22 
    h1                        1,041          0.17          0.38              0.00     1.00 
    h5                        1,040          0.26          0.44              0.00     1.00 
    e6                         522           0.07          0.25              0.00     1.00 
    h8                        1,019          0.02          0.15              0.00     1.00 
               
    The Philippines     
    Index * ln(D/L)           2,329         ‐0.66          0.30              ‐1.34    0.31 
    h1                        1,157          0.22          0.41              0.00     1.00 
    h5                        1,124          0.44          0.50              0.00     1.00 
    e6                        1,624          0.18          0.39              0.00     1.00 
    h8                        1,143          0.22          0.41              0.00     1.00 
               
    Thailand            
    Index * ln(D/L)            835          ‐0.93          0.33              ‐1.60    0.18 
    h1                         773           0.08          0.26              0.00     1.00 
    h5                         754           0.18          0.39              0.00     1.00 
    e6                         498           0.12          0.33              0.00     1.00 
    h8                         797           0.04          0.18              0.00     1.00 
               
    Vietnam             
    Index * ln(D/L)           1,741         ‐1.34          0.70              ‐3.10    0.27 
    h1                         860           0.24          0.43              0.00     1.00 
    h5                         831           0.44          0.50              0.00     1.00 
    e6                        1,160          0.11          0.31              0.00     1.00 
    h8                         862           0.22          0.41              0.00     1.00 




                                                33 
 
Figure A1: Cumulative firm distribution for innovation variables by sector World Bank Enterprise Survey Database (ISIC Rev 3.1) 


100.00


    90.00
                h5      h1     e6      h8
    80.00


    70.00


    60.00


    50.00


    40.00


    30.00


    20.00


    10.00


     0.00
            15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 45 50 51 52 55 60 61 62 63 64 71 72 93 98  

Source:  Authors’ using World Bank Enterprise Survey. Numbers on the horizontal axis reflect 3‐digit ISIC 3.1 sectors. 



                                                                              34 
 
Annex B: Malaysian firm‐level data 
Table B1: Selected questions from Malaysian firm‐level survey  

Survey Question 9.12: Research and development expenditure 

        Refers to expenditure incurred on Research and Development (R&D) activities. R&D is the 
        systematic study of new processes, techniques and applications & products. 

        (a) In‐House: The percentage of expenses incurred by the establishment itself for the 
            purposes of research and development. 
        (b) Outsource: The percentage of expenses paid to other establishment for the purposes of 
            research and development. 

Survey Section C2: Does your establishment has any Intellectual Property (IP) Protection System? 

        Intellectual Property is the exclusive rights provided by law for a certain period of time to 
        the creators of the works to control the use of their work. Intellectual property refers to 
        patents, trademarks (including: brand registered / insured), copyright and related rights and 
        others. 

        1. Patent is an exclusive right granted by the Government for a new invention, whether it is 
           a product or a process. The protection of a patent is 20 years from the filing date. 
        2. Trademark may consist of words, logos, pictures, names, letters, numbers or a 
           combination of such elements. It is a marketing tool that allows users to recognize and 
           associate a release with certain dealers. Also known as mark, brand, logo or trademark is 
           a sign placed on goods or services produced by a manufacturer to identify and 
           distinguish it from goods or services produced by other parties.  
        3. Copyright in Malaysia is a work automatically protected when it meets the following 
           conditions: 
            ‐ Sufficient efforts have been made to make the work original in nature; 
            ‐ The works were written, recorded or made in the form of materials and the creator 
                is a qualified person; 
            ‐ The works were made in Malaysia or the first publication of the work, is in Malaysia. 
            ‐ Industrial design was the overall exterior appearance of an item or product. Shape 
                or configuration is three dimensional aspects while decorative patterns or includes 
                two dimensions. The characteristics of three dimensional or two or both which 
                appear on finished goods shall be through the industrial method. These features will 
                provide a unique appearance in an item or product. 
            ‐ Geographical indication is an indication which identifies any goods as originating 
                from a country or territory, or a region or a place in the national territory or, where 
                quality is determined, or the reputation of the other characteristics of the goods is 
                essentially attributable to their geographical origin. Geographical indications can be 
                used above or natural or agricultural produce discharges or handicraft industry.  
            ‐ Layout Designs of integrated circuits is the three‐dimensional arrangement of the 
                elements of an integrated circuit and part or all of the relationships that integrated 
                circuits or such three‐dimensional arrangement prepared for an integrated circuit 
                intended to be manufactured. The law that protects the layout design of integrated 
                circuits is the Layout‐Designs of Integrated Circuits Act 2000. 

Source: Economic Census Malaysia 2016, Department of Statistics Malaysia. 

                                                  35 
 
Table B2: Firm distribution and frequency Malaysian firm‐level data by year 

    Year                                     Freq.                    Percent          Cum. 
    2010                                    34,896                      47.31          47.31 
    2015                                    38,861                      52.69         100.00 
    Total                                   73,757                     100.00    
 

 

 

 

 

 

 

 

 

 

Table B3: Summary statistics for the variable used for Malaysia (2010 & 2015) 

    Variable            Observations        Mean         Std. Dev.           Min     Max 
    Index * (D/T)         73,757            0.09           0.03              0.03    0.14 
    R&D                   73,757            0.07           0.26              0.00    1.00 
    Assets Purch.         73,757            0.04           0.20              0.00    1.00 
    Assets Used           73,757            0.00           0.05              0.00    1.00 
    Assets Prod.          73,757            0.00           0.05              0.00    1.00 
 

                                 




                                                 36 
 
Annex C: Vietnam firm‐level data 
 
Table C1: Selected questions from Vietnamese firm‐level survey 

Section E: Technological and innovation capacity  

       Refers to set of questions that is concerned with the innovative capacities and the 
       organization of technological progress in the enterprise of the respondent. 

       Question 8.3:   Does your enterprise undertake research and development (R&D) activities 
                       in order to develop new technologies? Answers: 1. Yes | 2. No, if no skip to 
                       question 8.4 

                       The R&D activities are target at an innovation that is… (Circle the most 
                       suitable answer) Answers: 1. New to the enterprise | 2. New to the market | 
                       3. New to the world 

       Question 8.3:   How many national patents do you hold? Answers: 1. New in 2013 … | 2. 
                       Stock / total (the end of 2013) …  

       Question 8.4:  How many international patents do you hold? Answers: 1. New in 2013 … | 
                      2. Stock / total (the end of 2013) …  

       Question 8.5:   Are you currently involved in any research collaborations? Answers: 1. Yes, 
                       since … (year) | 2. No, skip to section 8.7 

Source: Survey Questionnaire Technology Use in Production, General Statistical Office Vietnam 

                                




                                                37 
 
Table C2: Firm distribution and frequency Vietnam firm‐level data by year 

     Year                                    Freq.                    Percent           Cum. 
     2010                                    7,890                      25.12           25.12 
     2011                                    8,292                      26.40           51.52 
     2012                                    7,577                      24.12           75.65 
     2013                                    7,649                      24.35          100.00 
     Total                                  31,408                     100.00    
 

  

 

 

 

 

 

 

 

Table C3: Summary statistics for the variable used for Vietnam (2010‐2013) 

     Variable           Observations        Mean         Std. Dev.            Min     Max 
     Index * (D/T)        31,397            0.07           0.03               0.03    0.16 
     R&D                  31,381            0.09           0.28               0.00    1.00 
     R&D new               2,700            0.57           0.50               0.00    1.00 
     Patent Nat.          31,397            0.01           0.10               0.00    1.00 
     Patent Int.          31,397            0.00           0.06               0.00    1.00 
     Collab.              31,247            0.01           0.10               0.00    1.00 
 

                                 




                                                 38 
 
Annex D: China firm‐level data 
 

Table D1: Selected variables from the Thomson Reuters Data base 

    Variable  Description 
    R&D       Represents expenses for research and development of new products and services by a 
              company in order to obtain a competitive advantage. In unit USD 
               
    Intg.     Represents Intangibles, which are the gross intangibles reduced by accumulated 
              intangible amortization and is reported in USD Intangibles consists of patents, 
              copyrights, franchises, goodwill, trademarks, trade names, secret processes, and 
              organization costs.  
 

                                  




                                                 39 
 
Table D2: Firm distribution and frequency China firm‐level data by year 

    Year                                    Freq.                  Percent                Cum. 
    2009                                    4,237                     9.09                 9.09 
    2010                                    4,237                     9.09                18.18 
    2011                                    4,237                     9.09                27.27 
    2012                                    4,237                     9.09                36.36 
    2013                                    4,237                     9.09                45.45 
    2014                                    4,237                     9.09                54.55 
    2015                                    4,237                     9.09                63.64 
    2016                                    4,237                     9.09                72.73 
    2017                                    4,237                     9.09                81.82 
    2018                                    4,237                     9.09                90.91 
    2019                                    4,237                     9.09                  100 
    Total                                  46,607                      100    
 

 

 

 

 

 

Table D3: Summary statistics for the variable used for China (2009‐2019) 

    Variable               Observations              Mean     Std. Dev.           Min      Max 
    Index * (D/T)                46,607               0.07         0.03           0.02     0.12 
    R&D (0‐1)                    46,607               0.55         0.50           0.00     1.00 
    Intg. (0‐1)                  46,607               0.79         0.41           0.00     1.00 
    R&D                          25,708              15.45         1.44           2.96    21.53 
    Intg.                        36,592              16.20         1.66           4.97    23.14 
    R&D per empl.                24,560               8.05         1.18          ‐2.45    13.89 
    Intg. Per empl.              34,686               8.83         1.21          ‐4.94    16.90 
 




                                                 40