New Approaches to the Identification of Low-Frequency Drivers: An Application to Technology Shocks

This paper addresses the identification of low-frequency macroeconomic shocks, such as technology, in Structural Vector Autoregressions. Whilst identification issues with long-run restricted VARs are well documented, the recent attempt to overcome said issues using the Max-Share approach of Francis et al. (2014) and Barsky and Sims (2011) has its own shortcomings, primarily that they are vulnerable to bias from confounding non-technology shocks. A modification to the Max-Share approach and two further spectral methods are proposed to improve empirical identification. Performance directly hinges on whether these confounding shocks are of high or low frequency. Applied to US and emerging market data, spectral identifications are most robust across specifications, and non-technology shocks appear to be biasing traditional methods of identifying technology shocks. These findings also extend to the SVAR identification of dominant business-cycle shocks, which are shown will be a variance-weighted combination of shocks rather than a single structural driver.


Policy Research Working Paper 9047
This paper addresses the identification of low-frequency macroeconomic shocks, such as technology, in Structural Vector Autoregressions. Whilst identification issues with long-run restricted VARs are well documented, the recent attempt to overcome said issues using the Max-Share approach of Francis et al. (2014) and Barsky and Sims (2011) has its own shortcomings, primarily that they are vulnerable to bias from confounding non-technology shocks. A modification to the Max-Share approach and two further spectral methods are proposed to improve empirical identification. Performance directly hinges on whether these confounding shocks are of high or low frequency. Applied to US and emerging market data, spectral identifications are most robust across specifications, and non-technology shocks appear to be biasing traditional methods of identifying technology shocks. These findings also extend to the SVAR identification of dominant business-cycle shocks, which are shown will be a variance-weighted combination of shocks rather than a single structural driver.

Introduction
In this paper we revisit the use of Structural Vector Autoregressions (hereafter, SVAR) to identify shocks with highly persistent impacts, focusing on its specific application to the identification of technology shocks. We examine the recently proposed solutions to address known drawbacks of the traditional long-run restriction by Francis et al. (2014) and Barsky and Sims (2011), and unearth a key weakness of these methodologies that have so far been ignored, or more accurately not recognized. We suggest further modifications to the existing methods and point out the features of the data generating process that will result in the choice of one modification over another.
The use of SVARs to 'look-through' cyclical changes in productivity and isolate structural developments, or changes in technology, can be traced back to Blanchard and Quah (1989). In their approach, long-run restrictions are imposed in a structural VAR to separate the effects of temporary 'demand' and permanent 'supply' shocks on GDP. This methodology was later adapted by Galí (1999) to specifically identify technology shocks in a two-variable VAR containing log-differences of productivity and hours worked. 1 Approaches using long-run restrictions are advantageous on a number of grounds, requiring only a parsimonious selection of variables and imposing no restriction on the short-run impacts of technology shocks; there are, however, some controversy associated with the latter. 2 Long-run restrictions have been criticized on two main grounds, one economic and the other econometric: first, it is restrictive to assume that technology is the only shock that can affect productivity in the long-run; 3 and second, econometrically, that imposing long-run restrictions on a finite sample leads to biased and inefficient estimates.
It is the second strand of the literature we investigate more deeply in this paper, notably the pitfalls of using alternative medium-run restrictions to identify longrun shocks. The medium-run identification strategy is more robust to estimation on finite samples, identifying technology shocks as those which contribute the most to the forecast error variance decomposition (FEVD) of labor productivity at the 10-year horizon (Francis et al., 2014). We hereafter refer to this approach as the 'Max-Share' identification. This strategy has also been used to identify technology 1 Whereas Christiano et al. (2004) challenge the inclusion of hours in first differences in Galí (1999), noting that the level of hours worked is likely to be a stationary variable, not requiring differencing. 2 In a real-business cycle model, hours rise in response to a positive technology shock. In contrast, technology shocks cause total hours to fall in a New Keynesian model with sticky prices. The technology VAR literature is generally agnostic about the short-run impact of technology shocks, imposing only long or medium-run restrictions. However, Dedola and Neri (2007) have estimated a sign-restricted VAR based on impulse-response commonalities between an RBC and a New Keynesian model under a variety of parameterizations. 3 Several strands of research have suggested other non-technology shocks that may also result in permanent effects on labor productivity. Mertens and Ravn (2013) find that changes in taxation can have long-running effects of productivity, which once controlled for, lead to different dynamics of macroeconomic variables in response to identified technology shocks. Fisher (2006) separates productivity shocks into those that specifically apply to investment goods (IST shocks), and neutral technology shocks that affect aggregate production. In addition, Francis and Ramey (2005) and Uhlig (2004) have debated the extent to which the presence of these additional shocks alters the literature's findings on the impact of technology shocks on hours worked. news shocks, which reflect changes in the perception of future long-run technology developments (Barsky and Sims, 2011). Here, 'news' shocks are identified as those that maximize the contribution to the productivity FEVD, but are orthogonal to 'surprise' technology shocks (using a utilization-adjusted technology series to capture these 'surprise' shocks). 4 An important gap in the literature is how to identify technology shocks in the presence of confounding shocks, a key feature of the data. One significant exception to this is Chari et al. (2008), who find that long-run restrictions are only unbiased when technology shocks dominate as a driver of output and non-technology shocks play a small role in a DSGE setting. In circumstances where non-technology shocks play a material role in driving macroeconomic variables, long-run identifications will not only be biased, but will often show small and significant confidence intervals around biased results -they will be 'confidently wrong' about the impact of technology shocks. There is evidence that non-technology shocks are likely to play a larger role in output fluctuations than commonly assumed. A spectral decomposition of the levels and differences of logged US labor productivity (hours-based) demonstrates the presence of a range of high, business, and low frequency shocks ( Figure 1). In addition, the literature has found that technology shocks account for between 2 and 63% of output variability (Galí and Rabanal (2005), Christiano et al. (2003), Chari et al. (2008)). The issues raised by Chari et al. (2008) for long-run identifications have yet to be raised for newer alternatives. We show that the Max-Share approach can be biased by confounding non-4 Ben Zeev and Khan (2015) have extended the news literature to cover IST shocks in addition to neutral technology shocks. They also note that identification based purely on long or medium run restrictions can conflate technology surprise shocks with changes in expectations of future developments in technology.
technology shocks just as long-run identifications can. We consider three alternative approaches to sharpen the identification of technology shocks in the presence of confounding shocks. The first is a modification to the Max-Share approach. This modification hinges on the choice of horizon used in estimation. We next propose a spectral identification approach whose rotation is in the frequency domain. Finally, we modify the spectral approach by substituting the long-run variance-covariance matrix with one that can be reasonably obtained in small samples.
When using Monte-Carlo simulations to validate our methodologies, we highlight that a key drawback of this exercise is that few DSGE calibrations contain nontechnology shocks with a material impact on the variance of productivity. That is, in order to generate non-trivial confounding shocks, we need to incorporate certain kinds of stochastic processes not usually assumed for these models, but which are present in the data. As an alternative, we employ a simple two-variable model that provides greater transparency around the data-generating process than a DSGE model.
In summary, this paper makes the following contributions to the literature. First, we demonstrate that the Max-Share identification is susceptible to bias from confounding low and high-frequency shocks. Second, we propose three alternative identification methodologies. The first, Non-Accumulated Max-Share (NAMS), shows a lower degree of bias in the presence of low-frequency confounding non-technology shocks. The second and third, labeled Spectral and Limited Spectral approaches, show less bias in the presence of non-technology high and medium frequency shocks other than technology. Finally, we show that both the Max-Share and our improved alternative identifications are more robust than long-run restrictions to the problems of both confounding non-technology shocks and lag-truncation bias identified by Chari et al. (2008). Therefore, SVAR techniques can still play an important role in testing the predictions of macroeconomic models.
The remainder of the paper is as follows. We first revisit existing technology identification methodologies, drawing attention to newly identified shortcomings of the Max-Share approach before outlining several proposed improvements. We then show the performance of the existing and new identifications in Monte Carlo simulations on DSGE-generated data and data generated by a simple 2-variable model. Finally, we take each specification to data for the United States and several major emerging and developing economies (EMDEs), finding that there are several sources of contamination in traditional VAR approaches that can be addressed with our new methodologies.

Empirical Approaches
In this section, we briefly revisit the standard long-run and Max-Share identification of a technology shock. The potential for the Max-Share approach to be contaminated by other non-technology shocks is then explained, before we detail three approaches that can reduce contamination, depending on its source.

Long-Run Restrictions
We start with the simple and original approach that first introduced SVARs as a tool to disentangle technology shocks from general macroeconomic fluctuations (Galí, 1999). Isolating the long-run components of labor productivity (prod t ) and total hours worked (hours t ), labeled LP LR and Hours LR , respectively, this methodology imposes the restriction that only the technology shock can impact labor productivity in the long-run. LP LR (1) Assuming the structural AR matrix polynomial, The long-run counterpart is therefore, In a stationary VAR containing the log-difference series of productivity and hours, the long-run effect of the technology shock on growth will dissipate. The long run impact of each shock on the level of the target variable can be written as: where B(L) is the reduced-form VAR polynomial. Restricting the loading of the non-technology shock onto productivity to be zero can be accomplished by ensuring the long-run impact matrix is lower triangular. This is accomplished by solving for A −1 0 as follows: Where Σ u is the reduced-form variance-covariance matrix.

Max-Share
Long-run restrictions have come under fire for short-sample estimation problems and overly restrictive assumptions. The Max-Share identification instead assumes that technology shocks are the predominant driver of productivity around the 10year horizon. In this identification, the technology shock is that which drives the largest proportion of the forecast error variance of labor productivity at this horizon, as in (Francis et al., 2014). (Francis et al., 2014) surmise that 10 years is longer than the period over which the business cycle occurs (typically assumed to be 2-8 years), but short enough to reduce challenges related to estimation on a finite sample. (Francis et al., 2014) impose this restriction in a VAR containing productivity, hours, consumption and investment as a share of GDP. The forecast error at horizon k can be written: By defining an orthonormal matrix A 0 with columns α, and e as a selection vector (size 1×n), we find the shock j which maximizes the contribution to the total forecast error variance of variable i at horizon k The technology shock at this maximized value is then: tech Following Uhlig (2003), identifying the structural shock that maximizes the contribution to the forecast error variance of productivity is solved by identifying the eigenvector associated with the maximum eigenvalue of V τ , where V τ is the FEVD of the target variable based on reduced-form shocks, and the denominator of ω(α).

New Approaches
A drawback of the Max-Share approach is that in addition to capturing the long-run shock of interest it also captures aspects of other shocks in the data. We propose three alternative empirical approaches that offer reduced interference from these confounding shocks in different circumstances.
We first formally demonstrate the confounding nature of other shocks when using the Max-Share identification. We present the key equations of the eigenvalueeigenvector problem and refer the reader to Appendix Section 9.1 for details of the proof.
Max-Share involves setting up the Lagrangian for V τ : whose first order conditions reduce to solving for the eigenvector associated with the largest eigenvalue of V τ For a simple two variable VAR with the true structural coefficients of the form: where we use l and h to characterize the first shock as a low frequency (technology shock) and the second as a high frequency (business cycle) shock. Given reduced form MA coefficient counterparts, B, the solution for α will be of the general form: 5 In an attempt to isolate the low frequency technology shock, we can clearly see the potential contamination coming from the high frequency shock as α depends on the initial impact of the respective shocks of the target variable (1) ( , and their relative persistence: . Essentially, the derived shock will not be of a 'pure' form, but rather a combination of shocks with the ratio dependent on their importance in driving the forecast error variance at the chosen horizon.
As the true form of A 0 is unobserved, the extent of this contamination in empirical applications is also unknown.
In general, there is no closed form solution to the eigenvalue-eigenvector problem and the above expression is for an extremely special case where such a solution exists. 6 However, we consider this illustrative as the more complicated solution will exhibit the features of this restrictive expression.

Non-Accumulated Max-Share (NAMS)
The standard Max-Share approach takes cumulative forecast errors up to time k: see equation 6. As indicated above, and expanded in the Appendix, we show that there may be instances where non-trivial proportions of the forecast error variance is driven by shocks of lower persistence than the shock of interest.
The first methodology we propose will sharpen the identification of shocks with long-run impacts by reducing the weight given to less persistent processes than technology. For example, a high volatility AR(1) process with a small AR coefficient is also a predominantly low-frequency process, but has no meaningful lasting impact at the 10-year horizon. As an alternative to Max-Share, we propose finding the maximum of the square of the impulse response functions at a particular horizon k. Here we aim to find the shock which has the maximum effects at say period 40, ignoring all the previous horizon shocks. At this horizon, it may be expected that the effects of lower-persistence shocks will have dissipated. The NAMS approach is implemented in a similar way to the Max-Share approach, by solving: 5 To satisfy the unit length restriction we will need to further normalize this by the length of the eigenvector. 6 For example, assuming the variance-covariance matrix is commutative at each horizon Kurmann and Sims (2017) also advocate reducing the impact of less persistent shocks in the Max-Share approach. In the Barsky and Sims (2011) identification, the forecast error variance under consideration is 'double-weighted' -the maximization is applied to the summed forecast error variance from periods 1 to k (max k i=0 ω i (α), compared to the Francis et al. (2014) approach of max ω k (α)). Kurmann and Sims (2017) propose returning to the original identification of Francis et al. (2014), which maximizes a single forecast error variance at horizon k, finding that it helps sharpen the identification of technology news shocks. Our NAMS approach takes this to its logical conclusion, further reducing the distortion caused by transitory shocks. 7

Spectral Identification
The NAMS approach deals with contamination from low-persistence, low-frequency processes. However, the Max-Share approach may also be contaminated by driving processes that occur at business-cycle and higher frequencies. Where the amplitude of these shocks are (coincidentally) high at the chosen target range (k), the NAMS approach may also be biased.
We investigate the use of identification in the frequency domain, which can maximize the share of variance explained only at frequencies that are of interest, excluding those that are not.
Identifying technology shocks through restrictions that explain the majority of low (long-term) frequency volatility of productivity is a novel approach. However, this methodology has in the past been used to assess the types of shocks which drive the business cycle. For example, Angeletos et al. (2018) find that a single shock drives the majority of the variance of a range of macroeconomic variables at business cycle frequencies. DiCecio and Owyang (2010) use the spectral approach to conduct an identification of technology shocks that is similar to what we propose in this article but do not evaluate its performance. 8 We effectively apply a band-pass filter to the reduced form coefficients of a VAR containing macroeconomic variables, identifying the spectral density of the variables within a particular frequency band.
We then identify the technology shock by maximizing the variance of productivity explained at the desired frequency.
The spectral density of series Y at frequency ω can be written as a Fourier transform of its auto and cross covariances (γ): 7 Uhlig (2004) suggests an identification in which non-technology shocks will have no effect after 10 years, such that technology shocks can be identified by a restriction such that only the technology shock has an impact on productivity at that horizon. Our approach differs in that we do not exclude other shocks from having an effect at this horizon, and instead look for the shock that dominates at 10-years. 8 In addition, Christiano et al. (2006) find that by using a spectral estimator at frequency 0, Longrun restriction estimates prove less biased following a MCMC assessment of VAR performance.
Therefore, once γ(τ ) in known, the spectrum, S Y Y (ω), can be straightforwardly calculated. The reverse also holds. That is, knowing the spectrum, S Y Y (ω), leads to an easy computation of γ(τ ) by just using the inverse Fourier transform, Setting τ = 0, gives the variance of the time series Y .
This means that the variance of Y is the sum of the spectrum over all frequencies, −π < ω < π. This further indicates that the spectrum decomposes the variance of Y into components from non-overlapping frequencies. Therefore, similar to the Max-Share identification, spectral analysis allows us to gauge the importance of cycles at different frequencies to the variance of the series of interest. And importantly, we can remove unwanted frequencies from the maximization problem.
To employ this methodology we first need to uncover a VAR representation of the spectral density of Y. We start by writing the Wold representation of the VAR (assuming it is invertible): By post-multiplying Y t by Y t−τ and summing across its lags (of τ periods), the series of auto and cross covariances is: Then, by writing D e −iτ ω = I − B 1 Le −iω + B 2 L 2 e −i2ω + . . . B p L p e −ipω −1 , the spectral density of Y can be written as a function of the reduced-form VAR coefficients.
To assess the spectral density within a frequency band, the spectral power can be summed between ω = [ω ,ω] As in the case of the Max-Share approach, the shock which maximizes the contribution to the variance of productivity over this band is the eigenvector associated with the largest eigenvalue of the matrix S Y Y (band). To identify technology, the band of interest is restricted to frequencies below (longer-than) 10 years, to exclude business-cycle frequencies.

Limited-Horizon Spectral Identification
One criticism that can be leveled against the spectral identification approach is that the long-run VAR representation used to calculate the spectrum of the endogenous variables may be biased when estimated on a short sample of data. In practice, a 'windowed' selection of autocorrelations is often used to estimate the spectrum of a series. The same principle can be used in the spectral VAR identification process.
For example, the infinite-MA representation, D(e iω ) can be replaced with a limited horizon series of impulse response coefficients.
One proposal is to use coefficients extending to 10 years of data, as in the original Max-Share approach:

Model Evaluation
Traditionally, SVARs seeking to extract an unobservable shock, such as technology, have been evaluated on their performance using simulated data from a DSGE model ( 2005)). This is an intuitive evaluation, given the true underlying shock is known by construction. In this section, we argue that many of these tests have been performed using DSGE models which fail to test for the influence of material confounding shocks needed to replicate key features of US macroeconomic data. We later show that existing methodologies perform poorly when these features are included in simulated data from a simple 2-variable model that provides greater transparency over the data-generating process compared to a DSGE model.

DSGE Monte Carlo simulations
Each VAR specification is tested on a standard medium-scale New Keynesian model, of the type proposed by Christiano et al. (2005), and used by Francis et al. (2014) and Barsky and Sims (2011) to evaluate their respective implementations of the Max-Share identifications. 9 This model contains features such as persistent consumption habits, investment adjustment costs, capital utilization, and partial price and wage indexation.
The model can be written as the following planner's problem (full specification available in Appendix Section 9.2): Subject to investment adjustment costs, affected by shocks Z t : Monetary policy follows an inertial Taylor rule: and a government sector accounts for a stochastic proportion of output (ω g ) Total output is a function of costly capital capacity utilization u t : Four shocks drive the model: namely, technology, investment-adjustment costs, government spending and monetary policy shocks: The dynamics of the monetary policy shock ( i,t ) are governed by the monetary policy inertia parameter (ρ i,t ).

Monte-Carlo DSGE Results
A standard shock calibration is used to test each VAR specification (Table 1) based on (Barsky and Sims, 2011). However, we argue that this test may not be reflective of the types of factors that can influence the VAR results when they are applied to real data. The results below are based on 100 simulations of the DSGE model, generating 250 observations after a burn-in of 100 periods. Each VAR is estimated via a Gibbs sampling procedure with flat priors, saving 1000 draws following a 500-period burnin. The same procedure is used for all simulations throughout the document.
A four-variable VAR is estimated using the level of productivity, total hours worked, and the share of investment and consumption in GDP. For the long-run identification, log-differenced productivity is used, as is standard.
Each VAR specification makes a relatively small error relative to the true DSGE IRF of the technology shock to labor productivity, although the long-run specification diverges over time. This can be seen in the bias of the median IRF for each identification relative to the true DSGE IRF (Figure 2). In addition, the shocks uncovered by the VARs have a high correlation with the true underlying technology shock (Table 2). In the case of the long-run identification, given that the technology shock is not a true unit-root process, it is not surprising that it performs poorly.
However, as we will show in subsequent simulations, the long-run specification is also outperformed in a variety of DSGE and non-DSGE settings where technology is a unit-root process. In addition, the NAMS specification has a modestly higher IRF bias than the Max-Share and Spectral specifications. This is in part due to the high persistence of the IRFs of non-technology shocks on labor productivity in the DSGE model. 10 This is a challenge to the NAMS framework, which aims to isolate processes that are still in the data at this horizon, assuming that business cycle factors should have faded at this point. This is one motivation to switch to a simpler and more transparent framework for testing the VARs than by using DSGE models. The high performance of the VARs is likely to be in part due to the shock processes driving the DSGE model. The highly persistent technology shock in this and other similar DSGE models tends to drive the vast majority of the variance of labor productivity. In this calibration, over 99% of the volatility of labor productivity is driven by a , the technology shock. 11 Therefore, several of the methodologies find a similar result given how insignificant the non-technology driving processes are in the variation of productivity. In (Francis et al., 2014), the technology shock was a true unit-root process, which dominates the variance of technology by even more than the persistent AR(1) chosen here. This DSGE framework also does not accurately capture the data generating process embodied in the US data. A spectral decomposition of the US productivity data relative to that produced by the DSGE model shows an important discrepancy with respect to growth rates. DSGE-generated productivity growth rates are dominated by high and medium-frequency shocks, while the US data suggests lowfrequency shocks with long-run effects drive an important component of productivity growth ( Figure 3). In addition, the similarities of the DSGE-generated data and the log-level of US productivity does not rule out the presence of large confounding non-technology low-frequency shocks such as an AR(1) growth process. 12 Arguably, the standard DSGE specification does not adequately 'road-test' the performance of the VARs for real-world situations. In the following sections, we examine VAR performance in the event that larger confounding shocks drive a more material component of the data process than assessed in traditional DSGE models.

An Illustrative Two Variable Data-Generating Process
It is clear that in a DSGE model, driven largely by a single persistent low-frequency shock, the performance of each VAR specification, except the long-run identification, is broadly similar, as judged by their IRF bias and correlation with the underlying productivity shock. Here we examine how the VARs perform when adding additional features to the data generating process. It is useful to strip away the complexity of the dynamics of the data driven by the DSGE model so that we can clearly examine how different data processes can affect the results.
To that end, a simple two-variable model is used to generate the data. In the first instance, we build a model where technology shocks follow a highly persistent AR(1) process that is confounded by low or high-frequency non-technology shocks.
In the next section, we modify the model to allow for the presence of technology shocks with unit roots and additional components that affect the growth rate of productivity.
A simple two-variable data process is generated for labor productivity (L) and hours (N ). Both processes are driven by a technology shock ( z ) and a business- This simple process is calibrated to replicate some of the features of a more complex model, while being more transparent. In the case of a technology shock, labor productivity rises persistently, while hours-worked initially falls (as in the New Keynesian framework).
The advantage of this simple setup is that we can change the driving processes of the business cycle shock and easily understand how this changes the properties of the data and hence the estimation performance of the VAR specifications. Later a small modification to this model is made in order to cover the case of unit roots with a stochastic growth process, which appears to be present in the US data.

Motivating the Choice of Stochastic Processes
We choose our shock processes to examine two plausible scenarios in the detection of technology shocks: 1. There are confounding low-frequency but less persistent processes in the data other than technology; 2. There are high or business cycle frequency processes in the data.
As described earlier, our NAMS approach has been designed to deal with the first case, while the spectral identifications are targeted at the second. Before we turn to the simulations, we briefly describe the specifications that can generate the two described processes. This includes covering the frequency domain properties of AR(1) and AR (2) processes.
Our choice of driving processes is motivated by the following well-known spectral density factoid. Consider the white noise process t , with variance γ(0) = σ 2 and autocovariance function γ(h) = 0 for h = 0.
Therefore, the spectrum is Now consider the AR(1) process v t = ρv t−1 + with autocovariance function γ(h) = σ 2 ρ |h| /(1-ρ 2 ). The associated spectrum is, Notice when ρ > 0 it is obvious that the spectrum is dominated by low-frequency shocks, and in the case of negative autocorrelation, ρ < 0, the spectrum is dominated by high frequency components. This simple factoid shows that a specification of a simple AR(1) process for b allows us to generate a confounding low-frequency process. 13 A business cycle frequency shock requires an AR(2) process.
We replicate a sinusoidal business-cycle shock process with a specific frequency (f ) using the following AR(2) process: Here we set f such that the shock process for b has a periodicity of 8-quarters (2 years).
In our application, to ensure the cyclical process degrades over time (and avoids a unit root), we multiply both coefficients (ρ b,1 and ρ b,2 ) by 0.9. We also calibrate 13 A negatively-signed AR coefficient would allow us to include a confounding high frequency shock. For a more detailed discussion of the data-generating processes behind a range of spectral densities, see Medel (2014).
the variance of the shocks to be different for a clear distinction between the driving processes. In both cases, the shock standard deviations are calibrated so that z , the technology shock, explains the majority (just over 50%) of the FEVD at period 40. This allows us to demonstrate that even where the shock of interest is the dominant shock, the application of the Max-Share identification will still result in biased estimates.
The two calibrations are shown in Table 3.

SVAR Performance
Case 1: low-frequency confounding shocks: In the presence of an additional low-frequency, albeit less persistent shock, both the Spectral and NAMS identifications outperform the traditional Max-Share and long-run restriction approaches.
The NAMS approach is least affected by the confounding shock, consistent with its intended purpose.
As predicted, the IRF of the Max-Share identification is biased upwards by the higher-variance shock b (Figure 4), even though the targeted shock z explains the majority of the forecast error variance at the 10-year horizon. In the presence of confounding low-frequency shocks (the less persistent AR(1) process), the NAMS approach shows the least IRF bias and shows minimal bias in the estimation of the FEVD share of technology in productivity ( Figure 5). This is to be expected: by design, the NAMS approach gives minimal weight to low-persistence processes.
The end result is a very high correlation between the estimated NAMS technology shocks and the true underlying shocks (Table 4). The Spectral approaches also show less IRF bias than the Max-Share approach.
This might seem unintuitive at first, as the confounding shock is also a low-frequency process, like the targeted technology shock. However, the lower persistence of the variable b relative to z lowers its contribution to the variance at low frequencies, reducing the bias. Referring to equation 30, it is clear that the contribution of the process b to the variance of productivity at low frequencies will be increasing in the size of the persistence parameter ρ.  Low-frequency confounding Business-frequency confounding higher median correlation (0.97) with the true underlying shock than the remaining VARs (Table 4). In addition, the Spectral identifications are less prone to overstate the FEVD share of the technology shock ( Figure 5). Both the long-run and Max-Share specifications are prone to overstating the forecast error variance explained by the technology shock, capturing additional variance from the business-cycle shock. In contrast, the NAMS approach initially understates the contribution to the forecast error variance and has larger confidence bands than the spectral approaches.
Overall, in the presence of confounding business-cycle frequency shocks, the Spectral approaches have a clear advantage over the traditional Max-Share and Long-run identification, even where the technology shock dominates the forecast error variance at a standard target horizon (10-years). Appendix 9.3.2 shows the impact of increasing the target horizon to 15 years, finding that the results are robust to this change. shocks have a unit root and are subject to AR(1) shocks to its growth rate. This shock specification is also employed by Sala (2015) in order to calibrate a DSGE model that matches the spectral properties of the US data. The two-variable model now takes the form: The technology shock z g t is now a permanent shock to the level of productivity L, with persistent effects on its growth rates. b t remains a temporary impact on the level of productivity. We keep the parameter values unchanged relative to scenario 1 of the previous model (σ b = 2, ρ b = 0.3, σ z g = 1), with the exception of the new parameter ρ z g which is set to be reasonably persistent of 0.8 in value. 14 14 Lindé (2008) finds the persistence parameter to be low (0.14) but the variance of z g is high -he also finds the model fit was also very good when ρ z g was high but var(z g ) was low All previous VARs have been estimated using the productivity data in levels (with the exception of the long-run restriction identification). Given that we are now interested in the particular case of a shock process driving the growth rate of productivity, we estimate each of the VARs in both levels and differences for productivity.
In the first case, where the VARs are estimated on the differenced labor productivity series (L), the Spectral approaches show the lowest level of bias in their IRFs ( Figure 6).

Differences estimation Levels estimation
Note: The long-run specification requires the productivity data to be estimated in log differences.
The results for the bias of the long-run specification in 'levels' plot reports the estimation in log differences for comparison purposes.
To see why this is the case, observe that the differenced series L is the sum of the differenced series z l and b (∆L t = ∆z l t + ∆b t ). The first term is simply the low frequency AR(1) process As (ρ b − 1) is negative, this second process is a mixture of high frequency and white noise processes. This may contribute to the volatility of ∆L but does not have persistent low-frequency effects. The Max-Share identification is, therefore, less capable of distinguishing between this and the true persistent technology shock.
The Spectral approaches assign most weight to the low-frequency persistent shock, as does the NAMS approach, which 'looks through' the transitory white noise and high-frequency process resulting from differencing b. Note: The long-run specification requires the productivity data to be estimated in differences. 5th and 95th percentiles shown in brackets.
The ability of the Spectral and NAMS identifications to distinguish between these data generating processes also enables them to more accurately estimate the proportion of forecast error variance of productivity driven by the technology shock ( Figure 7). When estimating the VAR in levels, we find that all approaches have a similar performance with the exception of the long-run restriction -which is estimated in differences -(Chart 7 and Table 5). This accuracy is driven by all approaches accurately estimating the initial variance of the technology shock. However, the IRFs under all approaches are less persistent relative to those estimated by the Spectral VARs on the differenced data ( Figure 6). Effectively, the additional dynamics in productivity growth driven by the technology shock are obscured when estimating the VAR with the data in levels. As such, the IRFs prove less persistent than the estimate on of the Spectral and NAMS VARs on differenced productivity, and further away from the true persistence of the shock.
The poor performance of the long-run identification in detecting a unit root technology process is due to the presence of confounding shocks. As we show in Appendix 9.3.3, with little or no confounding shocks, the long-run restriction accurately estimates unit-root technology shocks with minimal IRF bias and close to 100% correlation with the underlying shock.

Summary of Identification Performance
Figure 8 provides a stylized example of the different forms a traditional technology shock can take, from a persistent AR1 to a unit-root shock. We test both forms in the above scenarios. In contrast, non-technology shocks such as less persistent AR(1) and cyclical business-cycle-related shocks to productivity may also be in the data, driving a material proportion of the variance. In this model, there is no advantage of using the new specifications above the advantages offered by the Max-Share identification, as the technology shock has a unit root, and has no impact on the growth of productivity. However, we demonstrate that compared to the long-run restriction, all other specifications show lower degrees of bias. Therefore, in contrast to the findings of CKM, using the right identifications, SVARs can prove useful in identifying the impacts of technology shocks in a range of DGPs.
Consumers' utility functions are given by where c is consumption, l is per capita labor, β is the discount rate and γ the population growth rate. Consumers maximize utility subject to the budget constraint Where τ x is a tax on investment, k is the capital stock, δ the depreciation rate, w wages, r the rental rate on capital and T a lump sum transfer. Firms face a resource constraint The technology and non-technology shocks evolve according to This RBC model contains a unit root for technology, but also a highly persistent non-technology shock τ , where ρ l is 0.95 in the standard calibration. Chari et al.
(2008) note that two sources of bias exist for the long-run SVAR methodology: non-technology shocks will increase IRF bias as they drive a larger proportion of the variance of output; and a lag-truncation bias, where limited VAR lags result in a bias due to the true specification of the VAR having an ∞-representation. Turning to the first claim, the relative variance of the non-technology shock to the technology shock is adjusted ( Each VAR is estimated using 4 lags. The long-run SVAR is estimated with log hours specified as h t − αh t−1 , where α determines the degree of quasi-differencing, as in CKM. This allows the VAR to be estimated with total hours in both levels and a highly quasi-differenced form. 16 All other SVAR specifications use hours in levels. • The long-run IRF for productivity is 'confidently wrong' (also demonstrated by Chari et al. (2008)) as the non-technology shock generates over 50% of the variance of output in the model in the quasi-differenced long-run specification.
The specification with hours in levels has the largest bias and confidence intervals of the remaining specifications. However, the Max-Share and our new approaches correctly display uncertainty in the identification of technology shocks, via wider error bands, as the non-technology shock variance increases.
See Christiano et al. (2007) for a discussion of large confidence intervals relative to the size of the bias in the context of Chari et al. (2008).
• Our alternative methodologies show lower bias than the long-run identification as the non-technology shock increases in size. However, the Max-Share identification is marginally more efficient than the spectral approach, showing slightly less bias at all horizons. The NAMS identification becomes more biased as the highly persistent non-technology shock grows larger in importance, given that the non-technology shock is calibrated to be highly persistent by CKM (0.95 AR coefficient), and therefore continues to have a material effect on labor productivity even at the 10-year horizon.
In the second exercise, we examine the robustness of each methodology to lagtruncation bias. This is the bias caused by estimating the VARs using a finite number of AR coefficients when the true DSGE-generated data has an infinitelag order. There are two main findings from running each method on 100,000 simulated data points from the RBC model and varying the number of lags used in the estimation.
• At low lag levels, alternate methods show lower initial bias relative to the long-run specification. Further out, at long horizons, the new specifications show a similar bias to the long-run identifications.
• The NAMS specification continues to show more bias than the Spectral and Max-Share specification on impact, due to bias stemming from the highpersistence non-technology shock.

Application to US Data
We find that when applied to the US data, each proposed new methodology offers qualitatively similar impulse responses ( Figure 11). However, a closer examination of the forecast error variance, the persistence of the IRFs, and the results of estimating the VAR using productivity in differences are revealing about the likely presence of confounding shocks. Figure 11: IRFs from estimation on data levels Note: 16th and 86th percentile error bands. The long-run identification is estimated using differenced labor productivity and hours worked per capita data. All other identifications use both variables in levels.

The IRFs and forecast error variance decompositions suggest the presence of confounding shocks
We use a 6-variable VAR with 4 lags. Our results are robust to alternative lag specifications, given the associated problems highlighted by Canova et al. (2010) and Chari et al. (2008) around the estimation of a process that may have an underlying representation of an AR process with infinite lags. The VAR contains logged labor productivity (output per hour), logged total hours worked per capita, the share of investment in total output (including consumer durables and excluding government investment), the share of consumption in total output (excluding  There appear to be few confounding business frequency shocks in the productivity levels data. In part, this is observable in the spectral densities shown earlier, which show an overwhelming domination of the variance of productivity at low frequencies ( Figure 1). This can also be seen by the similarities between the Max-Share and limited spectral FEVD identifications. One way of thinking about the relationship between the two methodologies is that the limited spectral approach is simply a Fourier transform of the Max-Share identification, with the additional feature of excluding certain frequencies. The forecast error variance matrix with which the maximization takes place for each can be written as: and Therefore, medium and high-frequency volatility must be a sufficiently small proportion of the variance at this horizon so that the maximization problems are essentially equivalent. Under the standard Spectral identification (using the longrun representation of VAR coefficients), confounding low-frequency shocks will be given less weight relative to highly persistent shocks such as technology, providing an explanation of the lower proportion of explained variance in the standard Spectral relative to the limited Spectral identification. The growth rate of US productivity exhibits a wide range of frequencies driving its spectral density, unlike the level, suggesting an estimation of the VAR with productivity in log-differences will also be informative.

Spectral identifications best capture technology shocks with persistent effects on productivity growth
When estimating the VARs in differences (Figure 12), the Spectral approaches show a more persistent IRF than when estimated on the level of productivity, and a similar share of forecast error variance explained of productivity. This demonstrates similar circumstances to the model-based scenario which simulated data in the case of technology taking the form of a unit-root plus stochastic trend growth specification (Section (3.4). The differenced Spectral VARs are therefore likely producing the least biased IRFs for the response of productivity to technology.
Only the Spectral identifications produce consistent estimates when estimating on data in levels and differences Comparing the impulse responses across the two specifications (productivity estimated in levels versus differences), shows that only the spectral estimators produce consistent impulse responses in each case (the long-run identification is always estimated in differences). In response to a positive technology shock, productivity, consumption, and investment rise, while hours, inflation, and the long rate fall. The confidence bands around the spectral estimates are also tighter than their counterparts. While the Max-Share impulses are mainly consistent, we see differences in the impulse of hours and long rates: both hours and the long rate are negative with productivity in levels and positive when estimated in differences.
NAMS produces contradictory or non-informative impulse responses across all variables except for consumption and investment when estimated on labor productivity in differences. Additionally, the accompanying wide confidence bands of the difference specification precludes us from drawing any concrete conclusions and again manifest themselves in the very low explained FEVD (Table 6). This is to be expected, given the impact of a technology shock on productivity growth after 10 years is understandably small and uncertain.
US productivity data is likely to be contaminated by sizable confounding shocks -making the Max-Share and long-run identifications unsuitable In summary, the US data is likely to be contaminated by low-frequency shocks that affect its level (temporarily) but also contains a stochastic growth shocks. Both these features of the data are poorly captured by the Max-Share approaches. Spectral estimators, on the other hand, can reliably handle this data generating process in both levels and differences for productivity. That said, the Spectral estimators are still somewhat biased by the presence of the confounding low-frequency shock. The NAMS approach reduces this interference, but at the trade-off of poorly capturing the true IRF of the technology shock in the presence of a unit root for technology with a stochastic growth shock. To the extent that the Spectral estimators are best adapted to the probable DGP of the US productivity data, the spectral IRFs reinforce findings from both the technology-shock literature and the technology-news literature. In addition to the negative impact on hours, the IRFs show persistent negative effects on inflation and interest rates following a technology shock. Technology shocks also have a slower-building effect on productivity than in the traditional technology shock literature, closer to the findings of the technology news literature. This may be unsurprising to the extent that the news literature uses utilization-adjusted technology rather than labor productivity, which aims to remove some business cycle variation from the data. In addition, the maximized shock is considered orthogonal from period 1 technology innovations. 17 A second key difference is that our shock is not orthogonalized to contemporaneous innovations in productivity.

Applications to Emerging Markets
In the United States, data is of relatively good quality, and volatility is relatively low. In emerging markets and developing economies (EMDEs), macroeconomic variables of interest are more volatile, and also often considered to be subject to larger business cycle fluctuations than advanced economies (Neumeyer and Perri (2005), García-Cicco et al. (2010)). EMDEs are therefore ideal candidates to demonstrate differences across the proposed SVAR identifications.
EMDE data availability is limited. Few EMDEs publish quarterly data on hoursworked or employment before 2000. Instead, we prioritize data span over data frequency, given our identifications focus on the persistence of shocks. Using annual data from the World Bank's World Development Indicators (WDI), The Conference Board, and the Penn World Table 9.1, we estimate four-variable VARs consisting of productivity (output per employee), employment, the share of consumption in output, and the share of investment in output.
For illustrative purposes, we show results for three major EMDEs below, with time horizons restricted by data availability.
• Brazil -1988-2017 • Indonesia -1961-2017 • South Africa -1961-2017 The first key finding is that the technology shock in the Max-Share identification always captures the largest proportion of forecast error variance in these EMDEs, particularly in the early stages of the estimation (Figure 13). This is consistent with previous simulation results showing that volatile but less persistent shocks can bias the results of the Max-Share identification. Both the Spectral and NAMS approaches are less influenced by these shocks and demonstrate a lower share of explained forecast-error variance initially (note that 'periods' are now expressed in years rather than quarters). We note that in some cases the long-run restriction displays a lower share of FEVD than the other approaches (notably in Brazil), but in simulations has shown a tendency to underestimate the FEVD share relative to the true shock (see Figure 7). 17 The news literature follows a similar procedure to our identification. The notable differences are that utilization-adjusted technology of Basu et al. (2006) is used in place of labor productivity in the VAR, and the shock that maximizes the FEVD of technology over the long-run is orthogonalized from contemporaneous innovations in technology. In a similar approach, Beaudry and Portier (2006) applies the long-run restriction identification and orthogonalizes the news shock from contemporaneous innovations in technology. A second finding is that, as in the US example, the productivity IRFs are more persistent when productivity is included in the VAR in log-differences, particularly for the Spectral approaches. Previous simulations showed that the Spectral VARs are best able to capture technology shocks with a persistent effect on productivity growth, and are the least biased in the presence of confounding shocks ( Figure 14).

Detecting Business-Cycle Shocks
Before concluding, we briefly note that our findings have applications not just to the identification of low-frequency and persistent shocks such as technology. The Max-Share and the Spectral VAR methodologies have been applied to finding dominant business-cycle frequency shocks. For example, Angeletos et al. (2018) find a single primary driver of multiple macroeconomic variables at business cycle frequencies.
Levchenko and Pandalai-Nayar (2018) use the Max-Share methodology to identify the impact of changes in economic sentiment on US output.
We note that the same logic of confounding shocks will also apply in these cases. Most notably, using the Max-Share and Spectral identifications to find the shock that maximizes the forecast error variance or variance within a particular frequency band does not necessarily identify a single structural shock. For example, Giannone et al. (2019), find a shock which maximizes the variance of unemployment at business-cycle frequencies but note that this statistical-derived identification will encompass a linear combination of structural shocks. We formalize this statement in this paper, finding that the shock will encompass a combination of structural shocks broadly in proportion to their relative contribution to the variance at that horizon or frequency band. We propose that the same logic outlined in Appendix section ?? would also hold in the case of targeting the shock which maximizes the variance Note: Blue = estimated on log-productivity differences, Red = estimated on log-levels. Estimated using a four-variable VAR consisting of the log level of output per worker, log employment, the share of consumption in output, and the share of investment in output.
within a business-cycle frequency band. Take for example two structural shock drivers affecting 2 endogenous variables at business-cycle frequencies A BC−1 and A BC−2 , where A BC−1 drives the majority of the variance at these frequencies.
In the case where only these two drivers existed, column one of the identified rotation matrix would weight the two shocks according to their contribution to the variance at the desired frequency band ω and horizon k (in the Limited Spectral this would be 40 quarters and ∞ in the standard Spectral approach). It would not simply 'pick out' the dominant shock.
This will make it difficult to interpret IRFs of the identified shock in question, given they will represent a combination of drivers with only an a priori understanding of their relative weights.

Conclusion
In summary, this paper documents the biases that can be introduced into the longrun and increasingly widely used Max-Share SVAR identification methodologies by the presence of confounding shocks. We show theoretically why this is the case, and why Monte Carlo DSGE simulations have previously failed to account for this issue given the overwhelming dominance of technology shocks in such models. Three new SVAR identifications are proposed to deal with confounding shocks with different frequency domain properties.
Using a simple two-variable model, we show that our new NAMS approach is less susceptible to confounding low-frequency shocks when trying to identify highly persistent technology shocks. And we propose two types of SVAR identifications in the frequency domain which show significantly reduced estimation bias in the presence of higher-frequency confounding shocks relative to technology. We demonstrate that the US productivity data is likely to be affected by confounding shocks that affect the level of productivity temporarily and that US productivity is likely to be a unit root process driven by stochastic technology growth shocks. We propose that the Spectral identifications are the most robust to the features of the data, although the NAMS approach has an advantage in being less affected by the confounding transitory (low-frequency) shocks to productivity. Finally, we show that similar issues affect EMDE data, which is often more volatile than advanced economy data.
9 Appendix 9.1 What shock is Max-Share capturing?: Sources of bias 9.1.1 Low and high-frequency drivers of forecast errors In this Appendix section, we formally demonstrate that the Max-Share identification is prone to contamination from shocks of higher-frequency or lower persistence than desired. In the below, we show the contamination from a high frequency shock when the econometrician attempts to identify a low-frequency shock. However, the logic below could easily be replaced by a high-persistence shock contaminated by a low persistence shock to demonstrate the same result.
A series Y is driven by two structural shocks, a low-frequency shock l and a high frequency shock h with cov( l , h ) = 0. The forecast error at horizon k is a function of the structural impulse responses at each horizon A t = [A l t A h l ]. The series A t is formed of the reduced-form coefficients and the identification matrix, In turn, the forecast error variance of Y is: The proportion of F 2 (Y ) k explained by low-frequency shocks is increasing in the persistence of A l relative to A h as k increases. It is also increasing in the relative variance of the impact of the low frequency shock For the researcher looking to isolate the shock which dominates the low-frequency dynamics of a particular series, k must be set sufficiently high for the low-frequency shock to dominate the forecast error variance.
Where a series is driven by a combination of low and high frequency processes, the low frequency shock will only account for the majority of the forecast error variance for sufficiently large k, such that: By definition, the series of high frequency shock coefficients will be declining at a faster rate than for low frequency coefficients. In some cases, the variance of the low frequency shock will exceed the variance of the high frequency shock, and dominate the forecast error variance from the initial period.
The probability of mistaking a transitory shock for a persistent one will be low where the low-frequency shock takes the form of a unit root process. Take a limiting ie. Y t is driven by a permanent I(1) shock and a transitory white noise process. Here, A l 0:k = 1, while A h 0 = 1, A h 1:k = 0. The low frequency shock will dominate the forecast error variance for k ≥ for other processes, there may be confounding shocks with similar frequencies to the shock of interest. For low frequency shocks with less than infinite periodicities, and in the presence of persistent business cycle shocks, the researcher is liable to identify a shock other than the low-frequency shock without sufficiently high k.

Solving the maximization problem
Even where the low-frequency shock dominates the forecast error at horizon k, the shock captured by Max-Share will not consist only of this underlying shock. Here we show that the shock which maximizes the contribution to the forecast error variance of productivity is a combination of the low and high-frequency structural shock, rather than the dominant underlying structural shock.
Start with the definition with the true underlying structural shock A −1 0 µ = , where refers to the low and high frequency shocks described in the previous example. In the max share approach, the search for A 0 begins withÃ 0 , a Cholesky decomposition of Σ µ and the impulse matrixÃ(L) = B(L)Ã 0 . The "true" structural shock is defined using an unknown orthonormal matrix Q, such that the true structural impulse matrix  Uhlig (2003), isolating the shock which explains the largest proportion of the forecast error variance reduces to an eigenvector decomposition problem. This means searching for an orthonormal vector q (of the matrix Q) which maximizes the forecast error for variable ii.
Maximizing q S(k)q is a Lagrangian with the constraint that q q = 1 L(q) = q S(k)q − λ(q q − 1) whose first order conditions reduce to solving for the eigenvector associated with the largest eigenvalue of S(k) Will this q be equivalent to the first column of Q defining the true low and high-frequency structural shocks, A 0 =Ã 0 Q?
We will demonstrate that, in an attempt to isolate the low-frequency shock, the q in the Max-Share rotation will still contain features of the high-frequency shock, the extent depending on the forecast horizon, the relative variances of the shocks, and the relative persistence of the shocks. While we demonstrate this for the simple 2x2 case we argue that the results hold for higher dimension VARs.
As in the previous example, the true underlying low and high-frequency shocks can be written as Here, A l 11 and A l 21 refer to the impacts of the low frequency shock on variables 1 and 2 respectively in period 0, while A h 22 and A h 12 refer to the impacts of the high frequency shock on variable 2 and 1 respectively.
The total forecast error variance is equivalent when using the structural form of the shocks and the reduced form since The forecast error variance can be computed using the structural shocks and the reduced form MA-representation coefficients B i:T 2×2 . We are only interested in the impacts on the first endogenous variable of the system (for example productivity), and therefore only require the first row of B for our computation of the forecast error variance.
At K = 0, [B 11 0 B 12 0 ] = [1 0] since the second variable will not have a contemporaneous impact on the first. Therefore: As in Uhlig, finding the shock that maximizes the forecast error variance of variable i reduces to an eigenvector problem.
Returning to the Lagrangian where we maximize q S(k)q subject to q q = 1.
In the case of S(0) = F EV (0), the eigenvalue problem is simple. Here, there are two eigenvalues found, λ = [(A 2 11 + A 2 12 ), 0]. The eigenvector associated with the largest eigenvalue, the non-zero value, is simply a ratio of the impact of the two shocks on variable 1 at time zero. Normalizing the second contribution to 1, the un-normalized eigenvector can be written as: The normalized eigenvector q can be obtained by dividing each element ofq with its Euclidean length. This also demonstrates the generalized solution to the eigenvector problem for a symmetric 2x2 matrix as a function of the square root of the diagonal elements of S.q will always be proportional to the standard deviation of variable 1 driven by each shock. As this will vary over time, the identified shock will also vary over the timespan used to calculate the forecast error variance.
In period 1, B = [B 11 1 B 12 1 ], and therefore: The eigenvector is now a function of the ratio of the impact of shock 1 on variable 1 in periods 1 and 2 relative to the impact of the high frequency shock in both periods:q We can then generalize the form of the eigenvector q as a function of the forecast error variance horizon chosen: , will be increasing over time, as shock 1 has been designated to be more persistent.
As k increases, the ratio ofq 1 q2 inq = q 1 q 2 will increase, placing more weight on the persistent shock. In all cases however, the shock found will be a linear combination of the persistent and non-persistent shocks. The ratio will also depend on the initial variance of the respective shocks ( , and their relative persistence:

DSGE Model Specification
This section sets out a New Keynesian model which includes capital, investment adjustment costs, wage-indexation, a government sector, and habit formation in consumption. The model is calibrated based on (Christiano, Eichenbaum and Evans, 2005), and the notation is heavily based on notes by Sims, 2014.
Households Households supply labor and have habit-formation in consumption such that the planners problem is: Households also choose investment and thus the capital stock in t + 1, which is subject to adjustment costs, scaled by τ . Shocks to investment costs are accounted for by Z t .
And, including a cost for capital utilization, the resource constraint is The FOCs are as follows: Firms Final output Y is a CES aggregate of a continuum of intermediate goods Demand for each intermediate is a downward-sloping function of its relative price: and the aggregate price index is a function of the elasticity of substitution, p : Intermediate firms use the following production technology combining capital and labor: with the firm minimizing input costs subject to the constraint that production meets demand at a given price The first-order conditions allow real wages and real capital rental costs to be written as a function of total marginal costs ϕt Pt .
Real profits are therefore a function of the price and demand for the product Y (j) and the marginal cost of each input.

Price and Wage setting
Firms are able to change prices with probability 1 − φ p each period. If they are not able to optimize prices, they index prices at a fixed proportion relative to inflation, ζ p . Prices are therefore indexed at the cumulative rate of inflation Π ζp t−1,t+s−1 if prices are not changed for s periods. After substituting in for Y t (j) = Pt(j) Pt − p Y t , the standard maximization problem for price setting becomes The optimal price is a constant markup over current and expected future marginal The wage setting process is also driven by a continuum of labor: With the demand for each type of labor a downward sloping function of the relative price.
Wage setting is now a function of the disutility of labor and the probability of not being able to reset wages in each period, φ w . As with consumer prices, where prices are not reset, nominal wages W are indexed to the previous period's inflation at rate ζ w . Real indexed wages with no adjustment for s periods are therefore: This can be written as a Lagrangian: t+s With the solution taking a similar form to the price setting equation for goods The evolution of wages and prices is then determined by these equilibrium updating equations in addition to prices in the current period t Exogenous processes, adding-up, and parametization. A government sector has output share ω g in the steady state, but is subject to spending shocks.
Monetary policy follows an inertial Taylor rule to close the model In addition to the shocks to monetary policy and government spending, there are two persistent shocks to technology (A) and investment costs, (Z).
Finally, in aggregation, output is affected by price dispersion and utilization: Where the Calvo price dispersion term takes the form: For the deep parameters, we follow (Christiano et al., 2005). For the parameterization of the shocks and shock processes, we initially assume that ρ a = 0.99, while the remaining shocks are less persistent (ρ g = ρ z = 0.8). In addition, we assume the variance of the productivity shock is 0.66%, while the variance of the remaining shocks is 0.15%, as in (Barsky and Sims, 2011).
As noted in the main text, the resulting model contains persistent effects of not just technology, but other shocks on labor productivity ( Figure 15). Even where these shocks are calibrated with lower persistence parameters (ρ), their effects can easily last over 40 quarters in many standard specifications, where up to one-quarter of the initial impact of each shock remains in the data at this time horizon.
Finally, we also note that the SVAR performance for detecting the impact of technology shocks on productivity also extends to their performance in detecting the impact on hours worked. The Max-Share and spectral approaches prove most accurate than NAMS, and the long-run restriction is outperformed by all other SVAR identifications ( Figure 16).  In section 3.3.2, the ability of each SVAR identification to correctly estimate a technology shock in the presence of confounding low-frequency shocks is evaluated.
In this section, we evaluate the impact of making the confounding shock closer in persistence to the technology shock.
In the original scenario, the technology shock is assumed to have persistence 0.9, while the coefficient on the confounding low-frequency shock is 0.3 (ρ b,1 = 0.3).
Here, we show that when raising the persistence of the confounding shock to 0.6, closer to the persistence of the technology shock (z). Figure 17 shows that the NAMS approach continues to have the lowest IRF bias even with the increased persistence of confounding shock, while the Spectral identification biases are the second-lowest in the initial stages of the IRF, consistent with the original scenario.
In both cases however, the IRF bias has increased relative to the main-text scenario. Figure 17: Bias of technology shock IRF for labor productivity: increased persistence of confounding low-frequency shock Note: Absolute bias of technology shock IRF for labor productivity compared to 'true' impulse, based on 100 simulations of data (250 periods in each simulation)

Changing the target horizon for Max-Share and NAMS
Simulations in this paper have used the standard target horizon for Max-Share of 10 years over which to maximize the forecast error variance (used by Francis et al. (2014) and Barsky and Sims (2011)). For consistency, we have adopted this time horizon for the NAMS approach. In this section, we evaluate the effects of increasing the target horizon for both NAMS and Max-Share to 15 years. IRF biases remain similar in magnitude to the baseline case of using a 10-year target range. Figure 18: Bias of technology shock IRF for labor productivity: increased target horizon for Max Share and NAMS to 15 years Note: Absolute bias of technology shock IRF for labor productivity compared to 'true' impulse, based on 100 simulations of data (250 periods in each simulation) 9.3.3 Where do long-run restrictions work well?
All simulations in this paper have included confounding non-technology shocks to demonstrate the ability of alternative identifications to accurately estimate technology in these circumstances. In this section, we show that the traditional long-run identification can be the best performing of all alternatives used throughout this paper in certain circumstances: notably, where the technology process has a unit root and confounding shocks are minimal.
Let the simulation model take the simple form: Where z t is now a permanent unit-root shock to the level of productivity L. Unlike previous simulations, there is no shock to labor productivity growth in this simulation. b t remains a temporary impact on the level of productivity and hours-worked N t . We show results in the case where the confounding shock has a minimal standard deviation of 0.05 (relative to 1 for the technology shock), and a case where the shock is much larger, with a standard deviation of 2 (Table 8).
The long-run identification clearly outperforms other specifications in the pure unit-root case when there are only minimal confounding shocks. Its performance deteriorates dramatically in the case of confounding shocks, with a larger bias than all other specifications, consistent with the range of scenarios shown throughout the paper. Figure 19: IRF bias where technology growth has a unit root -confounding and non-confounding case

Minimal confounding Large confounding
Note: The long-run specification is estimated using productivity and hours-worked data in differences. All other specifications use the data in levels.

CKM (2008) simulations
In this appendix section, we first show that our new proposed specifications most accurately assess the technology shock's impact on hours as well as productivity.
Second, we show that reducing the persistence of the non-technology shock from 0.95 (CKM's original parameterization) reduces the discrepancy in performance between the Max-Share and our proposed specifications -when these highly persistent shocks (with material effects on labor productivity at the 10-year horizon) are reduced, model performance is comparable.
In the case of the effect of a technology shock on hours, the Max-Share, Spectral, Limited Spectral and NAMs approaches show minimal bias-although the spectral approach bias does increase as technology shocks make up less than 50 percent of the variance of output ( Figure 20). In the case of the long-run restriction, bias increases for lower influences of non-technology shocks. Figure 20: CKM: Impact bias on hours from technology as the proportion of output driven by non-technology shock is varied and spectral approaches in detecting the impact of technology on productivity in CKM is due to the high persistence of non-technology shocks in their model. As NAMS and the spectral methods are designed to capture highly persistent shocks, they are biased by the presence of multiple shocks with this characteristic. By reducing the persistence parameter of the non-technology shock from 0.95 to 0.7, the bias in both of these identifications falls considerably (Figure 21). This same change does not improve the performance of the long-run restriction. Figure 21: CKM with lower persistence non-technology shock: Impact bias on productivity as proportion of output driven by non-technology shock is varied Note: The proportion of variance driven by the non-technology shock is calculated by simulating the model with one shock at a time, and then comparing the variance of the HP-filtered series for output from each simulation, as in CKM. In this simulation, the persistence parameter of the non-technology shock is reduced from 0.95 to 0.7, demonstrating a lower bias for the NAMS and spectral approaches than in the high-persistence case