The Federal Reserve Board eagle logo links to home page

Skip to: [Printable Version (PDF)] [Bibliography] [Footnotes]
Finance and Economics Discussion Series: 2013-63 Screen Reader version

Computing Arbitrage-Free Yields in Multi-Factor Gaussian Shadow-Rate Term Structure Models

PRELIMINARY DRAFT

Marcel Priebsch*
First Draft: March 11, 2013 This Draft: September 24, 2013

Abstract:

This paper develops a method to approximate arbitrage-free bond yields within a term structure model in which the short rate follows a Gaussian process censored at zero (a "shadow-rate model" as proposed by Black (1995)). The censoring ensures that model-implied yields are constrained to be positive, but it also introduces non-linearity that renders standard bond pricing formulas inapplicable. In particular, yields are not linear functions of the underlying state vector as they are in affine term structure models (see Piazzesi, 2010). Existing approaches towards computing yields in shadow-rate models suffer from high computational burden or low accuracy. In contrast, I show that the technique proposed in this paper is sufficiently fast for single-step estimation of a three-factor shadow-rate term structure model, and sufficiently accurate to evaluate yields to within approximately half a basis point.


1 Introduction

In late-2008, short-term nominal interest rates in the U.S. fell to their effective "zero lower bound" (see Bernanke et al., 2004 ). Since standard Gaussian term structure models do not rule out the possibility of negative model-implied yields, they provide a poor approximation to the behavior of nominal yields when the lower bound is binding (Kim and Singleton, 2012; Christensen and Rudebusch, 2013; Bauer and Rudebusch, 2013). Kim and Singleton (2012) find that shadow-rate models in the spirit of Black (1995) successfully capture yield-curve properties observed near the zero lower bound. However, arbitrage-free multi-factor versions of these models tend to be computationally intractable (Christensen and Rudebusch, 2013). Gorovoi and Linetsky (2004) show that bond prices in a one-factor shadow-rate model can be computed analytically by an eigenfunction expansion, but their approach does not generalize to multiple dimensions. Kim and Singleton (2012) and Ichiue and Ueno (2007) successfully estimate shadow-rate models with up to two factors, but they compute bond prices using discretization schemes that are subject to the curse of dimensionality. Christensen and Rudebusch (2013) use a yield formula proposed by Krippner (2012) to estimate shadow-rate Nelson-Siegel models with up to three factors, but 's derivation deviates from the usual no-arbitrage approach. Bauer and Rudebusch (2013) evaluate bond prices by Monte Carlo simulation for given model parameters from an unconstrained Gaussian term structure model, but they do not estimate a shadow-rate version of the model due to the computational burden.

This paper develops and applies a new technique for fast and accurate approximation of arbitrage-free zero-coupon bond yields in multi-factor Gaussian shadow-rate models of the term structure of interest rates. The computational complexity of the method does not increase with the number of yield curve factors, and, empirically, it produces yields that are accurate to within about half a basis point. The method is sufficiently fast to estimate a flexible, arbitrage-free, three-factor term structure model in which the shadow rate follows a Gaussian process. For illustration purposes, I estimate such a model by quasi-maximum likelihood on a sample of U.S. Treasury yields, using the unscented Kalman filter to account for the non-linear mapping between factors and yields.


2 Model

Consider first the standard, continuous-time N-factor Gaussian term structure model. In particular, let  W_{t}^{\mathbb{P}} be  N -dimensional standard Brownian motion on a complete probability space  \left(\Omega, \mathcal{F}, \mathbb{P} \right) with canonical filtration  \left\{ \mathcal{F}_{t}\right\} _{t\ge0} . Assume there is a pricing measure  \mathbb{Q} on  (\Omega,\mathcal{F}) that is equivalent to  \mathbb{P} , and denote by  W_{t}^{\mathbb{Q}} Brownian motion under  \mathbb{Q} as derived from Girsanov's Theorem (Karatzas and Shreve, 1991). Suppose  N latent factors (or states) representing uncertainty underlying term-structure securities follow the multivariate Ornstein-Uhlenbeck process

\displaystyle dX_{t}=(K_{0}^{\mu}+K_{1}^{\mu}X_{t})dt+\Sigma dW_{t}^{\mu} (1)

were  \mu\in\left\{ \mathbb{P},\mathbb{Q}\right\} . Let the short rate be
\displaystyle r_{t}=\rho_{0}+\rho_{1}\cdot X_{t}. (2)

Then by definition, the arbitrage-free time  t price of a zero-coupon bond maturing at time  T is given by
\displaystyle {P}_{t}^{T}=\mathbf{E}_{t}^{\mathbb{Q}}\left[\exp\left(-\int_{t}^{T}{r}_{s}\, ds\right)\right] (3)

with associated zero-coupon bond yield
\displaystyle {y}_{t}^{T}=-\frac{\log{P}_{t}^{T}}{T-t}. (4)

Bond prices (and hence yields) can equivalently be defined in terms of forward rates:
\displaystyle {P}_{t}^{T}=\exp\left(-\int_{t}^{T}{f}_{t}^{s}\, ds\right)\qquad\Leftrightarrow\qquad{f}_{t}^{T}=-\frac{d}{dT}\log{P}_{t}^{T} (5)

where  {f}_{t}^{T} denotes the instantaneous time  T forward rate effective at time  t .

Since  X_{t} is a Gaussian process (Karatzas and Shreve, 1991), it follows from (2) that the short rate  r_{t} takes on strictly negative values with strictly positive probability. To modify the model in a way that accounts for the zero lower bound on nominal yields, Black (1995) proposes to think of  r_{t} as a shadow short rate (and, analogously, of  P_{t}^{T} ,  y_{t}^{T}, and  f_{t}^{T} as shadow bond price, shadow yield, and shadow instantaneous forward rate, respectively) and define the observed short rate as the shadow rate censored at zero:

\displaystyle \underline{r}_{t}=\max\left\{ r_{t},0\right\} . (6)

With the observed short rate  \underline{r}_{t} in place of the shadow rate  r_{t} , the observed bond price  \underline{P}_{t}^{T} , yield  \underline{y}_{t}^{T} , and instantaneous forward rate  \underline{f}_{t}^{T} are then defined as in (3)-(5).

3 Parameterizing the Lower Bound

The theoretical argument for a lower bound at zero on the nominal short rate (and hence on nominal yields) is based on arbitrage between bonds and currency Black (1995). In practice, the two assets may not be perfect substitutes for reasons such as convenience, default risk, or legal requirements. This may push the empirical lower bound into slightly negative or slightly positive territory. The derivations in Section 2 are easily modified to accommodate a lower bound at  r_{\min}\ne0 . In particular, suppose

\displaystyle \underline{r}_{t}=\max\{r_{t},r_{\min}\}=r_{\min}+\max\{r_{t}-r_{\min},0\}.
Then,
\displaystyle \underline{y}_{t}^{T} \displaystyle =-\frac{1}{T-t}\mathbf{E}_{t}^{\mathbb{Q}}\left[\exp\left(-\int_{t}^{T}\underline{r}_{s} ds\right) \right]    
  \displaystyle =r_{\min}-\frac{1}{T-t}\mathbf{E}_{t}^{\mathbb{Q}}\left[\exp\left(-\int_{t}^{T}\max\{r_{s}-r_{\min},0\}\ ds\right)\right].    

The last term is equal to the expression for the yield when the lower bound is zero, except that  r_{\min} is subtracted from the shadow rate. Therefore, since  r_{s}=\rho_{0}+\rho_{1}\cdot X_{s} , when the lower bound is nonzero we can compute zero-coupon yields as if the bound were zero, with  \rho_{0}-r_{\min} in place of  \rho_{0} , and then add  r_{\min} to the final result. The lower bound  r_{\min} can be set to a specific value based on a priori reasoning, or treated as a free parameter in estimation.

4 Bond Price Computation

A central task in term-structure modeling is the (analytical and/or numerical) computation of arbitrage-free bond prices (and hence yields) based on equation (3). Section 4.1 reviews the standard approach using differential equations formalized by Duffie and Kan (1996), best suited to affine models. While it can be adapted to the shadow-rate framework, it loses much of its analytical tractability, and becomes computationally infeasible as the number of factors increases. Section 4.2 discusses an alternative method proposed by Krippner (2012). It defines a pseudo-forward rate that satisfies the lower bound (though differs from the arbitrage-free forward rate), and uses relationship (5) to approximate bond prices. Finally, Section 4.3 proposes a new approximation technique for yields in the shadow-rate model based on the expansion of a cumulant-generating function.


4.1 Partial Differential Equation

Like any conditional expectation of an  \mathcal{F}_{T} -measurable random variable,  e^{-\int_{0}^{t}r_{s}\, ds}P_{t}^{T} (the time  t price of a zero-coupon bond maturing at time  T in the unconstrained Gaussian model, discounted to time 0 ) follows a martingale under  \mathbb{Q} . This is an immediate consequence of the definition of a martingale, after an application of the Law of Iterated Expectations. Using It{\={o\/}}'s Lemma and the Martingale Representation Theorem, we can therefore represent  P_{t}^{T} by the function  D(X_{t},T-t) , where  D solves the partial differential equation (PDE)

\displaystyle D_{\tau}(x,\tau)=D_{x}^{\top}(x,\tau)(K_{0}^{\mathbb{Q}}+K_{1}^{\mathbb{Q}}x)+\frac{1}{2}\mathrm{tr}\bigl(\Sigma^{\top}D_{xx}(x,\tau)\Sigma\bigr)-(\rho_{0}+\rho_{1}\cdot x)D(x,\tau) (7)

with boundary condition  D(x,0)=1 . Using a separation-of-variables argument, it can be verified that
\displaystyle D(x,\tau)=e^{A(\tau)+B(\tau)\cdot x} (8)

solves (7), where  A and  B in turn solve ordinary differential equations (ODEs) in terms of the model parameters,
with  A(0)=0 ,  B(0)=0 . Reducing problem (7) to the system of ODEs (9)-(10) simplifies the numerical computation of bond prices substantially as it reduces the dimensionality of the problem from  N+1 to 1 (the time dimension). Direct computation reveals that
\displaystyle B(\tau)=((K_{1}^{\mathbb{Q}})^{-1})^{\top}(I_{N}-e^{(K_{1}^{\mathbb{Q}})^{\top}\tau})\rho_{1}, (11)

assuming  K_{1}^{\mathbb{Q}} is invertible.

A PDE analogous to (7) can be set up for the observed bond price in the shadow-rate model,  \underline{P}_{t}^{T} . The only required modification is to replace the expression for the shadow short rate,  r=\rho_{0}+\rho_{1}\cdot x , by that for the observed short rate,  \underline{r}=\max\{\rho_{0}+\rho_{1}\cdot x,0\} . Unfortunately, when this non-linearity is introduced, the separation-of-variables procedure no longer applies, and no solution as straightforward as (8) is available. It is possible to solve the modified version of (7) directly by numerical methods. This is the approach taken by Kim and Singleton (2012). It requires discretizing  \tau and  x on a multidimensional grid, which is computationally intensive and subject to the curse of dimensionality. Kim and Singleton (2012) therefore do not estimate models with more than two factors.


4.2 Forward Rate Approximation

Krippner (2012) proposes an alternative approach to computing yields in shadow-rate models, which is implemented empirically by Christensen and Rudebusch (2013). It is based on an approximation to the forward rate  \underline{f}_{t}^{T} . Substituting for  \underline{P}_{t}^{T} from the shadow-rate version of (3), and differentiating, we obtain

\displaystyle \underline{f}_{t}^{T}=\mathbf{E}_{t}^{\mathbb{Q}}\Biggl[\frac{e^{-\int_{t}^{T}\underline{r}_{s}\, ds}}{\underline{P}_{t}^{T}}\underline{r}_{T}\Biggr]=\mathbf{E}_{t}^{\underline{\mathbb{Q}}_{T}}\left[\underline{r}_{T}\right]=\mathbf{E}_{t}^{\underline{\mathbb{Q}}_{T}}\left[\max\{{r_{T},0}\}\right] (12)

where  \underline{\mathbb{Q}}_{T} , defined by the Radon-Nikodym derivative

\displaystyle \frac{d\underline{\mathbb{Q}}_{T}}{d\mathbb{Q}}=\frac{e^{-\int_{0}^{T}\underline{r}_{s}\, ds}}{\mathbf{E}^{\mathbb{Q}}\bigl[e^{-\int_{0}^{T}\underline{r}_{s}\, ds}\bigr]},
is referred to as the " T -forward measure." Equation (12) says that today's time  T forward rate is equal to today's expectation under the  T -forward measure of the time  T short rate. It can be verified directly from (12) and the Law of Iterated Expectations that  \underline{f}_{t}^{T} is a martingale under  \underline{\mathbb{Q}}_{T} (note that  \underline{r}_{T}\equiv\underline{f}_{T}^{T} by definition). Note also that (12) implies that the forward rate is subject to the same lower bound (6) as the short rate, by monotonicity of the mathematical expectation.

Analogously,

\displaystyle f_{t}^{T}=\mathbf{E}_{t}^{\mathbb{Q}}\Biggl[\frac{e^{-\int_{t}^{T}r_{s}\, ds}}{P_{t}^{T}}r_{T}\Biggr]=\mathbf{E}_{t}^{\mathbb{Q}_{T}}\left[r_{T}\right] (13)

expresses the shadow forward rate as the expectation under the shadow  T -forward measure of the future shadow short rate. Again,  f_{t}^{T} is a martingale under  \mathbb{Q}_{T} . Unlike the observed forward rate, it is not, however, constrained to be non-negative.

The distribution of the shadow forward rate  f_{t}^{T} under  \mathbb{Q}_{T} can be derived more explicitly. First, from (5) and (8),

\displaystyle f_{t}^{T}=-A'(T-t)-B'(T-t)\cdot X_{t}.
Therefore, by It{\={o\/}}'s Lemma and the Martingale Representation Theorem,
\displaystyle f_{T}^{T}=f_{t}^{T}-\int_{t}^{T}B'(T-s)^{\top}\Sigma\, dW_{s}^{\mathbb{Q}_{T}}. (14)

Thus,  f_{T}^{T} is Gaussian under  \mathbb{Q}_{T} conditional on  \mathcal{F}_{t} , with
\displaystyle \mathbf{E}_{t}^{\mathbb{Q}_{T}}[f_{T}^{T}]= \displaystyle f_{t}^{T}    
\displaystyle \mathrm{Var}_{t}^{\mathbb{Q}_{_{T}}}[f_{T}^{T}]= \displaystyle \mathbf{E}_{t}^{\mathbb{Q}_{T}}\left[\left(\int_{t}^{T}B'(T-s)^{\top}\Sigma\, dW_{s}^{\mathbb{Q}_{T}}\right)^{2}\right]]    
\displaystyle = \displaystyle \underbrace{\rho_{1}^{\top}\left(\int_{t}^{T}e^{(K_{1}^{\mathbb{Q}})^{\top}(T-s)}\Sigma\Sigma^{\top}e^{K_{1}^{\mathbb{Q}}(T-s)}\, ds\right)\rho_{1}}_{\omega(T-t)}. (15)

The final equality uses the It{\={o\/}} Isommetry and (11).1

Krippner (2012) takes advantage of this distributional property of  f_{T}^{T}\equiv r_{T} . He defines a pseudo-forward rate as a hybrid between observed forward rate (12) and shadow forward rate (13):

\displaystyle \d{f}_{t}^{T}=\mathbf{E}_{t}^{\mathbb{Q}}\Biggl[\frac{e^{-\int_{t}^{T}r_{s}\, ds}}{P_{t}^{T}}\underline{r}_{T}\Biggr]=\mathbf{E}_{t}^{\mathbb{Q}_{T}}\left[\underline{r}_{T}\right]=\mathbf{E}_{t}^{\mathbb{Q}_{T}}\left[\max\{{r_{T},0}\}\right]. (16)

This is the expectation under the shadow  T -forward measure of the observed time  T short rate (while the shadow-model-implied forward rate consistent with the absence of arbitrage is given by (12) as the expectation under the observed  T -forward measure of the observed time  T short rate).2 This rate is, by monotonicity, subject to lower bound (6). It is, moreover, relatively straightforward to compute: Lemma A.1 in Appendix A implies that
\displaystyle \d{f}_{t}^{T}=\mathbf{E}_{t}^{\mathbb{Q}_{T}}\left[\max\{f_{T}^{T},0\}\right]=f_{t}^{T}\Phi\left(\frac{f_{t}^{T}}{\omega(T-t)}\right)+\omega(T-t)\phi\left(\frac{f_{t}^{T}}{\omega(T-t)}\right). (17)

This is formula (32) in Krippner (2012).3 Note from (17) that  \d{f}_{t}^{T}/f_{t}^{T}\rightarrow1 as  f_{t}^{T} increases or  \omega(T-t) decreases. That is, as the lower bound becomes less binding, the wedge between  \d{f}_{t}^{T} and the shadow forward rate  f_{t}^{T} (which is the arbitrage-free forward rate in a Gaussian model without lower bound) shrinks.

Zero-coupon bond prices  \d{P}_{t}^{T} and yields  \d{y}_{t}^{T} can be approximated by substituting  \d{f}_{t}^{T} from (17) into (5) and (4).


4.3 Cumulants

Since the PDE approach to pricing bonds in shadow-rate models becomes computationally intractable as the number of factors increases, and the approach proposed by Krippner (2012) relies on a forward rate that is not equal to the arbitrage-free forward rate, I propose a new cumulant-based technique to approximating yields in Gaussian shadow-rate models.

The quantity  \log\underline{P}_{t}^{T}=\log\mathbf{E}_{t}^{\mathbb{Q}}\left[\exp\left(-\int_{t}^{T}\underline{r}_{s}\, ds\right)\right] appearing in the shadow-rate version of (4) is the conditional cumulant-generating function4 under  \mathbb{Q} , evaluated at -1 , of the random variable  \underline{R}_{t}^{T}\equiv\int_{t}^{T}\underline{r}_{s}\, ds . It has the series representation

\displaystyle \log\mathbf{E}_{t}^{\mathbb{Q}}\left[\exp\left(-\underline{R}_{t}^{T}\right)\right]=\sum_{j=1}^{\infty}(-1)^{j}\frac{\kappa_{j}^{\mathbb{Q}}}{j!} (18)

where  \kappa_{j}^{\mathbb{Q}} is the  j th cumulant of  \underline{R}_{t}^{T} under  \mathbb{Q} . An approximation to the zero-coupon yield  \underline{y}_{t}^{T} can therefore be computed based on a finite number of terms in the series in (18). Below, I consider the first-order approximation
\displaystyle \underline{\tilde{y}}_{t}^{T}=\frac{1}{\tau}\kappa_{1}^{\mathbb{Q}}=\frac{1}{T-t}\mathbf{E}_{t}^{\mathbb{Q}}\left[\int_{t}^{T}\underline{r}_{s}\, ds\right] (19)

and the second-order approximation
\displaystyle \underline{\tilde{\tilde{y}}}_{t}^{T}=\frac{1}{T-t}\left(\kappa_{1}^{\mathbb{Q}}-\frac{1}{2}\kappa_{2}^{\mathbb{Q}}\right)=\frac{1}{T-t}\left(\mathbf{E}_{t}^{\mathbb{Q}}\left[\int_{t}^{T}\underline{r}_{s}\, ds\right]-\frac{1}{2}\mathrm{Var}_{t}^{\mathbb{Q}}\left[\int_{t}^{T}\underline{r}_{s}\, ds\right]\right) (20)

where I make use of the fact that the first two cumulants of any random variable coincide with its first two centered moments.5

The first-order approximation (19) is equivalent to the method proposed independently and contemporaneously by Ichiue and Ueno (2013). I present it here both for comparison and to assess its relative performance in Section 5 below. I will, however, mostly focus on the second-order approximation (20), which I argue is particularly promising a priori because it is exact in the Gaussian benchmark case.6 It can therefore be expected to perform well both for short maturities (where the higher-order terms in (18) are relatively small), and for long maturities as long as  \mathbb{Q}_{t}[r_{T}<0] is small for large  T (in which case  \underline{r}_{t}=\max\{r_{t},0\} will behave approximately like a Gaussian process over sufficiently long horizons). Indeed, empirically, the second-order approximation turns out to be highly accurate across maturities both during normal times and when rates are low (see Section 5).


4.3.1 Computation of the First Moment

Evaluating the first- and second-order approximations (19) and (20) to zero-coupon yields requires computation of the first two cumulants (equivalently, centered moments) of  \underline{R}_{t}^{T}=\int_{t}^{T}\underline{r}_{s}\, ds . This subsection will be concerned with the first moment. As an initial step,

\displaystyle \mathbf{E}_{t}^{\mathbb{Q}}\left[\int_{t}^{T}\underline{r}_{s}\, ds\right]=\int_{t}^{T}\mathbf{E}_{t}^{\mathbb{Q}}\left[\underline{r}_{s}\right]\, ds (21)

by an application of Fubini's Theorem. Since  \underline{r}_{s}=\max\left\{ r_{s},0\right\} and  r_{s}\sim N(\mu_{t\rightarrow s},\sigma_{t\rightarrow s}^{2}) with known expressions for  \mu_{t\rightarrow s} and  \sigma_{t\rightarrow s}^{2} in terms of the model parameters (as shown in Appendix A), it follows from Lemma A.1 in Appendix A that
\displaystyle \mathbf{E}_{t}^{\mathbb{Q}}\left[\int_{t}^{T}\underline{r}_{s}\, ds\right]=\int_{t}^{T}\left[\mu_{t\rightarrow s}\Phi\left(\frac{\mu_{t\rightarrow s}}{\sigma_{t\rightarrow s}}\right)+\sigma_{t\rightarrow s}\phi\left(\frac{\mu_{t\rightarrow s}}{\sigma_{t\rightarrow s}}\right)\right]\, ds (22)

where  \phi and  \Phi denote the standard normal probability density function (pdf) and cumulative distribution function (cdf), respectively. That is, we can compute  \mathbf{E}_{t}^{\mathbb{Q}}\left[\underline{r}_{s}\right] analytically up to the standard normal cdf, which software such as Matlab is able to evaluate precisely and efficiently. The first cumulant of  \underline{R}_{t}^{T} can then be computed by numerical integration of  \mathbf{E}_{t}^{\mathbb{Q}}\left[\underline{r}_{s}\right] over the time dimension, as in (22).


4.3.2 Computation of the Second Moment

Once we know the first moment of  \underline{R}_{t}^{T}=\int_{t}^{T}\underline{r}_{s}\, ds , it remains to evaluate

\displaystyle \mathbf{E}_{t}^{\mathbb{Q}}\left[\left(\int_{t}^{T}\underline{r}_{s}\, ds\right)^{2}\right]=2\int_{t}^{T}\int_{t}^{s}\mathbf{E}_{t}^{\mathbb{Q}}\left[\underline{r}_{u}\underline{r}_{s}\right]\, du\, ds (23)

where the equality uses Fubini's Theorem and symmetry of the integrand. Since  \underline{r}_{u}=\max\left\{ r_{u},0\right\} and  \underline{r}_{s}=\max\left\{ r_{s},0\right\} , and  \left(r_{u},r_{s}\right) are jointly normally distributed with mean  \left(\mu_{t\rightarrow u},\mu_{t\rightarrow s}\right) , variances  \left(\sigma_{t\rightarrow u}^{2},\sigma_{t\rightarrow s}^{2}\right) , and covariance  \sigma_{t\rightarrow u\times s} (see Appendix A), we obtain from Lemma A.2 in Appendix A:
  \displaystyle \mathbf{E}_{t}^{\mathbb{Q}}\left[\left(\int_{t}^{T}\underline{r}_{s}\, ds\right)^{2}\right]    
\displaystyle = \displaystyle 2\int_{t}^{T}\int_{t}^{s}\Bigg\{(\mu_{t\rightarrow u}\mu_{t\rightarrow s}+\sigma_{t\rightarrow u\times s})\Phi_{2}^{d}\left(-\varsigma_{t\rightarrow u},-\varsigma_{t\rightarrow s};\chi_{t\rightarrow u\times s}\right)    
  \displaystyle +\sigma_{t\rightarrow s}\mu_{t\rightarrow u}\phi\left(\varsigma_{t\rightarrow s}\right)\Phi\left(\frac{\varsigma_{t\rightarrow u}-\chi_{t\rightarrow u\times s}\varsigma_{t\rightarrow s}}{\sqrt{1-\chi_{t\rightarrow u\times s}^{2}}}\right)    
  \displaystyle +\sigma_{t\rightarrow u}\mu_{t\rightarrow s}\phi\left(\varsigma_{t\rightarrow u}\right)\Phi\left(\frac{\varsigma_{t\rightarrow s}-\chi_{t\rightarrow u\times s}\varsigma_{t\rightarrow u}}{\sqrt{1-\chi_{t\rightarrow u\times s}^{2}}}\right)    
  \displaystyle +\sigma_{t\rightarrow u}\sigma_{t\rightarrow s}\sqrt{\frac{1-\chi_{t\rightarrow u\times s}^{2}}{2\pi}}\phi\Biggl(\sqrt{\frac{\varsigma_{t\rightarrow u}^{2}-2\chi_{t\rightarrow u\times s}\varsigma_{t\rightarrow u}\varsigma_{t\rightarrow s}+\varsigma_{t\rightarrow s}^{2}}{1-\chi_{t\rightarrow u\times s}^{2}}}\Biggr)\Bigg\}\, du\, ds (24)

where  \varsigma_{t\rightarrow j}=\frac{\mu_{t\rightarrow j}}{\sigma_{t\rightarrow j}} ,  j\in\left\{ u,s\right\} ,  \chi_{t\rightarrow u\times s}=\frac{\sigma_{t\rightarrow u\times s}}{\sigma_{t\rightarrow u}\sigma_{t\rightarrow s}} , and  \Phi_{2}^{d} denotes the decumulative bivariate normal cdf.

That is, we can compute  \mathbf{E}_{t}^{\mathbb{Q}}\left[\underline{r}_{u}\underline{r}_{s}\right] analytically up to the bivariate normal cdf, and we can then integrate this expression numerically over  u and  s to obtain the second cumulant of  \underline{R}_{t}^{T} .

4.3.3 Numerical Implementation

The following steps summarize the approximation procedure for zero-coupon yields for a given set of parameters  (K_{0}^{\mathbb{Q}},K_{1}^{\mathbb{Q}},\rho_{0},\rho_{1},\Sigma) and state vector  X_{t} :

  1. Compute the conditional mean and covariance matrix of  \left(r_{u},r_{s}\right) , for  u,s\ge t , using (A.4) and (A.5).
  2. Using the results from step 1, compute  \mathbf{E}_{t}^{\mathbb{Q}}\left[\underline{r}_{s}\right] and  \mathbf{E}_{t}^{\mathbb{Q}}\left[\underline{r}_{u}\underline{r}_{s}\right] , for  u,s\ge t , as described in 4.3.1 and 4.3.2.
  3. Integrate  \mathbf{E}_{t}^{\mathbb{Q}}\left[\underline{r}_{s}\right] numerically to obtain  \mathbf{E}_{t}^{\mathbb{Q}}\left[\underline{R}_{t}^{T}\right] , and integrate  \mathbf{E}_{t}^{\mathbb{Q}}\left[\underline{r}_{u}\underline{r}_{s}\right] numerically to obtain  \mathbf{E}_{t}^{\mathbb{Q}}[\left(\underline{R}_{t}^{T}\right)^{2}] .
  4. Using the moments computed in step 3, approximate  \underline{y}_{t}^{T} by  \underline{\tilde{y}}_{t}^{T} or  \underline{\tilde{\tilde{y}}}_{t}^{T} as defined in (19) and (20).
In terms of numerical implementation, step 1 is straightforward. Step 2 requires evaluation of the univariate and bivariate normal cdf's. A high-precision, efficient approximation to the univariate normal cdf is built into most computational software packages, so numerically evaluating the first integrand does not pose a challenge. For the bivariate normal cdf, I implement a vectorized version of an algorithm proposed by Genz (2004) which achieves double machine precision.7 Step 3 is straightforward in principle, though a favorable tradeoff between precision and computational burden requires careful choice of quadrature rule and grid. I use composite Gauss-Legendre and Gauss-Lobatto rules with 2-20 points per maturity (and corresponding product rules for the double integral in (24)), selected to evaluate  \underline{\tilde{y}}_{t}^{T} or  \underline{\tilde{\tilde{y}}}_{t}^{T} to an approximate minimum numerical precision of 1/100th of a basis point. With fully vectorized Matlab code,8 I am able to evaluate a full representative sample of model-implied zero coupon yields (approximately 20 years of monthly data across eight maturities) within a fraction of a second.

Note that the complexity of the algorithm does not depend on the number of yield curve factors  N , so it is not subject to the curse of dimensionality in the same way that some other methods are.9

For illustration purposes, in Appendix B I apply the second-order approximation method to estimate a three-factor shadow-rate model of the U.S. Treasury term structure.


5 Accuracy of Yield Approximation Methods

Section 4.3 argued intuitively that the second-order yield approximation (20) should be relatively precise. This section quantifies that claim. To get an initial sense of the relative numerical accuracy, I consider the stylized model used for illustration by Gorovoi and Linetsky (2004), and replicated for the same purpose in Krippner (2012). It is a one-factor model with  \rho_{0}=0.01 ,  \rho_{1}=1 ,  K_{0}^{\mathbb{Q}}=0 ,  K_{1}^{\mathbb{Q}}=-0.1 , and  \Sigma=0.02 . Gorovoi and Linetsky (2004) derive model-implied yield curves for states  X_{t}\in\{-0.06,-0.02,-0.01,0\} corresponding to shadow short rates  r_{t}=\{-0.05,-0.01,0,0.01\} . The four panels of Figure 1 plot the model-implied yield curves for each initial state. Within each panel, I compare four different yield approximation schemes: Solving PDE (7) numerically (which, in a one-factor setting, is computationally feasible and can be considered the "exact" solution for comparison purposes), Krippner's (2012) approach described in Section 4.2, and the first- and second-order approximations proposed in Section 4.3. As the figure shows, the second-order approximation matches the exact PDE solution most closely, and consistently across states. The yield approximation error is uniformly less than one basis point. The first-order approximation generally overstates yields (an implication of the alternating nature of series expansion (18)), with approximation errors increasing both in yield maturity and the level of the shadow short rate (in both cases, the first-order approximation is off by an increasingly large convexity adjustment arising from Jensen's inequality). Krippner's (2012) method generally undershoots yields, and is relatively more accurate when the shadow short rate is higher. Why this is the case can be seen intuitively by comparing the  \mathbb{Q} -measure expressions for the arbitrage-free forward rate (12) and Krippner's (2012) approximate forward rate (16): While both use the same time  T short rate  \underline{r}_{T} , Krippner's (2012) formula discounts by the shadow rate rather than the observed short rate. This means the discount factor tends to be larger than it should be when  \underline{r}_{T} is low, reducing the covariance between discount factor and  \underline{r}_{T} , and thus lowering the expectation of their product,  \d{f}_{t}^{T} .

To compare the relative performance of the different yield approximations in a more realistic empirical setting, I use the estimated model parameters and smoothed states from Appendix B.3 to compute model-implied yield curves for all dates in the sample.10 Since this is a three-factor model, solving PDE (7) numerically is no longer practicable. I therefore replace this benchmark by a simulated yield  \underline{\hat{y}}_{t}^{T} that consistently estimates the true yield  \underline{y}_{t}^{T} based on  n=1,\!000,\!000 randomly drawn

Figure 1: Yield curves (zero-coup on yield against maturity in years) implied by a one factor shadow-rate model with ρ0-0.01, ρ1-1, K  \mathbb{Q}0-0. K  \mathbb{Q}1- -0.1, and ∑ -0.02.The different panels corresp ond to di erent initial shadow short rates. Within each panel, the yield curve is computed using four metho ds: Numerical solution of PDE (7), Krippner's (2012) approach describ ed in Section 4.2,and the first- and second-order approximations proposed in Section 4.3.

Figure 1 Figure 1 Data

short-rate paths per sample date.11,12 Table 1 shows the mean simulation error of  \underline{\hat{y}}_{t}^{T} , and the root-mean-square errors (RMSE) against  \underline{\hat{y}}_{t}^{T} of the yield  \d{y}_{t}^{T} computed by Krippner's (2012) method, the first-order approximation  \underline{\tilde{y}}_{t}^{T} defined in (19), and the second-order approximation  \underline{\tilde{\tilde{y}}}_{t}^{T} defined in (20). The table is divided into two panels. The top panel shows errors for the sub-sample Jan 1990-Nov 2008 (when interest rates were at normal levels), and the bottom panel shows errors for the sub-sample Dec 2008-Dec 2012 (when the lower bound on nominal yields was binding at the short end of the yield curve). All methods are generally more precise at shorter maturities. As the first column in both panels shows, the simulated yields are accurate to within approximately one fifth of a basis point at the ten-year maturity point. As shown in the second column of the tables, Krippner's (2012) method produces ten-year yields that are accurate to about one basis point during normal times, and to within four basis points when the lower bound is binding. While the first-order method is more accurate when rates are low (the bottom panel), its errors remain substantial at the long end. The second-order method, the final column in the tables, produces ten-year yields that are accurate to approximately half a basis point, both during normal times and when the lower bound is binding.

To further illustrate the time-varying relative performance of the three approximation schemes, Figure 2 plots the difference over time between the simulated ten-year yield and the three approximated yields. Krippner's (2012) method and the second-order approximation appear to be equally precise in the first few years of the sample,

Table 1: The mean standard error of simulated yields  \mathrm\underline{\hat{y}}_{t}^{T} (n-1,000,000 draws per sample date) for the model estimated in Appendix B, and the root-mean-square errors (RMSE) against the simulated yields of Krippner's (2012) yield approximation  \mathrm{\hat{y}}_{t}^{T} , the first-order yield approximation  \mathrm\underline{\tilde{y}}_{t}^{T} , and the second-order yield approximation  \mathrm\underline{\tilde{\tilde{y}}}_{t}^{T} . All errors are in basis points.

Maturity  \mathrm{se}(\underline{\hat{y}}_{t}^{T})  \mathrm{RMSE}(\d{y}_{t}^{T})  \mathrm{RMSE}(\underline{\tilde{y}}_{t}^{T})  \mathrm{RMSE}(\underline{\tilde{\tilde{y}}}_{t}^{T})
6m Sub-sample Kam 1990-Nov 2008 0.04 0.04 0.05 0.04
1y Sub-sample Kam 1990-Nov 2008 0.06 0.06 0.18 0.06
2y Sub-sample Kam 1990-Nov 2008 0.09 0.10 0.88 0.10
3y Sub-sample Kam 1990-Nov 2008 0.12 0.14 2.26 0.13
4y Sub-sample Kam 1990-Nov 2008 0.14 0.18 4.22 0.15
5y Sub-sample Kam 1990-Nov 2008 0.16 0.27 6.60 0.17
7y Sub-sample Kam 1990-Nov 2008 0.19 0.47 12.22 0.21
10y Sub-sample Kam 1990-Nov 2008 0.21 0.93 21.81 0.35
6m Sub-sample Dec 2008-Dec 2012 0.01 0.01 0.01 0.01
1y Sub-sample Dec 2008-Dec 2012 0.02 0.04 0.04 0.02
2y Sub-sample Dec 2008-Dec 2012 0.05 0.19 0.33 0.05
3y Sub-sample Dec 2008-Dec 2012 0.07 0.51 1.07 0.07
4y Sub-sample Dec 2008-Dec 2012 0.10 0.94 2.31 0.09
5y Sub-sample Dec 2008-Dec 2012 0.12 1.42 3.99 0.12
7y Sub-sample Dec 2008-Dec 2012 0.15 2.43 8.38 0.23
10y Sub-sample Dec 2008-Dec 2012 0.17 3.87 16.63 0.52

with fluctuations presumably largely due to simulation error.13 The discrepancy between simulated yield and Krippner's (2012) method increases over time as the level of yields declines, and exceeds five basis points by December 2012. The discrepancy between simulated yield and second-order approximation remains small and appears to show little systematic variation over time, perhaps trending up modestly towards the end of the sample. The first-order approximation has a large negative discrepancy initially, which shrinks over time but remains at a high absolute level even at the end of the sample.

Figure 2 also confirms that, just like in the simple one-factor model in Figure 1, the approximation errors under Krippner's (2012) method and the first-order scheme are largely systematic (rather than mere noise), in that the first-order approximation overstates arbitrage-free yields while Krippner's (2012) method tends to understate them.

In sum, the analysis above suggests that, empirically, the second-order yield approximation  \underline{\tilde{\tilde{y}}}_{t}^{T} is accurate to within about one half of a basis point at maturities up to ten years, both during normal times and when the lower bound is binding. The approximation error is one order of magnitude smaller than both the model-implied observation error in yields (see Table 3) and the next best approximation method proposed by Krippner's (2012). In contrast, the first-order approximation is acceptable at most at the very short end of the yield curve.

To add perspective, the approximation error in  \underline{\tilde{\tilde{y}}}_{t}^{T} is no greater than commonly accepted fitting error in the derivation of constant-maturity zero-coupon bond yields from observed coupon-bearing Treasuries (e.g., Gürkaynak et al., 2006). This puts the second-order approximation roughly on par with the numerical accuracy achieved

Figure 2: Deviation from simulated ten-year yield  \underline{\hat{y}}_{t}^{T} +10 of Krippner's (2012) yield approximation  \d{y}_{t}^{T} +10, the first-order yield appproximation  \underline{\tilde{y}}_{t}^{T} +10, and the second-order yield approximation  \underline{\tilde{\tilde{y}}}_{t}^{T}+10. Model parameters and filtered states are taken from Appendix B.

Figure 2 Figure 2 Data

by "exact" bond pricing methods in standard affine models, to the extent that they rely on numerical methods (say, to solve the system of ODEs (9)-(10)).14

6 Conclusion

This paper develops an approximation to arbitrage-free zero coupon bond yields in Gaussian shadow-rate term structure models. The complexity of the scheme does not depend on the number of factors. Further, I demonstrate that the method is computationally feasible by estimating a three-factor shadow-rate model of the U.S. Treasury yield curve. Based on Monte Carlo simulation, I also show that the yield approximation is approximately as precise as conventional approaches that are considered to be "exact."


A. Useful Mathematical Results

A.1 Moments of  X_{t}

Consider the continuous time stochastic process  X_{t} defined in (1). The following derivations hold under both the  \mathbb{P} -measure and the  \mathbb{Q} -measure, hence for notational simplicity I will suppress dependence of moments and parameters on the measure. Since  X_{t} is a Gaussian process, all its finite-dimensional distributions are Gaussian Karatzas and Shreve, 1991). In particular, for  u,s\ge t , the vectors  \left(X_{u},X_{s}\right) are jointly conditionally Gaussian, with


If  K_{1} is invertible, the integral on the right-hand side of (A.2) can be evaluated analytically using integration by parts and formula (10.2.15) in Hamilton (1994),
  \displaystyle \mathrm{vec}(\mathrm{Cov}_{t}\left[X_{u},X_{s}\right]) (A.3)
  \displaystyle =(K_{1}\oplus K_{1})^{-1}\mathrm{vec}(e^{K_{1}(u-t)}\Sigma\Sigma^{\top}e^{K_{1}^{\top}(s-t)}-e^{K_{1}(u-u\wedge s)}\Sigma\Sigma^{\top}e^{K_{1}^{\top}(s-u\wedge s)}),    

where " \oplus " denotes the Kronecker sum. Since  r_{t} as defined in (2) is a linear function of the Gaussian random vector  X_{t} , it is itself Gaussian, with
\displaystyle \mu_{t\rightarrow u}=\mathbf{E}_{t}[r_{u}]=\rho_{0}+\rho_{1}\cdot\mathbf{E}_{t}[X_{u}] (A.4)

and
\displaystyle \sigma_{t\rightarrow u\times s}=\mathrm{Cov}_{t}[r_{u},r_{s}]=\rho_{1}^{\top}\mathrm{Cov}_{t}[X_{u},X_{s}]\rho_{1}. (A.5)

A.2 Moments of Censored Gaussian Random Variables

This section derives two useful mathematical results involving the moments of censored Gaussian random variables.

Lemma A.1   If  X\sim N(\mu,\sigma^{2}) , then

\displaystyle \mathbf{E}[\max\{X,0\}]=\mu\Phi\left(\frac{\mu}{\sigma}\right)+\sigma\phi\left(\frac{\mu}{\sigma}\right)
where  \Phi denotes the standard normal cdf, and  \phi denotes the standard normal pdf.
Proof. First, note that
\displaystyle \mathbf{E}[\max\{X,0\}]=\mu+\sigma\mathbf{E}\left[\max\left\{ \frac{X-\mu}{\sigma},-\frac{\mu}{\sigma}\right\} \right]. (A.6)

Thus, it only remains to compute  \mathbf{E}\left[\max\left\{ Z,a\right\} \right] where  Z is a standard normal random variable, for arbitrary  a\in\mathbb{R} . By direct computation of the integral defining the expectation,

\displaystyle \mathbf{E}\left[\max\left\{ Z,a\right\} \right]=a\Phi(a)+\frac{1}{\sqrt{2\pi}}\int_{a}^{\infty}z\exp\left(-\frac{1}{2}z^{2}\right)\, dz,\\ (A.7)

where  \Phi is the standard normal cdf. Further,

\displaystyle \int_{a}^{\infty}z\exp\left(-\frac{1}{2}z^{2}\right)\, dz=\left[-\exp\left(-\frac{1}{2}z^{2}\right)\right]_{a}^{\infty}=\exp\left(-\frac{1}{2}a^{2}\right)=\sqrt{2\pi}\phi(a). (A.8)

Recursively substituting from (A.8) into (A.7), and finally into (A.6), establishes the result.  \qedsymbol

Lemma A.2  

If

\displaystyle \begin{pmatrix}X_{1}\ X_{2} \end{pmatrix}\sim N\left[\begin{pmatrix}\mu_{1}\ \mu_{2} \end{pmatrix},\begin{pmatrix}\sigma_{1}^{2} & \sigma_{12}\ \sigma_{12} & \sigma_{2}^{2} \end{pmatrix}\right]
then
\displaystyle \mathbf{E}[\max\{{X_{1},0}\}\max\{{X_{2},0}\}]=\, \displaystyle (\mu_{1}\mu_{2}+\sigma_{12})\Phi_{2}^{d}\left(-\varsigma_{1},-\varsigma_{2};\chi\right)\nonumber    
  \displaystyle +\sigma_{2}\mu_{1}\phi\left(\varsigma_{2}\right)\Phi\left(\frac{\varsigma_{1}-\chi\varsigma_{2}}{\sqrt{1-\chi^{2}}}\right)+\sigma_{1}\mu_{2}\phi\left(\varsigma_{1}\right)\Phi\left(\frac{\varsigma_{2}-\chi\varsigma_{1}}{\sqrt{1-\chi^{2}}}\right)\nonumber    
  \displaystyle +\sigma_{1}\sigma_{2}\sqrt{\frac{1-\chi^{2}}{2\pi}}\phi\Biggl(\sqrt{\frac{\varsigma_{1}^{2}-2\chi\varsigma_{1}\varsigma_{2}+\varsigma_{2}^{2}}{1-\chi^{2}}}\Biggr)    
(A.9)

where  \varsigma_{j}=\frac{\mu_{j}}{\sigma_{j}} ,  j\in\left\{ 1,2\right\} ,  \chi=\frac{\sigma_{12}}{\sigma_{1}\sigma_{2}} ,  \phi denotes the univariate standard normal pdf,  \Phi denotes the univariate standard normal cdf,  \phi_{2}(z_{1},z_{2};\chi) denotes the bivariate normal pdf when both variables have zero means, unit variances, and correlation  \chi , and  \Phi_{2} and  \Phi_{2}^{d} denote the corresponding cumulative and decumulative bivariate Gaussian distribution functions, where in particular  \Phi_{2}^{d}(z_{1},z_{2};\chi)=1-\Phi\left(z_{1}\right)-\Phi\left(z_{2}\right)+\Phi_{2}\left(z_{1},z_{2};\chi\right) .

Proof. Write
  \displaystyle \mathbf{E}\left[\max\left\{ X_{1},0\right\} \max\left\{ X_{2},0\right\} \right]\nonumber    
\displaystyle =\: \displaystyle \sigma_{1}\sigma_{2}\mathbf{E}^{\mathbb{}}\left[\max\left\{ \frac{X_{1}-\mu_{1}}{\sigma_{1}},-\frac{\mu_{1}}{\sigma_{1}}\right\} \max\left\{ \frac{X_{2}-\mu_{2}}{\sigma_{2}},-\frac{\mu_{2}}{\sigma_{2}}\right\} \right]\nonumber    
  \displaystyle +\mu_{2}\mathbf{E}\left[\max\left\{ X_{1},0\right\} \right]+\mu_{1}\mathbf{E}\left[\max\left\{ X_{2},0\right\} \right]-\mu_{1}\mu_{2}.    
(A.10)

The second and third terms can be evaluated using Lemma A.1. For the first term, it suffices to be able to compute  \mathbf{E}\left[\max\left\{ Z_{1},a\right\} \max\left\{ Z_{2},b\right\} \right] for random variables  Z_{1} and  Z_{2} that are bivariate normal with zero means, unit variances, and correlation  \chi , and for arbitrary  a,b\in\mathbb{R} . Using the properties of  \phi_{2} , this expectation can be expanded as follows:

  \displaystyle \mathbf{E}\left[\max\left\{ Z_{1},a\right\} \max\left\{ Z_{2},b\right\} \right]    
\displaystyle = \displaystyle \int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\max\left\{ z_{1},a\right\} \max\left\{ z_{2},b\right\} \phi_{2}(z_{1},z_{2};\chi)\, dz_{1}\, dz_{2}    
\displaystyle = \displaystyle \: ab\int_{-\infty}^{b}\int_{-\infty}^{a}\phi_{2}(z_{1},z_{2};\chi)\, dz_{1}\, dz_{2}+a\int_{b}^{\infty}\int_{-a}^{\infty}z_{2}\phi_{2}(z_{1},z_{2};-\chi)\, dz_{1}\, dz_{2}    
  \displaystyle +b\int_{-b}^{\infty}\int_{a}^{\infty}z_{1}\phi_{2}(z_{1},z_{2};-\chi)\, dz_{1}\, dz_{2}+\int_{b}^{\infty}\int_{a}^{\infty}z_{1}z_{2}\phi_{2}(z_{1},z_{2};\chi)\, dz_{1}\, dz_{2}. (A.11)

The first double integral is simply  \Phi_{2}(a,b;\chi) , the bivariate normal cdf. The second and third double integrals correspond to expected values of truncated bivariate normal random variables, and the last integral is the expected cross product of a truncated bivariate normal random vector. These expected values are known up to the univariate standard normal cdf and the bivariate normal cdf, respectively (see Rosenbaum (1961)). Using the formulas in Rosenbaum (1961) to evaluate the integrals in (A.9), and substituting into (A.10), we obtain (A.9) after simplification.  \qedsymbol


B. Empirical Implementation

This appendix empirically estimates a three-factor Gaussian shadow-rate term structure model using the yield approximation methodology proposed in Section 4.3. The main purpose is to demonstrate the computational tractability of the method in the context of a realistic application. For more in-depth discussion and empirical analysis, see Kim and Priebsch (2013).


B.1 Data

I use end-of-month zero-coupon U.S. Treasury yields from January 1990 through December 2012, for maturities of 6 months, 1-5, 7, and 10 years. I derive the 6-month yield from the corresponding T-bill quote, while longer-maturity zero yields are extracted from the CRSP U.S. Treasury Database using the unsmoothed Fama and Bliss (1987) methodology.15

I augment the yield data with survey forecasts from Blue Chip, interpolated to constant horizons of 1-4 quarters (available monthly), as well as annually out to 5 years and for 5-to-10 years (available every six months).16 Model-implied survey forecasts are subject to the same lower-bound constraint as yields,17 but their computation is substantially simpler: Forecasters report their expectation of the arithmetic mean of future observed short rates,  \mathbf{E}_{t}^{\mathbb{P}}\left[\frac{1}{\tau}\int_{t}^{t+\tau}\max\left\{ r_{s},0\right\} ds\right] . This is exactly (19) with the data-generating measure  \mathbb{P} in place of pricing measure  \mathbb{Q} . Therefore, the first-order method described in Section 4.3 produces exact model-implied survey forecasts. Intuitively, unlike yields, survey forecasts are not subject to compounding, so there are no higher-order Jensen's inequality terms to consider.


B.2 Filtering and Estimation

Since the statistical properties of the term structure model laid out in Section 2 are formulated in terms of the latent state vector  X_{t} , but the data actually observed by the econometrician consist of yields,  y_{t} , and survey expectations,  z_{t} (see Appendix B.1), I set up a joint estimation and filtering problem to obtain estimates of the model's parameters  \theta=(K_{0}^{\mathbb{P}},K_{1}^{\mathbb{P}},K_{0}^{\mathbb{Q}},K_{1}^{\mathbb{Q}},\rho_{0},\rho_{1},\Sigma) .18 When discretely sampled at intervals  \Delta t>0 , the state vector  X follows a first-order Gaussian vector autoregression,

\displaystyle X_{t+\Delta t}=m_{0,\Delta t}+m_{1,\Delta t}X_{t}+\varepsilon_{t} (B.1)

where  \varepsilon_{t}\sim N(0,\Omega_{\Delta t}) , and  m_{0,\Delta t} ,  m_{1,\Delta t} , and  \Omega_{\Delta t} can be computed from (A.1) and (A.2). Equation (B.1) represents the transition equation of the filtering problem.

Next, denote by  H_{y}:\mathbb{R}^{N}\times\Theta\mapsto\mathbb{R}_{+}^{M_{Y}} the (non-linear) mapping from states  X and parameters  \theta to model-implied yields  y , and by  H_{z}:\mathbb{R}^{N}\times\Theta\mapsto\mathbb{R}_{+}^{M_{Z}} the analogous mapping from states and parameters to model-implied survey forecasts  z . For estimation purposes, I compute  H_{y} through the second-order approximation (20), and  H_{z} through the exact first-order method discussed in Appendix B.1. To simplify notation, denote the stacked mapping  \left(H_{y}^{\top},H_{z}^{\top}\right)^{\top} by  H . If we assume that all yields and survey expectations are observed with iid additive Gaussian errors, we obtain the observation equation

\displaystyle \begin{pmatrix}y_{t}\\ z_{t} \end{pmatrix}=H(X_{t})+e_{t}. (B.2)

Together, equations (B.1) and (B.2) form a non-linear filtering problem.

The simple (linear) Kalman filter--optimal when measurement and observation equation are linear and all shocks are Gaussian--has been modified in a number of ways to accommodate nonlinearity as in (B.2). The unscented Kalman filter, proposed by Julier et al. (1995), aims to deliver improved accuracy and numerical stability relative to the more traditional extended Kalman filter, without substantially increasing the computational burden.19,20 The algorithm is described in detail in Wan and van der Merwe (2001). As a by-product of the filtering procedure, it conveniently produces estimates of the mean and covariance matrix of  (y_{t},z_{t}) conditional on the econometrician's information set as of time  t-1 . I use these to set up a quasi-maximum likelihood function based on (B.2),21 which I maximize numerically to obtain estimates of the parameters  \theta as well as their asymptotic standard errors (following Bollerslev and Wo oldridge, 1992).


B.3 Estimation Results

To achieve econometric identification of the parameters  \theta in light of invariant transformations resulting in observationally equivalent models with different parameters (see Dai and Singleton, 2000 ), I follow Joslin et al. (2011) and impose the normalizations  \rho_{1}=(1,\ldots,1)^{\top} ,  K_{0}^{\mathbb{Q}}=0 ,  K_{1}^{\mathbb{Q}} is diagonal and therefore completely determined by its ordered eigenvalues  \lambda^{\mathbb{Q}} , and  \Sigma is lower triangular.

I estimate the model on the data set described in Appendix B.1, using the quasi-maximum likelihood (QML) procedure discussed in Appendix B.2. Table 2 displays the estimated model parameters  \hat{\theta} , as well as their asymptotic standard errors.

Table 2: Quasi-maximum likelihood parameter estimates (asymptotic standard errors) for the three-factor shadow-rate model.

 \rho_{0} 0.0738
 \rho_{0} :standard errors (0.0043)
 r_{\min} 0.0010
 r_{\min} :standard errors (0.0001)
 \lambda^{\mathbb{Q}} vector, row 1 - 0.1038
 \lambda^{\mathbb{Q}} vector, row 1: standard error (0.0226)
 \lambda^{\mathbb{Q}} vector, row 2 - 0.3566
 \lambda^{\mathbb{Q}} vector, row 2: standard error (0.1177)
 \lambda^{\mathbb{Q}} vector, row 3 - 0.8574
 \lambda^{\mathbb{Q}} vector, row 3: standard error (0.2876)
 K_{0}^{\mathbb{P}} vector, row 1 - 0.0193
 K_{0}^{\mathbb{P}} vector, row 1: standard error (0.0052)
 K_{0}^{\mathbb{P}} vector, row 2 - 0.0099
 K_{0}^{\mathbb{P}} vector, row 2: standard error (0.0224)
 K_{0}^{\mathbb{P}} vector, row 3 0.0278
 K_{0}^{\mathbb{P}} vector, row 3: standard error (0.0233)
 \Sigma matrix, row 1, column 1 0.0268
 \Sigma matrix, row 1, column 1: standard error (0.0084)
 \Sigma matrix, row 2, column 1 - 0.0324
 \Sigma matrix, row 2, column 1: standard error (0.0110)
 \Sigma matrix, row 2, column 2 0.0416
 \Sigma matrix, row 2, column 2: standard error (0.0295)
 \Sigma matrix, row 3, column 1 0.0068
 \Sigma matrix, row 3, column 1: standard error (0.0100)
 \Sigma matrix, row 3, column 2 - 0.0397
 \Sigma matrix, row 3, column 2: standard error (0.0302)
 \Sigma matrix, row 3, column 3 - 0.0090
 \Sigma matrix, row 3, column 3: standard error (0.0007)
 K_{1}^{\mathbb{P}} matrix, row 1, column 1 - 0.4679
 K_{1}^{\mathbb{P}} matrix, row 1, column 1: standard error (0.1574)
 K_{1}^{\mathbb{P}} matrix, row 1, column 2 - 0.3415
 K_{1}^{\mathbb{P}} matrix, row 1, column 2: standard error (0.1490)
 K_{1}^{\mathbb{P}} matrix, row 1, column 3 0.3785
 K_{1}^{\mathbb{P}} matrix, row 1, column 3: standard error (0.6798)
 K_{1}^{\mathbb{P}} matrix, row 2, column 1 - 0.5752
 K_{1}^{\mathbb{P}} matrix, row 2, column 1: standard error (0.6303)
 K_{1}^{\mathbb{P}} matrix, row 2, column 2 - 1.1881
 K_{1}^{\mathbb{P}} matrix, row 2, column 2: standard error (1.1335)
 K_{1}^{\mathbb{P}} matrix, row 2, column 3 - 1.1875
 K_{1}^{\mathbb{P}} matrix, row 2, column 3: standard error (0.5740)
 K_{1}^{\mathbb{P}} matrix, row 3, column 1 0.8908
 K_{1}^{\mathbb{P}} matrix, row 3, column 1: standard error (0.6982)
 K_{1}^{\mathbb{P}} matrix, row 3, column 2 1.3060
 K_{1}^{\mathbb{P}} matrix, row 3, column 2: standard error (1.0641)
 K_{1}^{\mathbb{P}} matrix, row 3, column 3 0.3990
 K_{1}^{\mathbb{P}} matrix, row 3, column 3: standard error (1.3561)

Table 3 shows the QML-estimated standard deviations of the measurement errors in yields and survey variables ( e_{t} in equation (B.2)). The average yield error is 8 basis points, and the average error in surveys is 21 basis points. For both yields and surveys, errors follow a U-shaped pattern, being largest at the short and long ends.

Figure 3 plots the model-implied shadow short rate  r_{t} over the sample period, based on the states implied by the Kalman smoother (that is, incorporating all information up to December 2012, the end of the sample). The shadow rate turned negative in December 2008, after the FOMC established a target federal funds rate range of 0 to 0.25 percent and the effective lower bound became binding, and has stayed negative through the end of the sample.

Table 3:Estimated standard deviations of observation errors in yields,  \sigma _{Y}

Maturity  \sigma _{Y}
6m 0.0017
1y 0.0014
2y 0.0006
3y 0.0003
4y 0.0004
5y 0.0003
7y 0.0006
10y 0.0015
Average 0.0008

Table 3:Estimated standard deviations of survey forecasts,  \sigma _{Z}

Maturity  \sigma _{Z}
1q 0.0014
2q 0.0002
3q 0.0009
4q 0.0014
2y 0.0028
3y 0.0026
4y 0.0027
5y 0.0031
5y-10y 0.0034
Average 0.0021

Figure 3: Model-implied shadow short rate r t based on smotthed states X t/T

Figure 3. Figure 3 Data

References

Bauer, M., and Rudebusch, G. (2013), "Monetary Policy Exp ectations at the Zero Lower Bound," Working Paper, Federal Reserve Bank of San Francisco

Bernanke, B., Reinhart, V., and Sack, B. (2004), Monetary Policy Alterna- tives at the Zero Bound: An Empirical Assessment, "Brookings Papers on Economic Activity", 2:1-100

Black, F. (1995), Interest Rates as Options, Journal of Finance, 50(5):1371 1376

Bollerslev, T., and Wo oldridge, J. (1992), "Quasi-Maximum Likelihoodd Esti- mation and Inference in Dynamic Mo dels with Time-Varying Covariances, Econo- metric Reviews, 11(2):143-172

Chen, R. (1995), A Two-Factor, Preference-Free Mo del for Interest Rate Sensivite Claims, Journal of Futures Markets, 15(3):345-372

Christensen, J., and Rudebusch, G. (2013), Estimating Shadow-Rate Term Structure Mo dels with Near-Zero Yields, Working Paper, Federal Reserve Bank of San Francisco

Christoffersen, P., Dorion, C., Jacobs, K., and Karoui, L. (2012), Nonlinear Kalman Filtering in Affine Term Structure Mo dels, CREATES Research Paper 2012-49, Aarhus University

Dai, Q., and Singleton, K. (2000), Specification Analysis of Affine Term Structure Models, Journal of Finance, 60(5):1943-1978

Duan, J.-C., and Simonato, J.-G. (1999), "Estimating and Testing Exponential-Affine Term Structure Mo dels by Kalman Filter," Review of Quantitative Finance and Accounting, 13:111-135

Duffie, D., and Kan, R. (1996), "A Yield-Factor Mo del of Interest Rates," Mathematical Finance, 6:369-406

Fama, E., and Bliss, R. (1987) ,"The Information in Long-Maturity Forward Rates," American Economic Review, 77(4):680-692

Genz, A. (2004),"Numerical Computation of Rectangular Bivariate and Trivariate Normal and t Probabilities," Statistics and Computing, 14:251-260

Gorovoi, V., and Linetsky, V. (2004), "Black's Model of Interest Rates as Op- tions, Eigenfunction Expansions and Japanese Interest Rates," Mathematical Fi- nance, 14(1):49-78

Hamilton, J. (1994), The Time Series Analysis, Princeton University Press

Ichiue, H., and Ueno, Y. (2007), "Equilibrium Interest Rate and the Yield Curve in a Low Interest Environment," Bank of Japan Working Paper

(2013), "Estimating Term Premia at the Zero Bound: An Analysis of Japanese, U.S., and U.K. Yields," Bank of Japan Working Pap er No. 13-E-8

Joslin, S., Singleton, K., and Zhu, H. (2011), "A New Persp ective on Gaussian Dynamic Term Structure Mo dels," Review of Financial Studies, 24(3):926-970

Julier, S., Uhlmann, J., and Durrant-Whyte, H. (1995), "A New Approach for Filtering Nonlinear Systems," in Proceedings of the American Control Conference , pp. 1628-1632

Karatzas, I., and Shreve, S. (1991),Brownian Motion and Stochastic Calculus , Graduate Texts in Mathematics Series, Springer, London

Kim, D., and Orphanides, A. (2005), "Term Structure Estimation with Survey Data on Interest Rate Forecasts," Staff Working Pap er 2005-48, Federal Reserve Board

Kim, D., and Priebsch, M. (2013), "Estimation of Multi-Factor Shadow Rate Models," Working Pap er in Progress, Federal Reserve Board, Washington D.C.

Kim, D., and Singleton, K. (2012),"Term Structure Models and the Zero Bound: An Empirical Investigation of Japanese Yields," Journal of Econometrics, 170(1):32-49

Krippner, L. (2012), "Modifying Gaussian Term Structure Mo dels When Interest Rates are Near the Zero Lower Bound," Reserve Bank of New Zealand Discussion Paper 2012/02

Lund, J. (1997), Non-Linear Kalman Filtering Techniques for Term Structure Mo d-els, Working Paper, Aarhus Scho ol of Business

Piazzesi, M. (2010), "Affine Term Structure Models," in Handbook of Financial Econometrics , ed. by Y. Ait-Sahalia, and L. P. Hansen. North-Holland, Amsterdam

Rosenbaum, S. (1961),"Moments of a Truncated Bivariate Normal Distribution," Journal of the Royal Statistical Society, 23(2):405-408

Severini, T. (2005), Elements of Distribution Theory, Cambridge Series in Statistical and Probabilistic Mathematics, Cambridge University Press

Wan, E., and van der Merwe, R. (2001), "The Unscented Kalman Filter," in Kalman Filtering and Neural Networks, ed. by S. Haykin, pp. 221-280

Wu, S. (2010), Non-Linear Filtering in the Estimation of a Term Structure Mo del of Interest Rates WSEAS Transactions on Systems, 9(7):724-733


Footnotes

* Federal Reserve Board, Washington D.C., [email protected]. The analysis and conclusions set forth in this paper are those of the author and do not indicate concurrence by other members of the research staff or the Board of Governors of the Federal Reserve System. Return to Text
1. The integral in (15) has the same form as (A.2) in Appendix A and therefore can be computed analytically as in (A.3). Return to Text
2. Krippner (2012) motivates his derivation in terms of options on shadow bonds. To derive (16) from his equations (12) and (13), replace the call option price by

\displaystyle \mathbf{E}_{t}^{\mathbb{Q}}\left[e^{-\int_{t}^{T}r_{s\, ds}}\max\left\{ \mathbf{E}_{T}^{\mathbb{Q}}\left[e^{-\int_{T}^{T+\delta}r_{s}\, ds}\right]-1,0\right\} \right].
Then interchange the limit operations and expectation, and evaluate. Return to Text
3. Krippner (2012) uses the parametrization proposed by Chen (1995) and hence obtains his formula (31) for  \omega as a special case of (15). Similarly, the results derived by by Christensen and Rudebusch (2013) are (essentially) special cases of (15) and (17) under their Nelson-Siegel parametrization (where some of the derivations must be modified appropriately to account for the fact that their matrix  K_{1}^{\mathbb{Q}} is singular). Don Kim (personal communication) independently derives the general expression for  \omega in (15) by generalizing Krippner's (2012) computations directly. Return to Text
4. The cumulant-generating function of a random variable  X is defined as the logarithm of its moment-generating function (for example, see Severini, 2005). Return to Text
5. Higher-order approximations following the same general logic are possible, but they are increasingly computationally costly while generating decreasing marginal benefits in terms of precision. Return to Text
6. The third- and higher-order cumulants of a Gaussian random variable are zero, so that (20) coincides with the usual affine-Gaussian yield formula in that case. Return to Text
7. Matlab's built-in function mvncdf uses adaptive numerical integration to compute the bivariate normal cdf. This is slower by several orders of magnitude than the algorithm I use. Return to Text
8. That is, without for-loops that grow with the number of quadrature points, states, dates, or maturities in the sample. Return to Text
9. The general approximation methodology I propose has its own curse of dimensionality in that the second-order approximation is substantially more computationally involved than the first-order approximation (and any higher-order approximation will be substantially more involved than the second-order approximation). In practice, the second-order approximation appears to strike an acceptable balance between precision and computational complexity for many cases of interest, see Section 5 below. Return to Text
10. The sample consists of end-of-month observations from January 1990 through December 2012. Return to Text
11. I simulate (1) under  \mathbb{Q} based on moments (A.1)-(A.2) on a uniformly-spaced grid with  \Delta t=1/360 . For each simulated path  i , I compute  \underline{R}_{t}^{T}(i)=\int_{t}^{T}\underline{r}_{s}(i)\, ds by the trapezoidal method. I then define  \underline{\hat{P}}_{t}^{T}=\frac{1}{n}\sum_{i=1}^{n}\exp(-\underline{R}_{t}^{T}(i)) and  \underline{\hat{y}}_{t}^{T}=-\frac{1}{T-t}\log\underline{\hat{P}}_{t}^{T} . The simulation error for  \underline{\hat{P}}_{t}^{T} is computed as  n^{-1/2} times the sample standard deviation, and the simulation error for  \underline{\hat{y}}_{t}^{T} is derived from the simulation error for  \underline{\hat{P}}_{t}^{T} by the delta method. Return to Text
12. The simulation takes several hours to complete for the given parameter estimates and smoothed states. This approach would not, therefore, be feasible as part of an estimation strategy. Return to Text
13. Recall that when there is no lower bound, both methods produce yields equal to the exact arbitrage-free yield. In the early 1990s, the overall level of yields was sufficiently high for the effect of the lower bound to be negligible. Return to Text
14. Empirically, the second-order approximation  \underline{\tilde{\tilde{y}}}_{t}^{T} is exact to an absolute tolerance of approximately  5\times10^{-5} (see Table 1). The default tolerance for numerical methods in Matlab is typically between  10^{-4} and  10^{-6} , depending on the complexity of the method. Return to Text
15. I am grateful to Anh Le for providing the code for this procedure. Return to Text
16. As discussed by Kim and Orphanides (2005), this potentially leads to more precise estimates of the parameters governing the data-generating distribution  \mathbb{P} . Return to Text
17. This follows from equivalence of the measures  \mathbb{P} and  \mathbb{Q} , and more fundamentally from the absence of arbitrage. Return to Text
18. See Duan and Simonato (1999) for an early reference discussing this approach towards term structure model estimation. Return to Text
19. A detailed treatment of the unscented Kalman filter, and a comparison to the extended Kalman filter, can be found in Wan and van der Merwe (2001). Return to Text
20.Christoffersen et al. (2012) and Wu (2010) confirm that the unscented Kalman filter performs better than the extended Kalman filter in the specific setting of term structure model estimation. Return to Text
21. This estimation approach is described and analyzed in Lund (1997). Return to Text

This version is optimized for use by screen readers. Descriptions for all mathematical expressions are provided in LaTex format. A printable pdf version is available. Return to Text