The Federal Reserve Board eagle logo links to home page

Skip to: [Printable Version (PDF)] [Bibliography] [Footnotes]
Finance and Economics Discussion Series: 2008-31 Screen Reader version

Zero Bound, Option-Implied PDFs, and
Term Structure Models

Don H. Kim *



Keywords: Zero bound, term structure of interest rates, options, probability density functions

Abstract:

This paper points out that several known ways of modeling non-negative nominal interest rates lead to different implications for the risk-neutral distribution of the short rate that can be checked with options data. In particular, Black's boundary models ("interest rates as options") imply a probability density function (pdf) that contains a Dirac delta function and a cumulative distribution function (cdf) that is nonzero at the zero boundary, while models like the CIR and positive-definite quadratic-Gaussian (QG) models have a zero cdf at the boundary. Eurodollar futures options data are found to favor Black's boundary models: the CIR/QG models, even multifactor versions, have difficulty capturing option prices accurately not only in low interest rate environments but also in higher interest rate environments, and data in early 2008 provide an almost tangible signature of the Dirac delta function in Black's boundary pdf models. Options data also contradict the prediction of well-known models whose cdf is zero at the zero boundary, namely that the risk-neutral pdf is always positively skewed.




1 Introduction

It is well known, at least since Breeden and Litzenberger (1978), that options at a broad range of strikes can provide information about the whole risk-neutral distribution of the underlying security prices, not just mean and variance. Policy makers and market participants have utilized this fact to deduce the risk-neutral distribution of short-term interest rates from eurodollar futures options or federal funds futures options, and many studies have explored techniques that are useful in this regard and discussed their applications to various settings.1However, this literature so far has not made much connection with the vast literature on dynamic term structure models, i.e., relatively little work has been done to investigate what the option-implied risk-neutral distribution tells about the underlying stochastic process of interest rates and whether known term structure models can capture the risk neutral distribution implicit in option prices.2 The present paper is a contribution toward filling this gap.

We shall be focusing in particular on the the zero boundary behavior of the short rate and its ramifications for term structure models. Perhaps the best known model that recognizes the presence of the zero bound is the CIR model, written down by Cox, Ingersoll, and Ross in their seminal paper (1985). This type of model has the feature that the cumulative distribution function (cdf) of the short rate vanishes as the short rate approaches zero. A different treatment of zero-boundary behavior was proposed by Fischer Black in his last paper (1995), in which the nominal short rate is modeled as  r_t=\max[x_t,0], where  x_t is a "shadow-rate" process that can go below zero. As we shall see, this type of model implies a short rate cdf that is nonzero at the zero boundary. Proper modeling of the behavior near the zero boundary (e.g., whether it is better described as CIR-like, Black-like, or something else) is a basic and historically important problem in term structure modeling, but it has significance beyond that, as different boundary behaviors could signify different underlying macroeconomic mechanism of the interest rate determination.

In the case of Japan, studies by Gorovoi and Linetsky (2004) and Ueno, Baba, and Sakurai (2006) using yields data give strong support for Black-type models. However, some might argue that this conclusion does not necessarily carry over to the postwar US economy.3 Indeed, non-negative interest rate modeling with US data has so far been predominantly in terms of models whose short rate cdf is zero at the zero boundary, e.g., the BGM model (Brace, Gutarek, and Musiela (1997)), the multifactor CIR model, the quadratic-Gaussian (QG) model, and the non-affine model of Ahn and Gao (1999).4As the US short rate in the past 50 years has not been low enough to see a significant effect of the zero bound on the yield curve,5the distinction between the two types of models is difficult to discern from yields data. A key point of the present paper is that data on out-of-money options can help, as the models may imply strong enough differences in risk-neutral distributions, especially near the zero bound. Recently (early 2008), short-term interest rates have come down to a fairly low level while uncertainty about interest rates has remained relatively high, thus it is particularly interesting to explore what the options data from this episode tell about term structure models. Beyond fundamental theoretical interest, modeling of risk-neutral distribution in such environment is an important practical problem for market participants as well.

This paper makes largely two contributions. The first is to explore the implications of well-known and tractable term structure models that respect the non-negativity condition for nominal interest rates (e.g., the multifactor positive-definite CIR/QG models) for the risk-neutral distribution of the short rate. In particular, this paper derives the asymptotic form of their risk-neutral probability density function (pdf) at vanishing interest rates (relevant for the discussion of the behavior near the zero boundary), which does not seem to have been explored in the literature. This paper also develops a method for fast computation of option prices for the risk-neutral distribution implied by the multifactor CIR model and (positive-definite) QG model, and explores how well these models perform in matching the observed out-of-money eurodollar futures option prices. In addition, this paper points out that known models which have zero cdf at the zero boundary imply a fairly strong restriction on the shape of the distribution, namely that the distribution is always positively skewed, and examines whether this prediction is born out in the options data. The second contribution is to examine the implications of Black-type boundary models for option prices and option-implied pdfs. After deriving the risk-neutral pdf of the simplest Black's boundary model ("Black-Vasicek model"), this paper develops flexible parametric models of risk-neutral pdf that are consistent with Black's boundary behavior and lead to convenient calculation of option prices, and explore their practical performance.

The main findings are as follows. A flexible parametric pdf model with Black's boundary behavior (normal-mixture shadow rate model) is found to perform quite well in matching the option prices across various strikes in a variety of interest rate environment from 1998 to 2008. The model produces implied prices of the 1-year-maturity 0.5%-strike eurodollar futures option in February and March of 2008 that agree fairly well with actual settlement prices. On the other hand, the traditional lognormal-mixture pdf model and 2-factor and 3-factor (positive-definite) CIR/QG models substantially underprice this "low-strike" put option. The CIR/QG models have difficulty matching the option prices accurately not only in low interest rate environments but also in higher interest rate environments, often generating large and biased pricing errors. In addition, options data contradict these models' prediction that the skewness of the pdf is always positive.

The remainder of this paper is organized as follows. Section 2 sets up the stage for the later parts of the paper, discussing empirically important zero boundary behaviors and reviewing the risk-neutral pdf extraction techniques. It also introduces a mathematical object called the Dirac  \delta-function, which seldom appears in finance literature but is needed for the discussion of the pdfs with Black's boundary behavior. Section 3 examines the pdfs and option prices in one-factor models, comparing Black's boundary model with CIR/QG models. Section 4 examines richer pdf models and develops useful techniques for them; the application of these models to actual options data and the empirical results are discussed in Section 5. Section 6 concludes.


2 Preliminary discussion

2.1 "Zero-cdf model" vs Black's model

Nominal interest rates cannot go below zero. The best known model that respects this condition is the CIR model. This model is nested by the so-called CEV model6

\displaystyle dr_t=\kappa (\theta - r_t)dt + \sigma r_t^{\alpha}dB_t, (1)

which admits two kinds of boundary behavior: when  \alpha > 1/2, the short rate  r_t does not reach zero, i.e., zero is an inaccessible boundary (also called unattainable boundary). When  \alpha < 1/2, the short rate  r_t can reach zero; in this case, zero is an accessible boundary. The CIR model (  \alpha=1/2) can display both behaviors depending on the parameters. More specifically, if the Feller condition
\displaystyle k\theta/\sigma^2 > 1/2 (2)

is satisfied, the zero boundary is inaccessible, and if not, it is accessible.

In the accessible boundary case (  \alpha < 1/2, or  \alpha=1/2 and  k\theta/\sigma^2 < 1/2), zero is a regular boundary (according to the terminology of Feller (1952)),7 and additional conditions at the boundary can be imposed. For example, requiring that  r_t stay at zero forever after it hits the zero boundary amounts to having an absorbing boundary. As the permanent stay of the short rate at zero would not be a promising description of the actual economy, we shall not consider the absorbing boundary scenario further in this paper, and instead focus on the regular "unrestricted boundary behavior", after Longstaff (1992).8

Most of the non-negative interest rate models applied to the US data have been either inaccessible boundary models or regular unrestricted boundary models. The former include the CIR model with  k\theta/\sigma^2 > 1/2, the BGM model, and Ahn and Gao (1999)'s non-affine model. The latter include the CIR model with  k\theta/\sigma^2 < 1/2 and (positive-definite) quadratic-Gaussian model (Beaglehole and Tenney (1992)). All these models (i.e., both the inaccessible boundary models and regular unrestricted boundary models) share the feature that the cdf of the conditional distribution of the short rate is zero at the zero boundary. Therefore, in this paper we shall refer to these models collectively as "zero-cdf models", for brevity.

Let us now discuss Black's boundary models. As the short rate in these models can stay at the zero boundary for an extended period, they have the feature that a finite probability mass is concentrated at the point  r=0; this means that the probability density function of the conditional distribution,  f(r), is infinite at  r=0. Mathematically,

\displaystyle {\rm Prob}(r=0)=\lim_{\epsilon\rightarrow 0^+}\int_{-\epsilon}^{\epsilon}f(r)dr= w > 0, \hspace{0.4cm} \Rightarrow f(r=0)=\infty. (3)

A natural way to describe such a distribution is to use the Dirac  \delta-function, which is defined as9
\displaystyle \delta(x)=\left\{ \begin{matrix}0 \hspace{0.3cm} {\rm if} \,\, x\neq 0 \\ \infty \hspace{0.3cm} {\rm if}\,\, x= 0, \end{matrix} \right. \hspace{1.5cm} \int_{-\infty}^{\infty}\delta(x)dx = 1. (4)

It has the property  \int_{-\infty}^{\infty} h(x) \delta(x-a) dx = h(a), where  h(\cdot) is an arbitrary function, and it can be thought of as the pdf of a normal distribution (or any other distribution) with a vanishing variance, i.e.,
\displaystyle \delta(x-a)=\lim_{\epsilon\rightarrow 0} \frac{1}{\sqrt{2\pi}\,\epsilon}\exp\left(-\frac{(x-a)^2}{2\epsilon^2}\right). (5)

Using the  \delta-function, the pdf of Black's boundary models can be expressed as10

\displaystyle f(r)= w \,\delta(r) + \tilde{f}(r) (6)

where  \tilde{f}(r) is an improper probability density function with support at  [0,\infty] (i.e.,
 \int_{-\infty}^{\infty}\tilde{f}(r)dr=\int_{0}^{\infty} \tilde{f}(r) dr=1-w < 1).

The cumulative distribution function of the short rate,  F(r)=\int_{-\infty}^r f(s)ds, takes the form

\displaystyle F(r)= w+\int_{0}^r \tilde{f}(s)ds (7)

(for  r > 0). Note that the presence of the  \delta-function in the pdf translates to a nonzero cdf at the zero boundary (more precisely,  \lim_{r\rightarrow 0^+}F(r) \neq 0).

2.2 Option-implied risk-neutral pdfs

Consider a eurodollar futures option with strike  K and maturity  \tau.11Let  \tilde{r} denote the futures rate at the maturity of the option, and let  f_{\tau}(\tilde{r}) be the risk-neutral pdf of  \tilde{r}. The put option price  P_{\tau}(K) and the call option price  C_{\tau}(K) can be expressed in terms of  f_{\tau} as

\displaystyle P_{\tau}(K) \displaystyle = \displaystyle e^{-r_f\tau}\int_{-\infty}^K (K-\tilde{r})f_{\tau} (\tilde{r}) d\tilde{r}, (8)
\displaystyle C_{\tau}(K) \displaystyle = \displaystyle e^{-r_f\tau}\int_{K}^{\infty} (\tilde{r}-K)f_{\tau}(\tilde{r}) d\tilde{r},  

where the prefactor  e^{-r_f\tau} denotes the price of a zero-coupon bond maturing  \tau periods hence.12In the case of the eurodollar futures options, the futures rate at the option maturity is the 3-month LIBOR. For simplicity, we shall treat this rate as the short rate of interest, and drop the "tilde" symbol from  \tilde{r}.

Taking successive derivatives of both sides of eq. (8) gives a simple formula for the pdf,

$\displaystyle f(K) = e^{r_f\tau}P''(K)= e^{r_f\tau} C''(K),$ (9)

where the double prime notation denotes second derivatives; the maturity index  \tau has been dropped for notational simplicity. In practice, options are available at discrete values of  K (say,  K_1^p,K_2^p,... for put options and  K_1^c,K_2^c,... for call options), and observed prices could contain some errors. Taking numerical second derivatives (with an interpolated function) to obtain  f(r) typically leads to unrobust results.13Therefore, a common practice is to assume a flexible functional form for  f(r;\gamma), where  \gamma collectively denotes the parameters of  f, and solve the least squares problem
\displaystyle \hat{\gamma} \displaystyle = \displaystyle {\rm argmin} \,\zeta(\gamma), (10)
\displaystyle \zeta(\gamma) \displaystyle = \displaystyle \sum_{i=1}^{n_p}(P^{ob}(K_i^p)-P(K_i^p))^2+ \sum_{i=1}^{n_c}(C^{ob}(K_i^c)-C(K_i^c))^2+ ({\rm F}^{ob} - {\rm F})^2  

where the superscript  ob denotes the observed values. Including the futures pricing error helps to pin down the mean of the distribution; note that the model-implied current futures value F is  E(r)= \int_{-\infty}^{\infty}r\, f(r;\gamma)dr. This least squares problem amounts to minimizing the pricing errors in absolute terms. Because options far out of money tend to have very low prices, this metric in effect substantially discounts the information in such options, as compared to near-the-money options. One could also consider minimizing the pricing errors in relative terms (i.e., replacing  (P^{ob}-P)^2 in eq. (10) by  (P-P^{ob})^2/P^{ob\, 2}, etc.), but this is not advisable, as the farther out-of-money options prices are expected to be less reliable, due to their lower liquidity and more pronounced market microstructural effects (minimum tick sizes, etc.).

There are many possibilities for parametrizing  f(r;\gamma). The general wisdom is that a flexible parametric form with 4 to 6 parameters is sufficient to match observed option prices at various strikes.14 A parametrization substantially richer than this can often lead to strange implications at minimal improvements in the fit of option prices (overfitting). Perhaps the most popular parametric form is the "mixture of lognormals":15

\displaystyle f(r) = \beta \frac{1}{\sqrt{2\pi}\,\sigma_1 r}\exp\left(-\frac{(\log r -\mu_1)^2}{2\sigma_1^2}\right) +(1-\beta) \frac{1}{\sqrt{2\pi}\,\sigma_2 r}\exp\left(-\frac{(\log r -\mu_2)^2}{2\sigma_2^2}\right). (11)

Note that if  \beta=1, eq. (11) becomes just the lognormal distribution, which is an implicit assumption behind the so-called Black implied-volatility formula (Black (1976)), which is often used to quote cap/floor and swaption prices. Parametric forms like  f(r) in eq. (11) cannot be derived from known classes of term structure models. However, certain parametric forms can be derived from a specific class of term structure models. In Section 4.1, we shall discuss such an example, where an implicit parametric form for  f(r) is derived from the multifactor CIR and quadratic-Gaussian models.

It is also worth noting that despite the typically good performance, forms like (11) do have a limitation that is particularly relevant to our problem (pdf modeling in the proximity of the zero bound). Because the lognormal distribution has the feature that  f(r) \rightarrow 0 as  r\rightarrow 0, the form (11) can have difficulty when the actual distribution contains a  \delta-function at  r=0. As noted earlier (eq. (5)), the delta function  \delta(r) can be approximated by a lognormal distribution with mean and variance close to zero, which can be achieved by having a small  \sigma_1 and a large and negative  \mu_1 in eq. (11), but in such a two-component mixture model this may incur a substantial cost in the fit of other aspects of the pdf. Section 4.2 develops parametric forms for  f(r) that can describe Black's boundary behavior more easily and naturally.

To have a manageable scope, this paper will be focusing only on the processes and distributions in the risk-neutral measure (which determines pricing), and not the physical (real world) measure.16Therefore, henceforth we shall often drop the adjective "risk-neutral" for brevity.


3 One-factor models

3.1 Black-Vasicek model

Let us now derive the pdf and option prices for the simplest Black's boundary model, which has the shadow rate described by the one-factor Vasicek process:

\displaystyle r_t \displaystyle = \displaystyle \max[0, x_t], (12)
\displaystyle dx_t \displaystyle = \displaystyle \kappa (\theta-x_t)dt + \sigma dB_t. (13)

This model, which we shall refer to as the "Black-Vasicek model," has been theoretically and empirically studied by Gorovoi and Linetsky (2004) and Ueno, Baba, and Sakurai (2006) and others, but its risk-neutral pdf has not been discussed in the literature, to my knowledge.17

The conditional distribution of this model, i.e., the transition density  f(r_{t+\tau}\vert {\mathcal F}_t), can be easily obtained if one thinks in terms of the shadow rate process  x_t rather than the  r_t process. Note that if  x_{t+\tau} > 0, we have  r_{t+\tau}=x_{t+\tau}; hence in this case, the distribution of  r_{t+\tau} equals that of  x_{t+\tau}. On the other hand, if  x_{t+\tau} \leq 0, we have  r_{t+\tau}=0. Thus all scenarios in which  x_{t+\tau} is non-positive maps to a single point  r_{t+\tau}=0. Therefore, we have

\displaystyle f(r_{t+\tau}\vert x_t) = f^S(x_{t+\tau}\vert x_t)\, {\rm I}(x_{t+\tau}) + w(x_t) \,\delta(x_{t+\tau}), (14)

where  f^S(x_{t+\tau}\vert x_t) denotes the transition density of the  x_t process,  w(x_t) is the weight of  f^S(x_{t+\tau}\vert x_t) in the nonpositive region, i.e.,
\displaystyle w(x_t)=\int_{-\infty}^0 f^S(x_{t+\tau}\vert x_t) \,dx_{t+\tau}, (15)

and  {\rm I}(x) is the indicator function (  {\rm I}(x)=1 for  x \geq 0 and  {\rm I}(x)=0 for  x < 0). It may be of interest to note that the distribution in eq. (14) can be thought of as a special case of the "mixture" distribution, consisting of a truncated normal distribution  \frac{1}{1-w(x_t)} f^S(x_{t+\tau}\vert x_t)\, {\rm I}(x_{t+\tau}) and an infinitely sharp distribution  \delta(x_{t+\tau}) with weights  1-w(x_t) and  w(x_t), respectively.

The transition density function  f^S(x_{t+\tau}\vert x_t) for the Vasicek process is well-known: Since the stochastic differential equation (13) has the solution

\displaystyle x_{t+\tau}= e^{-\kappa \tau} x_t + (1-e^{-\kappa\tau})\theta +\int_0^{\tau}e^{-\kappa (\tau-s)}\sigma dB_s (16)

the random variable  x=x_{t+\tau} is conditionally normally distributed18
\displaystyle x \sim N(\tilde{\mu},\tilde{\sigma}^2), (17)

with conditional mean and variance given by
\displaystyle \tilde{\mu} \displaystyle = \displaystyle e^{-\kappa \tau} x_t + (1-e^{-\kappa\tau})\,\theta, (18)
\displaystyle \tilde{\sigma}^2 \displaystyle = \displaystyle \frac{\sigma^2}{2\kappa}(1-e^{-2\kappa\tau}).  

Figure 1: Black-Vasicek model pdf (a) and cdf (b). The vertical line at 0 in Figure (a) pictures a delta function.
Figure 1. (a) The pdf of the Black-Vasicek model (thick solid line).  The x-axis is the short rate (in percentage units) and the y-axis is the pdf.  The thin dashed line shows the part of the shadow rate distribution that collapse to a delta function located at x=0. (b) The cdf of the Black-Vasicek model.  Note that the cdf at x=0 is nonzero.

Thus, suppressing time indices and simplifying notations, we have the conditional distribution of the short rate given by

\displaystyle f(r)= w\, \delta(r)+ \frac{1}{\tilde{\sigma}} \phi\left(\frac{r-\tilde{\mu}}{\tilde{\sigma}}\right) {\rm I}(r), (19)

where  w = \Phi(-\tilde{\mu}/\tilde{\sigma}), and  \phi(\cdot) and  \Phi(\cdot) are the standard normal pdf and cdf, respectively (i.e.,  \phi(x)=e^{-x^2/2}/\sqrt{2\pi},  \Phi(x)=\int_{-\infty}^x dt\, e^{-t^2/2}/\sqrt{2\pi}). Integrating eq. (19) gives the cdf,19
\displaystyle F(r)= [w +\Phi((r-\tilde{\mu})/\tilde{\sigma})-\Phi(-\tilde{\mu}/{\tilde{\sigma}})]{\rm I}(r)=\Phi((r-\tilde{\mu})/\tilde{\sigma}){\rm I}(r). (20)

The pdf and cdf of the short rate in the Black-Vasicek model are illustrated in Figure 1a,b. The dashed line in Figure 1b is the part of the shadow rate distribution that collapses onto the delta function at  r=0, denoted by a vertical line at zero in Figure (a). Note also that the cdf  F(r) approaches a finite number as  r\rightarrow 0^+ (Figure (b)).

The prices of  \tau-maturity put/call options with strike  K are straightforward to evaluate. Substituting eq. (19) into eq. (8), we have

\displaystyle P(K) \displaystyle = \displaystyle e^{-r_f\tau}[ w\, K + (K-\tilde{\mu})(\Phi(k) - \Phi(-\tilde{\mu}/\tilde{\sigma}))+ \tilde{\sigma}(\phi(k) - \phi(-\tilde{\mu}/\tilde{\sigma}))], (21)
\displaystyle C(K) \displaystyle = \displaystyle e^{-r_f\tau}[\tilde{\sigma}\phi(k) - (K-\tilde{\mu})(1-\Phi(k))], (22)

where  k\equiv (K-\tilde{\mu})/\tilde{\sigma}.

3.2 Two zero-cdf models

Let us now consider two well-known and tractable positive interest rate models: the CIR model and the QG model. The CIR model is given by

\displaystyle dr_t=\kappa (\theta - r_t)dt +\sigma\sqrt{r_t}dB_t. (23)

It is well known (CIR, 1985) that the random variable  r\equiv r_{t+\tau} in this model is conditionally distributed (at time  t) as
\displaystyle r \sim b \,\chi^2_{\nu,\lambda}, (24)

where  \chi^2_{\nu,\lambda} denotes the noncentral chi-squared random variable with  \nu degrees of freedom and noncentrality parameter  \lambda, and
\displaystyle b \displaystyle = \displaystyle \frac{\sigma^2(1-e^{-\kappa \tau})}{4\kappa} (25)
\displaystyle \nu \displaystyle = \displaystyle \frac{4\kappa\theta}{\sigma^2}  
\displaystyle \lambda \displaystyle = \displaystyle \frac{4\kappa} {\sigma^2(1-e^{-\kappa\tau})} e^{-\kappa\tau}r_t.  

Consider now the positive-definite 1-factor QG model

\displaystyle r_t \displaystyle = \displaystyle x_t^2, (26)
\displaystyle dx_t \displaystyle = \displaystyle \kappa (\theta-x_t)dt + \sigma dB_t.  

Since the random variable  x=x_{t+\tau} is conditionally normally distributed (eq. (17)), it can be seen immediately that the distribution of random variable  r=r_{t+\tau} is given by eq. (24), i.e., the same form as the CIR model, with
\displaystyle b \displaystyle = \displaystyle \tilde{\sigma}^2 (27)
\displaystyle \nu \displaystyle = \displaystyle 1  
\displaystyle \lambda \displaystyle = \displaystyle \tilde{\mu}^2/\tilde{\sigma}^2,  

where  \tilde{\mu} and  \tilde{\sigma} were given in eq. (18).

The pdf of the  r\sim b \chi^2_{\nu,\lambda} distribution (CIR/QG models) is given by

\displaystyle f(r)=\frac{1}{b} g\left(\frac{r}{b}\right)\, {\rm I}(r), (28)

with
\displaystyle g(x) = \frac{1}{2} e^{-(\lambda+x)/2}\left(\frac{x}{\lambda}\right)^{(\nu-2)/4} I_{(\nu-2)/2}(\sqrt{\lambda x}), (29)

where  I_a(\cdot) is the modified Bessel function of the first kind of order  a.20The option prices and cdf are not available in simple form for this distribution, but it is straightforward to compute them by one-dimensional numerical integrations with  f(r).

3.3 Comparing the model implications

To get a feel for the qualitative difference between Black's boundary models and zero-cdf (CIR/QG) models, it is instructive compare the behavior of the pdf  f(r) in the  r\rightarrow 0 limit.

For the CIR/QG model, expanding the expression (29) in powers of  r, using the well-known series expansion of the modified Bessel function of the first kind

\displaystyle I_a(z)= (\frac{1}{2}z)^a \sum_{j=0}^{\infty} \frac{(z^2/4)^j}{\Gamma(j+1)\Gamma(a+j+1)}, (30)

gives
\displaystyle f(r)= \frac{e^{-\lambda/2}}{(2b)^{\frac{\nu}{2}}\Gamma(\frac{\nu}{2})} \,r^{\frac{\nu}{2}-1} + \frac{e^{-\lambda/2}(\lambda-\nu)} {2 (2b)^{\frac{\nu}{2}+1}\Gamma(\frac{\nu}{2} +1)} \,r^{\frac{\nu}{2}} + O(r^{\frac{\nu}{2}+1}), (31)

where the notation  O(r^n) denotes terms of order  n or higher.21Thus, the cdf is given by
\displaystyle F(r)= \frac{e^{-\lambda/2}}{(2b)^{\frac{\nu}{2}}\frac{\nu}{2} \Gamma(\frac{\nu}{2})} \,r^{\frac{\nu}{2}} + \frac{e^{-\lambda/2}(\lambda-\nu)} {2 (2b)^{\frac{\nu}{2}+1}(\frac{\nu}{2} +1)\Gamma(\frac{\nu}{2} +1)} \, r^{\frac{\nu}{2}+1} + O(r^{\frac{\nu}{2}+2}). (32)

From eq. (31) it can be seen that depending on  \nu, we can expect qualitatively different behavior of  f(r):

\displaystyle \lim_{r\rightarrow 0} f(r)=\left\{ \begin{matrix}\infty,\hspace{1cm} \nu < 2\\ 0 ,\hspace{1.2cm}\nu > 2. \end{matrix}\right. (33)

Recall that, for the CIR model (eq. (23)),  \nu is given by eq. (25), therefore, the condition  \nu > 2 is the same as Feller condition (  \kappa \theta > \frac{1}{2}\sigma^2). It makes sense that accessible boundary case (Feller condition violated) shows more probability density in the low  r region than the inaccessible boundary case.

Note also that since  \nu=1 for the QG model, we have  \lim_{r\rightarrow 0} f(r)=\infty for the QG model. As the zero boundary is accessible in the QG model (since  x_t in eq. (26) is unrestricted, it can reach zero), it shows a qualitatively similar small- r behavior as the CIR model with violated Feller condition. It should be noted, however, that an accessible zero boundary does not always imply  \lim_{r\rightarrow 0} f(r)=\infty, as we shall see with the case of the 3-factor QG model (Sec. 4.1). Note also that even in the case  \lim_{r\rightarrow 0} f(r)=\infty (one-factor CIR model with violated Feller condition and the QG model), the cdf is zero at  r=0, since  F(r) \propto r^{\nu/2} in the  r\rightarrow 0 limit.

The small- r behavior of the pdf in eq. (31) implies that the put option price for small strike  K is given by

\displaystyle P(K)=e^{-r_f\tau}\int_0^K (K-r)f(r) dr=e^{-r_f\tau}\frac{e^{-\lambda/2}}{(2b)^{\frac{\nu}{2}}\Gamma(\frac{\nu}{2})} \frac{4}{\nu(\nu+2)}\, K^{\frac{\nu}{2}+1} + O(K^{\frac{\nu}{2}+1}). (34)

These asymptotic behaviors of CIR/QG models can be compared with the those of the Black-Vasicek model:

\displaystyle f(r) \displaystyle = \displaystyle w \, \delta(r) + \frac{\phi(-\tilde{\mu}/\tilde{\sigma})}{\tilde{\sigma}} + O(r^1) (35)
\displaystyle F(r) \displaystyle = \displaystyle w + \frac{\phi(-\tilde{\mu}/\tilde{\sigma})}{\tilde{\sigma}}r + O(r^2)  
\displaystyle P(K) \displaystyle = \displaystyle e^{-r_f\tau}w\, K + O(K^2).  

(for  r,K > 0), which are easily derived from the formulae in Section 3.1.
Figure 2: (a) The Black-Vasicek pdf (thick solid line) and inacessible boundary CIR/QG model pdf (thin dashed line). (b) Accessible boundary model pdf (thin solid line). (c) Put option prices for the Black-Vasicek model (thick solid line), accessible boundary CIR/QG model (thin solid line), and inaccessible boundary model (thin dashed line). (d) Same as (c).
Figure 2. (a) The Black-Vasicek pdf (thick solid line) and inaccessible boundary CIR/QG model pdf (thin dashed line).  The x-axis is the short rate and the y-axis is the pdf.  Note that the pdf is zero at x=0 in the inaccessible boundary model, while the pdf is infinity at x=0 in the Black-Vasicek model. (b) Accessible boundary CIR/QG model pdf (thin solid line).  Note that the pdf is infinity at x=0 in the accessible boundary model, as in the Black-Vasicek model. (c) The put option prices for the Black-Vasicek model (thick solid line), accessible boundary model (thin solid line), and inaccessible boundary model (thin dashed line).  The x-axis is the option strike, and the y-axis is the option price.   The option price for the Black-Vasicek model is notably larger than those of the accessible and inaccessible boundary CIR/QG models. (d) The put option prices shown for a broader range of strikes.  Note that the inaccessible and accessible boundary models produce similar option prices (beyond the region of very small strikes).

Figure 2 graphically illustrates the qualitative difference between Black's boundary models and CIR/QG models. Figure 2a plots, in thick solid line, the pdf of the Black-Vasicek model with parameters in eq. (19) given by  \tilde{\mu}=2,\tilde{\sigma}=1.2, which imply  w=0.0478. It also shows, in thin dashed line, the pdf of the  r\sim b \chi^2_{\nu,\lambda} model with  \nu=3 (corresponding to the CIR model satisfying the Feller condition), whose  b and  \lambda were determined such that the model matches the mean and variance of the Black-Vasicek model;22 requiring the means and variances of the distributions be similar amounts to having comparable at-the-money option prices. Figure 2b shows the pdf of the  r\sim b \chi^2_{\nu,\lambda} model with  \nu=1 (corresponding to the CIR model violating the Feller condition and the QG model) in thin solid line. As discussed with eq. (31), this pdf has a singularity of the form  r^{-1/2}, but it has a local minimum at a point very close to zero,23 and above that it looks similar to the pdf of the  r\sim b \chi^2_{\nu,\lambda} distribution with  \nu=3 (i.e., Figure 2a).

Despite the rough similarity in the shape of the pdf of the Black-Vasicek model and the  r\sim b\chi^2_{\nu=1,\lambda} model, the model-implied put prices are drastically different, with the Black-Vasicek model producing far larger numbers, as shown in Figure 2c. This is because the  r^{-1/2} singularity in the  r\sim b\chi^2_{\nu=1,\lambda} model is a much weaker singularity than the  \delta-function singularity in the Black-Vasicek model; to put it simply, Black's boundary model has a much larger probability weight at or near zero than the CIR/QG model. Note also that in the low interest rate region the Black-Vasicek model shows a linear dependence on  K, as predicted by eq. (35).

Although the  r, K\rightarrow 0 behavior of the pdfs and put option prices derived above for the one-factor Black-Vasicek and CIR/QG models are useful for getting a firm handle on the zero boundary behavior of these models, these predictions might not be cleanly testable empirically, as options might not exist (trade) at very low strikes, and even if they existed their quality may be in doubt. Furthermore, certain market realities (to be discussed at the end of Sec. 5.2) may blur the clean asymptotic behaviors. Still, even if one moves onto regions where the asymptotic behaviors like (34) and (35) are no longer accurate, the basic intuition would still carry over, and one would expect Black's boundary model to produce higher put option prices for small  K's than zero-cdf models (with comparable mean and variance) when the rates are low enough (or the distribution is wide enough) that the weight of the  \delta-function piece in Black's boundary model is non-negligible. This is illustrated in Figure 2d (which plots the same objects as Figure 2c but on a larger scale). It is also interesting to note that the put option prices for the  r\sim b \chi^2_{\nu=3,\lambda} model (inaccessible boundary) and the  r\sim b\chi^2_{\nu=1,\lambda} model (accessible boundary) are quite similar, beyond the very small  K region.


4 Richer parametric forms

The one-factor models considered in the previous section, though instructive, may be too parsimonious to capture option prices accurately for a broad range of strikes. Let us now therefore consider richer versions of zero-cdf models and Black's boundary models.

4.1 Multifactor zero-cdf models

A non-negative, multifactor version of the CIR model can be constructed by adding up independent CIR factors:

\displaystyle r_t \displaystyle = \displaystyle x_{1t}+x_{2t} + \cdots + x_{nt}, (36)
\displaystyle dx_{it} \displaystyle = \displaystyle \kappa_i(\theta_i - x_{it})dt +\sigma_i\sqrt{x_{it}}dB_{it}  

where the Brownian motions  B_{1t},  B_{2t},.. are independent of each other. This "multifactor CIR" model has been used in many settings, e.g., Duffie and Singleton (1997) and Feldh{\H{u\/}}\kern.05emtter and Lando (2007).24

Because  x_i's are independent, the results about the 1-factor CIR model carries over easily; it is straightforward to show that the distribution of  r=r_{t+\tau} is given by

\displaystyle r \sim b_1 \chi^2_{\nu_1,\lambda_1}+ b_2 \chi^2_{\nu_2,\lambda_2}+\cdots + b_n \chi^2_{\nu_n,\lambda_n}, \hspace{1cm} (b_i > 0), (37)

where  \chi^2_{\nu_i,\lambda_i}'s are independent noncentral chi-squared random variables with  \nu_i degrees of freedom and noncentrality parameter  \lambda_i, and
\displaystyle b_i \displaystyle = \displaystyle \frac{\sigma_i^2(1-e^{-\kappa_i \tau})}{4\kappa_i} (38)
\displaystyle \nu_i \displaystyle = \displaystyle \frac{4\kappa_i\theta_i}{\sigma_i^2}  
\displaystyle \lambda_i \displaystyle = \displaystyle \frac{4\kappa_i e^{-\kappa_i\tau }} {\sigma_i^2(1-e^{-\kappa_i\tau})} x_{it}.  

Let us now consider the general  n-factor positive-definite QG model,

\displaystyle r_t \displaystyle = \displaystyle ({\rm x}_t-\alpha)^{\rm T} \Psi ({\rm x}_t- \alpha), (39)
\displaystyle d{\rm x}_t \displaystyle = \displaystyle {\mathcal K}(\theta - {\rm x}_t)dt +\Sigma dB_t,  

where the superscript T denotes matrix transpose;  {\rm x}_t is an  n-dimensional vector  {\rm x}_t=[x_{1t},...,x_{nt}]^{\rm T};  {\mathcal K} and  \Sigma are  n\times n constant matrices;  \alpha and  \theta are  n-dimensional constant vectors, and  B_t is an  n-dimensional vector of standard Brownian motions. In order to insure the non-negativity of  r_t, we require  \Psi to be a symmetric positive-definite matrix.

It is straightforward to show that the conditional distribution of the short rate in the positive-definite QG model also takes the form (37),25 with

\displaystyle b_i \displaystyle = \displaystyle d_i \hspace{1cm}( > 0) (40)
\displaystyle \nu_i \displaystyle = \displaystyle 1  
\displaystyle \lambda_i \displaystyle = \displaystyle ([{\mathcal P}^{\rm T}C^{-1}(\tilde{\mu} -\alpha)]_i)^2,  

where  C,  \tilde{\mu} are given by
\displaystyle \tilde{\mu} \displaystyle = \displaystyle e^{-{\mathcal K}\tau}{\rm x}_t + (I-e^{-{\mathcal K}\tau})\theta, (41)
\displaystyle C \displaystyle = \displaystyle {\rm Chol}(\tilde{\Omega}), \hspace{1cm} (i.e., \,\,\tilde{\Omega}=C \,C^{\rm T}), (42)
\displaystyle \tilde{\Omega} \displaystyle = \displaystyle \int_0^{\tau}e^{-{\mathcal K}(\tau-s)}\Sigma\Sigma^{\rm T} e^{-{\mathcal K}^{\rm T}(\tau-s)}ds, (43)

(the notation  e^{X} denoting the matrix exponential), and  d_i and  {\mathcal P} are from the diagonalization of the symmetric positive-definite matrix  C^{\rm T}\Psi C:
\displaystyle C^{\rm T}\Psi C = {\mathcal P} D {\mathcal P}^{-1}={\mathcal P} D{\mathcal P}^{\rm T}, (44)

where  D is a diagonal matrix with diagonal elements  d_1,d_2,...,d_n (the eigenvalues of the  C^{\rm T}\Psi C matrix). Note that the dependence of  \lambda_i on  {\rm x}_t has been suppressed in the notation.

Models like the multifactor CIR model have been criticized for the restrictive nature of the factor correlation structure. However, the positive-definite QG models have been widely believed to accommodate a rich factor correlation structure, due to the ability to specify  \mathcal{K} and  \Sigma flexibly; see, e.g., Ahn, Dittmar, and Gallant (2002). Therefore, it bears emphasizing that not only the multifactor CIR model but also the positive-definite QG model has a short rate distribution given by a positive linear combination of independent noncentral  \chi ^2 random variables, and that this holds no matter how flexible the model is and no matter how many factors it has.

The pdf  f(r) and cdf  F(r) of the distribution for the multifactor CIR/QG model (eq. (37)) do not have simple closed-form expressions in general, but the characteristic function  \hat{f}(w)\equiv E(e^{i\omega\,r}) has a simple expression

\displaystyle \hat{f}(w)= \prod_{j=1}^n (1-2 b_j i\omega)^{-\nu_j/2}\exp\left(\frac{\lambda_j b_j i\omega}{1-2 b_j i\omega}\right), (45)

due to the independence of the noncentral  \chi ^2 variables in eq. (37). From this,  f(r) and  F(r) can be computed by evaluating the well-known Fourier-inversion integrals (see, e.g., Stuart and Ord (1994, Chap. 4)),
\displaystyle f(r) \displaystyle = \displaystyle \frac{1}{\pi}\int_{0}^{\infty} {\rm Re} (e^{-i\omega r}\hat{f}(\omega)) d\omega (46)
\displaystyle F(r) \displaystyle = \displaystyle \frac{1}{2} +\frac{1}{\pi}\int_0^{\infty} \frac{{\rm Im}(e^{i \omega r}\hat{f}(-\omega))}{\omega} d\omega. (47)

The option prices  P(K) and  C(K) for this model (integrals (8)) are also not available in closed-form, but they can be expressed in terms of certain cumulative distribution functions. For the computation of pdfs from options data (eq. (10)), it is desirable to have a formula that allows fast computation of option prices. Appendix B derives such a formula, using the the celebrated saddle-point formula of Lugannani and Rice (1980).

Let us now consider the small- r behavior of  f(r). Appendix A shows that

\displaystyle f(r) = \frac{e^{-\Sigma_i \lambda_i/2}}{\Pi_i (2 b_i)^{\nu_i/2}}\left(\frac{1}{\Gamma(\frac{1}{2}\Sigma_i\nu_i)} \,r^{\frac{1}{2}\Sigma_i\nu_i-1} + \frac{\Sigma_i(\lambda_i-\nu_i)/4b_i}{\Gamma(\frac{1}{2}\Sigma_i\nu_i+1)}\, r^{\frac{1}{2}\Sigma_i\nu_i} + \cdots\right). (48)

This form has an interesting dependence on the number of factors: the exponent in the leading term increases with number of factors. For the QG model, we have
\displaystyle f(r\rightarrow 0) = \left\{ \begin{matrix}\infty, \hspace{1cm} n=1,\\ {\rm finite}, \hspace{0.3cm} n=2, \\ 0, \hspace{1.1cm} n=3. \end{matrix}\right. (49)

This pattern is due to the fact that when there are more factors it is less likely for  r to approach 0 since all of the factors have to become small. Therefore, higher-factor QG models are even more different from Black's boundary models insofar as the zero-boundary behaviors are concerned.

To get further insights into the distribution described by eq. (37), it is useful to examine its cumulants (hence moments), which is straightforward to calculate from the cumulant generating function (  \log(\hat{f}(-i\omega))). The mean, variance, and the third central moment of this distribution are easily derived from the cumulants of the noncentral  \chi ^2 variables (see, e.g., Johnson, Kotz, Balakrishnan (1994, p447)):

\displaystyle E(r) \displaystyle = \displaystyle \Sigma_{i=1}^n b_i(\nu_i+\lambda_i) (50)
\displaystyle Var(r) \displaystyle = \displaystyle \Sigma_{i=1}^n 2\, b_i^2(\nu_i+2\lambda_i)  
\displaystyle E[(r-E(r))^3] \displaystyle = \displaystyle \Sigma_{i=1}^n 8\, b_i^3(\nu_i+3\lambda_i).  

It can be seen that because  b_i, \nu_i, \lambda_i > 0 for all  i, the skewness
 E[(r-E(r))^3]/Var(r)^{3/2} is always positive.

More generally speaking, one cannot create a negatively skewed distribution by forming a positive linear combination of independent, positive-skewed random variables. Therefore, a model like

\displaystyle r_t= b_1x_{1t} + b_2x_{2t}+ \cdots + b_nx_{nt}, \hspace{1cm} (b_i > 0), (51)

where  x_{it} is Ahn and Gao (1999)'s one-factor process for the short rate, would again have the feature that the short rate distribution is always positively skewed.26

4.2 Flexible pdfs with Black's boundary behavior

This section develops flexible parametric forms for the risk-neutral pdf that are consistent with Black's boundary behavior. The earlier discussion with the Black-Vasicek model points to the way: write down a flexible form for the shadow rate distribution  f^S that has some weight below zero, and take

\displaystyle f(r) = f^S(r) {\rm I}(r-\underline{r}) + w\, \delta(r-\underline{r}),\hspace{1cm} w= \int_{-\infty}^{\underline{r}}dx f^S(x). (52)

Here we have a slightly more general expression in which the lower boundary  \underline{r} is not necessarily zero.

The distribution  f^S can be modeled with similar degree of parsimony as the pdfs that are in use in the extant literature. A natural candidate for  f^S is the "mixture of normals" form, i.e.,

\displaystyle f^S(r)= \beta_1 \frac{1}{\sqrt{2\pi}\, \sigma_1} \exp\left(-\frac{(r-\mu_1)^2}{2\sigma_1^2}\right) + \beta_2 \frac{1}{\sqrt{2\pi}\, \sigma_2} \exp\left(-\frac{(r-\mu_2)^2}{2\sigma_2^2}\right), (53)

where  \beta_2=1-\beta_1. Another possibility is the Gram-Charlier/Edgeworth expansion
\displaystyle f^S(r)= \frac{1}{\sigma}\phi\left(\frac{r-\mu}{\sigma}\right)\left(1+\frac{\gamma_3}{6} H_3\left( \frac{r-\mu}{\sigma} \right) + \frac{\gamma_4}{24} H_4\left(\frac{r-\mu}{\sigma}\right) + \cdots \right), (54)

where  H_j are Hermite polynomials (  H_3(x)=x^3-3x,  H_4(x)=x^4-6x^2+3, etc.). This expansion differs from the Gram-Charlier/Edgeworth expansion commonly used in the option-implied pdf literature by the fact that it uses unit normal variable, rather than unit lognormal variable. Note that in both eq. (53) and (54),  f^S is allowed to have some probability weight below  r=0.

These forms, which are richer than that of the Black-Vasicek model (eq. (19)), can be viewed as accommodating a more complicated process for the shadow rate, e.g., a multifactor model with stochastic volatility.

The formula for  P(K),  C(K) and F for the mixture-of-normals shadow rate model (eq. (53)) and the Gram-Charlier/Edgeworth shadow rate model (eq. (54)) are provided in Appendix C.


5 Empirical evidence

5.1 Empirical strategy

To examine how well zero-cdf models perform in matching option prices, I have implemented two versions of pdf models, which we shall refer to as the QG3 model and the CIR2 model:

\displaystyle {\rm QG3}: \,   \displaystyle r \, {\mathbf \sim}\, b_1 \chi_{1,\lambda_1}^2+ b_2 \chi_{1,\lambda_2}^2+ b_3\chi_{1,\lambda_3}^2, (55)
\displaystyle {\rm CIR2}: \,   \displaystyle r \, {\mathbf \sim}\, b_1 \chi_{\nu_1,\lambda_1}^2 + b_2 \chi_{\nu_2,\lambda_2}^2, (56)

 (b_i,\nu_i,\lambda_i > 0). For Black's boundary model, I have implemented the model in which the shadow rate distribution is that of the mixture of normals, i.e., eq. (53), which we shall refer to as the B-MN model.27 For comparison with a popular benchmark pdf, I have also computed the pdf based on the mixture of lognormals, i.e., eq. (11), which we shall denote as the MLN model. These four models are summarized in Table 1.


Table: Summary of the models. The pdf parameters for the QG3 and CIR2 models are defined in eqs. (55) and (56). The pdf parameters for the MLN and B-MN models are defined in eqs. (11) and (53), respectively.
model pdf parameters description
QG3  b_1,b_2,b_3,\lambda_1,\lambda_2,\lambda_3 3-factor QG(CIR) models.
CIR2  b_1,b_2,\nu_1,\nu_2,\lambda_1,\lambda_2 2-factor CIR(QG) models.
B-MN  \beta_1,\mu_1,\mu_2,\sigma_1,\sigma_2 Black's boundary; mixture of normals.
MLN  \beta_1,\mu_1,\mu_2,\sigma_1,\sigma_2 Mixture of lognormals.

The QG3 model covers all possible 3-factor positive-definite QG models, but it also includes a special case of the 3-factor CIR model (  \kappa_i\theta_i=\frac{1}{4}\sigma_i^2). The CIR2 model covers all possible cases of the 2-factor CIR models, but it also nests the 2-factor QG model. This model also nests the Longstaff-Schwartz (1992) model, which was regarded by at least H{\H{o\/}}\kern.05emrdahl (2000) as a promising model for describing the short rate distribution. In terms of the number of parameters, the QG3 and CIR2 models are richer than the B-MN and MLN models by one parameter.

It is also worth noting that, because the QG3 and CIR2 models originate from specific term structure models, the pdf parameters  b_i,\nu_i,\lambda_i ( \nu_i=1 for QG3) are linked to the elementary parameters of the term structure model and the state variables  x_{it}.28Therefore the pdf parameters can be obtained from an estimated term structure model. However, there are many ways of estimating term structure models,29which could lead to different estimated elementary parameters and state variables, and in turn, different pdfs. Therefore, in this paper we shall simply treat the pdf parameters for the QG3 model and th