The Federal Reserve Board eagle logo links to home page

A Note on the Coefficient of Determination in Models with Infinite Variance Variables1

Jeong-Ryeol Kurz-Kim2 - Mico Loretan3

NOTE: International Finance Discussion Papers are preliminary materials circulated to stimulate discussion and critical comment. References in publications to International Finance Discussion Papers (other than an acknowledgment that the writer has had access to unpublished material) should be cleared with the author or authors. Recent IFDPs are available on the Web at http://www.federalreserve.gov/pubs/ifdp/. This paper can be downloaded without charge from the Social Science Research Network electronic library at http://www.ssrn.com/.


Abstract:

Since the seminal work of Mandelbrot (1963), $ \alpha$-stable distributions with infinite variance have been regarded as a more realistic distributional assumption than the normal distribution for some economic variables, especially financial data. After providing a brief survey of theoretical results on estimation and hypothesis testing in regression models with infinite-variance variables, we examine the statistical properties of the coefficient of determination in models with $ \alpha$-stable variables. If the regressor and error term share the same index of stability $ \alpha<2$, the coefficient of determination has a nondegenerate asymptotic distribution on the entire [0, 1] interval, and the density of this distribution is unbounded at 0 and 1. We provide closed-form expressions for the cumulative distribution function and probability density function of this limit random variable. In contrast, if the indices of stability of the regressor and error term are unequal, the coefficient of determination converges in probability to either 0 or 1, depending on which variable has the smaller index of stability. In an empirical application, we revisit the Fama-MacBeth two-stage regression and show that in the infinite-variance case the coefficient of determination of the second-stage regression converges to zero in probability even if the slope coefficient is nonzero.

Keywords: Regression models, $ \alpha$-stable distributions, infinite variance, coefficient of determination, Fama-MacBeth regression, Monte Carlo simulation, signal-to-noise ratio, density transformation theorem.

JEL classification: C12, C13, C21, G12


1  Introduction

Granger and Orr (1972) begin their article, " 'Infinite variance' and research strategy in time series analysis," by questioning the uncritical use of the normal distribution assumption in economic modelling and estimation:

Due in part to the influential seminal work of Mandelbrot (1963), $ \alpha $-stable distributions are often considered to provide the basis for more realistic distributional assumptions for some economic data, especially for high-frequency financial time series such as those of exchange rate fluctuations and stock returns. Financial time series are typically fat-tailed and excessively peaked around their mean--phenomena that can be better captured by $ \alpha$-stable distributions with $ 1<\alpha<2$ rather than by the normal distribution, for which $ \alpha=2$.4 The $ \alpha$-stable distributional assumption with $ \alpha\le2$ is thus a generalization of rather than an alternative to the Gaussian distributional assumption. If an economic series fluctuates according to an $ \alpha$-stable distribution with $ \alpha<2$, it is known that many of the standard methods of statistical analysis, which often rest on the asymptotic properties of sample second moments, do not apply in the conventional way. In particular, as we demonstrate in this paper, the coefficient of determination--a standard criterion for judging goodness of fit in a regression model--has several nonstandard statistical properties if $ \alpha<2$.

The linear regression model is one of the most commonly used and basic econometric tools, not only for the analysis of macroeconomic relationships but also for the study of financial market data. Typical examples for the latter case are estimation of the ex-post version of the capital asset pricing model (CAPM) and the two-stage modelling approach of Fama and MacBeth (1973). Because of the prevalence of heavy-tailed distributions in financial time series, it is of interest to study how regression models perform when the data are heavy-tailed rather normally distributed.

The first purpose of the present paper is to survey theoretical results of estimation and hypothesis testing in regression models with infinite-variance distributions, and the second is to establish that infinite variance of the regression variables has important consequences for the statistical properties of the coefficient of determination and tests of the hypothesis that this coefficient is equal to zero. Third, we revisit the Fama-MacBeth two-stage regression approach and demonstrate that infinite variance of the regression variables can affect decisively the interpretation of the empirical results.

The rest of our paper is structured as follows. In Section 2 we provide a brief summary of the properties of $ \alpha$-stable distributions and of aspects of estimation, hypothesis testing, and model diagnostic checking in regression models with infinite-variance regressors and disturbance terms. Section 3 provides a detailed analysis of the asymptotic properties of the coefficient of determination in regression models with infinite-variance variables. In our empirical application, presented in Section 4, we revisit the data used in Fama and French (1992), and we show that the statistical and/or economic interpretation of their findings can be quite different under the maintained assumption of $ \alpha$-stable distributions from an interpretation based on the assumption of normal distributions. Section 5 summarizes the paper and offers some concluding remarks.

2  Framework

2.1  $ \alpha$-stable distributions

A random variable $ X$ is said to have a stable distribution if, for any positive integer $ n>2$, there exist constants $ a_{n}>0$ and $ b_{n} \in\mathbb{R}$ such that $ X_{1} + \cdots+ X_{n} \overset{d}{=} a_{n} X+b_{n}$, where $ X_{1},\dots,X_{n}$ are independent copies of $ X$ and  $ \overset{d}{=}$ signifies equality in distribution. The coefficient $ a_{n}$ above is necessarily of the form $ a_{n}=n^{1/\alpha}$ for some $ \alpha\in(0,2]$ (see Feller, 1971, Section VI). The parameter $ \alpha$ is called the index of stability of the distribution, and a random variable $ X$ with index $ \alpha$ is called $ \alpha$-stable. An $ \alpha$-stable distribution is described by four parameters and will be denoted by $ S(\alpha, \beta, \gamma, \delta)$. Closed-form expressions for the probability density functions of $ \alpha $-stable distributions are known to exist only for three special cases.5 However, closed-form expressions for the characteristic functions of $ \alpha$-stable distributions are readily available. One parameterization of the logarithm of the characteristic function of $ S(\alpha, \beta, \gamma, \delta)$ is

$\displaystyle \ln\left( \E e^{i\tau X} \right) = i\delta\tau-\gamma^{\alpha }\vert\tau\vert^{\alpha}\bigl( 1+i\beta\, \mathrm{sign}(\tau)\, \omega(\tau ,\alpha)\bigr) \, ,$ (1)

where $ \mathrm{sign}(\tau)=-1$ for $ \tau<0$, $ \mathrm{sign}(\tau)=0$ for $ \tau=0$, and $ \mathrm{sign}(\tau)=+1$ for $ \tau>0$; and $ \omega(\tau,\alpha)= -\tan(\pi{\alpha/2})$ for $ \alpha\ne1$ and $ \omega(\tau,\alpha)= (2/\pi )\ln\vert\tau\vert$ for $ \alpha=1$.

The tail shape of an $ \alpha$-stable distribution is determined by its index of stability $ \alpha\in(0,2]$. Skewness is governed by $ \beta\in[-1,1]$; the distribution is symmetric about $ \delta$ if and only if $ \beta=0$. The scale and location parameters of $ \alpha$-stable distributions are denoted by $ \gamma>0$ and $ \delta\in\mathbb{R}$, respectively. When $ \alpha=2$, the log characteristic function given by equation (1) reduces to $ i\delta\tau-\gamma^{2}\tau^{2}$, which is that of a Gaussian random variable with mean $ \delta$ and variance  $ 2\gamma^{2}$. For $ \alpha<2$ and $ \vert\beta\vert<1$, the tail properties of an $ \alpha$-stable random variable $ X$ satisfy

$\displaystyle \lim_{x\to\infty} \P (X>x) = \bigl[C(\alpha) \gamma ^{\alpha}(1+\beta)/2\bigr] x^{-\alpha}$    and (2)

$\displaystyle \lim_{x\to\infty} \P (X<-x) = \bigl[C(\alpha) \gamma^{\alpha}(1-\beta)/2\bigr] x^{-\alpha}$ (3)

i.e., both tails of the probability density function (pdf) of $ X$ are asymptotically Paretian. For $ \alpha<2$ and $ \beta=+1$ ($ -1$), the distribution is maximally right-skewed (left-skewed) and only the right (left) tail is asymptotically Paretian.6 The term $ C(\alpha)$ in equations (2) and (3) is given by

$\displaystyle C(\alpha)= \frac{1-\alpha}{\Gamma(2-\alpha) \cos(\pi {\alpha/2})\T}$    for $ \alpha\ne1$ (4)

and $ 2/\pi$ for $ \alpha=1$; see, e.g., Samorodnitsky and Taqqu (1994), p. 17. The function $ C(\alpha)$, which is shown in Figure 1, is continuous and strictly decreasing in $ \alpha\in(0,2)$, with $ \lim_{\alpha\downarrow 0}C(\alpha)=1$ and $ \lim_{\alpha\uparrow2} C(\alpha)=0$.7 In consequence, even though all stable distributions with $ \alpha<2$ have asymptotically Paretian tails, as $ \alpha\uparrow2$ proportionately less and less of the distribution's probability mass is located in the tail region. In addition, the density's tails decline at an increasingly rapid rate as $ \alpha\uparrow2$, thereby limiting the likelihood of observing very large draws conditional on the draw coming from the tail region. These observations explain why potentially very large sample sizes are required if one desires to estimate the index of stability with adequate precision if $ \alpha$ is close to but smaller than 2.

Because $ \E \vert X\vert^{\xi}=\lim_{b\to\infty}\int_{0}^{b} \P (\vert X\vert^{\xi}>x)\d x$ , it follows that $ \E \vert X\vert^{\xi}< \infty$ for $ \xi\in(0,\alpha)$ and $ \E \vert X\vert^{\xi}= \infty$ for $ \xi\ge\alpha$ if $ X$ is $ \alpha$-stable with $ \alpha\in (0,2)$.8 Only moments of order up to but not including $ \alpha$ are finite if $ \alpha<2$, and a non-Gaussian stable distribution's index of stability is also equal to its maximal moment exponent.9 In particular, if $ \alpha\in(1,2)$, the variance is infinite but the mean exists. For $ \alpha>1$, it follows that $ \E (X)=\delta$; in addition, for $ \beta=0$, $ \delta$ is equal to the distribution's mode and median irrespective of the value of $ \alpha$, justifying the use of the term ``central location parameter'' for $ \delta$ in the finite-mean or symmetric cases. In addition, for $ \alpha\ne1$, one can show that $ S(\alpha,\beta, \gamma,\delta) \overset{d}{=} \gamma\cdot S(\alpha,\beta,1, \delta/\gamma)$ .10 We make use of this property below in the derivations of Theorem 1 and Remark 3.

The class of $ \alpha$-stable distributions is an interesting distributional candidate for disturbances in regression models because (i) it is able to capture the relative frequencies of extreme vs.observations in the economic variables, (ii) it has the convenient statistical property of closure under convolution, and (iii) only $ \alpha$-stable distributions can serve as limiting distributions of sums of independent and identically distributed (iid) random variables, as proven in Zolotarev (1986). The latter two properties are appealing for regression analysis, given that disturbances can be viewed as random variables which represent the sum of all external effects not captured by the regressors. For more details on the properties of $ \alpha $-stable distributions, we refer to Gnedenko and Kolmogorov (1954), Feller (1971), Zolotarev (1986), and Samorodnitsky and Taqqu (1994). The role of the $ \alpha$-stable distribution in financial market and econometric modelling is surveyed in McCulloch (1996) and Rachev et al.1999).

2.2  Regression models with infinite-variance variables

Let $ X$ and $ Y$ be two jointly symmetric $ \alpha$-stable (henceforth, $ S\alpha S$) random variables with $ \alpha>1$, i.e., we require $ X$ and $ Y$ to have finite means. Our main reason for concentrating on the case $ \alpha>1$ lies in its empirical relevance. Estimated maximal moment exponents for most empirical financial data, such as exchange rates and stock prices, are generally greater than 1.5; see, for example, de Vries (1991) and Loretan and Phillips (1994). An econometric (purposeful) reason for studying the case $ \alpha>1$ is that, for $ \alpha$-stable distributions with $ \alpha>1$, regression analysis that is based on sample second moments, such as least squares, is still asymptotically consistent for the regression coefficients, even though the limit distributions of these regression coefficients are nonstandard.11 Suppose that the regression of a random variable $ Y$ on a random variable $ X$ is linear, i.e., there exists a constant $ \theta$ such that

$\displaystyle \E (\,Y\mid X\,)=\theta X \quad a.s.,$ (5)

with

$\displaystyle \theta=\frac{[Y,X]_{\alpha}}{{\gamma_{x}}^{\alpha}}X, $

where  $ \gamma_{x}$ is the scale parameter of the $ S\alpha S$ variable $ X$ and $ [\cdot,\cdot]_{\alpha}$ in the numerator is covariation (covariance in the Gaussian case), which can be calculated as $ \E \bigl(XY^{<\xi-1>} \bigr) \big/ \E \bigl(\vert Y\vert^{\xi}\bigr)$ , for all $ \xi \in(1,\alpha)$ with $ a^{<\xi>}\equiv\vert a\vert^{\xi}\mathrm{sign}(a)$.

For estimation and diagnostics, the relation (5) can be written as a regression model with a constant term,

$\displaystyle y_{t} = c + \theta x_{t} + u_{t},$ (6)

where the maintained hypothesis is that $ u_{t}$ is iid $ S\alpha S$, with $ \alpha\in(1,2]$. The econometric issues of interest are to estimate $ \theta$ properly, to test the hypothesis of significance for the estimated parameter, usually based on the $ t$-statistic, as well as to compute model diagnostics, such as the coefficient of determination, the Durbin-Watson statistic, and the $ F$-test of parameter constancy across subsamples.

The effects of infinite variance in the regressor and disturbance term can be substantial. If the variables share the same index of stability $ \alpha$, the ordinary least squares (OLS) estimate of $ \theta$ is still consistent, but its asymptotic distribution is $ \alpha$-stable with the same $ \alpha$ as the underlying variables. Furthermore, the convergence rate to the true parameter is $ T^{(\alpha-1)/\alpha}$, smaller than the rate $ T^{1/2}$ which applies in the finite-variance case. If $ \alpha<2$, OLS loses its best linear unbiased estimator (BLUE) property, i.e., it is no longer the minimum-dispersion estimator in the class of linear estimators of $ \theta$. In addition, the asymptotic efficiency of the OLS estimator converges to zero as the index of stability $ \alpha$ declines to $ 1$. Blattberg and Sargent (1971) (henceforth, BS) derived the BLUE for $ \theta$ in (6) if the value of $ \alpha$ is known. The BS estimator is given by

$\displaystyle {_{\alpha}}\hat{\theta}_{BS} = \frac{\sum^{T}_{t=1} x_{t}^{<1/(\alpha-1)>} y_{t}} {\sum^{T}_{t=1} \vert x_{t}\vert^{\alpha/(\alpha-1)}}\,, \qquad1<\alpha\le2,$ (7)

which coincides with the OLS estimator if $ \alpha=2$. Kim and Rachev (1999) prove that the asymptotic distribution of the BS estimator is also $ \alpha $-stable. Samorodnitsky et al.2007) consider an optimal power estimate based on the BS estimator for unknown $ \alpha$, and they also provide an optimal linear estimator of the regression coefficients for various configurations of the indices of stability of $ x_{t}$ and $ u_{t}$. Other efficient estimators of the regression coefficients have been studied as well; Kanter and Steiger (1974) propose an unbiased $ L_{1}$-estimator, which excludes very large shocks in its estimation to avoid excess sensitivity due to outliers. Using a weighting function, McCulloch (1998) considers a maximum-likelihood estimator which is based on an approximation to a symmetric stable density.

Hypothesis testing is also affected considerably when the regressors and disturbance terms have infinite-variance stable distributions. For example, the $ t$-statistic, commonly used to test the null hypothesis of parameter significance, no longer has a conventional Student-$ t$ distribution if $ \alpha<2$. Rather, as established by Logan et al.1973), its pdf has modes at $ -1$ and $ +1$; for $ \alpha<1$ these modes are infinite. Kim (2003) provides empirical distributions of the $ t$-statistic for finite degrees of freedom and various values of $ \alpha$ by simulation. The usual applied goodness-of-fit test statistics, such as the likelihood ratio, Lagrange multiplier, and Wald statistics, also no longer have the conventional asymptotic $ \chi^{2}$ distribution, but have a stable $ \chi^{2}$ distribution, a term that was introduced by Mittnik et al.1998).

In time series regressions with infinite-variance innovations, Phillips (1990) shows that the limit distribution of the augmented Dickey-Fuller tests for a unit root are functionals of Lévy processes, whereas they are functionals of Brownian motion processes in the finite-variance case. The $ F$-test statistic for parameter constancy that is based on the residuals from a sample split test has an $ F$-distribution in the conventional, finite-variance case. Kurz-Kim et al.2005) obtain the limiting distribution of the $ F$-test if the random variables have infinite variance. As shown by the authors, as well as by Runde (1993), the limiting distribution of the $ F$-statistic for $ \alpha<2$ behaves completely differently from the Gaussian case: whereas in the latter case the statistic converges to 1 under the null as the degrees of freedom for both numerator and denominator of the statistic approach infinity, in the former case the statistic converges to a ratio of two independent, positive, and maximally right-skewed $ {\alpha/2}$-stable distributions. This result is used below to derive closed-form expressions for the pdf and cumulative distribution function (cdf) of the limiting distribution of the $ R^{2}$ statistic if the regressor and disturbance term share the same index of stability $ \alpha<2$.

Moreover, commonly used criteria for judging the validity of some of the maintained hypotheses of a regression model, such as the Durbin-Watson statistic and the Box-Pierce $ Q$-statistic, would be inappropriate if one were to rely on conventional critical values. Phillips and Loretan (1991) study the properties of the Durbin-Watson statistic for regression residuals with infinite variance, and Runde (1997) examines the properties of the Box-Pierce $ Q$-statistic for random variables with infinite variance. Loretan and Phillips (1994) and Phillips and Loretan (1994) establish that both the size of tests of covariance stationarity under the null and their rate of divergence of these tests under the alternative are strongly affected by failure of standard moment conditions; indeed, standard tests of covariance stationarity are inconsistent if population second moments do not exist.

3  Asymptotic properties of the coefficient of determination in models with $ \alpha$-stable regressors and error terms

3.1  Basic results

For the general asymptotic theory of stochastic processes with stable random variables, we refer to Resnick (1986) and Davis and Resnick (1985a, 1985b, 1986). Our results in this section are, in large part, an application of their work to the regression diagnostic context.

The maintained assumptions are:

  1. The relationship between the dependent and independent variable conforms to the classical bivariate linear regression model,
    $\displaystyle y_{t} = c + \theta x_{t} + u_{t},\quad t=1,\dots,T\,.$ (8)

  2. $ u_{t}$ is iid $ S\alpha S$ $ (\alpha_{u},0,\gamma_{u},0)$, with $ \alpha_{u} \in(1,2)$.
  3. $ x_{t}$ is exogenous and is also iid $ S\alpha S$ $ (\alpha_{x},0, \gamma_{x},0)$, with $ \alpha_{x} \in(1,2)$.
  4. The regressor and the error term have the same index of stability, i.e., $ \alpha_{x}=\alpha_{u}=\alpha$.
  5. The coefficients $ c$ and $ \theta$ are consistently estimated by $ \hat c$ and  $ \hat\theta$.12

The fourth assumption, that the regressor and the error term have the same index of stability, is rather strong, and its validity may be difficult to ascertain in empirical applications. In Corollary 2 below, we examine the consequences of having unequal values for the indices of stability for $ x_{t}$ and $ u_{t}$ for the asymptotic properties of the coefficient of determination.

The coefficient of determination measures the proportion of the total squared variation in the dependent variable that is explained by the regression:

$\displaystyle R^{2} = \frac{\text{Explained Sum of Squares}}{\text{Total Sum of Squares}} = \frac{\sum_{t=1}^{T}(\hat y_{t}-\bar y)^{2} }{ \sum_{t=1}^{T} (y_{t}-\bar y)^{2}}\,. $

Because $ \hat y_{t}-\bar y=\hat\theta(x_{t}-\bar x)$ and $ y_{t}-\bar y = \hat\theta(x_{t}-\bar x) +\hat u_{t}$, where $ \bar y$ and $ \bar x$ are the respective sample averages of $ y_{t}$ and $ x_{t}$, and because $ \sum_{t=1}^{T} (x_{t}-\bar x)\hat u_{t}$=0 by construction, the coefficient of determination may be written as

$\displaystyle R^{2} = \frac{\hat\theta^{2} \sum_{t=1}^{T} (x_{t}-\bar x)^{2}} {\hat\theta^{2} \sum_{t=1}^{T} (x_{t}-\bar x)^{2\T} +\sum_{t=1}^{T} \hat u_{t}^{2}} \,.$ (9)

Since $ x_{t}^{2}$ and $ u_{t}^{2}$ are in the normal domain of attraction of a stable distribution with index of stability $ {\alpha/2}$, norming by $ T^{-2/\alpha}$ rather than by $ T^{-1}$ is required to obtain non-degenerate limits for the sums of the squared variables. Because $ \hat\theta\to_{p} \theta$ by the assumption of consistent estimation, an application of the law of large numbers to $ \bar x$, the continuous mapping theorem, and the results of Davis and Resnick (1985b) yield the following expression for the joint limiting distribution of the elements in equation (9):

$\displaystyle \Bigl( T^{-2/\alpha} \gamma^{-2}_{u} \sum^{T}_{t=1} \hat u^{2}_{t}, \hat\theta^{2} T^{-2/\alpha} \gamma^{-2}_{x} \sum^{T}_{t=1} (x_{t}-\bar x)^{2} \Bigr) \sim\Bigl( T^{-2/\alpha} \gamma^{-2}_{u} \sum ^{T}_{t=1} u^{2}_{t}, \theta^{2} T^{-2/\alpha} \gamma^{-2}_{x} \sum^{T}_{t=1} x^{2}_{t} \Bigr) = \Bigl( T^{-2/\alpha} \sum^{T}_{t=1} (u_{t}/\gamma_{u})^{2}, \theta^{2} T^{-2/\alpha} \sum^{T}_{t=1} (x_{t}/\gamma_{x})^{2} \Bigr) \to_{d} \left( S_{u},\theta^{2} S_{x}\right) .$ (10)

For $ \alpha<2$, the random variables $ S_{u}$ and $ S_{x}$ are independent, maximally right-skewed, and positive stable random variables with index of stability  $ {\alpha/2}<1$, $ \beta=+1$, $ \gamma=1$,13 $ \delta=0$, and log characteristic function

$\displaystyle \ln\E \left( e^{i\tau S_{x}}\right) = \ln\E \left( e^{i\tau S_{u}}\right) = -\vert\tau\vert^{\alpha/2} \bigl(1-i\,\mathrm{sign} (\tau)\,\tan(\pi\alpha/4)\bigr) \,.$ (11)

We therefore conclude that, under the five maintained assumptions of this section, the $ R^{2}$ statistic of the regression model (8) has the following asymptotic distribution.

Theorem 1   Under the maintained assumptions of the regression model in equation (8), the coefficient of determination is distributed asymptotically as
$\displaystyle R^{2} \to_{d} \frac{\theta^{2}\gamma_{x}^{2} S_{x}}{\T\theta^{2} \gamma_{x}^{2} S_{x}+\gamma_{u}^{2} S_{u}}=\frac{\eta S_{x}}{\eta S_{x}+S_{u}} =\frac{\eta Z}{\eta Z+1} = {\widetilde{R}}(\alpha,\eta)\,$ , say, (12)

where $ \eta=(\theta\gamma_{x}/\gamma_{u})^{2} \ge0$14 and $ Z=S_{x}/S_{u}$. For $ \alpha<2$, $ S_{x}$ and $ S_{u}$ are independent and are identically distributed with log characteristic functions given by equation (11).

Thus, for $ \alpha<2$ and $ \eta>0$, the coefficient of determination does not converge to a constant but has a nondegenerate asymptotic distribution on the interval $ [0,1]$. This contrasts starkly with the standard, finite-variance result, which is stated here for completeness.

Corollary 1   If $ \alpha=2$, and hence if $ x_{t}$ and $ u_{t}$ have finite variance, the limit variables $ S_{x}$ and $ S_{u}$ in Theorem 1 are non-random constants and are, in fact, equal to $ 2$.15 In the finite-variance case, then, the limit of $ R^{2}$ as $ T\to\infty$ is given by
$\displaystyle R^{2}\to_{p} \frac{\theta^{2} \sigma_{x}^{2}}{\T\theta^{2} \sigma_{x} ^{2}+\sigma_{u}^{2}}= \frac{\eta}{\eta+1}\,, $
where now $ \eta=(\theta\sigma_{x}/\sigma_{u})^{2}$.

In the finite-variance case, the model's asymptotic signal-to-noise ratio, $ \eta=(\theta\sigma_{x}/\sigma_{u})^{2}$, is constant, as is therefore the limit of the coefficient of determination. In contrast, in the infinite-variance case the model's limiting signal-to-noise ratio is given by $ \eta Z$, where $ \eta=(\theta\gamma_{x}/ \gamma_{u})^{2}$ and $ Z=S_{x}/S_{u}$, and is therefore a random variable even asymptotically; it is this feature that causes the randomness of  $ {\widetilde{R}}(\alpha,\eta)$. We postpone a fuller discussion of the intuition that underlies this result to the end of this section, after we provide a detailed analysis of the statistical properties of  $ {\widetilde{R}}$.

Before doing so, however, we note that the fourth maintained assumption, i.e., that the indices of stability of the regressor and error term in (8) be the same, is crucial for obtaining the result that the asymptotic distribution of  $ {\widetilde{R}}$ is nondegenerate. Indeed, if the two indices of stability differ, the asymptotic properties of the $ R^{2}$ statistic are as follows.

Corollary 2   Suppose that the maintained assumptions of Theorem 1 apply except that $ \alpha_{x} \ne\alpha_{u}$, i.e., suppose that the indices of stability of the regressor and error term are unequal. Let $ \theta\ne0$ to rule out the trivial case from further consideration. Then,

Thus, $ R^{2}$ converges to $ 1$ in probability if $ \alpha_{x}<\alpha_{u}$, and it converges to 0 in probability if $ \alpha_{u}<\alpha_{x}$.

Proof. These results follow immediately from the fact that if $ \alpha_x \ne \alpha_u$, different norming factors, viz., $ T^{2/\alpha_x}$ and $ T^{2/\alpha_u}$, are needed in equation (10) to achieve joint convergence of the terms $ \hat{\theta}\sum^T_{t=1} (x_t-\bar x)^2$ and $ \sum^T_{t=1} \hat{u}_t^2$ to the limiting random variables $ S_x$ and $ S_u$. Whenever the two norming factors differ, the larger of the two factors dominates the ratio that defines $ R^2$ as $ T\to\infty$, and this statistic must therefore converge either to 0 or 1 in probability. Suppose first that $ \alpha_x<\alpha_u$; since $ T^{2/\alpha_x}> T^{2/\alpha_u}$, we find $ T^{-2/\alpha_x} \sum\hat u_t^2 = T^{-2/ \alpha_u} \left(T^{2/\alpha_u- 2/\alpha_x} \right) \sum\hat u_t^2 = o_p \left( T^{2/\alpha_u- 2/\alpha_x} \right)$ . Therefore,
$\displaystyle R^2 &= \frac{\hat{\theta}^2 T^{-2/\alpha_x}\sum(x_t-\bar x)^2}{ \hat{\theta}^2 T^{-2/\alpha_x}\sum(x_t-\bar x)^2 + T^{-2/\alpha_x} \sum \hat u_t^2 } \\ &\to_d \frac{\theta^2\gamma_x^2 S_x}{\theta^2\gamma_x^2 S_x + o_p\left( T^{2/\alpha_u- 2/\alpha_x} \right)}\\ &\to_p 1.$

Similarly, if $ \alpha_u<\alpha_x$, $ T^{-2/\alpha_u} \sum(x_t-\bar x)^2 =o_p\left(T^{2/\alpha_x-2/\alpha_u}\right)$ , and $ R^2\to_p 0$. $ \qedsymbol$

Heuristically, if $ \alpha_{x}\ne\alpha_{u}$ and $ \theta\ne0$, the limiting distribution of the $ R^{2}$ statistic is degenerate at 0 or 1 because the model's asymptotic signal-to-noise ratio is either zero (if $ \alpha_{u} < \alpha_{x}$) or infinite (if $ \alpha_{x} < \alpha_{u}$). From an examination of the proof of this corollary, we can also deduce that if $ \alpha_{x} \ne\alpha_{u}$, the fifth maintained assumption--that the regression coefficients are estimated consistently--could be relaxed, to require merely that an estimation method be employed that guarantees $ \hat{\theta}\ne o_{p}(1)$; the result that $ R^{2}$ converges either to 0 or 1 would continue to hold in this case.

3.2  Qualitative properties of $ {\widetilde{R}}$

Returning to the main case of $ \alpha_{x}=\alpha_{u}=\alpha$, we note that the random variable  $ {\widetilde{R}}$ is defined for all values of $ \alpha \in(0,2)$, even though in a regression context one would typically assume that $ \alpha\in(1,2)$. We now establish some important qualitative properties of  $ {\widetilde{R}}$.

Remark 1   For $ \eta>0$, the median of $ {\widetilde{R}}$, $ m$, equals $ \eta/(\eta+1)$.
Proof. For $ \eta>0$, observe that
$\displaystyle \P\left(\wR \le \frac{\eta}{\eta+1}\right) &= \P\left(\frac{\eta S_x}{\eta S_x+S_u}\le \frac{\eta}{\eta+1}\right)\\ &= \P\left( S_x \le \frac{1}{\eta+1}\left(\eta S_x+S_u\right) \right)\\ &= \P\bigl( (\eta+1)S_x-\eta S_x \le S_u \bigr)\\ &= \P\left(S_x\le S_u\right)\,.$

Because $ S_x$ and $ S_u$ are iid and have continuous cdfs, $ \P (S_x\le S_u)=0.5$ by an application of Fubini's Theorem. 16 $ \qedsymbol$

Thus, $ m$ is equal to the non-random limit of $ R^{2}$ in the finite-variance case. Since $ S_{x}$ and $ S_{u}$ are positive a.s., we also have $ \P (S_{x}/S_{u}\le1)\equiv\P (Z\le1)=0.5$, i.e., the median of $ Z$ is equal to 1, regardless of the value of $ \alpha$. As we will demonstrate rigorously later in this paper, the probability mass of $ Z$ is highly concentrated around 1 for values of $ \alpha$ close to 2. Conversely, for small values of $ \alpha$, $ Z$ is unlikely to be close to 1; instead, it is very likely that one will obtain a draw of $ Z$ that is either very small, i.e., close to 0, or very large. A small or large draw of $ Z$ has a crucial effect on the model's signal-to-noise ratio, $ \eta Z$, and therefore also on $ R^{2}$. This suggests that an informal measure of the effect of infinite variance in the regression variables on the value of $ R^{2}$ in a given sample may be based on the difference between the model's coefficient of determination and a consistent estimate of its median $ m$, say $ {\hat m}={\hat\eta}/ ({\hat\eta}+1)$, where $ {\hat\eta}=( {\hat\theta} {\hat\gamma_{x}}/ {\hat\gamma_{u}})^{2}$ . The larger the difference between $ R^{2}$ and $ \hat m$, the more important the effect is of having obtained a small (or large) value of $ Z$.

The following remark shows that a finite-variance property of  $ R^{2}(\eta)$ for $ \eta>0$, viz., $ R^{2}(1/\eta)=1-R^{2}(\eta)$, carries over in a natural way to  $ {\widetilde{R}}$.

Remark 2   For $ \eta>0$, the distribution of $ {\widetilde{R} }({\alpha},\eta)$ is skew-symmetric, viz.,
$\displaystyle {\widetilde{R}}({\alpha},\eta)\overset{d}{=} 1-{\widetilde{R}}({\alpha} ,1/\eta)$

or equivalently, $ {\widetilde{R}}({\alpha},m)\overset{d}{=} 1-{\widetilde{R} }({\alpha},1-m).$ The pdf of $ {\widetilde{R}}$ therefore satisfies
$\displaystyle {\widetilde{R}}({\alpha},m)(r)={\widetilde{R}}({\alpha},1-m) (1-r)\quad\forall r \in[0,1].$

The distribution of $ {\widetilde{R}}$is symmetric about $ 0.5$ for $ \eta=1$.
Proof. Recall that $ S_x$ and $ S_u$ are iid. Thus, for $ \eta>0$
$\displaystyle 1-\wR (\alpha,1/\eta) = 1-\frac{(1/\eta) S_x}{(1/\eta) S_x+S_u}\\ = \frac{S_u}{(1/\eta) S_x+S_u}\\ = \frac{\eta S_u}{\eta S_u +S_x}\\ \eqd \frac{\eta S_x}{\eta S_x+S_u} = \wR (\alpha,\eta).$

The symmetry of $ \wR $ about $ 0.5$ for $ \eta=1$ follows immediately from this result and the fact that the distribution's support is the interval [0, 1].

$ \qedsymbol$

Next, as the following remark shows, the pdf of $ {\widetilde{R}}$ has infinite modes at 0 and $ 1$, i.e., at the endpoints of its support.

Remark 3 (i)   For $ \eta>0$, the pdf of $ {\widetilde{R}}$ is unbounded at 0 and $ 1$, i.e., $ f_{{\widetilde{R}}}(0)=f_{{\widetilde{R}}}(1)=\infty$. (ii) The cdf of $ {\widetilde{R}}$ is continuous on $ [0,1]$, and the distribution does not have atoms at 0 and $ 1$.
Proof. To demonstrate the validity of the first part of this remark, we apply a standard result for the pdf of the ratio of two random variables, 17 adapted to the present case where the random variables in the numerator and denominator are both strictly positive. For $ \eta>0$, set $ V=\eta S_x$ and $ W=\eta S_x+S_u$. We have
$\displaystyle f_{\wR }(r)=\int_0^\infty\! w f_{V,W}(rw,w)\d w, \qquad 0\le r\le 1,$

where the joint pdf $ f_{V,W}(\cdot,\cdot)$ is nonzero on $ \mathbb{R}^+ \times \mathbb{R}^+$. The case $ r=1$ can occur only if $ S_u=0$; if $ S_u=0$, however, the random variables $ V$ and $ W$ are perfectly dependent, their joint pdf is nonzero only on the positive $ 45^\circ$-halfline, and the joint pdf $ f_{V,V}(w,w)$ reduces to  $ (1/\sqrt{2})f_V(w)$, $ w\ge 0$. Hence, for $ r=1$ we find

$\displaystyle f_{\wR }(1)= \int_0^\infty\! w f_{V,V}(1\cdot w,w)\d w = \frac{1}{\sqrt{2}} \int_0^\infty\! w f_V(w) \d w = \frac{1}{\sqrt{2}} \E (\eta S_x) = \infty.$

By Remark 2, we have $ f_{\wR }(0)=\infty$ as well. The continuity of the cdf of $ \wR $ on $ [0,1]$ for $ \eta>0$ follows from the continuity of the cdfs of $ S_x$ and $ S_u$ on  $ \mathbb{R}^+$ and the fact that their pdfs are equal to zero at the origin. For example, one finds that $ \P (\wR =1)=\P (S_u=0)=0$; the result $ \P (\wR =0)=0$ then follows from Remark 2.

$ \qedsymbol$

The fact that the probability density function of $ {\widetilde{R}}$ has infinite singularities may seem unusual. However, the presence of singularities is a regular feature of pdfs that are based on ratios of stable random variables. For example, Logan et al.1973) and Phillips and Hajivassiliou (1987) showed that if $ \alpha<1$, the density of the $ t$-statistic has infinite modes at $ -1$ and $ +1$; similarly, Phillips and Loretan (1991) demonstrated that if $ \alpha<2$, this feature is also present in the asymptotic distributions of the von Neuman ratio and the normalized Durbin-Watson test statistic.

3.3  The cdf and pdf of $ {\widetilde{R}}$

The remarks in the preceding subsection provide important qualitative information about some of the distributional properties of  $ {\widetilde{R}}$. However, they do not address issues such as whether the distribution has modes beyond those at 0 and 1, whether the discontinuity of the pdf at the endpoints is simple or if $ f_{{\widetilde{R}}}(r)$ diverges--and, if so, at which rate--as $ r\downarrow0$ or $ r\uparrow1$, or how much of the distribution's mass is concentrated near the endpoints of the support. To examine these issues, we provide expressions for the cdf and pdf of  $ f_{{\widetilde{R}}}(r)$ in this subsection. It is possible to do so because $ {\widetilde{R}}$ is a continuously differentiable and invertible function of the ratio of two independent, maximally right-skewed, and positive $ \alpha$-stable random variables, and because closed-form expressions for the cdf and pdf of this ratio are known. The latter expressions are provided in the following proposition.

Proposition 1 (Zolotarev 1986, p. 205; Runde 1993, p. 11)   Let $ S_{1}$ and $ S_{2}$ be two iid positive $ \alpha$-stable random variables with common parameters $ {\alpha/2}\in(0,1)$, $ \beta=+1$, $ \gamma=1$, and $ \delta=0$. Set $ Z=S_{1}/S_{2}$. For $ z\ge0$, the cdf of $ Z$ is given by
$\displaystyle F_{Z}(z) = \P (Z\le z) = \frac{1}{\pi{\alpha/2}}\arctan\left( \frac{z^{{\alpha/2}}+ \cos(\pi{\alpha/2})} {\sin(\pi{\alpha/2})}\right) -\frac{1}{\alpha}+1\,.$ (13)

Differentiating this expression with respect to $ z$, the pdf of $ Z$ for $ z>0$ is obtained as
$\displaystyle f_{Z}(z) = \frac{\d }{\d z}F_{Z}(z) = \frac{\sin(\pi{\alpha /2})}{\pi z\bigl[ z^{-{\alpha/2}}+z^{{\alpha/2}}+ 2\cos(\pi{\alpha/2}) \bigr] }\,.$ (14)

As $ Z$ is a positive random variable, $ F_{Z}(z)=f_{Z}(z)=0$ for $ z<0$.

The cdf of the random variable $ Z$ is shown in Figure 2 for various values of $ \alpha$ between 1.98 and 0.25.18The random variable $ Z$ has several interesting properties. First, note that $ \lim_{z\downarrow0} f_{Z}(z)=\infty$ and that the rate of divergence to infinity of $ f_{Z}(z)$ as $ z\downarrow0$ is given by $ (1/z)^{1-\alpha/2}$; thus, the pdf of $ Z$ has a one-sided infinite singularity at 0. Second, as $ z\to\infty$, $ f_{Z} (z)\approx\kappa\cdot z^{-{\alpha/2}-1}$ for a suitable constant $ \kappa>0$. This result, along with $ \P (Z>0)=1$, implies that $ Z$ lies in the normal domain of attraction of a positive stable distribution, say $ Z^{\prime}$, with index of stability  $ {\alpha/2}$ and $ \beta=+1$, the same parameters as that of the variables $ S_{1}$ and $ S_{2}$.19 Hence, the mean of $ Z$ is infinite for all values of $ \alpha<2$. Third, in the special case of $ \alpha=1$, $ S_{1}$ and $ S_{2}$ are each distributed as a Lévy $ \alpha$-stable random variable, which is well known to be equivalent to the inverse of a $ \chi^{2}(1)$ random variable. For $ \alpha=1$, then, the pdf of $ Z$ reduces to $ \bigl(\pi z^{1/2}(1+z) \bigr)^{-1}$, which is also the pdf of an $ F_{1,1}$ distribution; see Runde (1993).

As was noted earlier, the median of $ Z$ is equal to 1 for all values of  $ \alpha\in(0,2)$. The regression model's signal-to-noise ratio is given by the random variable $ \eta Z$ if $ \alpha<2$, whereas it is given by the constant $ \eta$ in the standard, i.e., finite-variance case. The fact that the random variable which multiplies $ \eta$ has a median of 1 helps to develop further the intuition that underlies the result of Remark 1, viz.,the median of  $ {\widetilde{R}}$, $ \eta/(\eta+1)$, is the same in both the finite-variance and the infinite-variance cases. Finally, an inspection of equation (13) reveals that $ \lim_{\alpha\uparrow2} \P (Z<1) = 0$ and $ \lim_{\alpha\uparrow2} \P (Z>1) = 0$; put differently, $ \lim_{\alpha\uparrow2}\P (Z=1)=1$. The probability mass of $ Z$ therefore becomes perfectly concentrated at 1 as $ \alpha\uparrow2$, even though, of course, its mean remains infinite as long as $ \alpha<2$.

From Theorem 1, we have $ {\widetilde{R}}=\eta Z/(\eta Z+1)=g(Z)$, say. Note that $ Z\equiv S_{x}/S_{u}$ satisfies the conditions of Proposition 1 and that the function  $ Z=g^{-1}({\widetilde{R} })=(1/\eta)\bigl({\widetilde{R}}/(1-{\widetilde{R}})\bigr)$ is continuously differentiable and strictly increasing in the interior of its domain. We are therefore able to provide the following expressions for the cdf and pdf of  $ {\widetilde{R}}$ by an application of the density transformation theorem.20

Theorem 2   For $ r\in(0,1)$ and $ \eta>0$, set $ z=g^{-1}(r)=(1/\eta )\bigl(r/(1-r) \bigr)$, and let the cdf and pdf of $ Z$ be given by equations (13) and (14). The cdf of $ {\widetilde{R}}$ for $ r\in(0,1)$ is given by
$\displaystyle F_{{\widetilde{R}}}(r) = F_{Z}\bigl[ g^{-1}(r) \bigr]\,.$ (15)

Furthermore, $ F_{{\widetilde{R}}}(0)=0$ and $ F_{{\widetilde{R}}}(1)=1$.

The pdf of  $ {\widetilde{R}}$ for $ r\in(0,1)$ is given by

$\displaystyle f_{{\widetilde{R}}}(r) = \left\vert \frac{\d }{\d r} g^{-1}(r) \right\vert f_{Z}\bigl[g^{-1}(r)\bigr] = \frac{1}{\eta(1-r)^{2}} \cdot\frac{\sin(\pi{\alpha/2})}{\pi g^{-1} (r)\Bigl( [g^{-1}(r)]^{-{\alpha/2}}+[g^{-1}(r)] ^{{\alpha/2}}+ 2\cos (\pi{\alpha/2}) \Bigr) } = \frac{\sin(\pi{\alpha/2})}{\pi\,r(1-r)} \cdot\bigl[ z^{-{\alpha/2}} +z^{{\alpha/2}}+ 2\cos(\pi{\alpha/2})\bigr]^{-1}$    where $ z=r/\bigl(\eta(1-r)\bigr)$. (16)

As $ r\downarrow0$ or $ r\uparrow1$, $ f_{{\widetilde{R}}}(r)$ diverges to infinity at a rate proportional to $ (1/r)^{1-{\alpha/2}}$ and $ \bigl(1/(1-r)\bigr)^{ 1-{\alpha/2}}$, respectively.
Proof. The results stated in equations (15) and (16) follow immediately from Proposition 1 and the density transformation theorem. Because $ \lim_{r\downarrow 0}\d g^{-1}(r)/\d r=\eta^{-1}$, the rate of divergence of $ f_\wR (r)$ as $ r\downarrow 0$ is equal to--apart from the multiplicative constant $ \eta^{-1}$--that of $ f_Z(z)$ as $ z\downarrow 0$, which is $ (1/z)^{1-\atwo }$. Finally, it follows from Remark 2 that as $ r\uparrow 1$ the pdf of $ \wR $ also diverges to infinity at this rate. $ \qedsymbol$

The probability density functions and cumulative distribution functions of  $ {\widetilde{R}}(\alpha,\eta)$ for values of $ \alpha$ between 0.25 and 1.98 are graphed in Figures 3 and 4. (In all cases, we have set $ \eta=1$.) The pdfs in Figure 3 are shown with a logarithmic scale on the ordinate. Since we know that $ f_{{\widetilde{R}}}(0)=f_{{\widetilde{R}}}(1)= \infty$, we graph the functions only for  $ r\in\bigl(10^{-13}, 1-10^{-13}\bigr)$. The graphs show that


A heuristic summary of these properties of  $ {\widetilde{R}}$ is straightforward. We begin by recalling that the multiplicative term  $ C(\alpha )$, shown in equation (4) and Figure 1, affects the probability of tail-region values of the random variables in question, and that the rate of decline in the tail areas of density of $ \alpha$-stable random variables increases as $ \alpha\uparrow2$. Suppose first that $ \alpha$ is very close to 2; then, $ C(\alpha)$ is close to 0, and the fraction of observations of $ x_{t}$ and $ u_{t}$ that fall into the respective Paretian-tail regions is therefore very low; moreover, given the fairly rapid decay of the density's tails for $ \alpha$ close to 2, the likelihood of obtaining a very large draw, conditional on obtaining a draw from the Paretian tail area, is also low. As a result, the probability of observing large observations of $ x_{t}$ and $ u_{t}$ is quite low. This, in turn, makes it unlikely to observe a very large draw of either $ S_{x}$ or $ S_{u}$ and thus of observing a value of $ Z$ that is either close to 0 or very large. Therefore, if $ \alpha$ is very close to 2, $ Z$ is likely close to its median of 1, and most of the mass of  $ {\widetilde{R}}$ is concentrated near its median, $ \eta/(\eta+1)$. Next, as $ \alpha$ moves down and away from 2, say to around 1.5, $ C(\alpha)$ increases rapidly, leading to a higher frequency of observing tail-region draws for $ x_{t}$ and $ u_{t}$. In addition, as the density in the tail region declines more slowly for smaller values of $ \alpha$, it is much more likely of obtaining very large draws of the regressor and error term than if $ \alpha$ is close to 2. In consequence, if $ \alpha$ is around 1.5, it is quite likely to obtain draws of $ Z$ that are either very close to zero or very large, and thus more of the probability mass of  $ {\widetilde{R}}$ is located near the edges of its support. Conversely the interior mode of  $ {\widetilde{R}}$ is considerably less pronounced than if $ \alpha$ is close to 2. Finally, as $ \alpha$ decreases further, $ C(\alpha)$ rises further, and both the frequency of tail observations and the likelihood that any draws from the tail areas will be very large increase. Therefore, it is very likely that the largest few observations of $ x_{t}$ or $ u_{t}$ will dominate the realization of $ Z$ and therefore the realization of  $ {\widetilde {R}}$. As a result, if $ \alpha$ is small the central mode of  $ {\widetilde {R}}$ vanishes entirely and almost all of its probability mass is located very close to the endpoints of the distribution's support. In the limit, as $ \alpha\downarrow0$ $ {\widetilde{R}}$ converges to a Bernoulli random variable, for which all of the probability mass is located at 0 and 1.

4  An empirical application

Fama and MacBeth (1973) proposed the so-called Fama-MacBeth regression to test the hypothesis of a linear relationship between risk and risk premium in stock returns in a cross-sectional setting. Let $ r_{it}$ be the return on market portfolio $ i$ at time $ t$, where $ i=1,\dots,N$ and $ t=1,\dots,T$; denote the average return of portfolio $ i$ as $ \bar r_{i}=T^{-1} \sum^{T}_{t=1} r_{it}$; denote the average portfolio return at time $ t$ as $ R_{t}=N^{-1} \sum ^{N}_{i=1} r_{it}$; and denote the average portfolio return across all time periods by $ \mu_{R}=T^{-1}\sum^{T}_{t=1} R_{t}$. The first-stage Fama-MacBeth regression is an ex post CAPM,

$\displaystyle r_{it}=\theta_{0i}+\theta_{i} R_{t}+u_{t}, \quad t=1,\dots,T,$ (17)

where $ \E (u_{t})=0$, $ \E (u_{t}R_{t})=0$, and $ u_{t}$ is iid $ S\alpha S$the same index  $ \alpha\in(1,2]$ as $ r_{it}$. We may assume that the distribution of  $ \theta_{i}$ has a finite mean and variance, say, $ \E (\theta)$ and  $ \mathrm{Var}(\theta)$. Denote the OLS estimates of the regression coefficients in equation (17) by $ \hat{\theta}_{0i}$ and $ \hat{\theta}_{i}$. The second-stage Fama-MacBeth regression is given by

$\displaystyle \bar r_{i}=\lambda_{0}+\lambda_{1}\hat{\theta}_{i} +{\varepsilon}_{i}, \qquad i=1,\dots,N\,,$ (18)

where $ {\varepsilon}_{i}$ is iid $ S\alpha S$the same index $ \alpha$ as $ r_{it}$, $ \E ({\varepsilon}_{i})=0$, and $ \E ({\varepsilon}_{i} \hat{\theta }_{i}) =0$.

The $ R^{2}$ statistic of the second-stage Fama-MacBeth regression is given by

$\displaystyle R^{2} = \frac{N^{-1} \hat\lambda_{1}^{2}\sum_{i=1} ^{N}\bigl( \hat{\theta}_{i}- \overline{\hat{\theta}}_{i} \bigr)^{2}} {N^{-1}\hat\lambda_{1}^{2}\sum_{i=1}^{N}\bigl( \hat{\theta}_{i} -\overline {\hat{\theta}}_{i} \bigr)^{2}+ N^{-1}\sum_{i=1}^{N}\hat{{\varepsilon}} _{i}^{2\T} }\,.$ (19)

This statistic has the following asymptotic properties.

Theorem 3   If the individual portfolio returns $ r_{it}$ follow an iid $ S\alpha S$ distribution with $ \alpha\in(1,2]$ and if $ \mu_{R}>0$, the coefficient of determination in (19) has the following limits as $ T\to\infty$ and $ N\to\infty$:

Thus, if $ \alpha<2$, $ R^{2}\to_{p} 0$, at a rate that is proportional to $ N^{1-2/\alpha}$.

Proof. The result for the finite-variance case follows immediately from Corollary 1. For $ \alpha<2$, observe that the normalized estimator of $ \theta_i$, $ T^{(\alpha-1)/\alpha}\bigl(\hat{\theta}_i -\E (\theta)\bigr)$, is in the domain of attraction of an $&alpha#alpha;$-stablefor fixed values of $ T$. As $ T\to\infty$, the dispersion of  $ \hat{\theta}_i$ about  $ \E (\theta)$ converges to 0, and the distributional properties of the estimated regressors  $ \hat{\theta}_i$ converge to those of $ \theta_i$; by assumption, the variance of $ \theta_i$ is finite. Thus, as $ N\to\infty$ and $ T\to\infty$, the numerator in equation (19) converges to $ \lambda_1^2 \Var (\theta)$. In contrast, the second summand in the denominator of (19) requires norming by $ N^{2/\alpha}>N$ in order to attain a proper limit. The coefficient of determination therefore converges to 0 in probability as $ N\to\infty$ and $ T\to\infty$, at a rate of $ N^{1-2/\alpha}$. $ \qedsymbol$

This result does not conflict with the one provided in Theorem 1, as the present case is one of an unbalanced regression design: the regressor has an asymptotically finite variance, whereas the error term has infinite variance, implying that the asymptotic signal-to-noise ratio is zero. Instead, this result is closely related to the one provided in Corollary 2, which examined the asymptotic limit of $ R^{2}$ if $ \alpha_{x}\ne\alpha_{u}$. We note that even if $ T$ is fixed (as is generally taken to be the case in Fama-MacBeth regressions), the dispersion of $ \hat{\theta}_{i}$ will likely be quite a bit smaller than that of  $ {\varepsilon}_{i}$, indicating that the model's signal-to-noise ratio, $ \eta$, and hence the median of $ R^{2}$, in the