
Board of Governors of the Federal Reserve System
International Finance Discussion Papers
Number 933, June 2008 --- Screen Reader
Version*
NOTE: International Finance Discussion Papers are preliminary materials circulated to stimulate discussion and critical comment. References in publications to International Finance Discussion Papers (other than an acknowledgment that the writer has had access to unpublished material) should be cleared with the author or authors. Recent IFDPs are available on the Web at http://www.federalreserve.gov/pubs/ifdp/. This paper can be downloaded without charge from the Social Science Research Network electronic library at http://www.ssrn.com/.
Abstract:
I test for stock return predictability in the largest and most comprehensive data set analyzed so far, using four common forecasting variables: the dividend- and earnings-price ratios, the short interest rate, and the term spread. The data contain over 20,000 monthly observations from 40 international markets, including 24 developed and 16 emerging economies. In addition, I develop new methods for predictive regressions with panel data. Inference based on the standard fixed effects estimator is shown to suffer from severe size distortions in the typical stock return regression, and an alternative robust estimator is proposed. The empirical results indicate that the short interest rate and the term spread are fairly robust predictors of stock returns in developed markets. In contrast, no strong or consistent evidence of predictability is found when considering the earnings- and dividend-price ratios as predictors.
Keywords: Cross-sectional dependence, panel data, pooled regression, predictive regression, stock return predictability
JEL classification: C22, C23, G12, G15
Our empirical knowledge regarding the predictability of stock returns by variables such as the dividend-price ratio has been subject to constant updating over time. Early work by Fama and French (1988, 1989) and Campbell and Shiller (1988) concluded that there is generally strong evidence of predictability. Recent studies that use more robust econometric methods, such as Campbell and Yogo (2006) and Lewellen (2004), still find evidence of predictability, but their results are much less conclusive than the earlier studies.
Despite the mixed evidence and uncertainty regarding stock return predictability, there have been surprisingly few attempts at furthering our understanding by using data other than that of the U.S. stock-market. Since the predictable component of stock returns must be small, if indeed one does exist, there seems to be little chance of reaching a decisive conclusion using U.S. data alone, which effectively provides only one time-series at the market level. There has, of course, been some analysis of predictability in international stock returns, but many of the results are based on relatively small data sets and non-robust econometric methods.1 In addition, most international results are based only on individual time-series regressions and very little analysis has been conducted with pooled panel data regressions. Yet, it is well known that pooling the data may lead to more powerful methods, which is particularly relevant when studying stock return predictability since any predictable component will always be small relative to the overall variance in the returns process.
The aim of this paper is twofold. First, by considering a large global data set, I provide the most comprehensive picture of stock return predictability to date. The data contain over 20,000 monthly observations from 40 countries, including markets in 24 developed economies.2 The longest data series is for the U.K. stock-market and dates back to 1836 while data for eight other markets date back to before 1935. Second, I develop and apply new results for pooled forecasting regressions, utilizing the panel structure of the data.
Since an international data set of stock returns and forecasting variables provides a panel, the theory part of this paper analyzes econometric inference in predictive regressions in a panel data setting, when the regressors are nearly persistent and endogenous.3 As is well known (Stambaugh (1999)), OLS inference in the corresponding time-series predictive regressions is generally biased and various bias and size correction procedures have been proposed.
In the panel case, it turns out that the pooled estimator is unbiased as long as no fixed effects are included. The intuition behind this is that when pooling the data, independent cross-sectional information dilutes the endogeneity effects that cause the Stambaugh bias in the time-series case. That is, the Stambaugh bias only arises when the predictors are both persistent and endogenous; by pooling the data, the endogeneity is, in a sense, removed, and hence also the bias. Furthermore, the standard pooled estimator has an asymptotically normal distribution and normal inference can therefore be performed.
The intuition just described for the standard pooled estimator
no longer holds when fixed effects are allowed for, and the
asymptotic properties of the pooled estimator with fixed effects
are very different from those of the pooled estimator with a common
intercept. The time-series demeaning of the data, which is implicit
in a fixed effects estimation, causes the fixed effects estimator
to suffer from a second order bias that invalidates inference from
standard test statistics. When demeaning each time-series in the
panel, information after time t is used to form the
time t regressor, and information before time
t is used to form the time t
returns. This induces a correlation between the lagged value of the
demeaned regressor and the error term in the forecasting equation,
which gives rise to the second order bias in the fixed effects
estimator. Thus, in contrast to the case with a common intercept,
the regressors no longer act as if they were exogenous. To correct
for this bias, I develop an estimator based on the idea of
recursive demeaning (e.g. Moon and Phillips (2000), and Sul et al.
(2005)). By using information only after time
in
the demeaning of the returns and the non-demeaned regressor as an
instrument, the distortive effects arising from standard demeaning
are eliminated.
The overall conclusion from the theoretical results and the supporting Monte Carlo simulations is that, in the typical panel data case with fixed effects, persistent and endogenous regressors will cause standard inference to be biased. While this result is well established in the time-series case (e.g. Stambaugh (1999)), the results in this paper show that equal caution is required when working with panel data.
In the empirical analysis, I conduct time-series regressions for individual countries as well as pooled regressions. In both types of analyses, I estimate regressions for four of the most commonly used forecasting variables: the dividend- and earnings-price ratios, the short interest rate, and the term spread. In the pooled regressions, countries are either all grouped together in a global panel or split up into groups of developed and emerging markets.
The results indicate that the short interest rate and the term spread are both fairly robust predictors of (excess) stock returns in developed markets. The null of no predictability is clearly rejected in the pooled regressions for developed markets as well as in a number of individual time-series regressions. These results are generally in line with those found by Campbell and Yogo (2006) with U.S. data and with the limited international results of Ang and Bekaert (2007). In contrast to the interest rate variables, no strong or consistent evidence of predictability is found when considering the earnings- and dividend-price ratios as predictors. In particular, neither predictor yields any consistent predictive power for the developed markets and, as seen in plots of the regression coefficient over time, this is especially true for the dividend-price ratio.
The rest of the paper is organized as follows. Sections II and III describe the empirical model and derive the main asymptotic properties of the pooled estimators. The finite sample properties of the procedures developed in this paper are analyzed through Monte Carlo experiments in Section IV. The data are described in Section V and the empirical results, including out-of-sample exercises, are provided in Section VI. Section VII concludes and technical assumptions and proofs are found in the Appendix.
Consider a panel model with dependent variables
,
,
, and
the corresponding vector of regressors,
, where
is an
vector. In this paper,
is the stock return in
country
, and
are the
corresponding predictor variables. The behavior of
and
are modelled as
follows:
| (1) | ||
| (2) | ||
| (3) | ||
| (4) |
That is, stock returns
are a function of the
past values of the predictor variables plus two factors
representing country specific
and global
innovations. In the
typical time-series predictive regression using, for instance,
aggregate U.S. data, these two error terms are generally not
distinguishable, and in terms of econometric inference, it makes no
difference whether the shocks are U.S. specific or global in some
sense. However, when pooling data from several countries, it
becomes important to control for whether innovations to returns are
due to country specific shocks or shocks that are common to all
countries in the sample. Intuitively, if one ignores the presence
of common factors in the error terms, the total amount of
(independent) variation in the pooled data is overstated, and the
econometric inference will be biased.
The vector of predictor variables,
, is
also assumed to be the sum of country specific
and global
terms. Both
and
follow
processes. More precisely,
the auto-regressive roots of both of these processes are
parameterized as being local-to-unity, such that
and
, where both
and
are
matrices. This captures the near unit-root, or highly persistent,
behavior of many predictor variables, but is less restrictive than
a pure unit-root assumption. The near unit-root construction, where
the autoregressive root drifts closer to unity as the sample size
increases, is used as a tool to enable an asymptotic analysis where
the persistence in the data remains large relative to the sample
size, even when the sample size increases to infinity. That is, if
the auto-regressive roots are treated as fixed and strictly less
than unity, then as the sample size grows, the regressors will
behave as strictly stationary processes asymptotically, and the
standard first order asymptotic results will not provide a good
guide to the actual small sample properties of the model. If the
roots are exactly equal to unity, the usual unit-root asymptotics
apply to the model, but this is clearly a restrictive assumption
for most potential predictor variables. Instead, by using the near
unit-root construction, the effects from the high persistence in
the regressor will appear also in the asymptotic results, but
without imposing the strict assumption of a unit root.
Finally, the regressors
can be
endogenous in the sense that
and
are contemporaneously correlated;
and
may be
contemporaneously correlated as well, and can, in fact, be
identical. The model specification is completed in Appendix
A with some
additional formal assumptions. Unless otherwise noted, all
variables appearing in the asymptotic distributions derived below
are defined in Appendix A.
The theoretical part of this paper analyzes the pooled
estimation of the slope coefficient in equation (1). That is, by
pooling data from several countries, an estimate of a joint slope
coefficient
is obtained. If the individual
slope coefficients are all identical, such that
for all
, the pooled estimator will converge to this common
parameter. In addition, the pooled estimator can either impose a
common intercept
, or allow for individual
intercepts, or fixed effects,
. When the restrictions
, and potentially
, hold for all
, pooling the data should lead to more precise estimates
than time-series estimation of each individual
.
When the slope coefficients
are not all
identical, pooled estimation may still be useful. In this case, the
pooled estimator will converge to a well-defined average slope
coefficient. The pooled estimate, and related tests, thus makes a
statement about the average predictive relationship in the
panel, which provides a useful tool for interpreting and
understanding the empirical results, especially if the individual
time-series regressions deliver mixed results. Furthermore, and as
importantly, the pooled estimate may in some respects provide at
least as good an estimate of
for a given
, by providing a possibly less noisy
estimate than the time-series one. That is, if the
are not identical, the pooled
estimator will generally not provide an unbiased estimate for a
given
, but in a bias-variance trade-off
it may still dominate the time-series estimate of
. This bias-variance trade-off is illustrated by
out-of-sample forecasts at the end of this paper where it is shown
that the forecasts based on the pooled estimator often dominate
those based on the time-series estimates.
Is it likely, from the perspective of economic theory, that the
and
are identical across
? That is, can one justify pooling the data from an economic
perspective, and if so, should fixed effects be included?
Consider first the question of fixed effects. Under the null of
no predictability, such that
for all
, the
restriction of
for all
imposes the same expected excess return in all countries. Although
it is very difficult to obtain precise estimates of average
returns, detailed empirical studies such as Jorion and Goetzman
(1999) strongly suggest that the equity premium varies across
countries. In addition, if an international world CAPM applies,
identical
in the absence of predictability
imply that the world CAPM beta for each country is identical. The
restriction of identical CAPM betas is strongly rejected in
previous studies such as Ferson and Harvey (1994), who report
international CAPM betas in the range from
to
1.3, and Harvey (1995), who shows that the
world CAPM betas for some emerging markets are negative. Although
the world CAPM does not offer a complete model of stock returns, it
does capture a sizeable amount of the variation in international
stock returns (Ferson and Harvey, 1994). Model predictions that
strongly contradict it, such as identical CAPM betas for all
countries, should thus be seen as a warning sign of
misspecification. Therefore, given the importance of having a model
that is correctly specified under the null hypothesis, fixed
effects should generally be included.4
In order to understand the economic constraints that are imposed
by identical
, one needs to analyze a model that
implies predictability in stock returns. Menzly et al. (2004)
explicitly analyze cross-sectional differences in time-series
return predictability. They use an external habit model similar to
Campbell and Cochrane (1999) and show that the dividend-price ratio
predicts excess stock returns. The slope coefficient in this
predictive regression varies across assets as a function of the
properties of the assets' cash-flow share of overall income; in an
international asset pricing framework, with integrated markets,
each country portfolio can be viewed as an individual asset, as in
the international CAPM. The model in Menzly et al. (2004) thus
implies that, in general, the slope coefficients
in the predictive regression in equation (1) may not be
identical across
. However, the model says little
about how disperse the slope coefficients actually are in practice.
That is, even though it is unlikely that the
are identical across
countries or assets, as is true for most parameters that may be
estimated in economics or finance, what is of primary importance
for the empirical scope of this paper is whether they are similar
enough that it may be beneficial from an econometric point of view
to treat them as equal.5 From this empirical perspective, the
implications of the Menzly et al. (2004) model are essentially
silent, and it is unlikely that other models of return
predictability would deliver any stronger practical
implications.
The results in the current paper suggest that pooling the data,
and thus imposing a common slope coefficient, is in fact quite
often empirically justified in the sense that the null hypothesis
of a common slope coefficient can often not be rejected in formal
statistical tests and the forecasts based on the pooled estimates
often tend to outperform those based on the individual time-series
estimates in out-of-sample exercises. Thus, even though economic
theory does not generally predict that the
are all identical, it cannot be
a priori rejected that they are similar enough for there to
be benefits from pooling the data, which ties back to the
discussion on the practical motivations in the section above.
In general, it is quite reasonable to conjecture that countries that share many common characteristics are more likely to have similar predictability patterns than those that do not. One of the most natural splits along these lines in international data is to distinguish between developed and emerging markets. Previous literature, such as Harvey (1995), also shows that emerging markets tend to have different return characteristics than developed markets, and different patterns of predictability. To the extent that stock markets in different countries are more likely to have similar predictability if they are priced globally in integrated financial markets, rather than locally in segregated markets, the group of developed markets is also likely to better satisfy this requirement. The empirical analysis separately analyzes developed and emerging market panels, and includes a test of slope homogeneity that shows that these two groups of countries appear more homogenous than all countries combined.
To understand the basic properties of pooled estimators of a
common slope coefficient
in equation (1), it is
instructive to start with analyzing the case when there are no
common factors in the data. That is, let
and
, for all
.
This assumption will be maintained throughout the remainder of
Section II and the
effects of common factors are analyzed in Section III. Unless
otherwise noted, it is assumed that the slope coefficients
are identical and equal to
for all
.
To estimate the parameter
, consider first
the traditional pooled estimator when there are no individual
effects, i.e. when
for all
. Although the previous discussion strongly suggested the
use of individual intercepts in the international analysis
performed in the current paper, there may be other cases when a
common intercept can be justified. In addition, a comparison of the
pooled estimator with and without fixed effects highlights some
important differences and helps form an understanding of the
effects of pooling the data. The pooled estimator with a common
intercept is given by
| (5) |
where
and
| (6) |
Following the work of Phillips and Moon (1999), asymptotic results
for the panel estimators are derived using sequential limits, which
implies first keeping the cross-sectional dimension,
, fixed and letting the time-series dimension,
, go to infinity, and then letting
go to
infinity. Such sequential convergence is denoted
.6 As mentioned
before, the definitions of the variables that appear in the
theorems and derivations below are all found in Appendix A, unless
otherwise noted.
| (7) |
The pooled estimator of
is thus
asymptotically normally distributed; summing up over the
cross-section eliminates the usual near unit-root asymptotic
distributions found in the time-series case. The rate of
convergence is also faster in the pooled case
compared to the
time-series case
, which again is a result of
the additional cross-sectional information. The limiting
distribution depends on
and
and,
in order to perform inference, estimates of these parameters are
required. Let
,
, and
. The estimator
is thus the panel equivalent
of HAC (heteroskedasticty and auto-correlation consistent)
estimators for long-run variances.
Standard tests can now be performed. For instance, the null
hypothesis
, for
some
, where
, can be tested using a
test. Let
. Using the results derived above, it follows easily that under the
null-hypothesis,
| (8) |
as
,
where
is an
vector with
the
'th component equal to one and zero
elsewhere, and
is the
'th component of
. More general linear
hypotheses can be evaluated using a Wald test.
Let
and
denote the time-series
demeaned data. That is,
and
.
The fixed effects pooled estimator, which allows for individual
intercepts, is then given by
| (9) |
and
| (10) |
Clearly, the estimator is still consistent. Its asymptotic
distribution, however, will be affected by the demeaning. For fixed
, as
,
| (11) |
where
and
are the limiting processes of
and
, respectively, as defined in Appendix A; the limiting
process for
is denoted
. Let
denote the average covariance vector between
and
, and observe that
|
| |
| ||
| (12) |
which is different from zero whenever
. Thus, as
,
| (13) |
and the estimator suffers from a second order bias from the demeaning process.7
The differences in sample properties between the standard pooled
estimator with a common intercept and the fixed effects estimator
are rather striking. Mechanically, the standard pooled estimator
works well because for each
, the terms in the
numerator of the estimator have mean zero and are independently
distributed across
. As they are summed up over
, the central limit theorem applies and an
asymptotically normally distributed estimator is obtained. More
intuitively, when pooling the data, independent cross-sectional
information dilutes the endogeneity effects that cause the
Stambaugh (1999) bias in the time-series case. The same result does
not hold for the fixed effects estimator because the numerator
terms no longer have a zero mean as a consequence of the
time-series demeaning of the data, which leads to a correlation
between the innovation processes
and the
demeaned regressors
whenever the regressor
is endogenous. Thus, unlike in the case with a common intercept,
the pooling does not remove the endogeneity effects and the
estimator suffers from a second order bias.
More generally, from the perspective of panel data econometrics,
the natural way of understanding the detrimental impact of fixed
effects is to view them as an instance of the incidental parameter
problem, which was originally raised by Neyman and Scott (1948) and
discussed in a panel data context by Nickell (1981). That is, as
the panel grows larger asymptotically, the number of (incidental)
fixed effects that need to be estimated also goes to infinity, as
the cross-sectional dimension grows. Thus, although more and more
data becomes available asymptotically, the number of parameters to
estimate also increases. In the traditional (dynamic) panel setup
studied by Nickell (1981), where
is fixed as
, inclusion of fixed
effects causes the standard estimator of the slope coefficient to
become inconsistent. Here, where both
and
tend to infinity, the fixed effects
estimator remains consistent but with a second order bias.
The second order bias in the fixed effects estimator arises
because the demeaning process induces a correlation between the
innovation processes
and the demeaned
regressors
. Intuitively,
and
are correlated
because, in the demeaning of
,
information available after time
is used. Or,
equivalently, because in the demeaning of the dependent variable,
, information before time
is used. One solution is therefore to use recursive
demeaning of
and
(e.g.
Moon and Phillips, (2000), and Sul et al. (2005)). In particular, I
will consider a `forward demeaned' equation. That is, define
and
| (14) |
Observe that
and consider the following pooled estimator, using the recursively demeaned data,
| (15) |
In
, the non-demeaned
regressors
are used as instruments, and
the dependent variable,
, is formed using
data dated only after time
. Since
and
are now independent of each other, unlike
and
, the estimator
will not suffer from the
same second order bias as the standard fixed effects estimator.
This is stated formally in the following theorem.
| (16) |
To perform inference, let
,
, and
. The
test and Wald test based on
and
will
satisfy the usual properties. Observe that the forward demeaning of
the data introduces a moving average component in the returns
process, which is reflected in the limiting distribution derived in
the proof of Theorem 2. The
variance-covariance matrix estimator that was just proposed
automatically accounts for this by calculating the long-run
variance using the forward demeaned residuals and the panel
equivalent of a HAC estimator.
The recursive demeaning procedure gives up some efficiency by
relying on a somewhat inefficient method for demeaning the data.
However, there are no clear-cut alternatives in the general case
when the autoregressive roots
(or
equivalently,
) are unknown. If the
were known, the bias term in equation (13) could be
directly estimated and a bias-corrected fixed effects estimator
could be constructed. More ambitiously, for known
, a panel version of fully modified estimation could be
considered, as suggested by Phillips and Moon (1999) in the pure
unit-root case. However, although such procedures are likely more
efficient than the recursive demeaning proposed here, they are not
feasible in practice since the
are
unknown.8
So far, the focus has been on the problems raised by fixed
effects. However, it is also possible that the slope coefficients
may vary across
.
In this section, I therefore discuss the properties of the pooled
estimator when the
are not identical.9 To
start with, suppose
, where
is
with mean zero.
| (17) |
| (18) |
In the case where the distribution of the slope coefficients is
independent of the regressors, it follows that the pooled estimator
converges to the average parameter
. The
rate of convergence is much slower than in the homogenous case,
however, and the fixed effects estimator no longer suffers from a
small sample bias. These results stem from the fact that when the
are non-identical, the residuals
in the regression are now given by
. Since
is a near integrated process, it
will dominate the asymptotic properties of the residuals, and will
therefore slow down the rate of convergence and also render the
second order bias term in the fixed effects estimator irrelevant.
However, when the deviations
are small, the second order bias
term is still a concern. Results from Monte Carlo simulations,
which are not presented here, show that for most potentially
relevant values of
and
in a stock return predictability
context, the bias in the fixed effects estimator is still highly
relevant. Likewise, when the deviations
are small, the slow-down in the
rate of convergence will not be as drastic as in Theorem 3.10
If the
are correlated with the
regressors, the pooled estimator does not converge to the average
slope coefficient
. However, as discussed
at length in Phillips and Moon (1999, 2000), the average of the
individual parameters
is not necessarily
the natural way of defining an average relationship between
and
.
Phillips and Moon note that in a framework with persistent
variables, one can define the individual regression coefficients
as
,
where
is the long-run variance for
and
is the long-run covariance
between
and
.
They then define the long-run average relationship between
and
as
, rather than
, and
show that the pooled estimator, with or without fixed effects, will
converge to
under very general conditions;
in the special case when
and
is independent of
, it follows that
. Thus,
and
converge to a well
defined average relationship under very general circumstances,
although not necessarily to
.
The analysis above shows that the pooled estimators are robust
to deviations from the assumption of homogenous slope coefficients,
and will converge to a well-defined average coefficient when the
are non-identical. In many cases,
it is still of interest, however, to evaluate whether the slope
coefficients are in fact all equal.
I adopt a version of a test originally proposed by Swamy (1970)
and further developed by Pesaran (2007). The basic idea is to
analyze a weighted sum of squared differences between the
unrestricted time-series estimates of the individual
and the fixed effects pooled
estimate, which imposes a common slope coefficient.
Define the following weighted fixed effects estimator,
| (19) |
where
is an estimate of the
variance of
; the
standardization by
leads to a natural reduction in
nuisance parameters in the asymptotic distribution of the below
test statistic. Further, let
| (20) |
where
is the OLS estimate of the
slope coefficient for country
.
| (21) |
Given
and
,
provides an asymptotically
normally distributed test of slope homogeneity. Unfortunately,
and
are functions of the unknown
nuisance parameters
; they are
also functions of the average correlation
between the
innovations
and
, but
this value can easily be estimated.
Through simulations, it easy to show that
changes fairly slowly with the values of the
, whereas
can vary substantially from small
changes in the
. In order to obtain a
feasible test with approximately correct size, I therefore propose
to use
, evaluated for a common value of
for all
,
where
is given by the average of the
median unbiased estimates of each
. As
originally shown by Stock (1991), median unbiased, although
inconsistent, estimates of each
can be
obtained by inverting a unit-root test statistic. Further,
is replaced by an empirical
estimate that is consistent under the null hypothesis of
for all
.
Write
where
represents the expression in
(20). From the proof
of Theorem 4, an estimate of
is obtained by calculating the
sample standard deviation of
. Under the
alternative, when the
are not all identical, this
estimate will be upward biased for
, and some power will therefore be
lost. However, given a lack of knowledge of the
, and the strong dependence of
on the values of the
, this seems like a preferable approach.
In terms of practical implementation, the median unbiased
estimates for
are obtained by inverting the
DF-GLS unit-root test statistic, as described in detail in Campbell
and Yogo (2006). In the case when
is a
vector, the same procedure can be applied to each of the component
processes of
and, with the extra restriction
that
is a diagonal matrix, one can proceed
exactly as in the scalar case. The variances
of
and the average correlation
between
and
are estimated from
the residuals of the time-series regressions of equations (1) and (2). The values for
are obtained by direct simulation of
the asymptotic expression given in the proof of Theorem 4; these
values are available from the author upon request. The null
hypothesis is rejected for large positive values of
; e.g. a five percent test
would reject for values larger than 1.65.
I now return to the general setup with common factors in the
data. The following theorem summarizes the asymptotic properties of
both the standard pooled estimator and the fixed effects estimator
when there are common factors. Again, the
are assumed to be identical
unless otherwise noted.
| (22) |
| (23) |
Thus, in the presence of the general factor structure outlined
in Section II, the
standard pooled estimator exhibits a non-standard limiting
distribution, although it is still consistent; standard tests can
therefore not be used. Similarly, the limiting behavior of the
fixed effects estimator is determined by the bias term arising from
the time-series demeaning of the data, as well as an additional
term that stems from the common factors in the data. Note that the
term
is random and can take on both negative and positive values. Thus,
correcting for it will have an ambiguous effect on the outcome of
the estimation and test results.
Based on the methods of Pesaran (2006), I propose an estimator
that is more robust to cross-sectional dependence in the data.
Pesaran's (2006) idea is to project the data onto the space
orthogonal to the common factors, thereby removing the
cross-sectional dependence from the data used in the estimation.
However, since the factors are not observed in practice, an
indirect approach is required. Pesaran suggests using the
cross-sectional means of the dependent and independent variable as
proxies for the common factors. A similar approach is adopted
below, but only the cross-sectional means of the regressors are
used to control for the common factors. This is done because of the
different orders of integration between the error terms and the
regressors. For
, the stochastic behavior of
is dominated by that of
, and the matrix
would be asymptotically singular.
Thus, consider the following estimator of
,
| (24) |
where
denotes the
matrix of the observations for the dependent
variable and
the
matrix of regressor observations.
is a
matrix and
is the
matrix of observations of
![]()
The estimator
is obtained by
applying the pooled estimator to the residuals from a projection of
the original data onto the cross-sectional averages of the
regressors. The intuition behind this is that the cross-sectional
average of
is close to the common
stochastic trend
, since the cross-sectional
averages of the cross-sectionally independent data may be expected
to be close to zero; the projection onto the compliment of the
cross-sectional means will therefore remove the effects of the
common factors in the regressors. As is shown in the proof of the
following theorem, it is sufficient to remove the factors from the
regressors (and not from the innovations to the regressand) in
order to achieve a mixed normal distribution.
| (25) |
The estimator
thus achieves a
convergence rate and an asymptotic
mixed normal distribution. The mixed normality in this case arises
from the common factors, which leads to a mixed normal distribution
rather than the normal distribution seen above in the no common
factors case. That is, the limiting distribution is effectively a
normal distribution with a random variance-covariance matrix that
is a function of the common shocks; conditional on the realization
of the common factors, the distribution is thus normal. For
practical purposes, the mixed normal distribution allows for
standard inference in that the
tests and Wald
tests will have asymptotically standard distributions. Allowing for
fixed effects in the arguments, it is easy to show the following
result.
| (26) |
| (27) |
| (28) |
| (29) |
and
thus provide pooled
estimators for predictive regressions that are asymptotically mixed
normally distributed in the presence of common factors and with the
allowance for fixed effects in the latter. Standard
tests and Wald tests can therefore be used; by simply using
the defactored data, the variance-covariance matrix can be
estimated in a manner analogous to that described above for the no
common factors case. The practical implementation of these
estimators is thus very simple: Premultiply the data by
, and use the
resulting variables in the original estimation procedures.
As shown in the simulations below, the
test of slope homogeneity
appears robust to the presence of factors -unlike the pooled
tests- and I do not attempt to modify it
to control for common factors.
To evaluate the small sample properties of the panel data
estimators proposed in this paper, a Monte Carlo study is
performed. In particular, I focus on the size and power properties
of the pooled ttests. Equations (1) and (3) are simulated
for the case with a single regressor. The innovations
are drawn
from normal distributions with mean zero, unit variance, and
correlations δ = 0, -0.4, -0.7, and -0.95; there is no cross-sectional dependence. The
local-to-unity parameters
are drawn from
a uniform distribution with support [-20, -2]. In analyzing the
power properties, the slope coefficient β varies
between -0.05 and 0.05, and is
identical for all i. The sample size is given
by T = 100, n = 20. The intercepts
are normally distributed with
mean and standard deviation equal to 0.005. All
results are based on 10,000 repetitions. The
test based on the fixed effects estimator using standard
demeaning,
, and that based on the
recursively demeaned pooled estimator,
, are considered. Throughout
the simulation study, the normal distribution is used to determine
significance; i.e. the null is rejected for absolute test values
greater than 1.96.
Panel A in Table 1 shows the average rejection rates for the nominal five percent two sided t-tests under the null hypothesis of β = 0. Panels A1 and A2 in Figure 1 show the corresponding power curves of the tests for the cases of δ = 0 and δ = -0.95. Table 1 and the power curves in Figure 1 clearly show the effects of the second-order bias in the fixed effects estimator; the test based on the standard fixed effects estimator severely over rejects under the null hypothesis for δ = -0.95. The test based on the recursively demeaned estimator has rejection rates close to the nominal size under the null, while maintaining decent power properties.11
In this section, I repeat the Monte Carlo experiment above for
the case when there is a common factor in the innovations. In
particular, equations (1)-(4) are now
simulated with a single regressor and a single common factor
, drawn from a standard normal
distribution. The factor loadings,
and
, are also normally distributed
with means of minus one and plus one, respectively, and standard
deviations equal to 2-1/2 in both cases. The
innovations in the returns and regressor processes are formed as
and
,
respectively, where
are drawn
from standard normal distributions; the scaling by 2-1/2 is performed in order to achieve an approximate unit
variance in the innovations, which enables easier comparison with
the cross-sectionally independent case. As before, the correlation
between
and
is set δ = 0, -0.4, -0.7 and -0.95.
The results are shown in Panels B and C of Table 1 and in Panels
B1, B2, C1, and C2 in Figure 1. Panel B in
Table 1 and Panels B1
and B2 in Figure 1
show the outcomes of the Monte Carlo experiments when the model
generated with common factors is estimated using the estimators
and
, which do not control for
cross-sectional dependence. It is clear that when the common
factors are ignored in the estimation process, the actual size of
the corresponding
tests is very far from the
nominal size of
percent, with rejection rates
above
percent under the null.
Panel C in Table 1
and Panels C1 and C2 in Figure 1 show the same
results for the estimators
and
, which do control for
the common factors. The
test based on the
recursively demeaned data,
, now possesses good
size and power properties. As before, the standard fixed effects
estimator exhibits a finite sample bias and extremely poor size
properties.
The final set of simulations analyzes the finite sample
properties of the
test of slope homogeneity.
The setup is identical to that used above, with the exception of
the slope coefficients. Three different scenarios are considered.
In the first, the null hypothesis is imposed and
. In the two other cases,
the slope coefficients exhibit heterogeneity and are generated as
, where
is a standard normal random
variable, independently distributed across
, and
is equal to
and
, respectively. The cases with and
without common factors are considered, although, as described
previously, no adjustment to the test is made when there are common
factors. The test is evaluated as a one-sided test with a nominal
size of five percent; i.e. the null is rejected when
is greater than 1.65.
Panel A of Table 2 shows the
results without any common factors and Panel B shows the results
with common factors. Under the null, the test is somewhat under
sized, and marginally more so when there are common factors. Unlike
the
tests analyzed above, the test of slope
homogeneity is thus not particularly sensitive to common factors in
the data. Under the first alternative, with
, the power of the test is around
45 percent without common factors, and
around 35 percent with common factors. Under the
second alternative, with
, the power rises to
above 90 percent in both cases. The relatively low
power under the first alternative reflects the fact that it is
difficult to distinguish between such small absolute
differences between the
, even though the relative
differences are reasonably large; one should keep in mind that the
test compares across the cross-section of the data, which in the
simulations only amounts to
observations.
Nevertheless, the test can serve as a useful diagnostic of panel
homogeneity.
All of the data come from the Global Financial Data database and are on a monthly frequency. Total returns, including direct returns from dividends, on market wide indices in 40 countries were obtained, as well as the corresponding dividend- and earnings-price ratios and measures of the short and long interest rates.
With the exception of Spain, the dividend-price ratio data is available over the same sample period as the total stock returns. But, the other predictor variables are typically not available during the whole sample of total stock returns. Due to the two world wars, France, Germany, Japan, and the U.K. have some years during which no observations are available. Further, Spain's total returns data start in 1940, but no dividends data is available during 1968-1983. Thus, in the time-series analysis, separate regressions are fitted for each sample period for these five countries, and in the pooled estimation separate intercepts are estimated. In Table 3, which presents the pooled results, the row listing the number of 'countries' in each panel can therefore include more than one count of some countries.
As is conventional in the literature, the dividend-price ratio is defined as the sum of dividends during the past year divided by the current price and the earnings-price ratio is defined as the current price divided by the latest 12 months of available earnings. Short interest rate measures come from Global Financial Data and use rates on 3-month T-bills when available or, otherwise, private discount rates or interbank rates. The long rate is measured by the yield on long-term government bonds. When available, a 10-year bond is used; otherwise, I use that with the closest maturity to 10 years. The term spread is defined as the log difference between the long and short rates. Excess stock returns are defined as the return on stocks, in the local currency, over the local short rate. This provides the international analogue of the typical forecasting regressions estimated for U.S. data. All regressions are run at the one-month frequency using log-transformed variables with the log excess returns over the domestic short rate as the dependent variable.
Countries are pooled into a global panel, as well as developed and emerging stock market panels, according to the MSCI classifications.12
In the empirical analysis, I conduct pooled regressions as well
as time-series regressions for individual countries. The results
from the pooled regressions and summaries of the time-series
results are presented in Table 3. The time-series
results for individual countries are given in Table 4. Each table
contains multiple panels, which correspond to the different
forecasting variables. For the pooled regressions, results from
both the standard fixed effects estimator,
, and the corresponding
statistic,
, as
well as the estimator using recursively demeaned data,
and
,
are documented. Separate results are shown for the case when common
factors are controlled for and when they are not. As discussed at
length above, the
test is not robust to
the endogeneity and persistence of the regressors, but provides an
interesting illustration of the potential pitfalls of not
addressing these issues. The short interest rate and the term
spread are generally less endogenous and inference based on the
fixed effects estimator for these two variables will be fairly
accurate (see also the discussion on this topic in Campbell and
Yogo (2006)); however, the fixed effects
test
will tend to greatly over reject the null for the dividend- and
earnings-price ratio. In Table 3, significant
results at the one-sided five percent level based on robust
test-statistics from which proper inference can be drawn, i.e. the
statistic corresponding to
, are indicated with a
.13
The results from the individual time-series regressions, shown
in Table 4, are
presented in a similar manner to the pooled regressions. Since
normal inference based on the OLS
statistic
will generally be biased, inference based on a robust 90 percent confidence interval for
, using the methods of Campbell and Yogo (2006), are
also provided. If viewed as a test, this confidence interval can be
seen as a five percent one-sided test and a rejection of the null
hypothesis of no predictability is indicated with a
next to the coefficient estimate; for brevity, the
actual confidence intervals are not shown. In Table 3, the number of
individual time-series regressions that yield significant
coefficients according to the Campbell and Yogo test is indicated
in the column labeled CY
.
Before considering the empirical results for the different
predictor variables, it is useful to briefly analyze the
homogeneity of the slope coefficients in the pooled predictive
regressions. Table 3
shows the outcome of the
test of slope homogeneity; it
is a one-sided test, and a value greater than 1.65 indicates that the null hypothesis of
for all
is rejected at the five percent level. As is seen, slope
homogeneity can always be rejected for the global panel. For the
developed panel, homogeneity is only rejected in the dividend-price
ratio regression using the full sample, which spans a much larger
range of time than the other panels since the dividend-price ratio
is available further back than any of the other predictor
variables; when data before 1950 is dropped, slope homogeneity can
no longer be rejected. For the emerging market panels, the null of
slope homogeneity is only rejected in the regression with the short
interest rate. These results thus support the notion that countries
within the groups of developed and emerging markets tend to be more
homogenous in terms of predictability than countries across these
groups.
As shown previously, the pooled analysis is valid also when the slope coefficients are not homogenous. However, the estimates from the global panels, and in a couple of instances from the developed and emerging panels, are best interpreted as average relationships, and the corresponding tests as tests of whether there is predictability on average in the data.
The results for the earnings-price ratio are presented in Panel A of Tables 3 and 4. There is minimal evidence of a positive predictive relationship. Specifically, pooling the data at either the global or developed market levels does not yield a significant coefficient, regardless of whether one controls for common factors; however, there is evidence of a predictive relationship when pooling at the emerging market level and controlling for common factors. To ensure that the developed market results are not driven by the longer earnings-price ratio time-series available for the U.K. and the U.S., I also estimate these pooled regressions when restricting the sample to observations after 1950. The individual country time-series results confirm the lack of evidence of a predictive relationship in the pooled regressions. In particular, in the post-1950 sample, only four of the 38 time-series regressions (Argentina, Jordan, South Africa, and the U.K.) yield any significant coefficients.
There is thus rather weak evidence that the earnings-price ratio predicts stock returns; the majority of evidence that does exist is for emerging economies. It is noteworthy that the null of no predictability would have been rejected in all of the pooled regressions if one relied on non-robust methods that fail to control for the endogeneity and persistence of the regressors, as well as common factors in the data. Controlling for common factors appears to be of potentially great importance. It is interesting to note that doing so does not necessarily weaken the results.
Panel B in Table 3 shows the results from pooled regressions with the dividend-price ratio as the regressor. The results are generally somewhat stronger than for the earnings-price ratio. Specifically, when controlling for common factors, the coefficient is significant when pooling both at the post-1950 global level and the developed market level, as well as in the full sample emerging panel. The overall picture depicted by the individual time-series regressions shown in Panel B of Table 4, however, is still fairly weak, although evidence of predictability is observed for post-1950 Australia, Chile, post-WWII Japan, Jordan, Mexico, Taiwan, the U.K., and post-1950 U.S. The time-series evidence thus presents no clear pattern of predictability and the evidence that exists is distributed fairly equally between developed and emerging markets. As in the case of the earnings-price ratio, the null of no predictability would have been often rejected when using non-robust tests.
In light of the empirical evidence of a predictive relationship seen in U.S. data, one would expect there to be a negative relationship between the current short rate and future stock returns. The data used in all interest rate regressions are restricted to start in 1952 or after, following the convention used in studies with U.S. data.14 The pooled results for the short interest rate are presented in Panel C of Table 3.
The null of no predictability is strongly rejected in the pooled sample of developed markets. In contrast to this strongly significant negative relationship, the pooled relationship in the emerging markets is not significant. Given the rather capricious character of interest rates in many emerging economies (e.g. Argentina), I focus strictly on the developed market results. As seen in Panel C of Table 4, this finding of predictability is supported by the results of the individual time-series regressions for the developed markets. In particular, a significant predictive relationship is found in eight out of 23 developed markets, including: Canada, Germany, the Netherlands, New Zealand, Portugal, Spain, Switzerland, and the U.S.. In addition, a closer look at the individual country level results further strengthens this pattern; in particular, it reveals that the estimates for 15 of the developed markets are more than one standard deviation away from zero while the slope coefficient estimate is negative for 20 of the 23 countries.
Based on the U.S. experience, one would expect there to be a positive predictive relationship, if any, between the term spread and stock returns. As in the case of the short interest rate, I find a positive significant predictive relationship only in developed markets. As shown in Panel D of Table 3, there is strong evidence of a predictive relationship when pooling the developed economies. As this relationship is not evident for the emerging markets, I once again focus on the results for the developed markets. As seen in Panel D of Table 4, this finding of predictability is supported strongly by the results of the individual time-series regressions for the developed markets. For 10 of 23 individual time-series regressions, there is a positive and significant predictive relationship: Canada, France, Germany, Italy, the Netherlands, New Zealand, Norway, Spain, Switzerland, and the U.S.. Furthermore, 14 countries have a coefficient that is more than one standard deviation from zero.
How robust are these patterns of predictability to different sample periods? To analyze this, I consider pooled regressions with expanding windows of observations for the developed markets. I focus on the developed panel since the time-series in that panel typically have longer samples available. A new country is added to the expanding window regression when there are five years of observations available; no 'old' observations are ever dropped from the estimation window and the estimates at each point in time are thus based on all observations available up to that date.15 Confidence intervals, with a nominal 90 percent coverage rate, are calculated in a manner analogous to the test statistics shown in Table 3, based on the normal distribution. The left column of Figure 2 shows the results from using the standard fixed effects estimator, without controlling for common factors. The right column shows the results from the estimator using recursively demeaned data and controlling for common factors. The confidence intervals in the left column are thus typically biased and will generally not have an actual coverage rate of 90 percent; however, these results further illustrate the importance of controlling for endogeneity and common factors.
The results presented in Figure 2 mostly reflect
those discussed above based on the complete sample. However, the
results for the dividend-price ratio
, which generally appeared
somewhat stronger than those for the earnings-price ratio
, now appear very weak
when viewed over time. Overall, the support of any stable
predictive ability in either of these two variables is very weak.
The term spread
coefficient
fluctuates around zero until the late 1970s, after which the lower
bound of the confidence interval typically hovers above zero,
a