Abstract:
A model for the multivariate distribution of the returns on large collections of financial assets is a crucial component in modern risk management and asset allocation. Modelling high-dimensional distributions, however, is not an easy task and only a few models are typically used in high dimensions, most notably the Normal distribution, which is still widely used in practice and academia despite its notorious limits, for example, thin tails and zero tail dependence.
This paper provides a new approach for constructing and estimating high-dimensional distribution models. Our approach builds on two active areas of recent research in financial econometrics. First, high frequency data has been shown to be superior to daily data for measuring and forecasting variances and covariances, see Andersen, et al. (2006) for a survey of this very active area of research. This implies that there are gains to be had by modelling linear dependence, as captured by covariances, using high frequency data. Second, copula methods have been shown to be useful for constructing flexible distribution models in high dimensions, see Christoffersen, et al. (2013), Oh and Patton (2013) and Creal and Tsay (2014). These two findings naturally lead to the question of whether high frequency data and copula methods can be combined to improve the modelling and forecasting of high-dimensional return distributions.
Exploiting high frequency data in a lower frequency copula-based model is not straightforward as, unlike variances and covariances, the copula of low frequency (say daily) returns is not generally a known function of the copula of high frequency returns. Thus the link between high frequency volatility measures (e.g., realized variance and covariance) and their low frequency counterparts cannot generally be exploited when considering dependence via the copula function. We overcome this hurdle by decomposing the dependence structure of low frequency asset returns into linear and nonlinear components. We then use high frequency data to accurately model the linear dependence, as measured by covariances, and a new class of copulas to capture the remaining dependence in the low frequency standardized residuals.
The difficulty in specifying a copula-based model for standardized, uncorrelated, residuals, is that the distribution of the residuals must imply an identity correlation matrix. Independence is only sufficient for uncorrelatedness, and we wish to allow for possible nonlinear dependence between these linearly unrelated variables. Among existing work, only the multivariate Student's t distribution has been used for this purpose, as an identity correlation matrix can be directly imposed on this distribution. We dramatically increase the set of possible models for uncorrelated residuals by proposing methods for generating "jointly symmetric" copulas. These copulas can be constructed from any given (possibly asymmetric) copula, and when combined with any collection of (possibly heterogeneous) symmetric marginal distributions they guarantee an identity correlation matrix. Evaluation of the density of our jointly symmetric copulas turns out to be computationally difficult in high dimensions, but we show that composite likelihood methods (see Varin, et al. 2011 for a review) may be used to estimate the model parameters and undertake model selection tests.
This paper makes four main contributions. Firstly, we propose a new class of "jointly symmetric" copulas, which are useful in multivariate density models that contain a covariance matrix model (e.g., GARCH-DCC, HAR, stochastic volatility, etc.) as a component. Second, we show that composite likelihood methods may be used to estimate the parameters of these new copulas, and in an extensive simulation study we verify that these methods have good finite-sample properties. Third, we propose a new and simple model for high-dimensional covariance matrices drawing on ideas from the HAR model of Corsi (2009) and the DCC model of Engle (2002), and we show that this model outperforms the familiar DCC model empirically. Finally, we present a detailed empirical application of our model to 104 individual U.S. equity returns, showing that our proposed approach significantly outperforms existing approaches both in-sample and out-of-sample.
Our methods and application are related to several existing papers. Most closely related is the work of Lee and Long (2009), who also consider the decomposition into linear and nonlinear dependence, and use copula-based models for the nonlinear component. However, Lee and Long (2009) focus only on bivariate applications, and their approach, which we describe in more detail in Section 2, is computationally infeasible in high dimensions. Our methods are also clearly related to copula-based density models, some examples of which are cited above, however in those approaches only the variances are modelled prior to the copula stage, meaning that the copula model must capture both the linear and nonlinear components of dependence. This makes it difficult to incorporate high frequency data into the dependence model. Papers that employ models for the joint distribution of returns that include a covariance modelling step include Chiriac and Voev (2011), Jondeau and Rockinger (2012), Hautsch, et al. (2013), and Jin and Maheu (2013). As models for the standardized residuals, those papers use the Normal or Student's t distributions, both of which are nested in our class of jointly symmetric models, and which we show are significantly beaten in our application to U.S. equity returns.
The paper is organized as follows. Section 2 presents our approach for modelling high-dimensional distributions. Section 3 presents multi-stage, composite likelihood methods for model estimation and comparison, which are studied via simulations in Section 4. Section 5 applies our model to daily equity returns and compares it with existing approaches. Section 6 concludes. An appendix contains all proofs, and a web appendix contains additional details, tables and figures.
We construct a model for the conditional distribution of the N-vector
$$ \mathbf{r}_{t}$$ as follows:
$$\displaystyle \mathbf{r}_{t}$$ | $$\displaystyle =$$ | $$\displaystyle \mathbf{\mu }_{t}+\mathbf{H}_{t}^{1/2}\mathbf{e}_{t}$$ | (1) |
where $$\displaystyle \mathbf{e}_{t}$$ | $$\displaystyle \sim$$ | $$\displaystyle iid~\mathbf{F}\left( \mathbf{\cdot };\mathbf{\eta }\right)$$ | (2) |
In existing approaches, see Chiriac and Voev (2011), Jondeau and Rockinger (2012), Hautsch, et al. (2013), and Jin and Maheu (2013) for example, $$ \mathbf{F}$$ would be assumed multivariate Normal (which reduces to independence, given that $$ \mathbf{e}_{t}$$ has identity covariance matrix) or Student's $$ t,$$ and the model would be complete. Instead, we consider the decomposition of the joint distribution $$ \mathbf{F}$$ into marginal distributions $$ F_{i}$$ and copula $$ \mathbf{C}$$ using Sklar's (1959) theorem:
$$\displaystyle \mathbf{e}_{t}\sim \mathbf{F\left( \mathbf{\cdot };\mathbf{\eta }\right) =C}\left( F_{1}\left( \mathbf{\cdot };\mathbf{\eta }\right) ,...,F_{N}\left( \mathbf{\cdot };\mathbf{\eta }\right) ;\mathbf{\eta }\right)$$ | (3) |
$$\displaystyle \mathbf{f}_{t}\left( \mathbf{r}_{t}\right) =\det \left( \mathbf{H}_{t}^{-1/2}\right) \times \mathbf{c}\left( F_{1}\left( e_{1t}\right) ,...,F_{N}\left( e_{Nt}\right) \right) \times \prod\nolimits_{i=1}^{N}f_{i}\left( e_{it}\right)$$ | (4) |
Lee and Long (2009) were the first to propose decomposing dependence into linear and nonlinear components, and we now discuss their approach in more detail. They proposed the following model:
$$\displaystyle \mathbf{r}_{t}$$ | $$\displaystyle =$$ | $$\displaystyle \mathbf{\mu }_{t}+\mathbf{H}_{t}^{1/2}\mathbf{\Sigma }^{-1/2}\mathbf{w}_{t}$$ | (5) |
where $$\displaystyle \mathbf{w}_{t}$$ | $$\displaystyle \sim$$ | $$\displaystyle iid~\mathbf{G}\left( \mathbf{\cdot };\mathbf{\eta }\right) =\mathbf{C}_{\mathbf{w}}\left( G_{1}\left( \mathbf{\cdot };\mathbf{\eta }\right) ,...,G_{N}\left( \mathbf{\cdot };\mathbf{\eta }\right) ;\mathbf{\eta }\right)$$ | |
and $$\displaystyle \mathbf{\Sigma }$$ | $$\displaystyle \mathbf{\equiv }$$ | $$\displaystyle Cov\left[ \mathbf{w}_{t}\right]$$ |
We next describe how we propose modelling the uncorrelated residuals, $$ \mathbf{e}_{t},$$ and then we turn to models for the covariance matrix $$ \mathbf{H}_{t}.$$
Research on forecasting models for multivariate covariance matrices with low-frequency data is pervasive, see Andersen, et al. (2006) for a review, and research on forecasting models using high frequency data is growing, e.g. Chiriac and Voev (2011), Noureldin, et al. (2012) among others. There are two major concerns about forecasting models for multivariate covariance matrices: parsimony and positive definiteness. Keeping these two concerns in mind, we combine the essential ideas of the DCC model of Engle (2002) and the heterogeneous autoregressive (HAR) model of Corsi (2009) to obtain a simple and flexible new forecasting model for covariance matrices. Following the DCC model, we estimate the variances and correlations separately, to reduce the computational burden. We use the HAR model structure, which is known to successfully capture the long-memory behavior of volatility in a simple autoregressive way.
Let $$ \Delta $$ be the sampling frequency (e.g., 5 minutes), which yields $$ 1/\Delta $$ observations per trade day. The $$ N\times N$$ realized covariance matrix for the interval $$ \left[ t-1,t\right] $$ is defined by
$$\displaystyle RVarCov_{t}^{\Delta }=\sum_{j=1}^{1/\Delta }\mathbf{r}_{t-1+j\cdot \Delta }\mathbf{r}_{t-1+j\cdot \Delta }^{\prime }$$ | (8) |
$$\displaystyle RVarCov_{t}^{\Delta }=\sqrt{RVar_{t}^{\Delta }}\cdot RCorr_{t}^{\Delta }\cdot \sqrt{RVar_{t}^{\Delta }}$$ | (9) |
We propose to first apply the HAR model to each (log) realized variance:
$$\displaystyle \log RVar_{ii,t}^{\Delta }$$ | $$\displaystyle =$$ | $$\displaystyle \phi _{i}^{\left( const\right) }+\phi _{i}^{\left( day\right) }\log RVar_{ii,t-1}^{\Delta }+\phi _{i}^{\left( week\right) }\frac{1}{4}\sum\nolimits_{k=2}^{5}\log RVar_{ii,t-k}^{\Delta }$$ | (10) |
$$\displaystyle +\phi _{i}^{\left( month\right) }\frac{1}{15}\sum\nolimits_{k=6}^{20}\log RVar_{ii,t-k}^{\Delta }+\xi _{it}$$, $$\displaystyle i=1,2,...,N.$$ |
Next, we propose a model for realized correlations, using the vech operator. Consider the following HAR-type model for correlations:
$$\displaystyle vech\left( RCorr_{t}^{\Delta }\right)$$ | $$\displaystyle =$$ | $$\displaystyle vech\left( \overline{RCorr_{T}^{\Delta }}\right) \left( 1-a-b-c\right) +a\cdot vech\left( RCorr_{t}^{\Delta }\right)$$ | (11) |
$$\displaystyle +b\cdot \frac{1}{4}\sum\nolimits_{k=2}^{5}vech\left( RCorr_{t-k}^{\Delta }\right) +c\cdot \frac{1}{15}\sum\nolimits_{k=6}^{20}vech\left( RCorr_{t-k}^{\Delta }\right) +\mathbf{\xi }_{t}$$ |
Let $$ \widehat{RVarCov_{t}^{\Delta }}$$ denote a forecast of the covariance matrix based on equations (10) and (11) and estimated parameters. The theorem below provides conditions under which $$ \widehat{RVarCov_{t}^{\Delta }}$$ is guaranteed to be positive definite.
Our forecasting model for the realized covariance matrix is simple and fast to estimate and positive definiteness is ensured by Theorem 2. We note that the above theorem is robust to the misspecification of return distributions, i.e. Theorem 2 holds regardless of whether or not return distribution follows the proposed model specified by equations (1)-(2).
This section proposes a composite likelihood approach to estimate models from the class of jointly symmetric copulas proposed in Theorem 1, and then describes corresponding methods for model comparison tests of copula models specified and estimated in this way. Finally, we present results on how to handle the estimation error for the complete model, taking into account the multi-stage nature of the proposed estimation methods.
The proposed method to construct jointly symmetric copulas in Theorem 1 requires $$ 2^{N}$$ evaluations of the given original copula density. Even for moderate dimensions, say N=20, the likelihood evaluation may be too slow to calculate. We illustrate this using a jointly symmetric copula based on the Clayton copula, which has a simple closed-form density and requires just a fraction of a second for a single evaluation.3 The first row of Table 1 shows that as the dimension, and thus the number of rotations, increases, the computation time for a single evaluation of the jointly symmetric Clayton copula grows from less than a second to several minutes to many years.4
For high dimensions, ordinary maximum likelihood estimation (MLE) is not feasible for our jointly symmetric copulas. A composite likelihood (Lindsay, 1988) consists of combinations of the likelihoods of submodels or marginal models of the full model, and under certain conditions maximizing the composite likelihood (CL) can be shown to generate parameter estimates that are consistent for the true parameters of the model.5 The essential intuition behind CL is that since submodels include partial information on the parameters of the full model, by properly using that partial information we can estimate the parameters of full model, although of course subject to some efficiency loss.
The composite likelihood can be defined in various ways, depending on which sub-models of the full model are employed. In our case, the use of bivariate sub-models is particularly attractive, as a bivariate sub-model of the jointly symmetric copula generated using equation (6) requires only four rotations. This is easily shown using some copula manipulations, and we summarize this result in the proposition below.
$$\displaystyle \mathbf{c}_{ij}^{JS}\left( u_{i},u_{j}\right) =\frac{1}{4}\left \{ \mathbf{c}_{ij}\left( u_{i},u_{j}\right) +\mathbf{c}_{ij}\left( 1-u_{i},u_{j}\right) +\mathbf{c}_{ij}\left( u_{i},1-u_{j}\right) +\mathbf{c}_{ij}\left( 1-u_{i},1-u_{j}\right) \right \}$$ |
Thus while the full model requires $$ 2^{N}$$ rotations of the original density, bivariate marginal models only require $$ 2^{2}$$ rotations. Similar to Engle, et al. (2008), we consider CL based either on all pairs of variables, only adjacent pairs of variables,6 and only the first pair of variables:
$$\displaystyle CL_{all}\left( u_{1},\ldots ,u_{N}\right)$$ | $$\displaystyle =$$ | $$\displaystyle \sum_{i=1}^{N-1}\sum_{j=i+1}^{N}\log \mathbf{c}_{i,j}\left( u_{i},u_{j}\right)$$ | (12) |
$$\displaystyle CL_{adj}\left( u_{1},\ldots ,u_{N}\right)$$ | $$\displaystyle =$$ | $$\displaystyle \sum_{i=1}^{N-1}\log \mathbf{c}_{i,i+1}\left( u_{i},u_{i+1}\right)$$ | (13) |
$$\displaystyle CL_{first}\left( u_{1},\ldots ,u_{N}\right)$$ | $$\displaystyle =$$ | $$\displaystyle \log \mathbf{c}_{1,2}\left( u_{1},u_{2}\right)$$ | (14) |
While there are many different ways to construct composite likelihoods, they all have some common features. First of all, they are valid likelihoods since the likelihood of the sub-models are themselves valid likelihoods. Second, the joint model implied by taking products of densities of sub-models (i.e., imposing an incorrect independence assumption) causes misspecification and the information matrix equality will not hold. Third, the computation of the composite likelihood is substantially faster than that of the full likelihood. In our application the computational burden is reduced from $$ \mathcal{O}\left( 2^{N}\right) $$ to $$ \mathcal{O}\left( N^{2}\right) ,~\mathcal{O}\left( N\right) $$ or $$ \mathcal{O}\left( 1\right) $$ when we use all pairs, only adjacent pairs, or only the first pair of variables. The bottom three rows in Table 1 show the computation gains from using a composite likelihood based on one of the three combinations in equations (12)-(14) compared with using the full likelihood.
Let us define maximum composite likelihood estimation (MCLE) as based on:
$$\displaystyle \mathbf{\hat{\theta}}_{MCLE}=\arg \max_{\mathbf{\theta }}\sum_{t=1}^{T}CL\left( u_{1t},..,u_{Nt};\mathbf{\theta }\right)$$ | (15) |
$$\displaystyle \sqrt{T}\left( \mathbf{\hat{\theta}}_{MCLE}\mathbf{-\theta }_{0}\right) \overset{d}{\mathbf{\longrightarrow }}N\left( 0,\mathcal{H}_{0}^{-1}\mathcal{J}_{0}\mathcal{H}_{0}^{-1}\right)$$ | (16) |
The identification condition required for CL estimation comes from the first-order condition implied by the optimization problem. Specifically, it is required that
$$\displaystyle E\left[ \frac{\partial }{\partial \mathbf{\theta }}CL\left( u_{1t},..,u_{Nt};\mathbf{\theta }\right) \right] ~~\left\{ \begin{array}{c} =\mathbf{0}\text{ \ for }\mathbf{\theta =\theta }_{0} \\ \neq \mathbf{0}\text{ \ for }\mathbf{\theta \neq \theta }_{0}\end{array}\right.$$ | (17) |
We next consider in-sample and out-of-sample model selection tests when composite likelihood is involved. The tests we discuss here are guided by our empirical analysis in Section 5, so we only consider the case where composite likelihoods with adjacent pairs are used. We first define the composite Kullback-Leibler information criterion (cKLIC) following Varin and Vidoni (2005).
$$\displaystyle I_{c}\left( \mathbf{g,h}\right) =E_{\mathbf{g}\left( \mathbf{z}\right) }\left[ \log \prod\limits_{i=1}^{N-1}\mathbf{g}_{i}\left( z_{i},z_{i+1}\right) -\log \prod\limits_{i=1}^{N-1}\mathbf{h}_{i}\left( z_{i},z_{i+1}\right) \right]$$ |
We focus on the CL using adjacent pairs, but other cKLICs can be defined similarly. Note that the composite log-likelihood for the joint distribution can be decomposed using Sklar's theorem (equations 3-4) into the marginal log-likelihoods and the copula composite log-likelihood. We use this
expression when comparing our joint density models in our empirical work below.7,8
$$\displaystyle CL_{h}$$ | $$\displaystyle \equiv$$ | $$\displaystyle \sum_{i=1}^{N-1}\log \mathbf{h}\left( z_{i},z_{i+1}\right)$$ | (18) |
$$\displaystyle =$$ | $$\displaystyle \log h_{1}\left( z_{1}\right) +\log h_{N}\left( z_{N}\right) +2\sum_{i=1}^{N-1}\log h_{i}\left( z_{i}\right) +\sum_{i=1}^{N-1}\log \mathbf{c}\left( H_{i}\left( z_{i}\right) ,H_{i+1}\left( z_{i+1}\right) \right)$$ |
$$\displaystyle I_{c}\left( \mathbf{g,h}\right) =\sum_{i=1}^{N-1}E_{\mathbf{g}\left( \mathbf{z}\right) }\left[ \log \frac{\mathbf{g}_{i}\left( z_{i},z_{i+1}\right) }{\mathbf{h}_{i}\left( z_{i},z_{i+1}\right) }\right] =\sum_{i=1}^{N-1}E_{\mathbf{g}_{i}\left( z_{i},z_{i+1}\right) }\left[ \log \frac{\mathbf{g}_{i}\left( z_{i},z_{i+1}\right) }{\mathbf{h}_{i}\left( z_{i},z_{i+1}\right) }\right]$$ | (19) |
We may also wish to select the best model in terms of out-of-sample (OOS) forecasting performance measured by some scoring rule, $$ \mathcal{S},$$ for the model. Gneiting and Raftery (2007) define " proper" scoring rules as those which satisfy the condition that the true density always receives a higher score, in expectation, than other densities. Gneiting and Raftery (2007) suggest that the "natural" scoring rule is the log density, i.e. $$ \mathcal{S}\left( \mathbf{h}\left( \mathbf{Z}\right) \right) =\log \mathbf{h}\left( \mathbf{Z}\right) ,$$ and it can be shown that this scoring rule is proper.10 We may consider a similar scoring rule based on log composite density:
$$\displaystyle \mathcal{S}\left( \mathbf{h}\left( \mathbf{Z}\right) \right) =\sum_{i=1}^{N-1}\log \mathbf{h}_{i}\left( Z_{i},Z_{i+1}\right)$$ | (20) |
$$\displaystyle E\left[ \sum_{i=1}^{N-1}\log \mathbf{h}_{i}\left( Z_{i},Z_{i+1}\right) \right] \leq E\left[ \sum_{i=1}^{N-1}\log \mathbf{g}_{i}\left( Z_{i},Z_{i+1}\right) \right]$$ | (21) |
This theorem allows us to interpret OOS tests based on CL as being related to the cKLIC, analogous to OOS tests based on the full likelihood being related to the KLIC. In our empirical analysis below we employ a Giacomini and White (2006) test based on an OOS CL scoring rule.
We next consider multi-stage estimation of models such as those defined by equations (1)-(3). We consider general parametric models for the conditional mean and covariance matrix:
$$\displaystyle \mathbf{\mu }_{t}$$ | $$\displaystyle \equiv$$ | $$\displaystyle \mathbf{\mu }\left( \mathbf{Y}_{t-1};\mathbf{\theta }^{mean}\right)$$ , $$\displaystyle \ \mathbf{Y}_{t-1}\in \mathcal{F}_{t-1}$$ | (22) |
$$\displaystyle \mathbf{H}_{t}$$ | $$\displaystyle \equiv$$ | $$\displaystyle \mathbf{H}\left( \mathbf{Y}_{t-1};\mathbf{\theta }^{var}\right)$$ |
The standardized uncorrelated residuals in equation (3) follow a parametric distribution:
$$\displaystyle \mathbf{e}_{t}\sim iid$$$$\displaystyle \mathbf{F=C}\left( F_{1}\left( \cdot ;\mathbf{\theta }_{1}^{mar}\right) ,...,F_{N}\left( \cdot ;\mathbf{\theta }_{N}^{mar}\right) ;\mathbf{\theta }^{copula}\right)$$ | (23) |
The covariance model proposed in Section 2.2 allows for the separate estimation of the conditional variances and the conditional correlation matrix, similar to the DCC model of Engle (2002) which we also consider in our empirical application below. Thus we can decompose the parameter $$ \mathbf{\theta }^{var}$$ into $$ \left[ \mathbf{\theta }_{1}^{var},\ldots ,\mathbf{\theta }_{N}^{var},\mathbf{\theta }^{corr}\right] ,$$ and then represent the complete set of unknown parameters as
$$\displaystyle \mathbf{\theta \equiv }\left[ \begin{array} \mathbf{\theta }_{1}^{var}\ & \ldots & \mathbf{\theta }_{N}^{var}\ & \mathbf{\theta }^{corr}\ & \mathbf{\theta }_{1}^{mar}\ & \ldots & \mathbf{\theta }_{N}^{mar}\ & \mathbf{\theta }^{cop}\end{array}\right] .$$ | (24) |
$$\displaystyle \mathbf{\hat{\theta}}_{i}^{var}$$ | $$\displaystyle \equiv$$ | $$\displaystyle \arg \max_{\mathbf{\theta }_{i}^{var}}\sum_{t=1}^{T}\log l_{it}^{var}\left( \mathbf{\theta }_{i}^{var}\right) ,$$$$\displaystyle i=1,\ldots ,N$$ | |
$$\displaystyle \mathbf{\hat{\theta}}^{corr}$$ | $$\displaystyle \equiv$$ | $$\displaystyle \arg \max_{\mathbf{\theta }^{corr}}\sum_{t=1}^{T}\log l_{t}^{corr}\left( \mathbf{\hat{\theta}}_{1}^{var},\ldots ,\mathbf{\hat{\theta}}_{N}^{var},\mathbf{\theta }^{corr}\right)$$ | (25) |
$$\displaystyle \mathbf{\hat{\theta}}_{i}^{mar}$$ | $$\displaystyle \equiv$$ | $$\displaystyle \arg \max_{\mathbf{\theta }_{i}^{mar}}\sum_{t=1}^{T}\log l_{it}^{mar}\left( \mathbf{\hat{\theta}}_{1}^{var},\ldots ,\mathbf{\hat{\theta}}_{N}^{var},\mathbf{\hat{\theta}}^{corr},\mathbf{\theta }_{i}^{mar}\right) ,$$$$\displaystyle \,i=1,\ldots ,N$$ | |
$$\displaystyle \mathbf{\hat{\theta}}^{cop}$$ | $$\displaystyle \equiv$$ | $$\displaystyle \arg \max_{\mathbf{\theta }^{cop}}\sum_{t=1}^{T}\log l_{t}^{cop}\left( \mathbf{\hat{\theta}}_{1}^{var},\ldots ,\mathbf{\hat{\theta}}_{N}^{var},\mathbf{\hat{\theta}}^{corr},\mathbf{\hat{\theta}}_{1}^{mar},\ldots ,\mathbf{\hat{\theta}}_{N}^{mar},\mathbf{\theta }^{cop}\right)$$ |
In words, the first stage estimates the N individual variance models based on QMLE$$ ;$$ the next stage uses the standardized returns to estimate the correlation model, using QMLE or a composite likelihood method (as in Engle, et al., 2008); the third stage estimates the N marginal distributions of the estimated standardized uncorrelated residuals; and the final stage estimates the copula of the standardized residuals based on the estimated "probability integral transforms." This final stage may be maximum likelihood (if the copula is such that this is feasible) or composite likelihood, as described in Section 3.1. We denote the complete vector of estimated parameters obtained from these four stages as $$ \mathbf{\hat{\theta}}_{MSML}.$$
As is clear from the above, later estimation stages depend on previously estimated parameters, and the accumulation of estimation error must be properly incorporated into standard error calculations for $$ \mathbf{\hat{\theta}}_{MSML}$$ . Multi-stage ML estimation (and, in particular, multi-stage ML with a composite likelihood stage) can be viewed as a form of multi-stage GMM estimation, and under standard regularity conditions, it can be shown (see Newey and McFadden, 1994, Theorem 6.1) that
$$\displaystyle \sqrt{T}\left( \mathbf{\hat{\theta}}_{MSML}\mathbf{-\theta }^{\ast }\right) \overset{d}{\rightarrow }N\left( 0,V_{MSML}^{\ast }\right)$$ as $$\displaystyle T\rightarrow \infty$$ | (26) |
In this section we use simulations to study the efficiency loss from maximum composite likelihood estimation (MCLE) relative to MLE, and we compare the efficiency of the three composite likelihoods presented in equations (12)-(14), namely "all pairs," "adjacent pairs," and "first pair."
We specify the data generating process as follows, based on some copula
$$ \mathbf{C}$$ and a set of independent Bernoulli random variables:
$$\displaystyle \tilde{u}_{it}$$ | $$\displaystyle =$$ | $$\displaystyle Z_{it}u_{it}+\left( 1-Z_{it}\right) \left( 1-u_{it}\right)$$ , $$\displaystyle t=1,2,...T$$ | (27) |
where $$\displaystyle \left[ u_{1t},...,u_{Nt}\right]$$ | $$\displaystyle \equiv$$ | $$\displaystyle \mathbf{u}_{t}\thicksim iid~\mathbf{C}\left( \theta \right)$$ | |
and $$\displaystyle Z_{it}$$ | $$\displaystyle \thicksim$$ | $$\displaystyle iid$$$$\displaystyle Bernoulli\left( 1/2\right)$$ , and $$\displaystyle Z_{it}\perp Z_{jt}~\forall ~i\neq j$$ |
We consider four different estimation methods: MLE, MCLE with all pairs (equation 12), MCLE with adjacent pairs (equation 13), and MCLE with the first pair (equation 14). MLE is not computationally feasible for $$ N>10$$ , but the MCLEs are feasible for all dimensions considered.11 We report estimated run times for MLE for $$ N\geq 20$$ to provide an indication of how long MLE would take to complete in those dimensions.
Table 2 presents the simulation results for the Clayton copula, and the web appendix presents corresponding results for the Gumbel copula. The average biases for all dimensions and for all estimation methods are small relative to the standard deviations. The standard deviations show, unsurprisingly, that MLE is more accurate than the three MCLEs; the efficiency loss of MCLE with "all pairs" to MLE is ranges from 5% to 37%. Among the three MCLEs, MCLE with all pairs has the smallest standard deviations and MCLE with the first pair has the largest, as expected. Comparing MCLE with adjacent pairs to MCLE with all pairs, we find that loss in efficiency is 23% for N=10, and 5% for N=100, and computation speed is two times faster for N=10 and 70 times faster for N=100. For high dimensions, it is confirmed that MCLE with adjacent pairs performs quite well compared to MCLE with all pairs according to accuracy and computation time, which is similar to results in Engle, et al. (2008) on the use of adjacent pairs in the estimation of the DCC model.
In sum, MCLE is less efficient than MLE but still approximately unbiased and very fast for high dimensions. The accuracy of MCLE based only on adjacent pairs is similar to that of MCLE with all pairs, especially for high dimensions, and the gains in computation time are large. For this reason, we use MCLE with adjacent pairs for our empirical analysis in Section 5.
Next we study multi-stage estimation for a representative model for daily asset returns. We assume:
$$\displaystyle \mathbf{r}_{t}$$ | $$\displaystyle =$$ | $$\displaystyle \mathbf{H}_{t}^{1/2}\mathbf{e}_{t}$$ | (28) |
$$\displaystyle \mathbf{H}_{t}$$ | $$\displaystyle \equiv$$ | $$\displaystyle Cov\left[ \mathbf{r}_{t}\vert\mathcal{F}_{t-1}\right]$$ | |
$$\displaystyle \mathbf{e}_{t}$$ | $$\displaystyle \sim$$ | $$\displaystyle iid$$$$\displaystyle \mathbf{F=C}\left( F_{1}\left( \cdot ;\nu _{1}\right) ,...,F_{N}\left( \cdot ;\nu _{N}\right) ;\mathbf{\varphi }\right)$$ |
We estimate the model using the multi-stage estimation described in Section 3.3. The parameters of GARCH for each variables are estimated via QML at the first stage, and the parameters of the DCC model are estimated via variance targeting and composite likelihood with adjacent pairs, see Engle, et al. (2008) for details. We use ML to estimate the marginal distributions of the standardized residuals, and finally we estimate the copula parameters using MCLE with adjacent pairs as explained in Section 3.1. We repeat this scenario 500 times with time series of length T=1000 and cross-sectional dimensions of N=10, 50, and 100. Table 3 reports all parameter estimates except $$ \overline{\mathbf{Q}}$$ . The columns for $$ \psi _{i},\kappa _{i},\lambda _{i}$$ and $$ \nu _{i}$$ report the summary statistics obtained from $$ 500\times N$$ estimates since those parameters are the same across all variables.
Table 3 reveals that the estimated parameters are centered on the true values with the average estimated bias being small relative to the standard deviation. As the dimension size increases, the copula model parameters are more accurately estimated, which was also found in the previous section. Since this copula model keeps the dependence between any two variables identical, the amount of information on the unknown copula parameter increases as the dimension grows. The average computation time is reported in the bottom row of each panel, and it indicates that multi-stage estimation is quite fast: for example, it takes five minutes for the one hundred dimension model, in which the total number of parameters to estimate is more than 5000.
To see the impact of estimation errors from the former stages to copula estimation, we compare the standard deviations of the estimated copula parameters in Table 3 with the corresponding results in Table 2. The standard deviation increases by about 30% for N=10, and by about 19% for N=50 and 100. The loss of accuracy caused by having to estimate the parameters of the marginals is relatively small, given that more than 5000 parameters are estimated in the former stages. We conclude that multi-stage estimation with composite likelihood results in a large reduction in the computational burden (indeed, they make this estimation problem feasible using current computing power) and yields reliable parameter estimates.
In this section we apply our proposed multivariate distribution model to equity returns over the period January 2006 to December 2012, a total of T=1761 trade days. We study every stock that was ever a constituent of the S&P 100 equity index during this sample, and which traded for the full sample period, yielding a total of N=104 assets. The web appendix contains a table with the names of these 104 stocks. We obtain high frequency transaction data on these stocks from the NYSE TAQ database, and clean these data following Barndorff-Nielsen, et al. (2009), see Bollerslev, et al. (2014) for details. We adjust prices affected by splits and dividends using "adjustment" factors from CRSP. Daily returns are calculated using the log-difference of the close prices from high frequency data. For high frequency returns, log-differences of five minute prices are used and overnight returns are treated as the first return in a day.
Table 4 presents the summary statistics of the data and the estimates of conditional mean model. The top panel presents unconditional sample moments of the daily returns for each stock. Those numbers broadly match values reported in other studies, for example, strong evidence for fat tails. In the lower panel, the formal tests for zero skewness and zero excess kurtosis are conducted. The tests show that only 3 stocks out of 104 have a significant skewness, and all stocks have a significant excess kurtosis. For reference, we also test for zero pair-wise correlations, and we reject the null for all pairs of asset returns. The middle panel shows the estimates of the parameters of AR(1) models. Constant terms are estimated to be around zero and estimates of the AR(1) coefficients are slightly negative, both are consistent with values in other studies.
We estimate two different models for conditional covariance matrix: the HAR-type model described in Section 2.2 and a GJR-GARCH-DCC model.12 The latter model uses daily returns, and the former exploits 5-minute intra-daily returns;13 both models are estimated using quasi-maximum likelihood. The estimates of HAR variance models are presented in Panel A of Table 5, and are similar to those reported in Corsi (2009): coefficients on past daily, weekly, and monthly realized variances are around 0.38, 0.31 and 0.22. For the HAR-type correlation model, however, the coefficient on past monthly correlations is the largest followed by weekly and daily. The parameter estimates for the DCC model presented in Panel B are close to other studies of daily stock returns, indicating volatility clustering, asymmetric volatility dynamics, and highly persistent time-varying correlations. The bootstrap standard errors described in Section 3.3 are provided for the correlation models, and they take into account the estimation errors of former stages.
The standardized residuals are constructed as $$ \mathbf{\hat{e}}_{t,M}\equiv \mathbf{\hat{H}}_{t,M}^{-1/2}\left( \mathbf{r}_{t}-\mathbf{\hat{\mu}}_{t}\right) $$ where $$ M\in \left \{ HAR,DCC\right \} .$$ We use the spectral decomposition rather than the Cholesky decomposition to compute the square-root matrix due to the former's invariance to the order of the variables. Summary statistics on the standardized residuals are presented in Panels A and B of Table 6.
Our proposed approach for modelling the joint distribution of the standardized residuals is based on a jointly symmetric distribution, and thus a critical first step is to test for univariate symmetry of these residuals. We do so in Panel D of Table 6. We find that we can reject the null of zero skewness for only 4/104 and 6/104 series based on the HAR and DCC models. Thus the assumption of symmetry appears reasonable for this data set.14 We also test for zero excess kurtosis and we reject it for all 104 series for both volatility models. These two test results motivate our choice of a standardized Student's t distribution for the marginal distributions of the residuals. Finally, as a check of our conditional covariance models, we also test for zero correlations between the residuals. We find that we can reject this null for 9.2% and 0.0% of the 5356 pairs of residuals, using the HAR and DCC models. Thus both models provide a reasonable estimate of the time-varying conditional covariance matrix, although by this metric the DCC model would be preferred over the HAR model.
Panel C of Table 6 presents the cross-sectional quantiles of 104 estimated degrees of freedom parameters of standardized Student's t distributions. These estimates range from 4.1 (4.2) at the 5% quantile to 6.9 (8.3) at the 95% quantile for the HAR (DCC) model. Thus both sets of standardized residuals imply substantial kurtosis, and, interestingly for the methods proposed in this paper, substantial heterogeneity in kurtosis. A simple multivariate t distribution could capture the fat tails exhibited by our data, but it imposes the same degrees of freedom parameter on all 104 series. Panel C suggests that this restriction is not supported by the data, and we show in formal model selection tests below that this assumption is indeed strongly rejected.
We next present the most novel aspect of this empirical analysis: the estimation results for a selection of jointly symmetric copula models. Parameter estimates and standard errors for these models are presented in Table 7. We consider four jointly symmetric copulas based on the t, Clayton, Frank, and Gumbel copulas. The jointly symmetric copulas based on Clayton, Frank and Gumbel are constructed using Theorem 1, and the jointly symmetric t copula is obtained simply by imposing an identity correlation matrix for that copula.15 We compare our jointly symmetric specifications with two well-known benchmark models: the independence copula and the multivariate Student's t distribution. The independence copula is a special case of a jointly symmetric copula, and there is no parameter to estimate. The multivariate t distribution is what would be obtained if our jointly symmetric t copula and all 104 univariate t distributions had the same degrees of freedom parameter, and in this case there would be no gains to using Sklar's theorem to decompose the joint distribution of the residuals into marginal distributions and the copula. Note that while the independence copula imposes a stronger condition on the copula specification than the multivariate t distribution, it does allow each of the marginal distributions to be possibly heterogeneous Student's t distributions, and so the ordering of these two specifications is not clear ex ante. This table also reports bootstrap standard errors which incorporate accumulated estimation errors from former stages. We follow steps explained in Section 3.3 to obtain these standard errors. The average block length for the stationary bootstrap is set to 100.
The log-likelihoods of the complete model for all 104 daily returns are reported for each of the models in Table 7, along with the rank of each model according to its log-likelihood, out of the twelve competing specifications presented here. Comparing the values of the log-likelihoods, we draw two initial conclusions. First, copula methods (even the independence copula) outperform the multivariate t distribution, which imposes strong homogeneity on the marginal distributions and the copula. Second, high frequency data improves the fit of all models relative to the use of daily data: the best six performing models are those based on the HAR specification.
We next study the importance of allowing for nonlinear dependence. The independence copula assumes no nonlinear dependence, and we can test for the presence of nonlinear dependence by comparing the remaining specifications with the independence copula. Since the four jointly symmetric copulas and the multivariate t distribution all nest the independence copula,16 we can implement this test as a simple restriction on an estimated parameter. The t-statistics for those tests are reported in the bottom row of each panel of Table 7. Independence is strongly rejected in all cases, and we thus conclude that there is substantial nonlinear cross-sectional dependence in daily returns. While linear correlation and covariances are important for describing this vector of asset returns, these results reveal that these measures are not sufficient to completely describe their dependence.
Our model for the joint distribution of returns invokes an assumption that while linear dependence, captured via the correlation matrix, is time-varying, nonlinear dependence, captured through the distribution of the standardized residuals, is constant. We test this assumption by estimating the parameters of this distribution (the copula parameter, and the parameters of the 104 univariate Student's t marginal distributions) separately for the first and second half of our sample period, and then test whether they are significantly different. We find that 16 (19) of the HAR (DCC) marginal distribution parameters are significantly different at the 5% level, but none of the copula parameters are significantly different. Importantly, when we implement a joint test for a change in the entire parameter vector, we find no significant evidence (the p-values are both 0.99), and thus overall we conclude that this assumption is consistent with the data.17
We now turn to formal tests to compare the remaining, mostly non-nested, models. We consider both in-sample and out-of-sample tests.
As discussed in Section 3.2, the composite likelihood KLIC, (cKLIC) is a proper scoring rule, and can be represented as a linear combination of bivariate KLICs, allowing us to use existing in-sample model selection tests, such as those of Rivers and Vuong (2002).
In a Rivers and Vuong test comparing two models, A and $$ B,$$ the null and alternative hypotheses are:
$$\displaystyle H_{0}$$ | $$\displaystyle :$$ | $$\displaystyle E\left[ CL_{t}^{A}\left( \theta _{A}^{\ast }\right) -CL_{t}^{B}\left( \theta _{B}^{\ast }\right) \right] =0$$ | (29) |
vs. $$\displaystyle H_{1}$$ | $$\displaystyle :$$ | $$\displaystyle E\left[ CL_{t}^{A}\left( \theta _{A}^{\ast }\right) -CL_{t}^{B}\left( \theta _{B}^{\ast }\right) \right] >0$$ | |
$$\displaystyle H_{2}$$ | $$\displaystyle :$$ | $$\displaystyle E\left[ CL_{t}^{A}\left( \theta _{A}^{\ast }\right) -CL_{t}^{B}\left( \theta _{B}^{\ast }\right) \right] <0$$ |
$$\displaystyle \frac{\sqrt{T}\left\{ \overline{CL}_{T}^{A}\left( \hat{\theta}_{A}\right) -\overline{CL}_{T}^{B}\left( \hat{\theta}_{B}\right) \right\} }{\hat{\sigma}_{T}}\rightarrow N\left( 0,1\right)$$ under $$\displaystyle H_{0}$$ | (30) |
Table 8 presents t-statistics from Rivers and Vuong (2002) model comparison tests. A positive t-statistic indicates that the model above beats the model to the left, and a negative one indicates the opposite. We first examine the bottom row of the upper panel to see whether the copula-based models outperform the multivariate t distribution. The multivariate t distribution is widely used as an alternative to the Normal distribution not only in the literature but also in practice due to its thick tails and non-zero tail dependence. We observe that all t-statistics in that row are positive and larger than 18, indicating strong support in favor of the copula-based models. This outperformance is also achieved when the GARCH-DCC model using daily data is used (see the right half of the bottom row of the lower panel).
Next we consider model comparisons for the volatility models, to see whether a covariance matrix model that exploits high frequency data provides a better fit than one based only on daily data. The diagonal elements of the left half of the lower panel present these results, and in all cases we find that the model based on high frequency data significantly out-performs the corresponding model based on lower-frequency data. In fact, all t-statistics in the left half of the lower panel are positive and significant, indicating that the worst high frequency model is better than the best daily model. This is strong evidence of the gains from using high frequency data for capturing dynamics in conditional covariances.
Finally, we identify the best-fitting model of all twelve models considered here. The fact that all t-statistics in Table 8 are positive indicates that the first model listed in the top row is the best, and that is the model based on the jointly symmetric t copula. This model significantly beats all alternative models. (The second-best model is based on the jointly symmetric Clayton copula.) In Figure 3 we present the model-implied conditional correlation and the 1% quantile dependence, a measure of lower-tail dependence,18 for one pair of assets in our sample, Citi Group and Goldman Sachs, using the best model. The plot shows that the correlation between this pair ranges from 0.25 to around 0.75 over this sample period. The lower tail dependence implied by the jointly symmetric t copula ranges from 0.02 to 0.34, with the latter indicating very strong lower-tail dependence.
We next investigate the out-of-sample (OOS) forecasting performance of the competing models. We use the period from January 2006 to December 2010 $$ \left( R=1259\right) $$ as the in-sample period, and January 2011 to December 2012 $$ \left( P=502\right) $$ as the out-of-sample period. We employ a rolling window estimation scheme, re-estimating the model each day in the OOS period. We use the Giacomini and White (2006) test to compare models based on their OOS composite likelihood. The implementation of these tests is analogous to the Rivers and Vuong test described above. We note here that the Giacomini and White test punishes complicated models that provide a good (in-sample) fit but are subject to a lot of estimation error. This feature is particularly relevant for comparisons of our copula-based approaches, which have 104 extra parameters for the marginal distribution models, with the multivariate t distribution, which imposes that all marginal distributions and the copula have the same degrees of freedom parameter.19
Table 9 presents t-statistics from these pair-wise OOS model comparison tests, with the same format as Table 8. The OOS results are broadly similar to the in-sample results, though with somewhat lower power. We again find that the multivariate t distribution is significantly beaten by all competing copula-based approaches, providing further support for the models proposed in this paper. We also again find strong support for the use of high frequency data for the covariance matrix model, with the HAR-type models outperforming the daily GARCH-DCC models.
Comparing the independence copula with the jointly symmetric copulas we again find that the independence copula is significantly beaten, providing evidence for the out-of-sample importance of modeling dependence beyond linear correlation. One difference in Table 9 relative to Table 8 is in the significance of the difference in performance between the four jointly symmetric copulas: we find that the jointly symmetric Gumbel copula is significantly beaten by the t and the Clayton, but neither of these latter two significantly beats the other, nor the Frank copula. The jointly symmetric t remains the model with the best performance, but it is not significantly better than the jointly symmetric Clayton or Frank models out of sample.
This paper proposes a new general model for high-dimensional distributions of asset returns that utilizes mixed frequency data and copulas. We decompose dependence into linear and nonlinear components, and exploit recent advances in the analysis of high frequency data to obtain more accurate models for linear dependence, as measured by the covariance matrix, and propose a new class of copulas to capture the remaining dependence in the low frequency standardized residuals. By assigning two different tasks to high frequency data and copulas, we obtain significantly improved models for joint distributions. Our approach for obtaining jointly symmetric copulas generates a rich set of models for studying the dependence of uncorrelated, but dependent, variables. The evaluation of the density of our jointly symmetric copulas turns out to be computationally difficult in high dimensions, but we show that composite likelihood methods may be used to estimate the parameters of the model and undertake model selection tests.
We employ our proposed models to study daily return distributions of 104 U.S. equities over the period 2006 to 2012. We find that our proposed models significantly outperform existing alternatives both in-sample and out-of-sample. The improvement in performance can be attributed to three main sources. Firstly, the use of a copula-based approach allows for the use of heterogeneous marginal distributions, relaxing a constraint of the familiar multivariate t distribution. Secondly, the use of copula models that allow for dependence beyond linear correlation, which relaxes a constraint of the Normal copula, leads to significant gains in fit. Finally, consistent with a large extant literature, we find that linear dependence, as measured by the covariance matrix, can be more accurately modelled by using high frequency data than using daily data alone.
The following two lemmas are needed to prove Lemma 2.
$$\displaystyle \mathbf{F}\left( a_{1}+x_{1},..,a_{i}+x_{i},..,a_{N}+x_{N}\right)$$ | $$\displaystyle =$$ | $$\displaystyle \mathbf{F}\left( a_{1}+x_{1},..,\infty ,..,a_{N}+x_{N}\right)$$ | (31) |
$$\displaystyle -\mathbf{F}\left( a_{1}+x_{1},..,a_{i}-x_{i},..,a_{N}+x_{N}\right) ~\forall i$$ |
$$\displaystyle \Pr \left[ X_{1}-a_{1}\leq x_{1},..,X_{i}-a_{i}\leq x_{i},..,X_{N}-a_{N}\leq x_{N}\right] =\Pr \left[ X_{1}-a_{1}\leq x_{1},..,a_{i}-X_{i}\leq x_{i},..,X_{N}-a_{N}\leq x_{N}\right]$$ | (32) |
$$\displaystyle \Pr \left[ X_{1}-a_{1}\leq x_{1},..,a_{i}-X_{i}\leq x_{i},\ldots ,X_{N}-a_{N}\leq x_{N}\right]$$ | (33) | ||
$$\displaystyle =$$ | $$\displaystyle \Pr \left[ X_{1}-a_{1}\leq x_{1},..,X_{i}\leq \infty ,..,X_{N}-a_{N}\leq x_{N}\right] -\Pr \left[ X_{1}-a_{1}\leq x_{1},..,X_{i}\leq a_{i}-x_{i},..,X_{N}-a_{N}\leq x_{N}\right]$$ | ||
$$\displaystyle =$$ | $$\displaystyle \mathbf{F}\left( a_{1}+x_{1},..,\infty ,..,a_{N}+x_{N}\right) -\mathbf{F}\left( a_{1}+x_{1},..,a_{i}-x_{i},..,a_{N}+x_{N}\right)$$ |
$$\displaystyle \Pr \left[ X_{1}-a_{1}\leq x_{1},..,X_{i}-a_{i}\leq x_{i},..,X_{N}-a_{N}\leq x_{N}\right] =\mathbf{F}\left( a_{1}+x_{1},..,a_{i}+x_{i},..,a_{N}+x_{N}\right)$$ |
$$ \left( \Leftarrow \right) $$ Equation (31) can be written as
$$\displaystyle \Pr \left[ X_{1}-a_{1}\leq x_{1},..,X_{i}-a_{i}\leq x_{i},..,X_{N}-a_{N}\leq x_{N}\right]$$ | |||
$$\displaystyle =$$ | $$\displaystyle \Pr \left[ X_{1}-a_{1}\leq x_{1},..,X_{i}\leq \infty ,..,X_{N}-a_{N}\leq x_{N}\right] -\Pr \left[ X_{1}-a_{1}\leq x_{1},..,X_{i}\leq a_{i}-x_{i},..,X_{N}-a_{N}\leq x_{N}\right] ~\forall i$$ |
$$\displaystyle \Pr \left[ X_{1}-a_{1}\leq x_{1},..,X_{i}-a_{i}\leq x_{i},..,X_{N}-a_{N}\leq x_{N}\right] =\Pr \left[ X_{1}-a_{1}\leq x_{1},..,a_{i}-X_{i}\leq x_{i},..,X_{N}-a_{N}\leq x_{N}\right] ~\forall i$$ |
Equation (31) provides a definition of joint symmetry for general CDFs. The corresponding definition for copulas is given below.
$$\displaystyle \mathbf{C}\left( u_{1},..,u_{i},..,u_{N}\right) =\mathbf{C}\left( u_{1},..,1,..,u_{N}\right) -\mathbf{C}\left( u_{1},..,1-u_{i},..,u_{N}\right) ~\forall ~i$$ | (34) |
$$\displaystyle \mathbf{C}\left( F_{1}\left( a_{1}+x_{1}\right) ,..,F_{i}\left( a_{i}+x_{i}\right) ,..,F_{N}\left( a_{N}+x_{N}\right) \right)$$ | |||
$$\displaystyle =$$ | $$\displaystyle \mathbf{C}\left( F_{1}\left( a_{1}+x_{1}\right) ,..,1,..,F_{N}\left( a_{N}+x_{N}\right) \right) -\mathbf{C}\left( F_{1}\left( a_{1}+x_{1}\right) ,..,F_{i}\left( a_{i}-x_{i}\right) ,..,F_{N}\left( a_{N}+x_{N}\right) \right) ~\forall i$$ |
$$\displaystyle \mathbf{C}\left( u_{1},..,u_{i},..,u_{N}\right) =\mathbf{C}\left( u_{1},..,1,..,u_{N}\right) -\mathbf{C}\left( u_{1},..,1-u_{i},..,u_{N}\right) ~\forall i$$ |
$$ \left( \Leftarrow \right) $$ Following the reverse steps to above, equation (34) becomes equation (31), and the proof is done by Lemma 3.
$$\displaystyle \mathbf{C}^{JS}\left( u_{1},..,u_{i},..,u_{N}\right) =\mathbf{C}^{JS}\left( u_{1},..,1,..,u_{N}\right) -\mathbf{C}^{JS}\left( u_{1},..,1-u_{i},..,u_{N}\right) ~\forall i$$ |
$$\displaystyle \mathbf{C}^{JS}\left( u_{1},..,u_{N}\right) =\frac{1}{2^{N}}\left[ \mathbf{C}_{\left( -N\right) }\left( u_{1},..,u_{N-1},u_{N}\right) -\mathbf{C}_{\left( -N\right) }\left( u_{1},..,u_{N-1},1-u_{N}\right) +\mathbf{C}_{\left( -N\right) }\left( u_{1},..,u_{N-1},1\right) \right]$$ |
where $$\displaystyle \mathbf{C}_{\left( -N\right) }\left( u_{1},..,u_{N-1},u_{N}\right)$$ | $$\displaystyle =$$ | $$\displaystyle \sum_{k_{1}=0}^{2}\cdots \sum_{k_{N-1}=0}^{2}\left( -1\right) ^{R_{\left( -N\right) }}\cdot \mathbf{C}\left( \widetilde{u}_{1},..,\widetilde{u}_{N-1},u_{N}\right)$$ | |
$$\displaystyle R_{\left( -N\right) }$$ | $$\displaystyle \equiv$$ | $$\displaystyle \sum_{i=1}^{N-1}1\left\{ k_{i}=2\right\}$$ and $$\displaystyle \widetilde{u}_{i}=\left\{ \begin{array} 1 & for k_{i}=0\ \ u_{i}\ & for k_{i}=1\ \ 1-u_{i}\ & for k_{i}=2\end{array}\right.$$ |
$$\displaystyle \mathbf{C}^{JS}\left( u_{1},..,u_{N-1},1\right) -\mathbf{C}^{JS}\left( u_{1},..,u_{N-1},1-u_{N}\right)$$ | |||
$$\displaystyle =$$ | $$\displaystyle \frac{1}{2^{N}}\left[ \mathbf{C}_{\left( -N\right) }\left( u_{1},..,u_{N-1},1\right) -\underset{=0}{\underbrace{\mathbf{C}_{\left( -N\right) }\left( u_{1},..,u_{N-1},0\right) }}+\mathbf{C}_{\left( -N\right) }\left( u_{1},..,u_{N-1},1\right) \right]$$ | ||
$$\displaystyle -\frac{1}{2^{N}}\left[ \mathbf{C}_{\left( -N\right) }\left( u_{1},..,u_{N-1},1-u_{N}\right) -\mathbf{C}_{\left( -N\right) }\left( u_{1},..,u_{N-1},u_{N}\right) +\mathbf{C}_{\left( -N\right) }\left( u_{1},..,u_{N-1},1\right) \right]$$ | |||
$$\displaystyle =$$ | $$\displaystyle \frac{1}{2^{N}}\left[ \mathbf{C}_{\left( -N\right) }\left( u_{1},..,u_{N-1},u_{N}\right) -\mathbf{C}_{\left( -N\right) }\left( u_{1},..,u_{N-1},1-u_{N}\right) +\mathbf{C}_{\left( -N\right) }\left( u_{1},..,u_{N-1},1\right) \right]$$ | ||
$$\displaystyle =$$ | $$\displaystyle \mathbf{C}^{JS}\left( u_{1},..,u_{N}\right)$$ |
$$\displaystyle \mathbf{F}\left( a+x_{1},..,a+x_{i},..,a+x_{N}\right) =\mathbf{F}\left( a+x_{1},..,\infty ,..,a+x_{N}\right) -\mathbf{F}\left( a+x_{1},..,a-x_{i},..,a+x_{N}\right) ~\forall i$$ |
$$\displaystyle \mathbf{G}\left( a+x_{1},..,a+x_{i},..,a+x_{N}\right)$$ | |||
$$\displaystyle =$$ | $$\displaystyle \sum\limits_{s=1}^{S}\omega _{s}\mathbf{F}_{s}\left( a+x_{1},..,a+x_{i},..,a+x_{N}\right)$$ | ||
$$\displaystyle =$$ | $$\displaystyle \sum\limits_{s=1}^{S}\omega _{s}\mathbf{F}\left( a+x_{1},..,\infty ,..,a+x_{N}\right) -\sum\limits_{s=1}^{S}\omega _{s}\mathbf{F}\left( a+x_{1},..,a-x_{i},..,a+x_{N}\right) ~\forall ~i$$ | ||
$$\displaystyle \equiv$$ | $$\displaystyle \mathbf{G}\left( a+x_{1},..,\infty ,..,a+x_{N}\right) -\mathbf{G}\left( a+x_{1},..,a-x_{i},..,a+x_{N}\right) ~~\forall ~i$$ |
To prove Theorem 2, we need the following lemma. Below we use M to denote the number of daily observations for the DCC model, and the total number of intra-daily observations for the HAR-type model.
$$\displaystyle \alpha _{1}\mathbf{y}_{1}+...+\alpha _{M}\mathbf{y}_{M}$$ | $$\displaystyle =$$ | $$\displaystyle \mathbf{x.}$$ | |
Premultiplying by $$\displaystyle \mathbf{x\prime }$$ gives $$\displaystyle \alpha _{1}\mathbf{x}^{\prime }\mathbf{y}_{1}+...+\alpha _{M}\mathbf{x}^{\prime }\mathbf{y}_{M}$$ | $$\displaystyle =$$ | $$\displaystyle \mathbf{x}^{\prime }\mathbf{x.}$$ |
$$\displaystyle \widehat{RCorr_{t}^{\Delta }}=\overline{RCorr_{T}^{\Delta }}\left( 1-\hat{a}-\hat{b}-\hat{c}\right) +\hat{a}\cdot RCorr_{t-1}^{\Delta }+\hat{b}\cdot \frac{1}{4}\sum_{k=2}^{5}RCorr_{t-k}^{\Delta }+\hat{c}\cdot \frac{1}{15}\sum_{k=6}^{20}RCorr_{t-k}^{\Delta }$$ |
$$\displaystyle \mathbf{C}_{ij}\left( u_{i},u_{j}\right) =\mathbf{C}\left( 1,...,1,u_{i},u_{j},1,...1\right)$$ | (35) |
$$\displaystyle \mathbf{C}_{ij}^{JS}\left( u_{i},u_{j}\right)$$ | $$\displaystyle =$$ | $$\displaystyle \mathbf{C}^{JS}\left( 1,...,1,u_{i},u_{j},1,...1\right)$$ | |
$$\displaystyle =$$ | $$\displaystyle \frac{1}{2^{N}}\sum_{j_{1}=0}^{2}\cdots \sum_{j_{N}=0}^{2}\left( -1\right) ^{R}\cdot \mathbf{C}\left( \widetilde{u}_{1},..,\widetilde{u}_{N}\right)$$ | ||
$$\displaystyle =$$ | $$\displaystyle \frac{1}{2^{N}}\sum_{k_{1}=0}^{1}\cdots \sum_{k_{i}=0}^{2}\sum_{k_{j}=0}^{2}\cdots \sum_{k_{N}=0}^{1}\left( -1\right) ^{R}\cdot \mathbf{C}\left( \widetilde{u}_{1},..,\widetilde{u}_{N}\right)$$ |
$$\displaystyle \mathbf{C}_{ij}^{JS}\left( u_{i},u_{j}\right) =\frac{1}{2^{N}}2^{N-2}\sum_{k_{i}=0}^{2}\sum_{k_{j}=0}^{2}\left( -1\right) ^{R}\cdot \mathbf{C}_{ij}\left( \widetilde{u}_{i},\widetilde{u}_{j}\right) =\frac{1}{4}\sum_{k_{i}=0}^{2}\sum_{k_{j}=0}^{2}\left( -1\right) ^{R}\cdot \mathbf{C}_{ij}\left( \widetilde{u}_{i},\widetilde{u}_{j}\right)$$ |
$$\displaystyle \mathbf{C}_{ij}^{JS}\left( u_{i},u_{j}\right) =\frac{1}{4}\left\{ 2u_{i}+2u_{j}-1+\mathbf{C}_{ij}\left( u_{i},u_{j}\right) -\mathbf{C}_{ij}\left( u_{i},1-u_{j}\right) -\mathbf{C}_{ij}\left( 1-u_{i},u_{j}\right) +\mathbf{C}_{ij}\left( 1-u_{i},1-u_{j}\right) \right\} ,$$ |
$$\displaystyle \mathbf{c}_{ij}^{JS}\left( u_{i},u_{j}\right) \equiv \frac{\partial ^{2}\mathbf{C}_{ij}^{JS}\left( u_{i},u_{j}\right) }{\partial u_{i}\partial u_{j}}=\frac{1}{4}\left\{ \mathbf{c}_{ij}\left( u_{i},u_{j}\right) +\mathbf{c}_{ij}\left( u_{i},1-u_{j}\right) +\mathbf{c}_{ij}\left( 1-u_{i},u_{j}\right) +\mathbf{c}_{ij}\left( 1-u_{i},1-u_{j}\right) \right\}$$ |
$$\displaystyle \sum_{i=1}^{N-1}E_{g\left( \mathbf{z}\right) }\left[ \log \frac{\mathbf{h}_{i}\left( Z_{i},Z_{i+1}\right) }{\mathbf{g}_{i}\left( Z_{i},Z_{i+1}\right) }\right]$$ | $$\displaystyle \leq$$ | $$\displaystyle \sum_{i=1}^{N-1}\left( E_{\mathbf{g}\left( \mathbf{z}\right) }\left[ \frac{\mathbf{h}_{i}\left( Z_{i},Z_{i+1}\right) }{\mathbf{g}_{i}\left( Z_{i},Z_{i+1}\right) }\right] -1\right)$$ | |
$$\displaystyle =$$ | $$\displaystyle \sum_{i=1}^{N-1}\left( E_{\mathbf{g}_{i}\left( z_{i},z_{i+1}\right) }\left[ \frac{\mathbf{h}_{i}\left( Z_{i},Z_{i+1}\right) }{\mathbf{g}_{i}\left( Z_{i},Z_{i+1}\right) }\right] -1\right)$$ | ||
$$\displaystyle \equiv$$ | $$\displaystyle \sum_{i=1}^{N-1}\left( \int \mathbf{g}_{i}\left( z_{i},z_{i+1}\right) \frac{\mathbf{h}_{i}\left( z_{i},z_{i+1}\right) }{\mathbf{g}_{i}\left( z_{i},z_{i+1}\right) }dz_{i}dz_{i+1}-1\right) =0$$ |
Table 1: Computation times for jointly symmetric copulas
Note: Computation times for one evaluation of the density of jointly symmetric copula based on the Clayton copula. These times are based actual computation times for a single evaluation of an N-dimension Clayton copula, multiplied by the number of rotations required to obtain the jointly symmetric copula likelihood $$ \left( 2^{N}\right) $$ or composite likelihood based on all pairs $$ \left( 2N\left( N-1\right) \right) ,$$ adjacent pairs $$ \left( 4\left( N-1\right) \right) ,$$ or a single pair $$ \left( 4\right) .$$
N=10 | N=20 | N=30 | N=50 | N=100 | |
---|---|---|---|---|---|
Full likelihood | 0.23 sec | 4 min | 70 hours | $$ 10^{6}$$ years | $$ 10^{17}$$ years |
Composite likelihood using all pairs | 0.05 sec | 0.21 sec | 0.45 sec | 1.52 sec | 5.52 sec |
Composite likelihood using adjacent pairs | 0.01 sec | 0.02 sec | 0.03sec | 0.06 sec | 0.11 sec |
Composite likelihood using first pair | 0.001 sec | 0.001 sec | 0.001sec | 0.001 sec | 0.001 sec |
Table 2: Simulation results for a jointly symmetric copula based on the Clayton copula
Note: This table presents the results from 500 simulations of jointly symmetric copula based on Clayton copula with true parameter 1. The sample size is T=1000. Four different estimation methods are used: MLE, MCLE with all pairs, MCLE with adjacent pairs, MCLE with the first pair. MLE is
infeasible for N>10 and so no results are reported in those cases. The first four columns report the average difference between the estimated parameter and its true value. The next four columns are the standard deviation in the estimated parameters. The last four columns present average run time
of each estimation method. The reported run times for MLE for N>10 are based on actual single function evaluation times and on an assumption of 40 function evaluations to reach the optimum.
N | Bias MLE | Bias $$ \underset{all}{MCLE}$$ | Bias $$ \underset{adj}{MCLE}$$ | Bias $$ \underset{first}{MCLE}$$ | Std dev MLE | Std dev $$ \underset{all}{MCLE}$$ | Std dev $$ \underset{adj}{MCLE}$$ | Std dev $$ \underset{first}{MCLE}$$ | Average Run Time (in sec) MLE | Average Run Time (in sec) $$ \underset{all}{MCLE}$$ | Average Run Time (in sec) $$ \underset{adj}{MCLE}$$ | Average Run Time (in sec) $$ \underset{first}{MCLE}$$ |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2 | -0.0027 | -0.0027 | -0.0027 | -0.0027 | 0.1176 | 0.1176 | 0.1176 | 0.1176 | 0.12 | 0.12 | 0.12 | 0.12 |
3 | -0.0019 | -0.0028 | -0.0031 | -0.0027 | 0.0798 | 0.0839 | 0.0917 | 0.1176 | 0.42 | 0.50 | 0.24 | 0.12 |
5 | -0.0014 | -0.0022 | -0.0016 | -0.0027 | 0.0497 | 0.0591 | 0.0713 | 0.1176 | 1.96 | 1.49 | 0.43 | 0.12 |
10 | -0.0051 | -0.0047 | -0.0039 | -0.0027 | 0.0293 | 0.0402 | 0.0495 | 0.1176 | 116 | 7 | 1 | 0.12 |
20 | -0.0018 | -0.0021 | -0.0027 | 0.0365 | 0.0405 | 0.1176 | 2 x 105 | 27 | 2 | 0.12 | ||
30 | -0.0036 | -0.0037 | -0.0027 | 0.0336 | 0.0379 | 0.1176 | 3 x 108 | 63 | 3 | 0.12 | ||
40 | -0.0028 | -0.0037 | -0.0027 | 0.0311 | 0.0341 | 0.1176 | 4 x 1011 | 117 | 5 | 0.12 | ||
50 | -0.0011 | -0.0014 | -0.0027 | 0.0298 | 0.0329 | 0.1176 | 5 x 1014 | 192 | 6 | 0.12 | ||
60 | -0.0007 | -0.0006 | -0.0027 | 0.0314 | 0.0332 | 0.1176 | 7 x 1017 | 256 | 7 | 0.12 | ||
70 | -0.0013 | -0.0013 | -0.0027 | 0.0306 | 0.0324 | 0.1176 | 8 x 1020 | 364 | 8 | 0.12 | ||
80 | -0.0039 | -0.0041 | -0.0027 | 0.0309 | 0.0332 | 0.1176 | 9 x 1023 | 471 | 9 | 0.12 | ||
90 | 0.0012 | 0.0013 | -0.0027 | 0.0312 | 0.0328 | 0.1176 | 9 x 1026 | 611 | 11 | 0.12 | ||
100 | -0.0006 | -0.0003 | -0.0027 | 0.0290 | 0.0305 | 0.1176 | 1 x 1030 | 748 | 12 | 0.12 |
Table 3: Simulation results for multi-stage estimation
Note: This table presents the results from 500 simulations of multi-stage estimation of the model described in Section 3.3. The sample size is T=1000 and cross-sectional dimensions are N=10, 50, and 100. The first row of each panel presents the average difference
between the estimated parameter and its true value. The second row presents the standard deviation in the estimated parameters. The third, fourth and fifth rows present the 50th, 90th and 10th percentiles of the distribution of estimated parameters, and the final row presents the difference between
the 90th and 10th percentiles.
N=10 Time/rep. is 54 sec
N=50 Time/rep. is 138 sec
N=100 Time/rep. is 329 sec
Variance Const $$ \psi _{i}$$ | Variance ARCH $$ \kappa _{i}$$ | Variance GARCH $$ \lambda _{i}$$ | Correlation DCC $$ \alpha $$ | Correlation DCC $$ \beta $$ | Marginal t dist $$ \nu _{i}$$ | Copula JS Clayton $$ \varphi $$ | |
---|---|---|---|---|---|---|---|
True value | 0.05 | 0.10 | 0.85 | 0.02 | 0.95 | 6.00 | 1.00 |
N=10 Bias | 0.0123 | 0.0007 | -0.0162 | -0.0012 | -0.0081 | 0.1926 | -0.0122 |
N=10 Std | 0.0442 | 0.0387 | 0.0717 | 0.0060 | 0.0277 | 1.1023 | 0.0650 |
N=10 Median | 0.0536 | 0.0959 | 0.8448 | 0.0184 | 0.9459 | 5.9837 | 0.9920 |
N=10 90% | 0.1027 | 0.1478 | 0.9015 | 0.0263 | 0.9631 | 7.5215 | 1.0535 |
N=10 10% | 0.0271 | 0.0580 | 0.7619 | 0.0119 | 0.9196 | 5.0559 | 0.9165 |
N=10 90-10 Diff | 0.0756 | 0.0898 | 0.1397 | 0.0144 | 0.0435 | 2.4656 | 0.1370 |
N=50 Bias | 0.0114 | 0.0012 | -0.0149 | -0.0018 | -0.0051 | 0.1880 | -0.0136 |
N=50 Std | 0.0411 | 0.0412 | 0.0687 | 0.0040 | 0.0111 | 1.0936 | 0.0390 |
N=50 Median | 0.0529 | 0.0958 | 0.8454 | 0.0179 | 0.9458 | 6.0000 | 0.9880 |
N=50 90% | 0.1019 | 0.1499 | 0.9025 | 0.0234 | 0.9580 | 7.5223 | 1.0312 |
N=50 10% | 0.0268 | 0.0567 | 0.7615 | 0.0135 | 0.9313 | 5.0454 | 0.9413 |
N=50 90-10 Diff | 0.0751 | 0.0931 | 0.1410 | 0.0098 | 0.0267 | 2.4769 | 0.0899 |
N=100 Bias | 0.0119 | 0.0017 | -0.0158 | -0.0020 | -0.0041 | 0.1813 | -0.0133 |
N=100 Std | 0.0419 | 0.0404 | 0.0691 | 0.0034 | 0.0094 | 1.0748 | 0.0362 |
N=100 Median | 0.0533 | 0.0966 | 0.8440 | 0.0177 | 0.9467 | 6.0002 | 0.9886 |
N=100 90% | 0.1025 | 0.1504 | 0.9022 | 0.0223 | 0.9566 | 7.4963 | 1.0244 |
N=100 10% | 0.0270 | 0.0576 | 0.7607 | 0.0139 | 0.9337 | 5.0492 | 0.9432 |
N=100 90-10 Diff | 0.0756 | 0.0928 | 0.1415 | 0.0084 | 0.0229 | 2.4471 | 0.0811 |
Table 4: Summary statistics and conditional mean estimates Note: Panel A presents summary statistics on the daily equity returns used in the empirical analysis. The columns present the mean and quantiles from the cross-sectional distribution of the measures listed in the rows. Panel B presents the parameter estimates for AR(1) models of the conditional means of returns. Panel C shows the number of rejections at the 5% level for tests of zero skewness, zero excess kurtosis, and zero cross-correlation for the 104 stocks under 5% level. (The total number of pairs of stocks is 5356.)
Panel A: Summary statistics
Mean | 5% | 25% | Median | 75% | 95% | |
---|---|---|---|---|---|---|
Mean | 0.0002 | -0.0006 | 0.0001 | 0.0002 | 0.0004 | 0.0006 |
Std dev | 0.0219 | 0.0120 | 0.0159 | 0.0207 | 0.0257 | 0.0378 |
Skewness | -0.0693 | -0.6594 | -0.3167 | -0.0318 | 0.1823 | 0.5642 |
Kurtosis | 11.8559 | 6.9198 | 8.4657 | 10.4976 | 13.3951 | 20.0200 |
Corr | 0.4666 | 0.3294 | 0.4005 | 0.4580 | 0.5230 | 0.6335 |
Panel B: Conditional mean
Mean | 5% | 25% | Median | 75% | 95% | |
---|---|---|---|---|---|---|
Constant | 0.0002 | -0.0006 | 0.0000 | 0.0002 | 0.0004 | 0.0006 |
AR(1) | -0.0535 | -0.1331 | -0.0794 | -0.0553 | -0.0250 | 0.0105 |
Panel C: Test for skewness, kurtosis, and correlation
# of rejections | |
---|---|
$$ H_{0}:Skew\left[ r_{it}\right] =0$$ | 3 out of 104 |
$$ H_{0}:Kurt\left[ r_{it}\right] =3$$ | 104 out of 104 |
$$ H_{0}:Corr\left[ r_{it},r_{jt}\right] =0$$ | 5356 out of 5356 |
Table 5: Conditional covariance model parameter estimates
Note: Panel A presents summaries of the estimated HAR-type models described in Section 2.2 using 5-minute returns. Panel B presents summaries of the estimated GJR-GARCH-DCC models using daily returns. The parameter estimates for variance models are summarized in the mean and quantiles from the cross-sectional distributions of the estimates. The estimates for correlation parts are reported with bootstrap standard errors which reflect accumulated estimation errors from former stages.
Panel A: HAR-type models based on 5-min returns
Variance models | Mean | 5% | 25% | Median | 75% | 95% |
---|---|---|---|---|---|---|
Constant $$ \phi _{i}^{\left( const\right) }$$ | -0.0019 | -0.0795 | -0.0375 | -0.0092 | 0.0207 | 0.1016 |
HAR day $$ \phi _{i}^{\left( day\right) }$$ | 0.3767 | 0.3196 | 0.3513 | 0.3766 | 0.3980 | 0.4414 |
HAR week $$ \phi _{i}^{\left( week\right) }$$ | 0.3105 | 0.2296 | 0.2766 | 0.3075 | 0.3473 | 0.3896 |
HAR month $$ \phi _{i}^{\left( month\right) }$$ | 0.2190 | 0.1611 | 0.1959 | 0.2146 | 0.2376 | 0.2962 |
Correlation model | Est | Std Err |
---|---|---|
HAR day $$ \left( a\right) $$ | 0.1224 | 0.0079 |
HAR week $$ \left( b\right) $$ | 0.3156 | 0.0199 |
HAR month $$ \left( c\right) $$ | 0.3778 | 0.0326 |
Panel B: DCC models based on daily returns
Variance models | Mean | 5% | 25% | Median | 75% | 95% |
---|---|---|---|---|---|---|
Constant $$ \psi _{i}\times 10^{4}$$ | 0.0864 | 0.0190 | 0.0346 | 0.0522 | 0.0811 | 0.2781 |
ARCH $$ \kappa _{i}$$ | 0.0252 | 0.0000 | 0.0079 | 0.0196 | 0.0302 | 0.0738 |
Asym ARCH $$ \zeta _{i}$$ | 0.0840 | 0.0298 | 0.0570 | 0.0770 | 0.1015 | 0.1535 |
GARCH $$ \lambda _{i}$$ | 0.9113 | 0.8399 | 0.9013 | 0.9228 | 0.9363 | 0.9573 |
Correlation model | Est | Std Err |
---|---|---|
DCC ARCH $$ \left( \alpha \right) $$ | 0.0245 | 0.0055 |
DCC GARCH $$ \left( \beta \right) $$ | 0.9541 | 0.0119 |
Table 6: Summary statistics and marginal distributions for the standardized residuals
Note: Panel A presents summary statistics of the uncorrelated standardized residuals obtained from the HAR-type model, and Panel B presents corresponding results based on the GARCH-DCC model. Panel C presents the estimates of the parameters for the marginal distribution of standardized residuals, obtained from the two volatility models. Panel D reports the number of rejections, at the 5% level, for tests of zero skewness, zero excess kurtosis, and zero cross-correlation.
Panel A: HAR standardized residuals
Mean | 5% | 25% | Median | 75% | 95% | |
---|---|---|---|---|---|---|
Mean | 0.0023 | -0.0122 | -0.0042 | 0.0016 | 0.0076 | 0.0214 |
Std dev | 1.0921 | 0.9647 | 1.0205 | 1.0822 | 1.1423 | 1.2944 |
Skewness | -0.1613 | -1.5828 | -0.4682 | -0.0837 | 0.3420 | 0.7245 |
Kurtosis | 13.1220 | 5.0578 | 6.8422 | 9.8681 | 16.0303 | 32.7210 |
Correlation | 0.0026 | -0.0445 | -0.0167 | 0.0020 | 0.0209 | 0.0502 |
Panel B: GARCH-DCC standardized residuals
Mean | 5% | 25% | Median | 75% | 95% | |
---|---|---|---|---|---|---|
Mean | 0.0007 | -0.0155 | -0.0071 | 0.0004 | 0.0083 | 0.0208 |
Std dev | 1.1871 | 1.1560 | 1.1737 | 1.1859 | 1.2002 | 1.2240 |
Skewness | -0.1737 | -1.4344 | -0.5293 | -0.0307 | 0.2628 | 0.7920 |
Kurtosis | 12.6920 | 5.0815 | 6.7514 | 10.1619 | 15.9325 | 28.8275 |
Correlation | -0.0011 | -0.0172 | -0.0073 | -0.0008 | 0.0053 | 0.0145 |
Panel C: Marginal t distribution parameter estimates
Mean | 5% | 25% | Median | 75% | 95% | |
---|---|---|---|---|---|---|
HAR | 5.3033 | 4.1233 | 4.7454 | 5.1215 | 5.8684 | 6.8778 |
DCC | 6.0365 | 4.2280 | 5.0314 | 5.9042 | 7.0274 | 8.2823 |
Panel D: Test for skewness, kurtosis, and correlation
# of rejections HAR | # of rejections DCC | |
---|---|---|
$$ H_{0}:Skew\left[ e_{it}\right] =0$$ | 4 out of 104 | 6 out of 104 |
$$ H_{0}:Kurt\left[ e_{it}\right] =3$$ | 104 out of 104 | 104 out of 104 |
$$ H_{0}:Corr\left[ e_{it},e_{jt}\right] =0$$ | 497 out of 5356 | 1 out of 5356 |
Table 7: Estimation results for the copula models
Note: This table presents the estimated parameters of four different jointly symmetric copula models based on t, Clayton, Frank, and Gumbel copulas, as well as the estimated parameter of the (standardized) multivariate t distribution as a benchmark model. The independence copula model has no parameter to estimate. Bootstrap standard errors are reported in parentheses. Also reported is the log-likelihood from the complete distribution model formed by combining the copula model with the HAR or DCC volatility model. (The MV t distribution is not based on a copula decomposition, but its joint likelihood may be compared with those from copula-based models.) The bottom row of each panel reports t-statistics for a test of no nonlinear dependence. $$ ^{* }$$ The parameter of the multivariate t distribution is not a copula parameter, but it is reported in this row for simplicity.
t | Clayton | Frank | Gumbel | Indep | MV t dist | |
---|---|---|---|---|---|---|
HAR $$ \underset{\text{(s.e.)}}{\text{Est.}}$$ | $$ \underset{\left( 4.3541\right) }{39.4435}$$ | $$ \underset{\left( 0.0087\right) }{0.0876}$$ | $$ \underset{\left( 0.0942\right) }{1.2652}$$ | $$ \underset{\left( 0.0038\right) }{1.0198}$$ | - | $$ \underset{\left( 0.1405\right) }{6.4326^{* }}$$ |
HAR log L | -282491 | -282500 | -282512 | -282533 | -282578 | -284853 |
HAR Rank | 1 | 2 | 3 | 4 | 5 | 6 |
HAR t-test of indep | 8.45 | 10.07 | 13.43 | 5.25 | - | 45.72 |
DCC $$ \underset{\text{(s.e.)}}{\text{Est.}}$$ | $$ \underset{\left( 5.4963\right) }{28.2068}$$ | $$ \underset{\left( 0.0155\right) }{0.1139}$$ | $$ \underset{\left(0.1540\right) }{1.5996}$$ | $$ \underset{\left(0.0071\right) }{1.0312}$$ | - | $$ \underset{\left( 0.3586\right) }{7.0962^{* }}$$ |
DCC log L | -289162 | -289190 | -289217 | -289255 | -289404 | -291607 |
DCC Rank | 7 | 8 | 9 | 10 | 11 | 12 |
DCC t-test of indep | 6.13 | 7.36 | 10.36 | 4.40 | - | 17.80 |
Table 8: t-statistics from in-sample model comparison tests
Note: This table presents t-statistics from pair-wise Rivers and Vuong (2002) model comparison tests introduced in Section 3.2. A positive t-statistic indicates that the model above beat the model to the left, and a negative one indicates the opposite. tJS, CJS, FJS, and GJS stand for jointly symmetric copulas based on t, Clayton, Frank, and Gumbel copulas respectively. "Indep" is the independence copula. MV t is the multivariate Student's t distribution. The upper panel includes results for models that use 5-min data and the HAR-type covariance model introduced in Section 2.2, the lower panel includes results for models based on a GARCH-DCC covariance model. $$ ^{\ast }$$ The comparisons of jointly symmetric copula-based models with the independence copula, reported in the penultimate row of the top panel, and the right half of the penultimate row of the lower panel, are nested comparisons and the Rivers and Vuong (2002) test does not apply. The t-statistics here are the same as those in Table 7. $$ ^{* }$$ The MV t density is nested in the density based on the jointly symmetric t copula, and so strictly the Rivers and Vuong (2002) test does not apply, however it is computationally infeasible to implement the formal nested test; we report the Rivers and Vuong t-statistic here for ease of reference. $$ ^{** }$$ The MV t density and the density based on the independence copula are nested only at a single point, and we apply the Rivers and Vuong (2002) test here.
HAR model tJS | HAR model CJS | HAR model FJS | HAR model GJS | HAR model Indep | HAR model MV t | GARCH-DCC model tJS | GARCH-DCC model CJS | GARCH-DCC model FJS | GARCH-DCC model GJS | GARCH-DCC model Indep | |
---|---|---|---|---|---|---|---|---|---|---|---|
HAR model tJS | - | ||||||||||
HAR model CJS | 2.92 | - | |||||||||
HAR model FJS | 2.16 | 1.21 | - | ||||||||
HAR model GJS | 5.38 | 6.02 | 1.75 | - | |||||||
HAR model Indep* | 8.45 | 10.07 | 13.43 | 5.25 | - | ||||||
HAR model MV t | 19.70$$ ^{* }$$ | 19.52 | 19.45 | 19.23 | 18.40$$ ^{** }$$ | - | |||||
GARCH-DCC model tJS | 7.86 | 7.85 | 7.85 | 7.84 | 7.82 | 6.92 | - | ||||
GARCH-DCC model CJS | 7.86 | 7.86 | 7.85 | 7.85 | 7.83 | 6.93 | 4.48 | - | |||
GARCH-DCC model FJS | 7.85 | 7.85 | 7.84 | 7.83 | 7.82 | 6.91 | 2.69 | 1.27 | - | ||
GARCH-DCC model GJS | 7.88 | 7.87 | 7.87 | 7.86 | 7.84 | 6.94 | 6.74 | 7.47 | 1.74 | - | |
GARCH-DCC model Indep* | 7.90 | 7.90 | 7.90 | 7.89 | 7.87 | 6.97 | 6.13 | 7.36 | 10.36 | 4.40 | - |
GARCH-DCC model MV t | 8.95 | 8.95 | 8.94 | 8.94 | 8.92 | 8.03 | 18.50$$ ^{* }$$ | 18.11 | 17.94 | 17.60 | 15.69$$ ^{** }$$ |
Table 9: t-statistics from out-of-sample model comparison tests
Note: This table presents t-statistics from pair-wise comparisons of the out-of-sample likelihoods of competing density forecasts based on the test of Giacomini and White (2006). A positive t-statistic indicates that the model above beat the model to the left, and a negative one indicates the opposite. tJS, CJS, FJS, and GJS stand for jointly symmetric copulas based on t, Clayton, Frank, and Gumbel copulas respectively. "Indep" is the independence copula. MV t is the multivariate Student's t distribution. The upper panel includes results for models that use 5-min data and the HAR-type covariance model introduced in Section 2.2, the lower panel includes results for models based on a GARCH-DCC covariance model.
HAR model tJS | HAR model CJS | HAR model FJS | HAR model GJS | HAR model Indep | HAR model MV t | GARCH-DCC model tJS | GARCH-DCC model CJS | GARCH-DCC model FJS | GARCH-DCC model GJS | GARCH-DCC model Indep | |
---|---|---|---|---|---|---|---|---|---|---|---|
HAR model tJS | - | ||||||||||
HAR model CJS | 1.50 | - | |||||||||
HAR model FJS | 0.89 | 0.44 | - | ||||||||
HAR model GJS | 2.88 | 3.09 | 1.21 | - | |||||||
HAR model Indep | 2.57 | 2.60 | 2.34 | 1.84 | - | ||||||
HAR model MV t | 10.75 | 10.63 | 10.65 | 10.48 | 10.00 | - | |||||
GARCH-DCC model tJS | 5.23 | 5.23 | 5.23 | 5.23 | 5.22 | 4.55 | - | ||||
GARCH-DCC model CJS | 5.23 | 5.23 | 5.23 | 5.23 | 5.22 | 4.55 | 1.55 | - | |||
GARCH-DCC model FJS | 5.23 | 5.22 | 5.23 | 5.22 | 5.21 | 4.55 | 1.79 | 1.34 | - | ||
GARCH-DCC model GJS | 5.24 | 5.24 | 5.24 | 5.23 | 5.22 | 4.56 | 2.96 | 3.31 | 0.01 | - | |
GARCH-DCC model Indep | 5.24 | 5.24 | 5.24 | 5.23 | 5.22 | 4.56 | 3.10 | 3.12 | 2.38 | 2.44 | - |
GARCH-DCC model MV t | 6.05 | 6.05 | 6.05 | 6.05 | 6.04 | 5.41 | 14.65 | 14.33 | 14.56 | 13.88 | 12.80 |
Figure 1: Iso-probability contour plots of joint distributions with standard Normal margins and various copulas the Clayton copula $$ \left ( \theta =2\right ) $$ , and its 90-, 180-, and 270-degree rotations (upper panel), and an equal-weighted average of the four Clayton copulas (lower panel).
Figure 2: Iso-probability contour plots of joint distributions with standard Normal margins and various jointly symmetric copulas.
Figure 3: Model-implied linear correlation (upper panel) and 1% quantile dependence (lower panel) for daily returns on Citi Group and Goldman Sachs, based on the HAR-type model for the conditional covariance matrix, and the jointly symmetric t copula model.
The DCC model by Engle (2002) decomposes the conditional covariance matrix
$$ \mathbf{H}_{t}$$ as:
$$\displaystyle \mathbf{H}_{t}$$ | $$\displaystyle =$$ | $$\displaystyle \mathbf{D}_{t}\mathbf{R}_{t}\mathbf{D}_{t}$$ | (2) |
where $$\displaystyle \mathbf{D}_{t}$$ | $$\displaystyle =$$ | $$\displaystyle diag\left( \left\{ \sqrt{\sigma _{i,t}^{2}}\right\} _{i=1}^{N}\right)$$ | (3) |
$$\displaystyle \mathbf{R}_{t}$$ | $$\displaystyle =$$ | $$\displaystyle diag\left( \mathbf{Q}_{t}\right) ^{-1/2}\mathbf{Q}_{t}diag\left( \mathbf{Q}_{t}\right) ^{-1/2}$$ | (4) |
where $$\displaystyle \mathbf{Q}_{t}$$ | $$\displaystyle =$$ | $$\displaystyle \left( 1-\alpha -\beta \right) \overline{\mathbf{Q}}+\alpha \left( \mathbf{\varepsilon }_{t-1}\mathbf{\varepsilon }_{t-1}^{\prime }\right) +\beta \mathbf{Q}_{t-1}$$ | (5) |
and $$\displaystyle \mathbf{\varepsilon }_{t}$$ | $$\displaystyle =$$ | $$\displaystyle \mathbf{D}_{t}^{-1}\left( \mathbf{r}_{t}-\mathbf{\mu }_{t}\right)$$ | (6) |
$$\displaystyle \sigma _{i,t}^{2}=\psi _{i}+\kappa _{i}\left( r_{i,t-1}-\mu _{i,t-1}\right) ^{2}+\zeta _{i}\left( r_{i,t-1}-\mu _{i,t-1}\right) ^{2}1_{\left\{ \left( r_{i,t-1}-\mu _{i,t-1}\right) <0\right\} }+\lambda _{i}\sigma _{i,t-1}^{2}$$ | (7) |
Engle (2002) suggests estimating the model above using Gaussian quasi-maximum likelihood, and we follow this for the volatility estimation stage. For the DCC estimation stage, Engle, et al. (2008) find that when N is large the estimates of $$ \alpha $$ and $$ \beta $$ may be biased due to the impact of estimation error from estimating $$ \overline{\mathbf{Q}}$$ and they suggest the composite likelihood based estimator based on bivariate likelihoods. We follow their suggestion and use composite likelihood for this stage in Section 4.2 and 5.
For added intuition, consider the bivariate case. Theorem 1 then shows that the jointly symmetric copula CDF is:
$$\displaystyle \mathbf{C}^{JS}\left( u_{1},u_{2}\right)$$ | $$\displaystyle =$$ | $$\displaystyle \frac{1}{4}\sum_{k_{1}=0}^{2}\sum_{k_{2}=0}^{2}\left( -1\right) ^{R}\cdot \mathbf{C}\left( \widetilde{u}_{1},\widetilde{u}_{2}\right)$$ | |
$$\displaystyle =$$ | $$\displaystyle \frac{1}{4}[\mathbf{C}\left( 1,1\right) +\mathbf{C}\left( 1,u_{2}\right) -\mathbf{C}\left( 1,1-u_{2}\right)$$ | ||
$$\displaystyle +\mathbf{C}\left( u_{1},1\right) \mathbf{+C}\left( u_{1},u_{2}\right) -\mathbf{C}\left( u_{1},1-u_{2}\right)$$ | |||
$$\displaystyle -\mathbf{C}\left( 1-u_{1},1\right) -\mathbf{C}\left( 1-u_{1},u_{2}\right) +\mathbf{C}\left( 1-u_{1},1-u_{2}\right) ]$$ | |||
$$\displaystyle =$$ | $$\displaystyle \frac{1}{4}[\mathbf{C}\left( u_{1},u_{2}\right) -\mathbf{C}\left( u_{1},1-u_{2}\right) -\mathbf{C}\left( 1-u_{1},u_{2}\right) +\mathbf{C}\left( 1-u_{1},1-u_{2}\right) +2u_{1}+2u_{2}-1]$$ |
$$\displaystyle \mathbf{c}^{JS}\left( u_{1},u_{2}\right) =\frac{1}{4}\left[ \mathbf{c}\left( u_{1},u_{2}\right) +\mathbf{c}\left( 1-u_{1},u_{2}\right) +\mathbf{c}\left( u_{1},1-u_{2}\right) +\mathbf{c}\left( 1-u_{1},1-u_{2}\right) \right]$$ |
The CDF of a jointly symmetric copula constructed via rotations can also be expressed more compactly using the multinomial formula. (We thank Bruno Rémillard for suggesting the following.)
Let $$\displaystyle \left( \mathbf{u}_{\mathcal{A}}\right) _{i}=\left\{ \begin{array}{cc} u_{i}, & i\in \mathcal{A}^{c} \\ 1-u_{i}, & i\in \mathcal{A}\end{array}\right.$$ , $$\displaystyle i=1,2,..,N$$ |
Then
$$\displaystyle \mathbf{C}^{JS}\left( \mathbf{u}\right)$$ | $$\displaystyle =$$ | $$\displaystyle \frac{1}{2^{N}}\sum\limits_{\mathcal{A}\subseteq \left\{ 1,...,N\right\} }\Pr \left[ \mathbf{U}_{\mathcal{A}}\leq \mathbf{u}\right] =\frac{1}{2^{N}}\sum\limits_{\mathcal{A}}E\left[ \prod\limits_{i\in \mathcal{A}^{c}}\mathbf{1}\left\{ U_{i}\leq u_{i}\right\} \prod\limits_{i\in \mathcal{A}}\left( 1-\mathbf{1}\left\{ U_{i}\leq 1-u_{i}\right\} \right) \right]$$ | |
$$\displaystyle =$$ | $$\displaystyle \frac{1}{2^{N}}\sum\limits_{\mathcal{A}}\sum\limits_{\mathcal{B}\subseteq \mathcal{A}}\left( -1\right) ^{\left\vert \mathcal{B}\right\vert }E\left[ \prod\limits_{i\in \mathcal{A}^{c}}\mathbf{1}\left\{ U_{i}\leq u_{i}\right\} \prod\limits_{i\in \mathcal{B}}\mathbf{1}\left\{ U_{i}\leq 1-u_{i}\right\} \right]$$ | ||
$$\displaystyle =$$ | $$\displaystyle \frac{1}{2^{N}}\sum\limits_{\mathcal{A}}\sum\limits_{\mathcal{B}\subseteq \mathcal{A}}\left( -1\right) ^{\left\vert \mathcal{B}\right\vert }\mathbf{C}\left( \mathbf{u}_{\mathcal{B},\mathcal{A}}\right)$$ | ||
where $$\displaystyle \left( \mathbf{u}_{\mathcal{B},\mathcal{A}}\right) _{i}$$ | $$\displaystyle =$$ | $$$\left\{ \begin{array}{cc} 1-u_{i}, & i\in \mathcal{B} \ u_{i}, & i\in \mathcal{A}^{c} \ 1, & i\in \mathcal{A}\backslash \mathcal{B}\end{array}\right. \text{, \ \ }i=1,2,...,N.$$$ |
Table A1: 104 Stocks used in the empirical analysis
Note: This table presents the ticker symbols and names of the 104 stocks used in the empirical analysis of this paper.
Ticker | Name |
---|---|
AA | Alcoa |
AAPL | Apple |
ABT | Abbott Lab. |
AEP | American Elec |
ALL | Allstate Corp |
AMGN | Amgen Inc. |
AMZN | Amazon.com |
AVP | Avon |
APA | Apache |
AXP | American Ex |
BA | Boeing |
BAC | Bank of Am |
BAX | Baxter |
BHI | Baker Hughes |
BK | Bank of NY |
BMY | Bristol-Myers |
BRKB | Berkshire Hath |
C | Citi Group |
CAT | Caterpillar |
CL | Colgate |
CMCSA | Comcast |
COF | Capital One |
COP | Conocophillips |
COST | Costco |
CPB | Campbell |
CSCO | Cisco |
CVS | CVS |
CVX | Chevron |
DD | DuPont |
DELL | Dell |
DIS | Walt Disney |
DOW | Dow Chem |
DVN | Devon Energy |
EBAY | Ebay |
EMC | EMC |
EMR | Emerson Elec |
ETR | Entergy |
EXC | Exelon |
F | Ford |
FCX | Freeport |
FDX | Fedex |
GD | General Dynam |
GE | General Elec |
GILD | Gilead Science |
GOOG | Google Inc |
GS | Goldman Sachs |
HAL | Halliburton |
HD | Home Depot |
HNZ | Heinz |
HON | Honeywell |
HPQ | HP |
IBM | IBM |
INTC | Intel |
JNJ | JohnsonJ. |
JPM | JP Morgan |
KFT | Kraft |
KO | Coca Cola |
LLY | Lilly Eli |
LMT | Lock'dMartn |
LOW | Lowe's |
MCD | MaDonald |
MDT | Medtronic |
MET | Metlife Inc. |
MMM | 3M |
MO | Altria Group |
MON | Monsanto |
MRK | Merck |
MS | MorganStanley |
MSFT | Microsoft |
NKE | Nike |
NOV | National Oilwell |
NSC | Norfolk South |
NWSA | News Corp |
ORCL | Oracle |
OXY | Occidental Petrol |
PEP | Pepsi |
PFE | Pfizer |
PG | Procter Gamble |
QCOM | Qualcomm Inc |
RF | Regions Fin |
RTN | Raytheon |
S | Sprint |
SBUX | Starbucks |
SLB | Schlumberger |
SLE | Sara Lee Corp. |
SO | Southern Co. |
SPG | Simon property |
T | AT&T |
TGT | Target |
TWX | Time Warner |
TXN | Texas Inst |
UNH | UnitedHealth |
UNP | Union Pacific |
UPS | United Parcel |
USB | US Bancorp |
UTX | United Tech |
VZ | Verizon |
WAG | Walgreen |
WFC | Wells Fargo |
WMB | Williams Co |
WMT | WalMart |
WY | Weyerhauser |
XOM | Exxon |
XRX | Xerox |
Table A2: Simulation results for a jointly symmetric copula based on the Gumbel copula
Note: This table presents the results from 500 simulations of jointly symmetric copula based on Gumbel copula with true parameter 2. The sample size is T=1000. Four different estimation methods are used: MLE, MCLE with all pairs, MCLE with adjacent pairs, MCLE with the first pair. MLE is infeasible
for N>10 and so no results are reported in those cases. The first four columns report the average difference between the estimated parameter and its true value. The next four columns are the standard deviation in the estimated parameters. The last four columns present average run time of each
estimation method. The reported run times for MLE for N>10 are based on actual single function evaluation times and on an assumption of 40 function evaluations to reach the optimum.
N | Bias MLE | Bias $$ \underset{all}{MCLE}$$ | Bias $$ \underset{adj}{MCLE}$$ | Bias $$ \underset{first}{MCLE}$$ | Std dev MLE | Std dev $$ \underset{all}{MCLE}$$ | Std dev $$ \underset{adj}{MCLE}$$ | Std dev $$ \underset{first}{MCLE}$$ | Average Run Time (in sec) MLE | Average Run Time (in sec) $$ \underset{all}{MCLE}$$ | Average Run Time (in sec) $$ \underset{adj}{MCLE}$$ | Average Run Time (in sec) $$ \underset{first}{MCLE}$$ |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2 | -0.0016 | -0.0016 | -0.0016 | -0.0016 | 0.0757 | 0.0757 | 0.0757 | 0.0757 | 0.30 | 0.13 | 0.13 | 0.13 |
3 | -0.0021 | -0.0018 | -0.0023 | -0.0016 | 0.0484 | 0.0508 | 0.0583 | 0.0757 | 0.71 | 0.43 | 0.29 | 0.13 |
5 | -0.0041 | -0.0025 | -0.0025 | -0.0016 | 0.0368 | 0.0409 | 0.0470 | 0.0757 | 3.52 | 1.31 | 0.53 | 0.13 |
10 | -0.0021 | -0.0023 | -0.0016 | -0.0016 | 0.0245 | 0.0328 | 0.0369 | 0.0757 | 153 | 6 | 1 | 0.13 |
20 | -0.0019 | -0.0021 | -0.0016 | 0.0285 | 0.0312 | 0.0757 | 3 x 105 | 25 | 2 | 0.13 | ||
30 | -0.0019 | -0.0022 | -0.0016 | 0.0277 | 0.0297 | 0.0757 | 5 x 108 | 61 | 4 | 0.13 | ||
40 | -0.0019 | -0.0022 | -0.0016 | 0.0270 | 0.0285 | 0.0757 | 7 x 1011 | 97 | 5 | 0.13 | ||
50 | -0.0024 | -0.0027 | -0.0016 | 0.0269 | 0.0283 | 0.0757 | 7 x 1014 | 166 | 7 | 0.13 | ||
60 | -0.0021 | -0.0023 | -0.0016 | 0.0267 | 0.0282 | 0.0757 | 9 x 1017 | 236 | 8 | 0.13 | ||
70 | -0.0022 | -0.0024 | -0.0016 | 0.0264 | 0.0276 | 0.0757 | 1 x 1021 | 326 | 9 | 0.13 | ||
80 | -0.0022 | -0.0023 | -0.0016 | 0.0262 | 0.0272 | 0.0757 | 1 x 1024 | 435 | 11 | 0.13 | ||
90 | -0.0021 | -0.0022 | -0.0016 | 0.0262 | 0.0272 | 0.0757 | 1 x 1027 | 509 | 11 | 0.13 | ||
100 | -0.0020 | -0.0021 | -0.0016 | 0.0261 | 0.0272 | 0.0757 | 1 x 1030 | 664 | 13 | 0.13 |