The Usefulness of Cross-sectional Dispersion for Forecasting Aggregate Stock Price Volatility ∗
Transcription
The Usefulness of Cross-sectional Dispersion for Forecasting Aggregate Stock Price Volatility ∗
The Usefulness of Cross-sectional Dispersion for Forecasting Aggregate Stock Price Volatility∗ Sungje Byun† November, 2014 Abstract Does cross-sectional dispersion in the returns of different stocks help forecast aggregate stock volatility? This paper develops a model of stock returns where dispersion in returns across different stocks is modeled jointly with aggregate volatility. Although specifications that allow for feedback from cross-sectional dispersion to aggregate volatility have a better fit in sample, they prove not to be robust for purposes of out-of-sample forecasting. Using a full cross-section of stock returns jointly, however, I find that use of cross-sectional dispersion can help improve parameter estimates of a GARCH process for aggregate volatility to generate better forecasts both in sample and out of sample. Given this evidence, I conclude that cross-sectional information helps predict market volatility indirectly rather than directly entering in the data-generating process. *Keywords: Stock market volatility, Cross-sectional dispersion, Estimation of large panel data, Forecasting accuracy ∗ I am very grateful to James D. Hamilton, Allan Timmermann, Valerie Ramey, Alexis Toda and Thomas Baranga for their helpful comments and suggestions. I also thank the participants of the UCSD macroeconomic workshop and summer empirical macroeconomics lunch seminar. † Economics Department, University of California at San Diego, [email protected] 1 1 Introduction Modeling and forecasting volatility is an important task and a popular research agenda in financial markets. Volatility models play key roles in academic literature for testing the fundamental tradeoff between risk and return of financial assets and for investigating causes and consequences of the volatility dynamics in the economy. Volatility forecasts have many practical applications as well. For example, volatility forecasts are used for market timing decisions, portfolio selections, risk managements and pricings of financial derivatives such as options and forward contracts since risks are measured by the volatility of the financial asset returns. Given its importance, there are a growing number of models and approaches for forecasting volatility in the financial assets. Since Engle (1982) and Bollerslev (1986), Autoregressive Conditional Heteroskedasticity (ARCH) models are the most popular by formulating the volatility forecasts of a return as a function of known variables. Adopting the specific functional form and/or alternative explanatory variables in the volatility forecasts, there have been numerous extensions of ARCH models focusing on highlighted characteristics such as volatility persistence, asymmetry, long memory properties and a leptokurtic distribution of financial asset returns. While earlier researchers use variation over time in variables of interests, crosssectional information began to be recognized as an important source for improving volatility forecasts. Campbell et al. (2001) and Connor et al. (2006) suggested cross-sectional dispersion across individuals as an important source of individual stock volatility1 . In a similar context, Hwang and Satchell (2005) tested whether cross-sectional dispersion helps forecast volatility of individual stock returns using 1 The contribution of cross-sectional dispersion is referred to as firm-specific (idiosyncratic) volatility in Campbell et al. (2001) and common heteroscedasticity in asset specific returns in Connor et al. (2006). 2 the so-called GARCH-X models, where X refers to “cross-sectional dispersion”. Although better specified, they found a trivial improvement from GARCH-X for outof-sample volatility forecasts, concluding that GARCH-X models do not necessarily outperform than GARCH models in forecasting individual stock volatility. In this paper, I investigate potential channels by which cross-sectional information might help predict aggregate volatility. I develop a model of individual returns that could be applied to the study of volatility in any financial assets, though the interest in this paper is in stock-market volatility. The approach includes a GARCH-X model as a special case in which measures of cross-sectional dispersion appear in the equation for predicting aggregate volatility, an approach previously investigated by Hwang and Satchell (2005). I find that although such GARCH-X specifications can improve in-sample forecasting accuracy, cross-sectional dispersion does not appear to be useful for out-of-sample forecasting. Next, I investigate another channel in which cross-sectional information helps predict aggregate volatility by providing accurate parameter estimates. After jointly modeling the full cross-section of individual stock returns, I estimate population parameters in the bivariate GARCH process for aggregate volatility and cross-sectional dispersion. While sharing the same GARCH process with univariate GARCH, I find improved forecasting accuracies that are statistically significant both in sample and out of sample for all nine loss criteria considered. Using Giacomini and White (2006)’s conditional predictive ability tests, I show that by jointly utilizing cross-sectional information, it also provides more accurate out-of-sample volatility forecasts in times of recessions as well as during bear markets. I conclude that crosssectional information helps predict market volatility indirectly insofar as it helps to obtain accurate parameter estimates for volatility forecasts. 3 This paper is organized as follows. The following section describes a model of stock returns and clarifies the relation between models of aggregate stock volatility and the volatility of individual returns. In Section 3, I investigate two potential channels whereby cross-sectional information could improve volatility forecasts of the market index return: 1) cross-sectional dispersion as an additional explanatory variable as in GARCH-X, 2) cross-sectional dispersion as an aid in parameter estimation when individual stock returns are jointly modeled. Section 4 provides robustness checks by using the Hwang and Satchell (2005)’s measure of cross-sectional dispersion. I also consider alternative measures for cross-sectional dispersion within the GARCH-X model, all confirming results in Section 3. Lastly, Section 5 concludes. 2 Model Let ri,t denote the monthly return on individual stock i measured in percent. For example, ri,t = −1.5 means that stock i fell 1.5% from month t−1 to t. My interest is in characterizing aggregate market volatility as measured by some weighted average of individual returns,2 Nt−1 rt = X wi,t−1 ri,t (1) i=1 where wi,t−1 is a predetermined weight of a stock i0 s return in the evolution of the −1 stock index return in period t. For example, wi,t−1 = Nt−1 for an equal-weighted index and wi,t−1 = pi,t−1 si,t−1 / NP t−1 pj,t−1 sj,t−1 for a value weighted index where j=1 pi,t−1 is stock i’s price and si,t−1 is its number of outstanding shares at time t − 1. One approach would be to fit a univariate GARCH-X(1,1) model to the aggregate 2 When a stock i is newly added in the stock index in period t, there is no contribution of a stock i’s return ri,t on the stock index return rt . 4 return: rt = φ0 + φ1 rt−1 + ut , (2) ut = σt · εt , (3) εt ∼ i.i.d. (0,1), 2 σt2 = $ + αu2t−1 + βσt−1 + πxt−1 . (4) Here xt−1 is a measure of the cross-sectional dispersion of stocks at time t − 1. Note that equation (4) includes the standard univariate GARCH(1,1) as a special case when π = 0. One question is what measure of dispersion to use for xt−1 and what kind of model for individual stock returns would be consistent with a process like (4) for aggregate returns. Having an explicit answer to the latter question will also clarify the way in which data on individual stock returns might be helpful for estimating the parameters of equation (4). Consider an AR(1) forecasting model of stock i’s return of the form, ri,t = φ0i + φ1i ri,t−1 + ui,t . (5) Let vt be a shock to the level of stock returns, distributed as N 0, σt2 conditional on available observations. Denoting κt a separate shock, the forecasting error of stock i’s return (ui,t ) is modeled as ui,t = λi,t−1 [vt + κt ηi,t ] , (6) where λi,t−1 denotes a predetermined loading of stock i on the aggregate shock 5 vt , κt governs the cross-section dispersion of stock returns,3 and ηi,t is a stock i’s idiosyncratic forecasting error that is a martingale difference sequence with unit variance 2 E ηi,t |Ft−1 = 1, where Ft−1 denotes all observed variables through t − 1. One special case of interest comes from the idea that small stocks tend to be more risky, for which I might specify λi,t−1 as, λi,t−1 = λi Nt−1 wi,t−1 (7) where λi captures the different degree of a stock i responding to the two shocks vt and κt . Conditional on λi , a stock with below-average weight (wi,t−1 < 1/Nt−1 ) is treated in (7) as being more exposed to the aggregate shock as well as having a higher weight on the dispersion shock κt . Support for such a specification can be found for examples in the results of Schwert and Seguin (1990) and Ang et al. (2006). Since time variation in the cross-sectional dispersion in this model is driven by the value of κt , I would use κ2t−1 in place of xt−1 in (4) if I had direct observations available on κt−1 . Note that the aggregate forecasting error ut is related to the errors in forecasting 3 Connor et al. (2006) and Jones (2001) suggested a strong commonality in asset specific volatilities, so that the average squared asset-specific return across a large number of stocks varies over time. 6 individual returns ui,t through the identity Nt−1 ut = X wi,t−1 ui,t (8) i=1 Nt−1 = vt X Nt−1 wi,t−1 λi,t−1 + κt X i=1 wi,t−1 λi,t−1 ηi,t i=1 For Nt large, it is reasonable to assume that NP t−1 wi,t−1 λi,t−1 → 1. For example, i=1 with a constant-sized sample of equal-weighted stocks, wi,t = 1/N and N −1 N P λi = i=1 1 by construction,4 in which case NP t−1 wi,t−1 λi,t−1 would always exactly equal 1. i=1 Likewise for Nt−1 large, it would typically be the case that NP t−1 (wi,t−1 λi,t−1 )2 → 0; i=1 for the above example NP t−1 (wi,t−1 λi,t−1 )2 = N −2 i=1 as N −1 N P i=1 N P i=1 λ2i which goes to zero as long 2 λ2i converges to some finite constant λ . Hence Nt−1 X 2 E (wi,t−1 λi,t−1 ηi,t ) i=1 implying κt NP t−1 Nt−1 = X (wi,t−1 λi,t−1 )2 → 0 i=1 p wi,t−1 λi,t−1 ηi,t → 0. Thus when Nt−1 is large the aggregate return i=1 ut gives a direct observation on the common shock vt : p lim ut = vt (9) Nt →∞ Consider the simple case for the equal-weighted index return (wi,t = N −1 for ∀i and ∀t). Then, a consistent estimate of λi for each i can be obtained by a univariate regression of stock i on the bOLS = equal-weighted index;, ui,t = λi · ut + ei,t estimated by OLS for t = 1, . . . , T . It is clear that λ i P −1 P P T T N bOLS 2 −1 = 1. A modification to the value-weighted index i=1 λi t=1 ut t=1 ui,t ut , satisfying N is same except that a consistent estimate of λi is obtained by regressing Nt−1 wi,t−1 ui,t on ut . 4 7 Note further from (6) that Nt−1 X Nt−1 wi,t−1 (ui,t − λi,t−1 vt )2 = κ2t i=1 X wi,t−1 (λi,t−1 ηi,t )2 i=1 → κ2t · λ 2 p provided NP t−1 i=1 2 wi,t−1 λ2i,t → λ . Hence under these conditions, I could use the magni- tude Nt−2 c2t−1 = X wi,t−2 (ui,t−1 − λi,t−2 ut−1 )2 (10) i=1 directly for xt−1 in (4) to explore whether cross-sectional dispersion at time t − 1 as summarized by the value of κ2t−1 contributes to aggregate market volatility σt2 . Note that the above cross-sectional dispersion differs from cross-sectional market volatility of Hwang and Satchell (2005) in two ways5 . First, I use individual forecasting errors instead of individual stock returns. More importantly, this formulation allows heterogeneous responses of individual stocks to the common market shock through the term λi,t−2 . One can fit a GARCH-X model to the aggregate return with the measure of crosssectional dispersion given in (10). This provides a natural test of whether crosssectional dispersion directly enters into the data-generating process for aggregate volatility. 3 Results In this section, I provide empirical evidence for the role of cross-sectional dispersion in predicting market volatility. After describing the sample, I provide empirical evi5 Two minor differences are 1) time-varying Nt−2 , 2) predetermined individual weigths wi,t−2 as opposed to wi,t−1 in Hwang and Satchell (2005). Section 4.1. provides robustness checks using crosssectional market volatility of Hwang and Satchell (2005). 8 dence of improved specifications in GARCH-X models by allowing for feedback from cross-sectional dispersion to aggregate volatility, showing better in-sample forecasting performance. While GARCH-X with cross-sectional dispersion proves not to be robust for purposes of out-of-sample forecasts, I show that use of cross-sectional dispersion can help improve parameter estimates of a GARCH process for aggregate volatility by using individual stock returns jointly, generating better forecasts both in sample and out of sample. 3.1 Additional explanatory variable in GARCH The dataset contains individual stocks in CRSP value-weighted index for three major U.S. markets such as NYSE, ASE and NASDAQ. I use prices and the number of outstanding shares of individuals stocks for calculating returns and weights of individual stocks. Since individual returns do not include cash dividends, the aggregate volatility of my interests is equivalent to the volatility in monthly CRSP value-weighted index without cash dividends6 (henceforth, value-weighted index). There are total 21, 523 individual stocks in the dataset while the number of stocks in the value-weighted index (Nt ) varies over time7 . Summary statistics of individual stock returns and historical changes in the number of stocks are provided in Appendix A.1. I also use daily CRSP value-weighted returns for calculating realized variance as a proxy for latent volatility, which is going to be discussed later in this section. The time period for the empirical analysis ranges from February 1954 (t = 1) to December 2013 (t = 719), corresponding to 719 monthly periods for 6 In CRSP dataset, acronyms are “vwretx” for value-weighted index return, and “retx” for that of an individual stock return. More precisely, I use the adjusted stock price and the adjusted number of shares outstanding for calculating individual stock returns for capturing effects from stock events except the cash-dividend. 7 The number of stocks in the value-weighted index is 8,402 (maximum), 5,093 (median), and 989 (minimum). 9 volatility forecasts. To begin with, I describe the empirical procedure for constructing cross-sectional dispersion using individual stocks. First, for each individual stock i, I calculate a monthly return ri,t for t = 1, . . . , 719, and estimate an individual forecasting error ui,t by regressing a stock i’s return (ri,t ) on a constant and its lagged return (ri,t−1 ) for t = 2, . . . , 719. Second, a stock i’s weight in period t − 1, wi,t−1 is calculated as wi,t−1 = pi,t−1 si,t−1 / PN j=1 pj,t−1 sj,t−1 , where pi,t−1 is a price and si,t−1 is the number of outstanding shares of a stock i in perod t − 1 for t = 1, . . . , 719. Third, an aggregate forecasting error ut is calculated from (8) given ui,t and wi,t−1 for t = 1, . . . , 719 and for all i. Assuming λi = 1 for ∀i for simplicity,8 I obtain a measure for cross-sectional dispersion provided by (10). Figure 1 plots historical cross-sectional volatility, that is the square root of c2t in (10). Notice that cross-sectional volatility measures average percentage deviation of individual forecasting errors from their aggregates, and reflects individual stocks’ heterogeneity in two ways. First, it takes into account different degrees by which individual stocks respond to the aggregate shock provided through the term λi,t−1 . Second, each stock’s squared deviation contributes to the evolution of cross-sectional volatility proportional to its relative share in the stock index given by wi,t−1 . The shaded areas represent the ten NBER recession periods in the sample. Three points are worth noting9 . First, cross-sectional dispersion itself is time-varying and also exhibits high persistence with a few clusterings, as documented in Hwang and Satchell 8 This restriction will be relaxed in the next revision. In principle, λi is obtained from a univariate regression of Nt−1 wi,t−1 ui,t on ut for t = 1, . . . , 719. One empirical difficulty associated with time-varying Nt is that the average of estimated λi is not necessarily equal to 1, violating the condition for internal consistency. For dealing with this empirical issue, one could normalize the average λi at each period t. After obtaining OLS estimates for λi , I define λ∗i,t = λi if a stock i appears in the index, and 0 otherwise, ei,t = λ∗ / N −1 PNt λ∗ so that the internal consistency is guaranteed by and replace λi in (7) by λ i,t t i=1 i,t construction. 9 Though not reported, the cross-sectional volatility measured from CRSP equal-weighted index return exhibits upward trend as is consistent with the observation in Campbell et al. (2001). 10 (2005) and Connor et al. (2006). Second, I find increasing cross-sectional dispersion during recession periods, indicating its potential role in helping to predict market volatility during recession periods. Lastly, cross-sectional dispersion is larger in magnitude than time-series market volatility (See Figure 2 for realized volatility), confirming the observation in Hwang and Satchell (2005). The contemporaneous correlation between cross-sectional volatility and realized volatility is 0.3318, providing a rationale for considering the GARCH-X model with cross-sectional dispersion. Given cross-sectional dispersion, I fit a GARCH-X model to CRSP value-weighted index return as described by (2), (3) and (4). The second column of Table 1 displays maximum likelihood estimates and asymptotic standard errors10 for model parameters. For comparison, the third column reports those from a univariate GARCH(1,1) model with π = 0 in (4). In the numerical estimation procedure, the first observations (r1 and c21 ) are given and the initial value for aggregate volatility (σ12 ) is jointly estimated with other model parameters although not reported. While all parameter estimates for GARCH-X except π are statistically significant at any conventional size, I find that the statistical significance for the coefficient estimate of cross-sectional dispersion (π) is low; p-value for the two-sided hypothesis test is 0.47, indicating the contribution of cross-sectional dispersion in predicting market volatility is rather low. The likelihood ratio test of H0 : π = 0 has a p-value of 0.42,11 confirming the weak evidence for the role of cross-sectional dispersion in predicting aggregate market volatility. Using in-sample volatility forecasts implied by parameter estimates in Table 1, I evaluate forecasting performance by comparing forecasting accuracies of volatility 10 Asymptotic standard errors are estimated by approximating the second derivative of the log-likelihood functions at maximum likelihood estimates. See details for numerical MLE estimation and calculation of asymptotic standard errors in Hamilton (1994) pp. 133-148. 11 1 degree of freedom log-likelihood ratio test statistics is 2 × {−2, 040.41 − (−2, 040.73)} = 0.6391. 11 forecasts from GARCH-X and GARCH models. I adopt realized variance as a proxy for latent volatility following Brailsford and Faff (1996), Hansen and Lunde (2006) 2 and Patton (2011)12 . Denoting by σRV,t realized variance in period t, it is measured by aggregating squared daily index returns within month t as 2 σRV,t = mt X 2 rd,t (11) d=1 where rd,t is a daily CRSP value-weighted index return at day d of month t and mt is a number of trading days in month t. Figure 2 plots realized volatility, which is a square root of realized variance in (11). Since observations for historical realized variance (or volatility) are widely documented in the earlier literature,13 I suppress further explanations and proceed to the evaluation of volatility forecasts using realized variance. Table 2 reports average losses under nine loss criteria that has been used in the literature. I provide definitions for each loss criteria at the second column. Column 3 and 4 report average losses of in-sample volatility forecasts from GARCH-X and GARCH models. Last column reports the percentage differences in forecasting accuracies of GARCH-X relative to GARCH, where for each loss function, the nagative difference implies that GARCH-X has smaller average losses than GARCH on average. I find that GARCH-X yields lower average losses under eight out of nine loss criteria, and the largest improvement in forecasting accuracies is found under MSE-LOG loss fuction: GARCH-X yields 2.50% smaller average losses compared to the univariate GARCH model. 12 Appendix C.1 provides the forecasting performance evaluation using an alternative proxy for latent volatility such as squared return. In general, results are similar to those with realized variance. 13 For example, Poon and Granger (2003) provide explanations for market volatility including definition, measurement and stylized facts about financial market volatility. 12 For investigating whether cross-sectional dispersion is useful for purposes of volatility forecasting in practice, I proceed to the evaluation of out-of-sample volatility forecasts. This is intended to address the potential over-fitting issue insofar as improved forecasting accuracy in GARCH-X may be provided by introducing an additional parameter in the data-generating process. I adopt a rolling fixed estimation period method14 following Brownlees et al. (2012). With 6 years of estimation window, I fit models to a sample of 6 years, generate one-step ahead volatility forecasts and drop the oldest observation from the sample when adding the new data. I repeat this process and evaluate the performance of 645 monthly out-of-sample forecasts from April 1960 to December 2013. Similarily, I also generate out-of-sample forecasts using 12 years of estimation window, evaluating 573 monthly forecasts from April 1966 to December 2013. Table 3 compares out-of-sample forecasting accuracies of GARCH-X and GARCH models under 6 and 12 years of estimation window size. For each loss function, column 2 and 3 (5 and 6) report average losses for out-of-sample volatility forecasts from GARCH-X and GARCH when one–step-ahead out-of-sample forecasts are obtained from parameter estimates using 6 years (12 years) observations. Column 4 and 7 report the percentage differences in forecasting accuracies of GARCH-X relative to univariate GARCH models. Here, results are contrary to the comparison of in-sample volatility forecasts in Table 2. Average losses of GARCH-X are larger than GARCH for eight out of nine loss functions with 6 years of estimation window, and for all loss functions with 12 years of estimation window. For statistical inference for average loss differentials, I perform tests for equal 14 Many researchers documented a few merits for using a rolling fixed estimation period method among alternatives beside the ease of statistical inference. For example, Dunis et al. (2001), Giacomini and White (2006) and Brownlees et al. (2012) noted that a rolling estimation period method is robust in the presence of nonstationarity. West and Cho (1995) showed that its forecasting accuracy is no worse than an expanding sample window method. 13 forecasting accuracy suggested by Diebold and Mariano (1995) and West (1996) (henceforth, DMW). Denoting by dt the loss differential among competing forecasts in period t, an asymptotic pairwise test statistic for testing the null hypothesis of no difference in the forecasting accuracy is given by, DM W = where d = T −1 T P d avar d dt is the sample mean loss differentials and avar d is asymptotic t=1 variance of loss differentials. Following standard practice, I obtain a consistent esti mate for avar d by taking a weighted sum of the available sample autocovariances using a Bartlett kernel. For each loss function, the statistical significance of DMW test statistics is denoted by using an asterisk on the percentage differences in column 4 and 8. With 6 years of estimation window, I find that average losses of GARCH-X are larger than GARCH for eight out of nine loss functions, where average loss differentials for five loss functions such as MSE, MSE-SD, MSE-prop, MAE and MAE-SD are statistically significant at 10%. With 12 years of estimation window, it becomes even worse: average losses of GARCH-X are larger than GARCH for all loss criteria and six loss differentials are statistically significantly larger at 10%. So far, I investigate one potential channel by which cross-sectional information might help predict aggregate volatility. As an additional explanatory variable in GARCH process, cross-sectional dispersion helps predict aggregate volatility in sample under some loss criteria although the statistical significance for the coefficient estimate of cross-sectional dispersion (π) is low. However, I find that such improvement in volatility forecasts is not robust for purposes of out-of-sample fore- 14 casting, indicating that cross-sectional dispersion does not enter the data-generating process directly. 3.2 Aid in parameter estimation Next, I investigate an alternative possibility that cross-sectional information improves volatility forecasts. A convenient model for incorporating cross-sectional information is the factor-ARCH model developed in Engle et al. (1990). Although this model has been used in hundreds of studies, it has not been successfully applied to a cross-section of thousands of stocks due to computational difficulties15 . In this section, I model a bivariate GARCH process for the aggregate volatility (σt2 ) and the cross-sectional dispersion (κ2t ), and estimate model parameters by jointly using a full cross-section of stock returns. Using the same dataset containing prices and outstanding shares of stocks in monthly CRSP value-weighted index, I estimate parameters in the following model: ui,t = ri,t = φ0i + φ1i ri,t−1 + ui,t , (12) 1 κt σ t εt + ηi,t , Nt−1 wi,t−1 Nt−1 wi,t−1 (13) εt , ηi,t ∼ i.i.d. (0,1), σt2 u2t−1 2 σt−1 β11 0 $1 α11 α12 + = + c2t−1 0 β22 κ2t−1 α21 α22 κ2t $2 (14) 15 To my knowlegde, the largest number of cross-sectional observations used within the factor-ARCH model is 50 in Engle and Sheppard (2008), where authors evaluate the performance of the class of covariance models including factor GARCH, restricted vector GARCH, dynamic conditional correlation GARCH models and extensions of these models. 15 where $1 , $2 > 0 and α11 , α12 , α21 , α22 , β11 , β22 ≥ 0 are model parameters. c2t−1 is a lagged cross-sectional dispersion measure provided in (10). Four points are worth noting. First, the equation (13) is a special case of the earlier model given by (6) and (7) for the purpose of estimating the common volatility across individuals. With a particular restriction such as λi = 1 for ∀i in (13), individual forecasting errors are treated as if they are affected by the aggregate market and idiosyncratic shocks differently only through weight differentials16 . Second, aggregate volatility process in (13) is same as the GARCH-X model in (4) with $1 = $, α11 = α, α12 = π, β11 = β. Hence this suggests the statistical test for the direct role of cross-sectional dispersion in predicting aggregate volatility by testing H0 : α12 = 0. Third, cross-sectional dispersion process is estimated along with aggregate volatility process, where α12 and α21 capture the dynamic dependence between two volatility processes. Lastly, the above bivariate GARCH model includes Panel-ARCH provided by Byun and Jo (2014) as a special case: with κt = τ for ∀t in (13), Panel-ARCH estimates a univariate ARCH process for the aggregate volatility with α12 = β11 = 0 in (14). In such a case, τ captures the time-series average of cross-sectional dispersion across a large number of cross-sectional observations. See also Byun and Jo (2014) for the estimation of quarterly profit uncertainty using industry-level sales revenues. Appendix B.1 describes the empirical procedure for estimating parameters in the above model. Table 4 displays maximum likelihood estimates and asymptotic standard errors for model parameters. Parameter estimates from the bivariate GARCH process (14) are reported in the second column (Full Model). For comparison, the preceding columns report parameter estimates from restricted models such as α21 = 0 (Model 16 This restriction will be relaxed in the next revision. 16 1) and α12 = α21 = 0 (Model 2). Parameter estimates for individual stocks such as φ0i and φ1i , are not reported as they are obtained separately by a univariate regression for each i. Four points are worth noting. First, parameter estimates for describing aggregate volatility process such as $1 , α11 and β11 are quantitatively similar to those of univariate GARCH reported in the third column in Table 1. Second, the statistical significance for the coefficient estimate of cross-sectional dispersion (α12 ) is low: p-value for the two-sided hypothesis test is 0.44, confirming the weak evidence for the direct contribution of cross-sectional dispersion in predicting aggregate market volatility. This further implies the weak dynamic dependence between two volatility processes when it is combined with statistically insignificant α21 17 . Third, crosssectional volatility process is shown to be non-stationary in sample: the persistence of the process implied from coefficient estimates for α22 and β22 is 1.0695, which is greater than 1. Lastly, given the weak dynamic dependence, Model 2 is sufficient for describing the bivariate GARCH process for aggregate volatility and cross-sectional dispersion: 1 degree of freedom likelihood ratio test statistic for H0 : α21 = 0 is 0.6804 with p-value being 0.41. In other words, both Full Model and Model 1 do not improve the specification for the bivariate GARCH model in statistical sense. Next, I evaluate forecasting performance by comparing forecasting accuracies of volatility forecasts from univariate GARCH and above bivariate GARCH models. More specifically, I compare forecasting accuracies of volatility forecasts from Model 1 and Model 2 with those from univariate GARCH. Though not providing additional prediction power in sample, I include Model 1 in the comparison of the forecasting performance in order to capture the possibility that cross-sectional dispersion 17 For testing H0 : α21 = 0, p-value is 1 from the two-sided hypothesis test as well as the likelihood ratio test by comparing maximized log-likelihood values between the full model and the model 1. 17 could improve volatility forecasts out of sample when cross-sectional stock returns are jointly used for parameter estimation, which differs from the previous exercise using GARCH-X. Furthermore, it also enables me to infer the relative size of the direct contribution from the cross-sectional dispersion in Model 1 by comparing the forecasting performance of Model 1 with Model 2. It is because the improved forecasting accuracies of Model 2 relative to univariate GARCH can be viewed as being obtained indirectly by using cross-sectional stock returns jointly. Table 5 compares the forecasting accuracies of two bivariate GARCH models. For comparison, column 2 reports average losses of in-sample volatility forecasts from univariate GARCH displayed in Table 2. While column 3 and 5 report average losses from Model 1 and Model 2 respectively, the adjacent columns report the percentage differences in forecasting accuracies from Model 1 (column 4) and Model 2 (column 7) respectively. There are two lines of empirical evidence supporting the improved in-sample forecasting accuracies by utilizing cross-sectional information. First, I find improved forecasting accuracies from both Model 1 and Model 2 across all loss criteria. In particular, the improved forecasting accuracies from Model 2 provides the evidence on the indirect contribution of cross-sectional information when cross-sectional stock returns are jointly used for estimating model parameters. Second, I find that Model 1 provides larger improvements in forecasting accuracies than Model 2, confirming the enhanced in-sample forecasting accuracies by directly using cross-sectional dispersion as an additional explanatory variable in the aggregate volatility process. From Model 1, the largest improvement is found under MSE-SD loss function, yielding 6.49% smaller average losses compared to univariate GARCH. Under MSE-LOG loss criteria of which GARCH-X provides the largest improvement in forecasting accuracies by 2.50% (See the last column in 18 Table 2), Model 1 has 5.57% smaller average losses than univariate GARCH, that is larger in magitude compared to GARCH-X. Table 6 compares forecasting accuracies of one-step-ahead out-of-sample volatility forecasts from above two models when using 6 years of estimation window. Again, I report the percentage differences relative to forecasts from GARCH as well as the statistical significance of DMW equal predictability test statistics using asterisks. In contrast to the failure of GARCH-X in Table 3, I find the improved forecasting accuracies from the bivariate GARCH models. By jointly using crosssectional stock returns for estimating model parameters, Model 2 has statistically significantly smaller average losses than GARCH for all loss functions at 10%18 . When extending Model 2 by including cross-sectional dispersion in the aggregate volatility process, however, Model 1 has statistically significant improvements under three loss functions such as QLIKE, MSE-prop and MAE-prop19 . Table 7 provides the comparison of out-of-sample forecasting accuracies under 12 years of estimation window. While confirming improved forecasting accuracies by jointly using cross-sectional stock returns, I find that the improved forecasting accuracies originate both from the direct and indirect contribution of cross-sectional dispersion. To see this, recall that GARCH-X fails to provide accurate out-of-sample volatility forecasts when using 12 years of estimation window in Table 3. By jointly using cross-sectional stock returns for estimating parameters in the volatility process, Model 2 yields statistically smaller average losses than GARCH under seven among nine loss functions. Furthermore, the additional improvements can be found 18 One exception is MAE-LOG loss function, where Model 2 still has smaller average losses than GARCH. 19 The potential explanation can be found from the non-stationarity of the cross-sectional dispersion during the sample period. Although cross-sectional dispersion is moderately correlated with aggregate market volatility (correlation is 0.3 in the sample), it undermines the explanatory power of the crosssectional dispersion, especially when using short estimation window. 19 from Model 1 by including cross-sectional dispersion as an additional explanatory variable in the aggregate volatility process. Model 1 has smaller average losses (equivalently larger differences in absolute value) than Model 2 under most loss criteria expect QLIKE and MSE-prop. Here Model 1 has the statistically significantly smaller average losses than GARCH under eight out of nine loss functions. These contrast with the failure of GARCH-X in predicting accurate volatility forecasts out of sample. I further explore improved forecasting abilities provided by the cross-sectional information by testing whether the bivariate GARCH Models also provides more accurate volatility forecasts in particular periods when accurate volatility forecasts are of great interest. More specifically, I focus on the second channel of the indirect contribution provided by using a full cross-section of stock returns, and test whether Model 2 outperforms univariate GARCH more during recessions when accurate volatility forecasts are of great interest. Let dt be the loss differential between Model 2 and GARCH for predicting one-step-ahead out-of-sample volatility forecasts in period t. Using an indicator variable for NBER recession periods ItR , I perform tests for conditional predictive ability developed in Giacomini and White (2006) for testing H0 : E [dt |Ft−1 ] = 0, which contrasts to H0 : E [dt ] = 0 in DMW equal (unconditional) predictability tests. Let ht be a vector of variables that are thought to be important for relative R forecast performance; hR t ≡ 1, dt , It 0 in this case. Given the conditional moment restriction, 3 degrees of freedom Wald-type test statistic (GW) is provided by, 0 b −1 GW = (T − 1) Z Ω Z 20 where Z ≡ (T − 1)−1 PT −1 t=1 0 −1 PT −1 R R b hR t dt+1 and Ω ≡ (T − 1) t=1 ht dt+1 × ht dt+1 is a 3 × 3 matrix that consistently estimates the variance of hR t dt+1 . Under the null hypothesis, the test statistic is asymptotically chi-squared distributed with 3 degrees of freedom. Table 8 reports GW statistics using hR t with 6 years of (column 2) and 12 years of estimation windows (column 3) respectively. During NBER recession periods, I find that Model 2 provides more accurate out-of-sample volatility forecasts with 6 years of estimation window (column 2), that are statistically significant under eight loss criteria except MSE-prop. With a larger estimation window such as 12 years (column 3), Model 2 provides statistically significantly accurate forecasts under seven loss criteria except MSE and MSE-prop. The preceding two columns report GW statistics for testing whether it outperforms univariate GARCH more during bear markets. I use another indicator variable ItN for periods with negative market N . Results are similar to returns, and calculate GW statistics with hN t ≡ 1, dt , It those using NBER recession periods. I find that Model 2 yields more accurate forecasts during periods of negative stock returns, providing conditionally accurate volatility forecasts that are statistically significant under most loss criteria. Although the forecasting equation is the same as univariate GARCH, I find the improved forecasting performance by jointly using cross-sectional information: volatility forecasts from Model 2 are more accurate than those from univariate GARCH both in sample and out of sample. In particular, Model 2 provides more accurate out-of-sample volatility forecasts in times of recessions as well as during bear markets. The potential explanation for such improvement comes from the basic insight in Stock and Watson (2002) that when the number of cross-sectional observations is 21 large, any aggregate factors can be uncovered essentially perfectly using the cross section. By jointly using the full cross section of stock returns, one can come up with better estimates of the population parameters. In other words, cross-sectional dispersion might help to estimate parameters in the aggregate volatility process. To see this, consider the log-likelihood function of bivariate GARCH described 2 by (12), (13) and (14). Let w ei,t−1 ≡ Nt−1 wi,t−1 with wi,t−1 being a stock i’s weight in (1). For expositional simplicity, I define an alternative measure of cross-sectional dispersion using w ei,t−1 in parallel with (10). It becomes in period t, Nt−1 e c2t = X i=1 w ei,t−1 1 ut ui,t − Nt−1 wi,t−1 2 (15) which is a special case of (10) with λi = 1 for ∀i and with replacing weights relevant for squared deviations of individuals by w ei,t−1 . Using (8) and (15), the closed-form log-likelihood of bivariate GARCH is given by, T X 1 Nt−1 1 Nt−1 1 2 2 L = − ×e ct + 2 × ut (16) log 2π − log Jt − 2 2 2 κ2t κt + Nt−1 σt2 t=1 N −1 QNt−1 where Jt ≡ κ2t + Nt−1 σt2 × κ2t t−1 × i=1 (wi,t−1 Nt−1 )−2 is a determinant of a Nt−1 × Nt−1 variance matrix of individual forecasting errors. The above log-likelihood function (16) shows that cross-sectional information enters in the log-likelihood function through e c2t , helping squared errors (u2t ) to estimate model parameters. In particular, it is weighted by 1/κ2t , proportional to an inverse of conditional cross-sectional volatility. Lastly, I provide the log-likelihood of univariate GARCH, of which only squared forecasting errors are used for parameter 22 estimation, T X 1 1 1 u2t 2 L = − log 2π − log σt − × 2 2 2 2 σt t=1 4 (17) Robustness checks In this section, I provide robustness checks for the empirical analysis in Section 3.1. First, I revisit tests for the role of cross-sectional dispersion in predicting aggregate volatility using 1) GARCH-cross-sectional (GARCH-XC) model and 2) cross-sectional market volatility measure, suggested in Hwang and Satchell (2005)20 . Next, I address concerns for highly dispersed cross-sectional returns and corresponding cross-sectional kurtosis by considering alternative cross-sectional dispersion measures. 4.1 Revisit Hwang and Satchell (2005) Hwang and Satchell (2005) investigate whether cross-sectional dispersion can improve conditional heteroskedastic models for volatilities in the individual stock returns. While squared market returns are highly noisy to be used in the GARCH forecasting model, they propose to use dispersion of individual stock returns with respect to the market return, namely, cross-sectional market volatility21 : Nt−1 2 σC,mt−1 = X wi,t−2 (ri,t−1 − rt−1 )2 (18) i=1 20 For expositional simplicity, I deviate from Hwang and Satchell (2005): GARCH-XC in this paper corresponds to GARCHX in Hwang and Satchell (2005). For notational consistency throughout this paper, I also use $, α, β and π in place of αi,0 , αi,1 , αi,2 and αi,3 respectively. 21 From the cross-sectional market volatility in Hwang and Satchell (2005), I made two adjustments for reflecting the time-varying number of stocks (Nt−1 ), and for making weights of individual stocks to be predetermined (wi,t−2 ) in period t − 1. While the former adjustment is crucial in the analysis, I find that the effect from the latter adjustment is trivial. 23 where wi,t−2 is a weight of an individual stock i in period t − 2. Figure 3 plots historical cross-sectional market volatility, that is the square root 2 of σC,mt−1 in (18). Note that the cross-sectional market volatility becomes smaller in size compared to the cross-sectional dispersion in Figure 1. This is mainly due to the misspecification in (18): for large Nt , it fails to take into account different degrees by which individual stocks respond to the aggregate market return. Furthermore, the cross-sectional market volatility exhibits peak during the burst of dot-com bubble in early 2001. This contrasts to the previous observation in Connor et al. (2006) documenting the hightened idiosyncratic volatility during the stock market crash in October 1987. For addressing the restrictive nature of GARCH-X model,22 Hwang and Satchell (2005) propose GARCH-cross-sectional (GARCH-XC) model by excluding the con2 stant coefficient from GARCH-X in (4). After replacing xt−1 by σC,mt−1 , it becomes 2 2 σt2 = αu2t−1 + βσt−1 + πσC,mt−1 (19) where π > 0 and α, β ≥ 0. With the cross-sectional market volatility and the GARCH-XC model proposed by Hwang and Satchell (2005), I test the direct role of cross-sectional market volatility in predicting aggregate volatility under two specifications. After replacing xt−1 2 by the cross-sectional market volatility σC,mt−1 , I fit a GARCH-X to CRSP value- weighted index return as described by (2), (3) and (4), and I also fit a GARCH-XC provided by (2), (3) and (19). Coefficient estimates are qualitatively similar to those 22 The non-negativity conditions for the GARCH-X model are $ > 0, α, β, π ≥ 0, that are frequently violated in empirical applications. Though $ ≤ 0, Hwang and Satchell (2005) noted that PT 2 T −1 t=1 $ + πσC,mt−1 is likely to be positive, finding that the conditional volatility process becomes to be always positive under this condition. 24 reported in Table 1. In particular, the coefficients estimates for the cross-sectional market volatility (π) are statistically significant at 1% level when using GARCH-XC as well as GARCH-X models, confirming the improved volatility specification due to the cross-sectional market volatility. Table 9 reports average losses of volatilty forecasts from GARCH-X and GARCHXC models when using cross-sectional market volatility. While column 2 and 3 report average in-sample losses from two models, column 4 and 5 (column 6 and 7) report average losses of out-of-sample volatility forecasts from GARCH-X and GARCH-XC models when using 6 years (12 years) of estimation window. Boldface entries represent more accurate volatility forecasts from GARCH-X (or GARCHXC) than GARCH for each loss function. For out-of-sample volatilty forecasts, the statistical significance of the equal predictive abilitity (DMW) test is denoted by a asterisk. Results are consistent with those in Table 2 and Table 3. When including cross-sectional market volatility in the aggregate volatility process, I find the improved forecasting accuracies in sample from both GARCH-X and GARCH-XC models. However, I confirm the weak evidence for the role of the cross-sectional market volatility in predicting aggregate volatility out-of-sample. 4.2 Alternative measures In this section, I provide robustness checks by using alternative cross-sectional dispersion measures within the GARCH-X model. This comes from the recognition in Hwang and Satchell (2005) that cross-sectional returns are highly dispersed, especially for a large number of stocks considered. For investigating whether empirical results are affected by cross-sectional kurtosis, I consider three alternative measures: 1) 5% trimmed estimator, 2) cross-sectional dispersion across 10 CRSP Cap-Based 25 Portfolios and 3) squared interquartile range. While cross-sectional dispersion across 10 CRSP Cap-Based Portfolios is self-explanatory, the other two alternatives are 1/2 constructed as follows. Let yi,t−1 ≡ wi,t−2 (ui,t−1 − λi,t−1 ut−1 ) for t = 2, . . . , T . Rewriting the equation for cross-sectional dispersion in (10) as, Nt−1 c2t−1 = X 2 yi,t−1 i=1 Then, the 5% trimmed estimator is obtained from the above equation by discarding 5% extreme yi,t−1 from both tails at each period t − 1. Similarily, the squared interquartile range estimator is calculated by squaring interquartile range of yi,t−1 at each period t − 1. Figure 4 plots cross-sectional volatility from three alternative measures, displayed as a square root of cross-sectional dispersion. This confirms that alternative measures are less volatile than the baseline measure in Figure 1. I estimate a GARCH-X model again, this time replacing c2t−1 in (10) by the above three alternatives. To conserve on space, I do not report parameter estimates that are generally similar to those reported in Table 1. However, it is worth noting that none of coefficient estimates of cross-sectional dispersion (π) are statistically significant at 10% when using three alternative measures23 . Table 10 reports average losses of in-sample volatility forecasts and boldface entries indicate smaller average losses than univariate GARCH for each loss function. Using alternative cross-sectional dispersion measures, I find smaller average losses under all loss criteria except MSE-prop when using 5% trimmed estimator, confirming the improved in-sample forecasting accuracy in Table 2. 23 Using 5% trimmed estimator, the coefficient estimate of cross-sectional dispersion (π) is 0.0012 with asymptotic standard errors being 0.0015. With estimators using 10 CRSP Cap-Based Portfolios and squared interquartile range, estimates are close to 0. 26 Table 11 reports average losses of out-of-sample forecasts using three alternative measures. As before, I denote out-of-sample volatility forecasts with smaller average losses than univariate GARCH using boldface character, where the statistical significance of DMW test statistics is denoted by a asterisk. Under 6 years of estimation window, I find the statistically significantly smaller losses from 5% trimmed estimator for MAE-prop, and from interquartile range estimator for four loss functions such as QLIKE, MSE-LOG, MAE-LOG and MAE-prop among ten cases with smaller losses than univariate GARCH. Under 12 years of estimation window, however, I find that evidence for improved forecasting accuracy becomes to be weaker; CRSP Cap-Based Portfolio estimator yields the statistically smaller losses only for MAE-LOG and interquartile range estimator provides the statistically smaller losses for two loss functions such as MSE-LOG and MAE-LOG. To sum up, I confirm the improved forecasting performance when using three alternative measures in sample. However, I also confirm the weak evidence for crosssectional dispersion being helpful in forecasting volatility out-of-sample, implying that cross-sectional dispersion does not directly enter the data-generating process given by GARCH-X. 5 Conclusion This paper investigates the role of cross-sectional information in predicting aggregate volatility. Given a large number of individual stocks, I develop a model of stock returns by reflecting a natural idea that individual stocks respond to the common aggregate shock at different degrees. The model is simple, but it also provides a natural measure for cross-sectional dispersion whose effects on the stock mar- 27 ket volatility and cyclical variations in macroeconomic variables have been popular research topics. Using individual stocks in the CRSP value-weighted index from 1954 to 2013, I test the direct contribution of cross-sectional dispersion in predicting stock market volatility. Although helpful for in-sample volatility forecasts, GARCH-X with crosssectional dispersion fails to provide more accurate out-of-sample volatility forecasts than GARCH. In other words, I provide empirical evidence that cross-sectional dispersion does not enter the data-generating process for market volatility. I further explore another possibility for cross-sectional dispersion contributing to accurate estimates for model parameters. Using full cross-section of individual stocks jointly, I estimate parameters in the bivariate GARCH model of aggregate volatility and cross-sectional dispersion. I find that the cross-sectional dispersion improves the accuracies of aggregate volatility forecasts both in sample and out of sample, and for all nine loss criteria. Furthermore, out-of-sample volatility forecasts from the bivariate GARCH model are shown to be more accurate than those from GARCH in times of NBER recessions as well as during the periods with negative stock index returns. Given improved forecasting accuracies when using cross-sectional stock returns jointly, I conclude that cross-sectional dispersion does help predict volatility forecasts indirectly by helping to estimate parameters. On the other hand, empirical evidence from GARCH-X indicates that it does not enter in the data-generating process directly. 28 References Ang, A., R. J. Hodrick, Y. Xing, and X. Zhang (2006): “The cross-section of volatility and expected returns,” Journal of Finance, 61, 259–299. Bollerslev, T. (1986): “Generalized autoregressive conditional heteroskedasticity,” Journal of econometrics, 31, 307–327. Brailsford, T. J. and R. W. Faff (1996): “An evaluation of volatility forecasting techniques,” Journal of Banking and Finance, 20, 419–438. Brownlees, C., R. Engle, and B. Kelly (2012): “A practical guide to volatility forecasting through calm and storm,” Journal of Risk, 14, 3. Byun, S. and S. Jo (2014): “Heterogeneity in the dynamic effects of uncertainty on investment,” University of California, San Diego Working paper. Campbell, J. Y., M. Lettau, B. G. Malkiel, and Y. Xu (2001): “Have individual stocks become more volatile? An empirical exploration of idiosyncratic risk,” Journal of Finance, 56, 1–43. Connor, G., R. A. Korajczyk, and O. Linton (2006): “The common and specific components of dynamic volatility,” Journal of Econometrics, 132, 231– 255. Day, T. E. and C. M. Lewis (1992): “Stock market volatility and the information content of stock index options,” Journal of Econometrics, 52, 267–287. Diebold, F. X. and R. S. Mariano (1995): “Comparing predictive accuracy,” Journal of Business and Economic Statistics, 13, 253–263. Dunis, C. L., J. Laws, and S. Chauvin (2001): “The use of market data and model combination to improve forecast accuracy,” Development in Forecasts Combination and Portfolio Choice (Wiley, Oxford). Engle, R. and K. Sheppard (2008): “Evaluating the specification of covariance models for large portfolios,” New York University, working paper. Engle, R. F. (1982): “Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation,” Econometrica, 987–1007. Engle, R. F., V. K. Ng, and M. Rothschild (1990): “Asset pricing with a factor-ARCH covariance structure: Empirical estimates for treasury bills,” Journal of Econometrics, 45, 213–237. Franses, P. H. and D. Van Dijk (1996): “Forecasting stock market volatility using (nonlinear) GARCH models,” Journal of Forecasting, 229–235. Giacomini, R. and H. White (2006): “Tests of conditional predictive ability,” Econometrica, 74, 1545–1578. Hamilton, J. D. (1994): Time series analysis, vol. 2, Princeton university press Princeton. 29 Hansen, P. R. and A. Lunde (2006): “Consistent ranking of volatility models,” Journal of Econometrics, 131, 97–121. Hwang, S. and S. E. Satchell (2005): “GARCH model with cross-sectional volatility: GARCHX models,” Applied Financial Economics, 15, 203–216. Jones, C. S. (2001): “Extracting factors from heteroskedastic asset returns,” Journal of Financial economics, 62, 293–325. Pagan, A. R. and G. W. Schwert (1990): “Alternative models for conditional stock volatility,” Journal of Econometrics, 45, 267–290. Patton, A. J. (2011): “Volatility forecast comparison using imperfect volatility proxies,” Journal of Econometrics, 160, 246–256. Poon, S.-H. and C. W. Granger (2003): “Forecasting volatility in financial markets: A review,” Journal of Economic Literature, 41, 478–539. Schwert, G. W. and P. J. Seguin (1990): “Heteroskedasticity in stock returns,” Journal of Finance, 45, 1129–1155. Stock, J. H. and M. W. Watson (2002): “Forecasting using principal components from a large number of predictors,” Journal of the American statistical association, 97, 1167–1179. West, K. D. (1996): “Asymptotic inference about predictive ability,” Econometrica, 1067–1084. West, K. D. and D. Cho (1995): “The predictive ability of several models of exchange rate volatility,” Journal of Econometrics, 69, 367–391. 30 Figure 1 : Cross-sectional dispersion 160 cross−sectional volatility (%) 140 120 100 80 60 40 20 0 54 58 62 66 70 74 78 82 86 90 94 98 02 06 10 time period Figure 1 plots historical cross-sectional volatility which is a square root of cross-sectional dispersion across individual stocks following (10). Shaded areas represent NBER recession periods. The cross-sectional dispersion is shown to be time-varying and highly persistent. The cross-sectional dispersion exhibits peak during the stock market crash in October 1987, that is commonly observed from alternative measures considered in earlier literature. 31 Figure 2 : Realized volatility 25 realized volatility (%) 20 15 10 5 0 54 58 62 66 70 74 78 82 86 90 94 98 02 06 10 time period Figure 2 plots historical realized volatility which is a square root of realized variance calculated by aggregating CRSP daily value-weighted index returns following (11). Shaded areas represent NBER recession periods. 32 Figure 3 : Cross-sectional market volatility cross−sectional market volatility (%) 25 20 15 10 5 0 54 58 62 66 70 74 78 82 86 90 94 98 02 06 10 time period Figure 3 plots cross-sectional market volatility proposed by Hwang and Satchell (2005). The plotted is the square root of the cross-sectional market volatility calculated from (18). Shaded areas represent NBER recession periods. 33 Figure 4 : Alternative cross-sectional dispersion 5% Trimmed 150 100 50 0 54 58 62 66 70 74 78 82 86 90 94 98 02 06 10 90 94 98 02 06 10 90 94 98 02 06 10 CRSP Cap−Based Portfolios 60 40 20 0 54 58 62 66 70 74 78 82 86 Interquartile range 2 1 0 54 58 62 66 70 74 78 82 86 Figure 4 plots alternative cross-sectional dispersion measures. Top panel displays cross-sectional volatility after removing 5% extreme observations from both tails. Middle panel displays crosssectional volatility constructed from 10 CRSP Cap-Based Portfolios. Bottom panel displays interquartile range in (10). 34 Table 1: Parameter estimates GARCH-X GARCH Parameters MLE (s.e.) MLE (s.e.) $ 1.1848 (0.5104) 1.0554 (0.4355) α 0.1070 (0.0306) 0.1176 (0.0287) β 0.8206 (0.0412) 0.8318 (0.0354) π 0.0004 (0.0005) φ0 0.7197 (0.1500) 0.7298 (0.1485) φ1 0.0629 (0.0402) 0.0634 (0.0402) Likelihood −2, 040.41 −2, 040.73 Table 1 reports MLE estimates (asymptotic standard errors) of model parameters in GARCH-X and GARCH models. Asymptotic standard errors are estimated by approximating the second derivative of the log-likelihood functions at MLE estimates. The last row reports the maximized log-likelihood values under two volatility forecasting models. 35 Table 2: Comparison of in-sample forecasting accuracy Criteria 2 L (σRV , σ2) M SE 2 (σRV − σ2) GARCH-X GARCH Difference (%) 1, 303.23 1, 312.00 −0.67 −1 3.77 3.78 −0.19 2 − log σ 2 ) (log σRV 2 0.81 0.84 −2.50 (σRV − σ)2 4.03 4.12 −2.19 2 −1 2.76 2.75 0.42 2 − σ2| |σRV 13.57 13.74 −1.19 2 − log σ 2 | |log σRV 0.75 0.75 −1.05 M AE − SD |σRV − σ| 1.41 1.43 −1.23 M AE − prop σ RV − 1 σ 0.69 0.69 −0.87 QLIKE M SE − LOG M SE − SD M SE − prop M AE M AE − LOG 2 σRV σ2 − log σRV σ2 2 2 σRV σ2 Table 2 provides the comparison of in-sample volatility forecasts from GARCH-X and GARCH models. As a proxy for an unobservable volatility, historical realized variance is calculated from the daily CRSP value-weighted index return without cash dividends. Column 2 provides definitions of loss functions for measuring volatility forecasting accuracies. Next two columns report average losses of in-sample volatility forecasts from GARCH-X and GARCH models. Last column reports the percentage differences in forecasting accuracies of GARCH-X relative to GARCH, where the negative difference implies that GARCH-X has smaller average losses than GARCH. 36 Table 3: Comparison of out-of-sample forecasting accuracy 6 year window Criteria 12 year window GARCH-X GARCH Difference (%) GARCH-X GARCH Difference (%) 1, 556.23 1, 419.27 9.65∗∗∗ 1, 743.40 1, 593.86 9.38∗ QLIKE 3.96 3.95 0.24 4.04 4.00 0.81∗ M SE − LOG 0.96 0.95 1.25 0.88 0.85 3.20 M SE − SD 5.30 4.85 9.26∗∗∗ 5.57 5.02 10.88∗∗∗ M SE − prop 6.37 5.04 26.43∗ 4.97 3.60 38.10∗ M AE 16.26 15.40 5.56∗∗ 17.42 16.29 6.99∗∗ M AE − LOG 0.80 0.80 −0.05 0.78 0.77 0.99 M AE − SD 1.60 1.56 2.30∗ 1.66 1.60 3.58∗ M AE − prop 0.87 0.87 0.06 0.83 0.81 2.92 M SE Table 3 provides the comparison of out-of-sample volatility forecasts from GARCH-X and GARCH models. A rolling fixed estimation period method was used for calculating out-of-sample volatility forecasts under two estimation window size: 6 years (column 2-3) and 12 years (column 5-6). Column 4 and 7 report the percentage differences in forecasting accuracies of GARCH-X relative to univariate GARCH. While the negative difference implies that GARCH-X has smaller average losses than univariate GARCH, asterisk represents the statistical significance of DMW equal predictability test suggested by Diebold and Mariano (1995) and West (1996). Given critical values being 1.28 (90%), 1.65 (95%) and 2.33 (99%) respectively, */**/*** represent the statistical significance at 90%, 95% and 99% respectively. 37 Table 4: Parameter estimates Full Model Model 1 Model 2 Parameters MLE (s.e.) MLE (s.e.) MLE (s.e.) $1 1.2661 (0.5288) 1.2662 (0.5287) 1.1594 (0.4725) $2 98.5860 (0.5757) 98.5860 (0.5329) 98.5858 (0.5329) α11 0.1043 (0.0338) 0.1043 (0.0338) 0.1207 (0.0296) α12 0.0004 (0.0005) 0.0004 (0.0005) − − α21 0.0000 (0.0319) − − − − α22 0.2933 (0.0014) 0.2933 (0.0012) 0.2933 (0.0012) β11 0.8145 (0.0412) 0.8145 (0.0412) 0.8211 (0.0381) β22 0.7762 (0.0009) 0.7762 (0.0008) 0.7762 (0.0008) Likelihood −24, 164, 572.03 −24, 164, 572.03 −24, 164, 572.37 Table 4 reports MLE estimates (asymptotic standard errors) of the bivariate GARCH model. Asymptotic standard errors are estimated by approximating the second derivative of the loglikelihood functions at MLE estimates. For comparison, it also reports estimation results under two restricted models: α21 = 0 (Model 1) and α12 = α21 = 0 (Model 2). The last row reports the maximized log-likelihood values under three models. 38 Table 5: Comparison of in-sample forecasting accuracy Criteria GARCH Model 1 Difference (%) Model 2 Difference (%) M SE 1, 312.00 1, 265.80 −3.52 1, 282.10 −2.28 QLIKE 3.78 3.76 −0.55 3.77 −0.29 M SE − LOG 0.84 0.79 −5.57 0.81 −2.78 M SE − SD 4.12 3.86 −6.49 3.98 −3.55 M SE − prop 2.75 2.70 −2.10 2.74 −0.49 M AE 13.74 13.20 −3.88 13.44 −2.17 M AE − LOG 0.75 0.73 −2.98 0.74 −1.72 M AE − SD 1.43 1.38 −3.72 1.40 −2.11 M AE − prop 0.69 0.67 −3.23 0.68 −1.86 Table 5 provides the comparison of in-sample volatility forecasts from univariate GARCH and two bivariate GARCH models provided in Section 3.2.. As a proxy for an unobservable volatility, historical realized variance is calculated from the daily CRSP value-weighted index return without cash dividends. Column 2 reports average losses of in-sample volatility forecasts from GARCH provided in Table 2. Column 3 and 5 report average losses of in-sample volatility forecasts from two bivariate GARCH models, and adjacent columns report the percentage differences in forecasting accuracies of two bivariate GARCH models relative to univariate GARCH model, where the negative difference implies that the bivariate GARCH model has smaller average losses than GARCH. 39 Table 6: Comparison of out-of-sample volatility forecasts (6 years) Criteria Model 1 Difference (%) Model 2 Difference (%) M SE 1, 388.92 −2.14 1, 358.07 −4.31∗∗∗ QLIKE 3.92 −0.84∗∗ 3.92 −0.94∗∗∗ M SE − LOG 0.95 −0.55 0.92 −3.73∗ M SE − SD 4.76 −1.74 4.58 −5.59∗∗∗ M SE − prop 4.03 −20.15∗∗ 4.07 −19.17∗∗ M AE 15.45 0.27 15.02 −2.51∗∗∗ M AE − LOG 0.80 0.23 0.79 −1.07 M AE − SD 1.57 0.24 1.54 −1.77∗∗ M AE − prop 0.82 −5.55∗∗∗ 0.83 −5.32∗∗∗ Table 6 provides average losses of out-of-sample volatility forecasts from two bivariate GARCH models introduced in Section 3.2. Given out-of-sample volatility forecasts using 6 years of estimation window, column 2 (column 4) reports average losses of one-month-ahead volatility forecasts from Model 1 (Model 2). Column 3 (column 5) reports the percentage differences in forecasting accuracies of Model 1 (Model 2) relative to univariate GARCH reported in Table 3. While the negative difference implies that Model 1 (Model 2) has smaller average losses than univariate GARCH, asterisk represents the statistical significance of DMW equal predictability test suggested by Diebold and Mariano (1995) and West (1996). Given critical values being 1.28 (90%), 1.65 (95%) and 2.33 (99%) respectively, */**/*** represent the statistical significance at 90%, 95% and 99% respectively. 40 Table 7: Comparison of out-of-sample volatility forecasts (12 years) Criteria Model 1 Difference (%) Model 2 Difference (%) M SE 1, 505.45 −5.55∗∗ 1, 520.94 −4.58∗ QLIKE 3.97 −0.78∗∗ 3.97 −0.90∗∗ M SE − LOG 0.82 −3.81∗∗ 0.83 −3.03∗∗ M SE − SD 4.68 −6.77∗∗ 4.75 −5.38∗ M SE − prop 3.39 −5.92 3.14 −12.83 M AE 15.60 −4.21∗∗ 15.94 −2.11 M AE − LOG 0.75 −3.08∗∗∗ 0.76 −1.41∗ M AE − SD 1.54 −3.73∗∗ 1.57 −1.77∗ M AE − prop 0.75 −7.30∗∗∗ 0.75 −6.45∗∗ Table 7 provides average losses of out-of-sample volatility forecasts from two bivariate GARCH models introduced in Section 3.2. Given out-of-sample volatility forecasts using 12 years of estimation window, column 2 (column 4) reports average losses of one-month-ahead volatility forecasts from Model 1 (Model 2). Column 3 (column 5) reports the percentage differences in forecasting accuracies of Model 1 (Model 2) relative to univariate GARCH reported in Table 3. While the negative difference implies that Model 1 (Model 2) has smaller average losses than univariate GARCH, asterisk represents the statistical significance of DMW equal predictability test suggested by Diebold and Mariano (1995) and West (1996). Given critical values being 1.28 (90%), 1.65 (95%) and 2.33 (99%) respectively, */**/*** represent the statistical significance at 90%, 95% and 99% respectively. 41 Table 8: Tests for conditional predictive ability Recession Bear markets Criteria 6 years 12 years 6 years 12 years M SE 11.95∗∗∗ 5.62 7.28∗ 5.50 QLIKE 11.21∗∗ 7.06∗ 11.26∗∗ 7.18∗ M SE − LOG 12.42∗∗∗ 12.35∗∗∗ 8.12∗∗ 9.66∗∗ M SE − SD 19.13∗∗∗ 8.59∗∗ 19.71∗∗∗ 8.52∗∗ M SE − prop 5.29 2.75 5.09 6.31∗ 8.36∗∗ 14.26∗∗∗ 8.20∗∗ 12.10∗∗∗ M AE − LOG 25.33∗∗∗ 20.14∗∗∗ 22.99∗∗∗ 19.26∗∗∗ M AE − SD 20.07∗∗∗ 18.33∗∗∗ 19.88∗∗∗ 18.64∗∗∗ M AE − prop 11.32∗∗ 9.71∗∗ 11.35∗∗∗ 9.06∗∗ M AE Table 8 reports test statistics for conditional predictive ability proposed by Giacomini and White (2006). Under the null hypothesis of no conditional loss differentials, the test statistic is asymptotically chi-squared distributed with 3 degrees of freedom. Column 2 and 3 reports results for testing whether Model 2 outperforms univariate GARCH during recession. Column 4 and 5 reports results for testing whether Model 2 outperforms univariate GARCH conditional on negative return on the value-weighted stock market index. Given critical values with 3 degrees of freedom being 6.25 (90%), 7.81 (95%) and 11.34 (99%) respectively, */**/*** represent the statistical significance at 90%, 95% and 99% respectively. 42 Table 9: Robustness checks - Hwang and Satchell In-sample 6 years 12 years Criteria GARCH-X GARCH-XC GARCH-X GARCH-XC GARCH-X GARCH-XC M SE 1, 276.87 1, 273.91 1, 541.30 1, 549.40 1, 679.30 1, 677.90 QLIKE 3.75 3.74 3.98 3.98 4.05 4.05 M SE − LOG 0.74 0.73 0.92∗∗∗ 0.92∗∗∗ 0.87 0.87 M SE − SD 3.78 3.81 5.04 5.09 5.28 5.27 M SE − prop 2.71 2.59 7.40 7.44 5.82 5.86 M AE 13.03 13.22 15.41 15.44 16.59 16.58 M AE − LOG 0.72 0.71 0.78∗∗∗ 0.78∗∗∗ 0.78 0.77 M AE − SD 1.35 1.36 1.55∗ 1.54∗ 1.62 1.61 M AE − prop 0.66 0.66 0.91 0.90 0.87 0.86 Table 9 reports average losses of volatility forecasts from GARCH-X and GARCH-XC. Following Hwang and Satchell (2005), I consider the cross- sectional market volatility calculated from (18), and the GARCH-cross-sectional (GARCH-XC) model specified as in (19). Using the cross-sectional market volatility as a measure for cross-sectional dispersion, column 2 and 3 report average in-sample losses from GARCH-X and GARCH-XC models. Column 4 and 5 (column 6 and 7) report average losses of out-of-sample volatility forecasts from GARCH-X and GARCH-XC models when using 6 years (12 years) estimation window. For each loss function, boldface entries represent more accurate volatility forecasts from GARCH-X (or GARCH-XC) than GARCH. For out-of-sample volatility forecasts, the statistical significance of the equal predictive ability test (DMW) is denoted by a asterisk. Given critical values being 1.28 (90%), 1.65 (95%) and 2.33 (99%) respectively, */**/*** represent the statistical significance at 90%, 95% and 99% respectively. 43 Table 10: Robustness checks - alternative measures (in-sample) 5% Cap IQR 1, 304.17 1, 311.92 1, 311.92 QLIKE 3.77 3.78 3.78 M SE − LOG 0.81 0.83 0.83 M SE − SD 4.04 4.12 4.12 M SE − prop 2.79 2.75 2.75 M AE 13.58 13.73 13.73 M AE − LOG 0.75 0.75 0.75 M AE − SD 1.41 1.43 1.43 M AE − prop 0.69 0.69 0.69 M SE Table 10 reports robustness checks for in-sample volatility forecasts from GARCH-X. Given concerns for a noisy cross-sectional dispersion measure when using all individual stock returns, we consider three alternative measures as a covariate in the GARCH-X model; in (10), we calculate 1) the trimmed cross-sectional dispersion (5%) where 5% extreme observations are removed from both tails at each t, 2) cross-sectional dispersion using 10 CRSP Cap-Based Portfolios (Cap), 3) squared interquartile range (IQR). Boldface entries have lower average losses than univariate GARCH for each loss function. Here, we find smaller average losses with alternative measures under all loss criteria except MSE-prop when using 5% trimmed cross-sectional dispersion. 44 Table 11: Robustness checks - alternative measures (out-of-sample) 6 years 12 years 5% Cap IQR 5% Cap IQR 1, 583.42 1, 865.05 1, 517.78 1, 780.54 1, 956.36 1, 602.34 QLIKE 3.95 3.96 3.94∗∗∗ 4.03 4.02 4.01 M SE − LOG 0.95 0.91 0.93∗∗∗ 0.88 0.84 0.83∗ M SE − SD 5.31 5.69 5.05 5.63 5.85 5.03 M SE − prop 5.72 5.49 5.38 4.45 4.29 3.82 M AE 16.31 16.73 15.74 17.57 17.60 16.22 M AE − LOG 0.80 0.77 0.79∗∗ 0.78 0.75∗∗∗ 0.76∗ M AE − SD 1.60 1.57 1.56 1.67 1.62 1.58 M AE − prop 0.86∗∗ 0.90 0.86∗ 0.83 0.82 0.82 M SE Table 11 reports robustness checks for out-of-sample volatility forecasts from GARCH-X using three alternative cross-sectional dispersion measures. While boldface entries have lower average losses than univariate GARCH for each loss function, the statistical significance of DMW test statistics is denoted by asterisk. Given critical values being 1.28 (90%), 1.65 (95%) and 2.33 (99%) respectively, */**/*** represent the statistical significance at 90%, 95% and 99% respectively. 45 Appendix A.1. Summary Statistics In this section, I provide summary statistics of individual stock returns that are used for constructing cross-sectional dispersion across stock returns following (10). For describing cross-sectional distribution of historical individual stock returns, I begin by constructing time-series summary statistics of individual stock returns, and obtain cross-sectional summary statistics across those of individual stocks in the universe. Table A1 reports cross-sectional distribution summarized by range statistics such as Min, Q1, Median, Q3 and Max, cross-sectional average and crosssectional standard deviation of each time-series individual summary statistics. On average, individual stocks have appeared in the monthly stock index for 13 years (158 months) while the median corresponds to about 10 years (117 months). Of note, there are only 118 individual stock returns spanning 60 years of the total sample period since firms listed on NASDAQ began to be included in CRSP database at the beginning of 1973. Next, it is common to observe the positive historical average returns (16,775 stocks or 77.94%), and excess kurtosis (20,611 stocks or 95.76%) across individual stock returns. On the other hand, cross-sectional distributions of returns and kurtosis exhibit the large heterogeniety across individual stock returns. While the historical individual stock return is about 0.89% on average (median of about 1.03%), for example, there are a few firms with either large postivie or large negative historical average returns. Lastly, there are 18,310 firms (85.07%) having positively skewed historical returns, which is seemingly inconsistent with the frequently documented negatively skewed stock index return. Though interesting, I do not pursue this finding further as it is irrelevant to the goal of this 46 paper. Table A1 : Descriptive Statistics Min Q1 Median Q3 Max Mean Stdev Time periods 36 68 117 204 720 157.92 125.91 Mean -14.00 0.13 1.03 1.81 27.63 0.89 1.82 Standard deviation 0.23 10.12 14.88 20.87 204.78 16.66 9.75 Skewness -8.14 0.27 0.77 1.44 14.81 0.99 1.29 Kurtosis 1.81 4.23 5.67 8.75 248.63 8.54 10.40 Table A1 provides the summary statistics for individual stock returns used for constructing a series of the monthly stock index returns. A.2. Changes in stock index universe Let Nt be a total number of stocks, NtA and NtD respectively be the number of stocks that are added in and dropped from the stock index universe in period t. Denoting by NtR the number of stocks that are carried over from t − 1 to t, law of D . Here the former motions are given by 1) Nt = NtR + NtA and 2) NtR = Nt−1 − Nt−1 indicates stocks in the stock index universe in period t are either those remained from the previous period t − 1 or those newly introduced in the universe. The latter indicates that stocks carried over from the previous period are leftovers after excluding stocks dropped at the end of period t − 1. Figure A1 plots the transition of the stock index universe. There are two periods with large number of stocks being added in the stock universe: 737 stocks (39.29%) were newly introduced during August 1962 and 2,281 stocks (46.51%) were added in during January 1973. Apart from these two big events, the total number of stocks has been gradually changed over time. 47 Figure A1: The dynamics of the Stock index universe 10000 Total number of Stocks (Nt) Added Stocks (NA) t Dropped Stocks (ND ) t number of stocks 8000 6000 4000 2000 0 54 58 62 66 70 74 78 82 86 90 time period 94 98 02 06 10 Note: This figure plots the total number of stocks (solid line), number of stocks added in (long dashed line) and dropped from the stock index universe (short dashed line) from February 1954 to December 2013. B.1. Empirical Procedure In this section, I describe the parameter estimation procedure of the proposed model which is skipped in Section 3.2. For clarifying dimensions associated with vectors and matrices, I use one underline below a variable for representing a vector and two underlines for representing a matrix. To begin with, consider individual forecasting errors obtained from AR(1) forecasting model of returns (12): for each stock i, an individual forecasting error ui,t is obtained by regressing ri,t on a constant and its lagged return ri,t−1 for t = 2, . . . , T . Let ut ≡ [u1,t , . . . , uNt ,t ]0 be a collection of individual forecasting errors at period t. Denoting by Ωt ≡ E [ut u0t |Ft−1 ] a Nt × Nt variance-covariance matrix of individual forecasting errors in period t, the joint log-likelihood function of individual stock 48 returns becomes, T 1 X Nt−1 1 0 −1 L = − log 2π − log Ωt − ut Ωt ut 2 2 2 t=1 In general, the numerical maximization of above log-likelihoods by iterative methods can be quite costly since it requires an inversion and a determinant calculation of Nt × Nt matrix Ωt for each period t. Here I overcome this empirical intractability issue by modeling an individual forecasting error using a factor structure provided by equation (13). Since Ωt is a symmetric matrix that is factored by a vector of individual weights, analytical forms of inversion and determinant are given by, Nt−1 Y 2 (Nt−1 −1) 2 2 Ω = κ · κ + N σ · t t−1 t t t i=1 (i, j) = Ω−1 t 2 w2 2 2 Nt−1 i,t−1 ·(κt +(Nt−1 −1)σt ) κ2t ·(κ2t +Nt−1 σt2 ) − 2 w 2 Nt−1 i,t−1 wj,t−1 σt κ2t ·(κ2t +Nt−1 σt2 ) 1 2 w2 Nt−1 i,t−1 for j = i for j 6= i (i, j) is an (i, j)th element in the inverse matrix Ω−1 . where Ω−1 t t Then, the closed-form log-likelihood of Panel-GARCH is given by, L = T X t=1 − Nt−1 2 Nt X 2 N 1 σt2 log 2π − log Jt − t−1 (wi,t−1 ui,t )2 − 2 2 2 2κt i=1 κt + Nt−1 σt2 Nt X i=1 !2 wi,t−1 ui,t (Nt−1 −1) QNt−1 where Jt ≡ κ2t + Nt−1 σt2 × κ2t × i=1 (wi,t−1 Nt−1 )−2 is a determinant of a Nt−1 × Nt−1 variance matrix of individual forecasting errors. This can be numerically evaluated along with the bivariate GARCH process, providing MLE estimates for parameters. 49 C.1. Squared return proxy In this subsection, I provide the comparison of forecasting accuracies across models using squared returns instead of realized variance. For evaluating the performance of volatility forecasting models, monthly squared returns have been widely adopted as a proxy for latent volatility. See, for example, Pagan and Schwert (1990), Day and Lewis (1992) and Franses and Van Dijk (1996) as previous applications using squared returns. Squared return proxy is obtained by squaring residuals from the AR(1) forecasting model of the aggregate return in (2). Given estimates for φ0 and φ1 , one can obtain squared returns as: for each t, 2 u2t = rt − φb0 − φb1 rt−1 Using squared return proxy, I provide the comparison of forecasting accuracies across univariate GARCH, GARCH-X and two bivariate models proposed in Section 3.2, that are in parallel with Table 2 and 5 (Table C1, in-sample), Table 3 (columns 2 - 4) and Table 6 (Table C2, out-of-sample using 6 years of estimation window), and Table 3 (columns 5 - 7) and Table 7 (Table C3, out-of-sample using 12 years of estimation window). 50 Table C1 : Comparison of in-sample forecasting accuracy GARCH GARCH-X Model 1 Model 2 Criteria Average Difference (%) Difference (%) Difference (%) M SE 1, 345.70 0.13 −0.67 −0.68 QLIKE 3.86 −0.02 −0.14 −0.13 M SE − LOG 6.44 −0.16 −0.62 −0.48 M SE − SD 8.38 −0.03 −1.18 −1.07 M SE − prop 3.81 0.56 3.17 3.61 M AE 18.87 −0.13 −1.10 −0.92 M AE − LOG 1.77 −0.10 −0.57 −0.49 M AE − SD 2.29 −0.15 −1.00 −0.84 M AE − prop 1.01 0.04 −0.08 −0.12 Table C1 provides the comparison of in-sample volatility forecasts from GARCH, GARCH-X, and two bivariate GARCH models provided in Section 3.2.. As a proxy for an unobservable volatility, squared return is used. While column 2 reports average losses of in-sample volatility forecasts from GARCH, the preceeding three columns report the percentage differences in forecasting accuracies of GARCH-X and two bivariate GARCH models relative to univariate GARCH model. The negative difference implies that the considered model has smaller average losses than GARCH. 51 Table C2 : Comparison of out-of-sample forecasting accuracy (6 years) GARCH GARCH-X Model 1 Model 2 Criteria Average Difference (%) Difference (%) Difference (%) M SE 1, 586.90 8.68∗∗∗ −0.47 −1.54 QLIKE 3.99 0.22 −0.69∗∗ −0.65∗∗ M SE − LOG 6.21 1.24∗∗∗ 1.40∗ 0.87 M SE − SD 9.61 6.43∗∗∗ −0.06 −1.79 M SE − prop 5.44 9.53 −13.55∗∗∗ −11.61∗∗∗ M AE 20.88 4.38∗∗∗ 1.15∗ −0.95 M AE − LOG 1.77 0.43 0.54 −0.25 M AE − SD 2.41 1.86∗∗ 0.76 −0.78 M AE − prop 1.17 0.47 −4.13∗∗∗ −3.74∗∗∗ Table C2 provides the comparison of out-of-sample volatility forecasts from GARCH, GARCHX, and two bivariate GARCH models provided in Section 3.2.. One-month-ahead volatility forecasts are calculated from parameter estimates using 6 years of estimation window. As a proxy for an unobservable volatility, squared return is used. While column 2 reports average losses of in-sample volatility forecasts from GARCH, the preceeding three columns report the percentage differences in forecasting accuracies of GARCH-X and two bivariate GARCH models relative to univariate GARCH model. The negative difference implies that the considered model has smaller average losses than GARCH. 52 Table C3 : Comparison of out-of-sample forecasting accuracy (6 years) GARCH GARCH-X Model 1 Model 2 Criteria Average Difference (%) Difference (%) Difference (%) M SE 1, 687.60 4.86∗∗∗ −1.44∗∗∗ −0.85∗∗∗ QLIKE 4.06 −0.20 −0.70 −0.86∗ M SE − LOG 6.57 −0.19 0.30 −0.40∗ M SE − SD 10.03 4.20∗∗∗ −1.09∗ −1.51∗∗∗ M SE − prop 5.51 3.48 −2.20 −17.47∗∗ M AE 21.63 3.43∗∗ 0.29 0.25 M AE − LOG 1.80 −0.11 0.01 0.01 M AE − SD 2.49 1.17 0.03 0.00 M AE − prop 1.14 −1.87 −4.38∗∗∗ −4.04∗∗∗ Table C3 provides the comparison of out-of-sample volatility forecasts from GARCH, GARCHX, and two bivariate GARCH models provided in Section 3.2.. One-month-ahead volatility forecasts are calculated from parameter estimates using 12 years of estimation window. As a proxy for an unobservable volatility, squared return is used. While column 2 reports average losses of in-sample volatility forecasts from GARCH, the preceeding three columns report the percentage differences in forecasting accuracies of GARCH-X and two bivariate GARCH models relative to univariate GARCH model. The negative difference implies that the considered model has smaller average losses than GARCH. 53