Finite sample performance of small versus large scale dynamic factor models

Transcription

Finite sample performance of small versus large scale dynamic factor models
Finite sample performance of small versus large scale
dynamic factor models
Rocio Alvarez
Maximo Camachoy
Gabriel Perez-Quiros
Universidad de Alicante
Universidad de Murcia
Banco de España and CEPR
[email protected]
[email protected]
[email protected]
Abstract
We examine the …nite-sample performance of small versus large scale dynamic factor models. Monte Carlo analysis reveals that small scale factor models outperform
large scale models in factor estimation and forecasting for high level of cross-correlation
across the idiosyncratic errors of series from the same category, for oversampled categories, and specially for high persistence in either the common factor series or the
idiosyncratic errors. In these situations, there is a superior accuracy of pre-selecting
the series to be used in a small scale dynamic factor model even for arbitrary selections
of the original set of indicators.
Keywords: Business Cycles, Output Growth, Time Series.
JEL Classi…cation: E32, C22, E27.
Maximo Camacho thanks Fundacion Ramon Areces for …nancial support. The views in this paper are
those of the authors and do not represent the views of Bank of Spain or the EuroSystem.
y
Corresponding Author: Universidad de Murcia, Facultad de Economia y Empresa, Departamento de
Metodos Cuantitativos para la Economiay la Empresa, 30100, Murcia, Spain. E-mail: [email protected].
1
1
Introduction
Two versions of dynamic factor models have received a growing attention in the recent
forecasting literature. On the one hand, forecasts have been computed from di¤erent enlargements of the Stock and Watson (1991) single-index small scale dynamic factor model
(SSDF M ). Examples are Mariano and Murasawa (2003), Nunes (2005), Aruoba, Dieblod
and Scotti (2009), and Camacho and Perez Quiros (2010) whose strict factor models are
estimated by maximum likelihood using the Kalman …lter under the assumption of having
non cross-correlated idiosyncratic errors. On the other hand, forecasts have been computed
from di¤erent sophistications of the seminal work of Stock and Watson (2002a), principal components estimator which combine the information of many predictors. Examples
of forecasts from the so-called large scale dynamic factor models (LSDF M ) are Forni,
Hallin, Lippi and Reichlin (2005), Giannone, Reichlin and Small (2008), Angelini et al
(2008), etc. These approximate factor models lead to asymptotically consistent estimates
when the number of variables and observations tends to in…nity, under the assumptions
of weak cross-correlation of the idiosyncratic components, and that the variability of the
common component is not too small.
Much theoretical attention has been devoted to large scale factor models by stressing
that strict factor models rely on the tight assumption that the idiosyncratic components
are cross-sectionally orthogonal. However, including additional time series in empirical
applications frequently faced non negligible costs as well. According to Boivin and Ng
(2006), the large data sets used by LSDF M are typically drawn in practice from a small
number of broad categories (such as industrial production, or monetary and price indicators). Since the idiosyncratic errors of time series belonging to a particular category are
expected to be highly correlated, the assumption of weakly correlation among the idiosyncratic components is more likely to fail as the number of time series of this category
increases. In addition, the good asymptotic properties suggested by the theory may not
hold in many empirical applications when the number of variables and observations are
relatively reduced.1
1
Recently, Boivin and Ng (2006) for US and Banbura and Runstler (2007) for the Euro area show that
the predictive content of empirical large scale factor models is contained in the factors extracted from as
2
The impact of this potential confront between asymptotic theory and empirical applications has rarely been addressed. Among the exceptions, Stock and Watson (2002b)
…nd deterioration in performance of (static) large scale factor models when the degree
of serial correlation and (to less extent) heteroskedasticity among idiosyncratic errors are
large and when serial correlation of factors is high. Boivin and Ng (2006) use (static)
large scale dynamic factor models to show that including series that are highly correlated
with those of the same category does not necessarily outperforms models that exclude
these series. Boivin and Ng (2006) for the US and Caggiano, Kapetanios, and Labhard
(2009) for some Euro area countries estimate large scale (static) factor models of di¤erent
dimensions to show that factors extracted from pre-screened series often yield satisfactory
or even better results than using larger sets of series. Their preferred data sets sometimes
includes one …fth of the original set of indicators. Bai and Ng (2008) …nd improvements
over a baseline large scale (static) factor model by estimating the factors using fewer but
informative predictors. Finally, Banbura and Runstler (2007) use a (dynamic) large scale
model to show that forecast weights are concentrated among a relatively small set of Euro
area indicators.
From all these previous works, the one that it is closer to our approach is Boivin and
Ng (2006). However, even though we think that their analysis is very complete, we share
the views of Aruoba, Diebold and Scotti (2009), when they conclude that comparative
assessments of the small sample properties of factor estimation and forecast from “small
data”versus “big data”dynamic factor models is a good place to develop further empirical
analyses. In that sense, we separate from Boivin and Ng (2006) …rst, because our purpose
is not to determine the optimal number of variables in a large dataset but to shed some
light on the dilemma of which is the optimal strategy when dealing with a forecasting
problem, to start from a simple small model and enlarge if it is necessary or to go directly
to a large scale model and try to eliminate the redundant information 2 Second, Boivin and
few as about 40 series.
2
The estimation strategy also determines the techniques to be used in the estimation. Most of the large
scale models techniques require a su¢ cient large number of series to have good properties. Therefore, a
small scale model can not be estimated as a particular case of the large scale speci…cation but a di¤erent
estimation strategy.
3
Ng (2006) consider static models, while we compare dynamic speci…cations. In particular,
we consider our framework for large scale model the dynamic model of Giannone, Reichlin
and Small (2008) while they consider the static model of Stock and Watson (2002a).
This is an important feature of our analysis because we will address the question of how
persistence in the factors or in the idiosyncratic shocks a¤ect how appropriate are the
di¤erent speci…cations. Third, even though they mention the word "categories" in the
motivation of their work, referring to the di¤erent sectors or type of data in the economy
(prices, production, etc..) they classify the data in general according to their correlation
or their heteroskedastic behavior. We concentrate in detail on considering and simulating
the e¤ects of having di¤erent sectors in the economy and considering cross correlation
across sectors, inside each sector and both across and inside jointly and its e¤ects on the
estimation of the factors.
Speci…cally, in this paper we develop simulations in which we try to mimic di¤erent
empirical forecasting scenarios. The …rst scenario would be the case on which an analyst
uses SSDF M to estimate the factors and to compute the forecasts from a small number
of pre-screened series which are the main (less noisy) indicators of the di¤erent categories
of data. In the second scenario, a SSDF M is again used for factor extraction and forecast
computations but from a less accurate pre-screening which includes a small number of
noisier indicators, the medium series of each category which are sorted by increasing variance. In the …nal scenario, a large scale data set is generated by including additional series
in each category under the assumption that the additional series are …ner disaggregations
of the main indicator with which they are correlated. From these large set of indicators,
a LSDF M is used to compute the forecasts.
Using averaged squared errors, we evaluate the accuracy of these three forecasting proposals to estimate the factors and to compute out of sample forecasts of a target variable.
We …nd that adding data that bear little information about the factor components does
not necessarily lead large scale dynamic factor models to improve upon the forecasts of
small scale dynamic factor models. In fact, we show that when the additional data are too
correlated with data from some categories which are already included in factor estimation,
forecasting with many predictors perform worse than forecasting from a reasonably pre-
4
screened dataset especially when the categories are not highly correlated. We also address
the role of the persistence in the factor in determining the best forecasting method.
This paper proceeds as follows. Section 2 describes both small and large scale dynamic
factor models. Section 3 presents the design details of the simulation exercise, i.e., how
to generate the main series of each category and the …ner disaggregations. Section 4
shows the main …ndings in the comparison between SSDF M and LSDF M for di¤erent
parameter’s values. Section 5 concludes.
2
Dynamic factor models
Large and small scale factor models can be represented in a similar general framework.
Let yt be a scalar time series variable to be forecasted and let Xt = (X1t ; :::; XN t )0 , with
t = 1; :::; T , be the observed stationary time series which are candidate predictors of yt . If
we are interested in one step ahead predictions, the baseline model can be stated as
yt+1 =
0+
0
Xt +
p
X
j yt j+1
+
yt+1 ;
(1)
j=1
where
=(
1 ; :::;
0
N) ,
and
yt+1 is
a zero mean white noise.
Since estimating this expression becomes impractical as the number of predictors increases, it is standard to assume that each predictor Xit has zero mean and admits a factor
structure:
Xit =
0
i Ft
+
for the ith cross-section unit at time t, i = 1; :::; N ,
this framework the r
it
=
0
i Ft
(2)
it;
i
=(
i1 ; :::;
1 vector Ft contains the r common factors,
the common components, and
it
0
ir ) ,
i
and t = 1; :::; T . In
the r factor loadings,
the idiosyncratic errors. In vector notation
the model can be written as
X t = Ft +
where
=(
ij )
t;
is the N r matrix of factor loadings and
(3)
t
is the vector of N idiosyncratic
shocks. In the related literature, it is standard to assume that the vectors Ft and
5
t
are
serially and cross-sectionally uncorrelated unobserved stationary processes.3 In contrast
to static factor models, the dynamics of the common factors are supposed to follow a
V AR(1) process
Ft = AFt
where A is the r
t
1
+ ut ;
(4)
r matrix of coe¢ cients, with E[ut ] = 0 and E[ut u0t ] =
u.
In addition,
is assumed to follow a stationary V AR(1) process with mean zero:
=C
t
t 1
+ vt ;
where vt is independent with E[vt ] = 0 and E[vt0 vt ] =
(5)
4
v.
Then, the objective variable
yt can be forecasted through the common factors by using the expression
yt+1 =
0
+
0
Ft +
p
X
j yt j+1
+ eyt+1 :
(6)
j=1
Finally, let us call the model small scale dynamic factor model (SSDFM ) when N is …xed
and small and T is large, and large scale dynamic factor model (LSDFM ) when both N
and T are large.
2.1
Small scale dynamic factor models
The baseline model is the single-index dynamic factor model of Stock and Watson (1991)
which can be written in state-space form. Accordingly, the autoregressive parameter A, the
vector of the N loading factors , and the (N
shocks
v,
N ) covariance matrix of the idiosyncratic
can be estimated by maximum likelihood via the Kalman …lter.5 Let ht be the
(N + 1) vector ht = (Ft;0 0t )0 , Ij be the identity matrix of dimension j, and 0j be the vector
of j zeroes. Hence, the measurement equation can be de…ed as
Xt = Hht + et ;
3
(7)
In this framework the common factor is supposed to generate most of the cross-correlation between
the series of the data set fXit gN
i=1 :
4
Although assuming V AR(p) dynamics for the factors and the idiosyncractic components is straightforward, it would complicates notation.
5
As usual, u is assumed to be one for identi…cation purposes.
6
where
H=
IN
;
(8)
and et is a vector of N zeroes. In addition, the transition equation can be stated as
ht+1 = F ht + wt ;
where the (N + 1
(9)
N + 1) matrix F is
0
F =@
A
0N
0N
C
1
A;
and wt = (ut ; vt0 ) with zero mean and covariance matrix
0
1
0
u
A:
Q=@
0
v
(10)
(11)
In the standard way, the Kalman …lter also produces …ltered and smoothed inferences
s gT
s
T
of the common factor: fFtjt
t=1 and fFtjT gt=1 . These inferences can be used in the
prediction equation (6) to compute OLS forecasts of the variable yt+1 :
2.2
Large scale dynamic factor models
To estimate the factors in the large scale framework, we use the quasi-maximum likelihood
approach suggested by Doz, Giannone and Reichlin (2007). In this method, the estimates
of the parameters are obtained by maximizing the likelihood via the EM algorithm, which
consists on an iterative two-step estimator. In the …rst step, the algorithm computes an
estimate of the parameters given an estimate of the common factor. In the second step,
the algorithm uses the estimated parameters to approximate the common factor by the
Kalman smoother. At each iteration, the algorithm ensures to obtain higher values of the
log-likelihood of the estimated common factor, so it is assumed that the process converges
when the slope between two consecutive log-likelihood values is lower than a threshold.6
Using an initial set of time series fXit gN
i=1 , the (i + 1)-th iteration of the algorithm is
i,
de…ned as follows. Let us assume that
6
In practice, we consider a threshold of 10
4
Ai and
.
7
i
x
are known. Let Fti be the common
factor which is the output of the Kalman …lter from the i-st iteration. The updated
estimates of
, A, and
x
can be obtained from
i+1
\
\
i0
i i0
= E[X
t Ft ](E[Ft Ft ])
1
;
i i0
i\
i0
Ai+1 = E[F\
t Ft 1 ](E[Ft 1 Ft 1 ])
i+1
x
(12)
1
;
(13)
[
i i0
= E[X
t t ]:
(14)
The estimates of the expectations can be obtained from
T
1X
0
[
E[Xt Ft ] =
Xt Fti0 ;
T
(15)
t=1
where the series fFti gTt=1 is the one estimated at the iteration i.
E[Ft Ft0 ] = E[Ft Fti0 ] + E[fFt
Fti0 gfFt
Fti0 g0 ], and E[fFt
Fti0 gfFt
In addition, since
Fti0 g0 ] is the variance
of the estimated common factor, then denoting the variances by fVt gTt=1 , the expectation
E[Ft Ft0 ] can be estimated by
\
i i0
E[F
t Ft ] =
T
1 X i i0
(Ft Ft + Vt ):
T
(16)
t=1
Following a similar reasoning, E[Ft Ft0 1 ] = E[Ft Fti0 1 ] + E[fFt
Fti0 gfFt
1
Fti0 1 g0 ]; and
the last expectation which we denote as fCt gTt=2 can be estimated by the Kalman …lter.
Then, the expectation E[Ft Ft0 1 ] can be estimated by
T
X
i F i0 ] = 1
(Fti Fti0 1 + Ct ):
E[F\
t t 1
T
(17)
t=1
The matrix
v
is estimated as the diagonal matrix whose principal diagonal is the ones of
the resulting matrix given by:
^ x = diag( 1
T
T
X
Xt (Xt
i
Fti )0 );
(18)
t=1
These estimates can be used again in the Kalman …lter to compute the factors Fti+1 . The
algorithm, which starts with the static principal components estimates of the common
factors Ft0 and their factor loadings
0,
is repeated until the quasi-maximum likelihood
8
estimates of the parameters are obtained. These can easily be used to compute the estimates of the common factor fFtjT gTt=1 using the Kalman smoother, treating the idiosyncratic errors as uncorrelated both in time and in the cross section.7 Finally, as in the case
of SSDFM, the forecasts of yt+1 are estimated by OLS regressions on (6).
3
Designing the simulation study
According to the estimation of the dynamic factor models described in the previous section, it is reasonable to think that the empirical applications that use these factor models
will perform worse than expected when facing data problems that invalidate the assumptions warranted by the theory. In the case of SSDF M , the larger the covariance among
idiosyncratic errors the less accurate the estimated are expected to be. With respect to
the empirical performance of LSDF M , Boivin and Ng (2006) stressed that it can be worse
when the average size of the common component falls, when the number of observations is
not large either on the cross-section or on the time dimensions, and when the possibility
of correlated errors increases as more series are included in the model. This situation is
very common in practice since the data are usually drawn from a small number of broad
categories (such as industrial production, money indicators or prices). In this case, if the
series are ordered within each category by the importance of its common component, and
expanding the datasets with series from each category will frequently lead to larger cross
correlations than assumed by the theory. In that sense, it is fundamentally wrong all the
analysis which base the asymptotic properties of the large scale models on the law of large
numbers as if all the series were the same or had the same properties8
In this section, we perform Monte Carlo simulations to asses the extent to which
the violation of the theoretical assumptions behind SSDF M and LSDF M a¤ects both
7
The algorithm requires small number of iterations to converge. In our simulations, we only required 3
or 4 iterations to converge.
8
To our knowledge, only some recent papers analyze seriously the di¤erent characteristics of the series
when the information set is increased. These are the so called dynamic hierarchical factor models Moench,
Ng and Potter (2009). The comparison between these type of models and the traditional large and small
scale is left for further research
9
consistency of factor estimation and accuracy of forecasts. To analyze under which circumstances it is worth reducing the in‡uence of noisy predictors, the simulations are
designed to replicate the two competing forecasting scheme The …rst scheme mimics the
case of forecasters who develop a reasonable pre-screening of the set of indicators and apply SSDF M to obtain predictions from a reduced number of indicators. In this case, the
analyst searches for the representative indicators of each economic category by screening
out those time series with high correlation with the main indicators. However, also in
this case, we contemplate the possibility of choosing just one indicator of each category
without previous pre-screening. The second scheme mimics the case of forecasters who
include a large number of indicators of each category and apply LSDF M to compute
predictions. In this case, the additional indicators are assumed to be correlated with the
representative indicators of each category. Finally, the goodness of …t in estimating the
factors and the factor forecast accuracy of these methods is examined my means of their
Mean Squared Error (M SE).
3.1
Generating small data sets
The small data set, fXits gN;T
i;t=1 ; with N = 10, is generated from one common factor only.
First, given the parameters A and
u;
we generate the series of the common factor fFt gTt=1
by using expression
Ft = AFt
1
+ ut :
(19)
In this case, fut gTt=1 are random numbers which are drawn from a normal distribution
with zero mean and variances
u
= 1. To examine the dependence of the results on the
persistence of the factor, we allow for di¤erent values for the parameter A = 0:1; 0:5; and
0:75.
Second, we assume that the idiosyncratic errors follow autoregressive processes. For
particular values of the coe¢ cient matrix C, and
v,
we generate the series
t
=(
1t ; :::; N t ),
from
t
=C
t 1
+ vt :
(20)
In this case, vt = (v1t ; :::; vN t ), and fvit gN;T
i;t=1 are random numbers which are drawn from
a normal distribution with zero mean and variance-covariances matrix
10
v.
To simplify
simulations, the autoregressive coe¢ cients matrix C will be diagonal with two possible
values c = 0:1 and c = 0:75 in the main diagonal. In addition, to examine the e¤ects
of the errors cross-correlation, the covariance matrix will take di¤erent values across the
simulations. In particular, let us consider a given value for the parameter s and generate
the vector !s = (1; s ; 2s ; :::; 9s ): Then, the matrix v can be viewed as the Toeplitz
matrix constructed from the vector ! as
s
0
v
1
B
B
B s
B
B
= B 2s
B
B ..
B .
@
9
s
s
1
s
..
.
8
s
2
s
:::
s
:::
1
..
.
:::
..
.
7
s
:::
As can be deduced from this expression, the parameter
9
s
1
C
C
C
C
7 C
:
s C
C
.. C
. C
A
1
8
s
s
(21)
represents the maximum corre-
lation between the error terms of two series and controls the correlation across categories
of data. In the simulations, the values of this parameter will be
Finally, in the simulations
f t gTt=1 is used in
s
= 0; 0:1; 0:5; and 0:75.
will be a column vector of N ones. Then, fFt gTt=1 , and
XtS = Ft +
t;
(22)
to obtain simulations of XtS . With XtS = fXits gTt=1 , for i = 1; :::; 10.
Therefore, what we have here in the 10 series fXits gTt=1 could be intuitively interpreted
as 10 economic sectors, that depend on the same business cycle Ft ; that has di¤erent
levels of persistence measured by A = 0:1; 0:5; and 0:75, and a 10 sectorial shocks f t gTt=1 ;
t
=(
1t ; :::; N t )
cross correlation
9
which also have di¤erent levels of persistence c = 0:1 and c = 0:75 and
s
= 0; 0:1; 0:5; and 0:759 .
For simplicity and clarity in the exposition we are going to assume that exists only one factor because
we think that the number of cases that we contemplate is already large enough. Considering more than
one factor is trivial but the computation time for the montecarlo simulations increases dramatically and
the results are of the same nature. We address the possibility of estimating more than one factor even
though the data are generated by one factor in the next section.
11
3.2
Generating large data sets
l gM;T , with M = 100, we assume that
As mentioned above, for the large data set fXjt
j;t=1
the ten series generated in the previous section,XtS . represent the main indicators of each
of the ten di¤erent categories of data. Accordingly, we add an error term representing
the idiosyncratic error of the speci…c series to each of the ten time series fXits gN;T
i;t=1 for
N = 10. The new errors are called fwikt g10;10;T
i;k;t=1 where i represents the sector, and k
represents the series within the sector, and they are assumed to be serially correlated and
cross-correlated with all the series existing within its respective category. Hence, the large
data set is generated by using
l
Xikt
= Xits + wikt ;
(23)
where i = 1; :::; 10, k = 1; :::; 10, and wit = (wi1t ; :::; wi10t )0 is the vector of idiosyncratic
errors which is generated by
wit = Cwit
1
+ elit :
(24)
In this expression, felikt g10;10;T
i;k;t=1 are random numbers drawn from a normal distribution
with zero mean and covariance matrix which is the Toeplitz matrix constructed from
the vector !l as in (21), where l = 0; 0:1; 0:5; and 0:75. Therefore, the parameter l
controls the correlation within each of the categories of data. Again, the autoregressive
coe¢ cients matrix C is diagonal with constant values of c = 0:1 and c = 0:75 in the main
diagonal.
According to expressions (22), (23), and (24), each series of the large data set can be
decomposed as follows
l
Xikt
=
where
l
ikt
=
it
i Ft
+
l
ikt ;
+ wikt . Then, the idiosyncratic components
common error inside the categories,
it ,
(25)
l
ikt
are composed by a
which could be cross-correlated among di¤erent
categories, and a speci…c error term, wikt , which could be correlated with series from the
same category.
Finally, putting together the series along all the categories, we have the large data set
l
Xtl = X1;1;t
; X l1;2;t ; :::; X l1;10;t ; X l2;1;t ; X l2;2;t ; :::; X l2;10;t ; :::; X l10;1;t ; X l10;2;t ; :::; X l10;10;t
0
:
(26)
12
As in the previous case, the intuition behind the data generating process here is the
same than before, but adding the fact that the series-speci…c shocks can also be autocorrelated c = 0:1 and c = 0:75 and cross correlation
3.3
l
= 0; 0:1; 0:5; and 0:75
Generating the target series
Finally, we generate the series to be predicted in the a simple scenario. To simplify
simulations, we consider that forecasting with factors and one lagged value of the time
series is dynamically complete. Hence, the series yt is generated from
yt+1 =
where
0
Ft + yt + eyt ;
is one, eyt is a white noise process, with
ey
= 1, and
(27)
takes on the values of 0,
0:3, 0:5 and 0:8.
4
Simulation results
In each replication, j, we estimate the small and large scale factor models and compute
the accuracy of these models to infer the factor by using the Mean Squared Error over the
J = 1000 replications
M SE i =
J
T
1X1X
(Fjt
J
T
j=1
i
QFjtjT
)2 ;
(28)
t=1
for i = s in the case of the small data set and i = l in the case of the large data set. In this
expression, Q is the projection matrix of the true common factor on the estimated common
factor.10 In addition, we compare the out of sample forecasting accuracy of SSDF M and
LSDF M by computing the errors in forecasting one step ahead the generated target
series. Let b and b be the OLS estimates of the parameters given by equation (27) using
the common factor series and the past values of y up to period T
1: Then we construct
i
b i
the one-step-ahead forecast of yjT +1 by using the relation ybjT
+1 = FjtjT + byjT . In this
way, one can de…ne the Mean Squared one-step-ahead Forecast Errors of model i as
M SF E i =
J
1X
(yjT +1
J
j=1
10
i
2
ybjT
+1 ) :
(29)
We need the projection matrix since the common factors are estimated up to a signal transformation.
13
However, this experiment could lead to unrealistic results in favor of SSDF M since
we are implicitly assuming that in the pre-screening of the indicators the researcher would
always …nd the main indicator in each category of data. To overcome this potential bias
in favor of small scale factor models, we consider in the simulation exercises an additional
case in which the researcher estimates a SSDF M but arbitrarily using the …fth noisier
series from each category. Accordingly, we call M SErs ; M SEns , M SE l , M SF Ers , M SF Ens ,
and M SF E l the mean across replications of the M SE and M SF E which are computed
from SSDF M with the 10 representative series of each category (superscript s, subscript
r), from SSDF M with 10 noisier series of each category (superscript s, subscript n), and
from LSDF M to the 100 time series of the large scale simulation exercise (superscript l).
4.1
Factor estimates
Let us start the analysis of the simulations by comparing the accuracy of the models to
infer the factors (using M SEs). To facilitate understanding, let us describe how the results
are presented in the tables. First, the results in Tables 1 to 3 are classi…ed according to
di¤erent values of the autoregressive coe¢ cient of the common factor series (coe¢ cient
A). Hence, this coe¢ cient takes the value of 0:1 (low correlation) in Table 1, the value
of 0:5 (medium correlation) in Table 2 and the value of 0:75 (high correlation) in Table
3. Second, each of these tables shows the accuracy of the models for di¤erent values
of the cross correlation within and across categories. The …rst block of results refers to
the case when the only cross-correlation presented in the idiosyncratic components is due
to series that belong to the same category,
s
= 0, while the following blocks of results
examine the e¤ects of progressively increasing the correlation across categories to 0:1, 0:5
and 0:75. Within each of these blocks, the tables report the models accuracy to infer the
common factor when the correlation within categories, which is measured by
l,
increases
from 0 to 0:1, 0:5 and 0:9. Third, the …rst three columns of the tables refer to MSEs
from reasonably pre-screened SSDF M , arbitrarily chosen (…fth noisier) SSDF M , and
LSDF M , respectively. Fourth, it is usually a problem in large scale models the fact that
not always there is the same number of series in each category. Some categories might
be over represented. We address the e¤ects of over sampling in the last two columns of
14
these tables. For this purpose, we simulate ten categories of data but including 20 series
instead of 10 in the …rst category, and 5 series instead of 10 in the second and third
categories. All the other 7 categories are represented by 10 series11 Fifth, in Tables 1 to
3, the idiosyncratic errors are assumed to have low serial correlation (value of c = 0:1),
the sample is small (T = 50), and we assume that there is only one common factor in
the estimation. The robustness of the results to allow for higher serial correlations, to use
larger samples, and to permit LSDF M to select the number of common factors as in Bai
and Ng (2002), are analyzed in Tables A1 to A4 in the appendix.
A small summary of the main results, is the following: It can be seen in all the
tables that reasonably pre-screened SSDF M , presents smaller MSE than all the other
speci…cations (M SErs ; < M SEns ) and ,(M SErs ; < M SE l ) . This is an important point.
From this result we learn that a good preselection in the categories could make the model
impossible to beat even if we add a lot of information. This results holds for all the
possible assumptions about the dynamics of the shocks, of the factors and the crosscorrelations..However, even in the case in which the econometrician is not extremely careful
with the selection of the variables, still there can be some gain of estimating a SSDF M .
In the comparison of SSDF M with an arbitrary selection (the …fth of each category) and
the LSDF M , the relative performance of these two models depend on the autocorrelation
of the factor (as can be seen in the comparison of Table 1 and Table 3) and obviously, of
the cross-correlation within categories and across categories
l
and
s.
In general, our main results are in concordance with those obtained by Boivin and Ng
(2006) from large scale static factor models using sets of di¤erent numbers of indicators.12
although, as we said in the introduction, they do not address the topic of small vs large
estimation, they concentrate on choosing the optimal number of variables in a large scale
model.
11
The accuracy of SSDF M from reasonable pre-screened series does not depend on the number of series
that are included in each category because we just take the representative series of each category. Hence,
we only show in the tables M SEns .and M SE l . In this case, M SEns represents the …fth noisier series in
each category.
12
They suggest that the large scale factor estimates are adversely a¤ected by cross-correlation in the
errors and by oversampling.
15
We are also in line with the …ndings of Stock and Watson (2002b). They …nd (using
static large scale factor models) some deterioration on the quality of the factor estimates,
and this deterioration occurs when the degree of serial correlation in the factor and in the
idiosyncratic errors is high even when the number of variables and observations is large,
exactly as we shown in Table 2 and 3. Going in detail to our results, we can observe
that increasing inertia in the simulated common factor, with A ranging from 0:1 (almost
no serial correlation) in Table 1 to 0:5 (moderate correlation) in Table 2 and 0:75 (high
correlation) in Table 3.con…rm the deterioration in factor estimation from all the factor
models although the relative losses are not uniformly distributed along the models. When
the serial correlation of the factor increases, the relative gains of reasonably over arbitrarily
pre-screening the series in SSDF M still hold at similar rates, except for the case of very
large correlation across categories where the relative gains attenuate. Notably, the M SEs
also highlight the signi…cant losses in the relative accuracy of LSDF M with respect to
SSDF M as the inertia of the common factor increases. In fact, when A = 0:75, in all
scenarios, the SSDF M from arbitrarily chosen series outperform LSDF M .
It is also important to point out the results displayed in columns 4 and 5 of Table
1, 2 and 3, which refer to the e¤ects of oversampling of some of the categories. All
the other columns are calculated assuming that the user of large scale models includes
the same number of series in each category. However, practitioners usually work with
unbalanced number of time series in each category, see for example, Angelini et al (2008)
or Giannone et al (2008).13 To examine the e¤ect of using oversampled categories in
factor analysis, these last two columns of Table 1, 2 and 3 report the M SEs of the
arbitrarily noisy series SSDF M and of a LSDF M which uses 10 unbalanced categories
chosen with the procedure explained before. Overall the large scale factor model with
unbalanced categories performs worse than in the case of balanced categories, especially
when the correlation across categories is small. Obviously, the relatively better accuracy of
noisy SSDF M with respect to the oversampled LSDF M is more evident and it becomes
critical when the low correlation across categories is combined with high correlation within
13
For example, typically, the number of series of disaggregated industrial production indicators is quite
higher than the number of time series included in other categories.
16
categories.
The tables that try to address the robustness of our results to di¤erent assumption
are included in the appendix. Tables A1 and A2 in examine the e¤ects of increasing the
serial correlation of the idiosyncratic components on the factor models performance. In
particular, the serial correlation is assumed to grow from c = 0:1 to c = 0:75 when the
serial correlation of the factor is low (A = 0:1 in Table A1) and when it is high (A = 0:75
in Table A2) which leads to the following results. First, the serial correlation in the errors
contributes to deteriorate the overall performance of the models even more than the serial
correlation in the factor. For example, when
l
= 0,
s
= 0:75, and A = c = 0:1, the M SErs
is 0.35 and it increases to 0.50 when c = 0:75 but only to 0.40 when A = 0:75. Second,
the accuracy of reasonably pre-screened SSDF M versus both arbitrarily chosen SSDF M
and LSDF M is larger when there is serial correlation in the idiosyncratic components In
that sense, the model more negatively a¤ected by the serial correlation is the LSDF M:
Third, these results are magni…ed in the case of oversampled categories in factor analysis.
In Table A3 and A4, we examine the role of the number of observations in the performance of factor models under di¤erent values of A. According to the theory, the larger
the values of time series and observations and in the absence of the typical data problems
which are accounted for by our simulations, the better the performance of LSDF M with
respect to SSDF M . This is documented in Table A4 where the reported M SEs show
that under low serial correlation of the factor and low correlation of the idiosyncratic
errors, the accuracy of SSDF M from reasonably pre-screened indicators with respect
to LSDF M diminishes, and LSDF M outperform SSDF M from arbitrarily selected indicators.14 However, SSDF M in both scenarios about the selection of variables, still
outperform LSDF M clearly. In addition, the table shows that the relative losses in accuracy due to oversampling in LSDF M are still large and mitigates the expected asymptotic
bene…ts of large scale factor models.
As a last remark, it is worth noting that the number of factors has been restricted
to be one according to the data generating process. However, the generation of time se14
Note that SSDF M applied to arbitrarily selected indicators are contamined by data problems as
LSDF M do.
17
ries in categories with high within category and across category correlation may lead this
assumption to be too restrictive.15 To evaluate the e¤ect of this potential restriction in
the accuracy of LSDFM in factor estimation, we leave the large scale model to select the
number of factors according to the procedure described in Bai and Ng (2002) where the
maximum number of factor is 11. Table A5 and A6 report the M SE l and the averaged
number of estimated factors across the 1000 replications The main results of this exercise
are the following. First, there is no gain for the LSDF M of estimating more than one
factor versus the model when we specify just one factor, and second, the higher the correlation within categories the larger the number of estimated factors since the high correlation
in each category is interpreted as if the series belonging to this category would share a
common factor and, in this case, the performance of the LSDF M improves signi…cantly
although it still does not improve the results of the SSDF M
4.2
Forecasting accuracy
The ability of factor models in one-step-ahead out-of-sample forecasting is examined in
Table 4.16 As in the case of factor estimates, we perform an analysis under di¤erent
situations. The tables allow for di¤erent values of the autoregressive coe¢ cient of the
common factor series which is 0:1, 0:5 and 0:7517 ;and di¤erent degrees of cross-correlation
across (
s
from 0 to 0.5) and within (
l
from 0 to 0.9) categories. In addition, they show
the extent to which the forecasting performance of dynamic factor models depends on the
inertia of the series to be forecasted. For this purpose, the forecasted series are simulated
with values of
ranging from 0 (no inertia) to 0:8 (high degree of time series dependence).
As in the common factor series estimates, the persistency of the common factor series
has no e¤ect of the M SF Ers and M SF Ens ; even in the case of one category over sampled.
However, it has an important e¤ect on M SF E l ; as it does in the common factor series
15
Datasets generated from one factor but in ten categories of highly correlated indicators could need one
factor in each category.
16
In-sample forecasting analyses were also developed with similar results. These results, which have been
omitted to save space, are available from the authors upon request.
17
To save space, we only presents the results for A=0.1 (the equivalent to Table 1). The other tables
are also available.
18
estimates.
As expected, the inclusion of past values of the main series is not relevant on the
one-step-ahead out-of-sample forecast relative performance of the di¤erent models. Then,
the models are classi…ed in the same way that they when measured according to factor
estimates. Therefore there is no case in which the M SF E l is lower than M SF Ers : Then
in overall, the strategy of reasonably pre-selecting the predictors and using them in a
SSDF M almost unambiguously outperforms LSDF M and SSDF M from arbitrarily
chosen series as we showed in the factor estimation section.
Comparing M SF E l with M SF Ens ; the former is lower than the second when A = 0:1;
and the cross-correlation across categories and within categories of the idiosyncratic errors
are low18 .
5
Empirical exercise
We consider as empirical exercise the one presented in Stock and Watson (2002b), where
we use the data set to forecast 12 months ahead the Industrial Production index growth
rate. That is, our objective series to forecast is given by yt+12 = ln(IPt+12 =IPt ); where
IPt is the index of industrial production for date t. The data set is composed of the
same series as in Stock and Watson (2002b), although the time period is from 1997:01 to
2010:05. Then, the 12 months ahead forecasts are constructed starting at 2005:01, where
using the data available at this date, we compute a forecast of the objective series at
2006:01. The last forecast would be at 2010:05 using data up to 2009:05.
The …rst forecast at 2006:01 is computed as follows. The common factor series and
the unknown parameters are estimated from the data up to 2005:01. Once we have an
estimate of the common factor series, we run a regression of the objective function up to
2005:01 on the common factor series by yt+12 = Fbt + ut : Then we obtain an estimate of
the parameter : Finally, the 12 month forecast at 2006:01 is computed as y2005:01+12 =
^ Fb2005:01: Proceeding similarly through the rest of the period, we get the 12 months forecast
18
All the tables, with the same structure that we have in the section of factor estimates, including the
ones that we have in the appendix are available for the forecasting exercise. In order to save space, we
have not included them in the text, but they are available from the authors under request.
19
of yt+12 :
We repeat the real time forecasting exercise under di¤erent speci…cations for the number of factors using Doz, Giannone and Reichlin (2007). We use the common factor to
forecast yt+12 , including a large number of series (111 series) assuming 1 or 2 factors and
also estimating the optimal number of factors The data set considered is composed of 111
series of the 149 used by Stock and Watson (2002b), they are the free access series available
by internet. The results of the M SF E for these three speci…cations are displayed in the
…rst three lines of Table 5.
We also estimate the factor including a small number of series. We use the series
proposed by Stock and Watson (1991) and estimate the model by maximum likelihood.
In particular, we use Industrial Production, real personal income,employment in non agricultural sector, and manufacturing and trade sales .The results are displayed in the fourth
line of Table 5.
Finally, just for comparison purposes, the table shows in the 5th line a simple autoregressive model for the annual growth rates of IP
As can be seen in the table, the lowest M SF E is achieved when the 12-month forecast
is performed using the common factor series from the 4 main indicators. In that sense,
the empirical exercise con…rms the results obtained by the simulation study conducted
through the paper. A well speci…ed small scale model is di¢ cult to be beaten even when
large amounts of information are added into the speci…cation.
6
Conclusions
In this paper, we address the research question proposed in Aruoba, Diebold and Scotti
(2009) about the performance of large vs small scale factor models for the estimation
common factors and the forecasting of a set of goal variables. We propose simulations
which mimic di¤erent scenarios of empirical forecasting, where the list of series is …xed
(rather than tending to in…nity) and where it may appear cross correlation and serial
correlation among idiosyncratic components which may be greater than those warranted
by the theory. The Monte Carlo analysis allows for indicators which belong to di¤erent
20
categories of data and whose idiosyncratic components show cross-correlation within and
across categories in addition to serial correlation. We also allow for categories which are
oversampled. Finally, the simulations examine the accuracy of small versus large data sets
under di¤erent degrees of serial correlation in the factor.
We …nd that adding data that bear little information about the factor components does
not necessarily lead large scale dynamic factor models to improve upon the forecasts of
small scale dynamic factor models. In fact, we show that when the additional data are too
correlated with data from some categories which are already included in factor estimation,
forecasting with many predictors perform worse than forecasting from a reasonably prescreened dataset especially when the categories are not highly correlated. This results is
stronger in the case of high persistence of the common factor, in the case of high serial
correlation of the idiosyncratic components, in the case of using noisy series, and in the
case of oversampled categories. In these cases, even arbitrarily selecting one time series
from each category and using the resulting dataset in a small scale dynamic factor model
outperforms the forecasts from large scale dynamic factor models. In these situations, we
can be better o¤ throwing away some redundant data even if they are available.
21
References
[1] Angelini, E., Camba-Mendez, G., Giannone, D., Reichlin., L., and Rünstler, G. 2008.
Short-term Forecasts of Euro Area GDP Growth. CEPR working paper 6746.
[2] Aruoba, B., Diebold, F., and Scotti, C. 2009. Real-time measurement of business
conditions. Journal of Business and Economic Statistics 27: 417-427.
[3] Bai, J., and Ng, S. 2002. Determining the number of Factors in approximate factor
models. Econometrica 70: 191-221.
[4] Bai, J., and Ng, S. 2006. Evaluating latent and observed factors in macroeconomics
and …nance. Journal of Econometrics, 131: 507-537.
[5] Banbura, M., and Runstler, G. 2007. A look into the model factor model black box.
Publication lags and the role of hard and soft data in forecasting GDP. ECB working
paper 751.
[6] Boivin, J., and Ng, S. 2006. Are more data always better for factor analysis? Journal
of Econometrics 132: 169-194.
[7] Caggiano, G., Kapetanios, G., and Labhard, V. 2009. Are more data always better
for factor analysis? Results for the Euro area, the six largest Euro area countries and
the UK. European Central Bank Working Paper 1051.
[8] Camacho, M., and Perez Quiros, G. 2010. Introducing the Euro-STING: Short Term
INdicator of Euro Area Growth. Journal of Applied Econometrics, forthcoming.
[9] Doz, C., Giannone, D., and Reichlin, L. 2007. A quasi-maximum likelihood approach
for large approximate dynamic factor models. ECB working paper 674.
[10] Forni, M., Hallin, M., Lippi, M., and Reichlin, L. 2005. The generalized dynamic factor model: one-sided estimation and forecasting. Journal of the American Statistical
Association 100: 830-40.
[11] Giannone, D., Reichlin, L., and Small, D. 2008. Nowcasting: The real-time informational content of macroeconomic data. Journal of Monetary Economics 55: 665-676.
22
[12] Mariano, R., and Murasawa, Y. 2003. A new coincident index os business cycles based
on monthly and quarterly series. Journal of Applied Econometrics 18: 427-443.
[13] Moench, E Ng, S and Potter S. 2009 Dynamic Hierarchical Factor Models. Federal
Reserve Bank of New York Sta¤ Reports 412. December.
[14] Nunes, L. 2005. Nowcasting quarterly GDP growth in a monthly coincident indicator
model. Journal of Forecasting 24: 575-592.
[15] Stock, J., and Watson, M. 1991. A probability model of the coincident economic indicators. In Leading Economic Indicators: New Approaches and Forecasting Records,
edited by K. Lahiri and G. Moore. Cambridge University Press.
[16] Stock, J., and Watson, M. 2002a. Macroeconomic forecasting using di¤usion indexes.
Journal of Business and Economic Statistics 20 147-162.
[17] Stock, J., and Watson, M. 2002b. Forecasting using principal components from a large
number of predictors. Journal of the American Statistical Association 97: 1167-1179.
23
Table 1. Simulations common factor estimator (T=50, c=0.1, A=0.1)
Correlation within
categories ρl
Same number of series in each category
s
MSEr
s
Over sampling one category
MSEn
MSE
MSEns
MSEl
l
Correlation across categories ρs=0
0
0.101
0.195
0.124
0.191
0.149
0.1
0.101
0.192
0.125
0.191
0.151
0.5
0.101
0.196
0.139
0.190
0.166
0.9
0.101
0.195
0.185
0.192
0.320
Correlation across categories ρs=0.1
0
0.116
0.207
0.139
0.204
0.159
0.1
0.116
0.205
0.141
0.204
0.162
0.5
0.116
0.205
0.152
0.203
0.175
0.9
0.116
0.206
0.197
0.202
0.310
Correlation across categories ρs=0.5
0
0.223
0.289
0.236
0.285
0.235
0.1
0.223
0.286
0.239
0.284
0.234
0.5
0.223
0.286
0.246
0.284
0.243
0.9
0.223
0.287
0.281
0.284
0.300
Correlation across categories ρs=0.75
0
0.350
0.383
0.350
0.382
0.346
0.1
0.350
0.380
0.349
0.383
0.344
0.5
0.350
0.381
0.359
0.376
0.346
0.9
0.350
0.377
0.376
0.378
0.376
Notes: The values of ρs determine the cross-correlation of the idiosyncratic shocks between series from
different categories, and the values of ρl determine the cross-correlation of the idiosyncratic shocks
between series from the same category. T is the sample size. Parameters A and c measure the serial
correlation of the factor and the idiosyncratic shocks, respectively. MSE rs refers to the Mean Squared Error
of the estimation with the 10 representative series of each category, MSE ns the model with 10 arbitrarily
chosen series and MSE l the model with 100 series.
1
Table 2. Simulations common factor estimator (T=50, c=0.1, A=0.5)
Correlation within
categories ρl
Same number of series in each category
MSErs
Over sampling one category
MSEns
MSEl
MSEns
MSEl
Correlation across categories ρs=0
0
0.100
0.191
0.175
0.190
0.202
0.1
0.100
0.190
0.175
0.188
0.200
0.5
0.100
0.192
0.190
0.188
0.217
0.9
0.100
0.191
0.236
0.187
0.350
Correlation across categories ρs=0.1
0
0.115
0.204
0.191
0.201
0.207
0.1
0.115
0.203
0.191
0.201
0.208
0.5
0.115
0.204
0.206
0.200
0.229
0.9
0.115
0.203
0.250
0.199
0.340
Correlation across categories ρs=0.5
0
0.227
0.293
0.294
0.290
0.290
0.1
0.227
0.291
0.297
0.288
0.289
0.5
0.227
0.291
0.305
0.290
0.304
0.9
0.227
0.291
0.343
0.288
0.368
Correlation across categories ρs=0.75
0
0.372
0.399
0.414
0.403
0.409
0.1
0.372
0.400
0.415
0.405
0.415
0.5
0.372
0.407
0.430
0.402
0.422
0.9
0.372
0.400
0.450
0.402
0.448
Notes: See notes of Table 1.
2
Table 3. Simulations common factor estimator (T=50, c=0.1, A=0.75)
Correlation within
categories ρl
Same number of series in each category
MSErs
Over sampling one category
MSEns
MSEl
MSEns
MSEl
Correlation across categories ρs=0
0
0.097
0.182
0.382
0.180
0.395
0.1
0.097
0.182
0.384
0.180
0.427
0.5
0.097
0.183
0.397
0.181
0.429
0.9
0.097
0.182
0.444
0.180
0.525
Correlation across categories ρs=0.1
0
0.112
0.195
0.398
0.192
0.417
0.1
0.112
0.195
0.400
0.193
0.421
0.5
0.112
0.196
0.413
0.194
0.428
0.9
0.112
0.195
0.459
0.191
0.559
Correlation across categories ρs=0.5
0
0.230
0.290
0.510
0.289
0.515
0.1
0.230
0.291
0.512
0.286
0.506
0.5
0.230
0.291
0.524
0.288
0.524
0.9
0.232
0.289
0.565
0.286
0.574
Correlation across categories ρs=0.75
0
0.406
0.425
0.644
0.432
0.650
0.1
0.406
0.425
0.646
0.430
0.652
0.5
0.406
0.425
0.655
0.426
0.680
0.9
0.406
0.425
0.688
0.427
0.711
Notes: See notes of Table 1.
3
Table 4. Forecasting accuracy (T=50, c=0.1, A=0.1)
Correlation
within
categories ρl
Persistency of
the target series
γ
Same number of series in each
category
Oversampling one
category
MSFErs
MSFEns
MSFEl
MSFEns
MSFEl
0
1.107
1.215
1.14
1.171
1.161
0.3
1.101
1.202
1.161
1.218
1.199
0.8
1.086
1.172
1.099
1.178
1.173
0
1.107
1.354
1.341
1.378
1.558
0.3
1.101
1.146
1.129
1.166
1.286
0.8
1.086
1.273
1.235
1.318
1.459
0
1.197
1.280
1.198
1.371
1.342
0.3
1.200
1.248
1.237
1.324
1.321
0.8
1.154
1.222
1.156
1.288
1.238
0
1.197
1.324
1.314
1.248
1.239
0.3
1.200
1.320
1.300
1.441
1.425
0.8
1.154
1.320
1.319
1.394
1.407
Correlation across categories ρs =0
0
0.9
Correlation across categories ρs =0.5
0
0.9
Notes: The estimated model is y t +1 = βFt + γyt + e yt +1 .
4
Table 5. Simulated Out-of-Sample Forecasting Results
Industrial Production, 12-Month Horizon. Sample period 1997:01 to 2010:05, out of
sample forecast period 2006:01 to 2010:05.
Forecast method
MSFE
LSDFM , r=1
0.0038
LSDFM , r=2
0.0047
LSDFM , r*
0.0100
SSDFM, LI, r=1
0.0033
AR, annual growth
0.0037
VAR, annual growth and LSDFM , r*
0.0084
VAR, annual growth and SSDFM, LI, r=1 0.0043
Note. The parameter r determines the number of common factor series estimated. MSFE rs refers to the
Mean Squared Forecast Error of the estimation with the 10 representative series of each category,
MSFE ns the model with 10 arbitrarily chosen series and MSFEl the model with 100 series.
5
APPENDIX
Table A1. Simulations from estimating the common factor (T=50, c=0.75, A=0.1)
Correlation within
categories ρl
Same number of series in each category
MSErs
Over sampling one category
MSEns
MSEl
MSEns
MSEl
Correlation across categories ρs=0
0
0.169
0.265
0.317
0.272
0.382
0.9
0.172
0.265
0.518
0.263
0.604
Correlation across categories ρs=0.75
0
0.503
0.506
0.538
0.508
0.534
0.9
0.503
0.505
0.591
0.510
0.596
Notes: See notes of Table 1.
Table A2. Simulations from estimating the common factor (T=50, c=0.75, A=0.75)
Correlation within
categories ρl
Same number of series in each category
MSErs
Over sampling one category
MSEns
MSEl
MSEns
MSEl
Correlation across categories ρs=0
0
0.251
0.470
0.507
0.482
0.558
0.9
0.251
0.461
0.696
0.496
0.833
Correlation across categories ρs=0.75
0
0.751
0.820
0.885
0.843
0.874
0.9
0.749
0.819
0.964
0.845
0.961
Notes: See notes of Table 1.
6
Table A3. Simulations from estimating the common factor (T=150, c=0.1, A=0.1)
Correlation within
categories ρl
Same number of series in each category
MSErs
Over sampling one category
MSEns
MSEl
MSEns
MSEl
Correlation across categories ρs=0
0
0.095
0.175
0.108
0.176
0.134
0.9
0.094
0.176
0.161
0.175
0.333
Correlation across categories ρs=0.75
0
0.350
0.376
0.340
0.375
0.333
0.9
0.350
0.377
0.370
0.375
0.364
Notes: See notes of Table 1.
Table A4. Simulations from estimating the common factor (T=150, c=0.1, A=0.75).
Correlation within
categories ρl
Same number of series in each category
MSErs
MSEns
Oversampling one category
MSEl
MSEns
MSEl
Correlation error term Series of SSDFM: ρs=0
0
0.092
0.168
0.195
0.168
0.218
0.9
0.092
0.169
0.252
0.169
0.314
Correlation across categories ρs=0.75
0
0.409
0.427
0.487
0.427
0.477
0.9
0.409
0.428
0.531
0.429
0.523
Notes: See notes of Table 1.
7
Table A5. Simulations from estimating the common factor (T=50, c=0.1, A=0.1). The
number of common factors is selected as in Bai and Ng (2002).
Correlation
within
categories ρl
Same number of series in each category
rˆ
Over sampling one category
MSEl
rˆ
MSEl
Correlation across categories ρs=0
0
2.31
0.121
1
0.147
0.9
10.89
0.140
1.84
0.196
Correlation across categories ρs=0.75
0
2.60
0.326
1.20
0.350
0.9
10.89
0.288
2.04
0.363
Notes: The values of rˆ are the averaged number of estimated number of factors across
replications. See notes of Table 1.
8
Table A6. Simulations from estimating the common factor (T=50, c=0.1, A=0.75). The
number of common factors is selectd as in Bai and Ng (2002).
Correlation
within
categories ρl
Same number of series in each category
rˆ
Over sampling one category
MSEl
rˆ
MSEl
Correlation across categories ρs=0
0
2.39
0.380
1
0.404
0.9
10.88
0.403
1.89
0.455
Correlation across categories ρs=0.75
0
2.58
0.621
1.24
0.643
0.9
10.86
0.587
2.06
0.667
Notes: The values of rˆ are the averaged number of estimated number of factors across
replications. See notes of Table 1.
9