Dynamic Factor Model for Heterogeneous Data One of the major limitations in classical econometric models and multivariate time series models is that the number of parameters to estimate increases with the square of the dimension of the vector of time series, with the consequently loss of degrees of freedom. When addressing empirical issues, it is crucial to find simplified structures which can be correctly estimated. As a solution to the problem of dimensionality, factor models have become one of the most useful tools between researchers and practitioners. Chapter 1 reviews the main characteristics of this methodology and its different specifications along literature.
First applications of Dynamic Factor Models (DFM) to macroeconomic series were originally proposed by Geweke (1977) and Sargent et al. (1977), as an extension of the classical static factor models to the field of time series and were initially known as index models. Since then, an extensive literature, both theoretical and empirical, about them has been developed. The main idea behind factor decomposition in time series analysis is that the co-movements of a high-dimensional vector of observed variables, yt, are driven by two mutually orthogonal components: a small number of latent dynamic factors, Ft, and a vector of mean-zero idiosyncratic disturbances, et, that are specific to an individual series. Let us consider the following factor model representation for observation yit, with i = 1,...,N where N is the number of cross-section units and t = 1, ..., T where T is the number of time series observations:
yit = piFt + eit, each component of the (r x 1) vector pi, given by pij for i = 1, ...,N and j = 1, ..., r, is known as the factor loading, where r is the number of latent factors, and p′iFt is considered as the common component for such observation. It is usually assume that the latent factors follow autoregressive dynamics of order p.
Given the advantages of these models for dimension reduction, the state-of-the-art about DFM has distinguished different versions and implementations. Attending to the literature, this thesis considers two possible classifications of the DFM: one depending on the amount of observable time series, N, used for the estimation of the latent factors; and the other depending on the assumptions made on Ft in order to be common and on et in order to be idiosyncratic.
In Chapter 2 a criterion for the estimation of the number of static factors is proposed, it can be seen like a test, as it is called by Ahn and Horenstein (2013), or like a criterion or rule, as it is called by Lam and Yao (2012). The proposed criterion is based on eigenvalue ratios and it combines the advantages of those proposed by Ahn and Horenstein (2013) and Lam and Yao (2012) (AH and LY from now on) and adds four others: First, it is based on the correlation instead of the covariance matrices and, therefore, the test is robust to a few atypical series with large variance that can dominate the results of a test based on the eigenvalues of the covariance matrices. Second, the new test uses all the information available about the dependency among the series as it incorporates both the information about the lag zero dependency (as the AH test) and the positive lagged dependency (as the LY criterion). Third, instead of adding the lagged covariance matrices they are combined with weights that depend on the precision estimation of each matrix. Fourth, when the series are heteroscedastic theoretical reasons are given to justify that the ratios of eigenvalues of correlation matrices are expected to be more powerful to detect the number of factors than those from the covariance matrices.
The proposed eigenvalue test is based on the weighted combination of the correlation matrices of the observed data, called the combined correlation matrix. Different weights can be considered but a simple solution is to use the asymptotic variance of the autocorrelation and cross correlation coefficients for white noise stationary process.
Let α1 ≥ α2 ≥ ... ≥ αN be the ordered estimated eigenvalues of the combined correlation matrix. The test selects the number of factors, r, as r = arg max αi / αi+1 1≤i≤r∗ for some r∗ = αN.
The test proposed in this chapter has been evaluated in different scenarios depending on the idiosyncratic error structure. It has shown a better overall performance than the ones proposed by Ahn and Horenstein (2013) and Lam et al. (2011). The advantages of the test appear mostly under a realistic error structure that includes heteroscedasticity in the series and allows the errors to present cross-sectional and serial correlations. Also, it has been illustrated in a real example that this test is less affected by atypical series with large variability and, therefore, has clear advantages in empirical applications.
Chapter 3 extends the proposed method introduced in Chapter 2, which is based on the use of correlation matrices, for the estimation of the factor space. This chapter focuses on the estimation of the DFM by means of non-parametric statistical tools. The most famous technique in this topic is Principal Component Analysis (PCA) which takes into account contemporaneous information about the data. Up to our knowledge, little attention has been given to the estimation of the common component having into account past information, previous work in this topic are Peña and Box (1987), and Lam et al. (2011). The interest is to analyse how and in what degree different idiosyncratic error structures, which are more realistic than the classical scalar error structure, may affect to the estimation of the DFM. It is compared the effect of different error structures on the PC estimator considered in Stock and Watson (2002), Bai and Ng (2002) and Bai and Ng (2006) between others; the pooling lagged estimator proposed in Lam et al. (2011), called LY estimator in what follows; and the one proposed in this chapter based on lagged correlation matrices, called CP in what follows. The main contribution is a Monte Carlo analysis of different data error structures in finite samples, where the exact and the approximate DFM are examined in deep.
This chapter has evaluated the finite sample performances of the principal component estimator, the estimator based on the eigenvectors of the pooling lagged matrices and the proposed estimator based on the eigenvectors of the combined dynamic correlation matrix. Some simulation experiments have been conducted to analyse for which sample size, T, and dimension of time series, N, would be more advantageous to consider the classical principal component estimator or the lagged estimators. Simulation results comparing the three methodologies, under different idiosyncratic error structures, have shown that the Relative Precision Growth rate of using CP procedure with the combined correlation matrix can be up to 140%, whereas the disadvantages would be as maximum of 3%. Furthermore, these gains would be obtained under a more realistic error structure than the classical one with homoscedastic errors, given that errors may present some degree of serial and cross-sectional dependence.
Chapter 4 examines the disparities in the evolution of the business cycle synchronization across the members of the Euro Area (EA( by proposing the following two-step procedure. In the first step, EA and country-specific measures of aggregate economic activity are obtained by constructing a large dataset of cross-country series from several macroeconomic categories, whose co-movements are captured by a Dynamic Factor Model with Cluster Structure (DFMCS).
In the second step, the measures of aggregate economic activities obtained in the factor analysis are used in the Markov-switching framework developed by Leiva-Leon (2017) to draw inferences about the synchronization of business cycles across the EA members. In contrast to other standard approaches, which summarize the overall level of synchronization in a single number for the entire sample period, this multivariate Markov-switching approach allows to compute a measure of pairwise synchronization at each time observation along the sample. Therefore, the evolution of the time-varying dynamic interactions across the business cycles of the EA members can be examined. In the proposal, the DFM of Kose et al. (2003) and Crucini et al. (2011) are used because it allows to distinguish between common sources of variation in the Union and nation-specific factors.
Using a recent dataset, which encompasses the financial and the sovereign debt crises, it is found that, overall, the degree of synchronization of the EA members remained stable until the financial crisis, which implied a dramatic reduction in the degree of synchronization due to the different effects of this shock on each country. Thereafter, all the countries showed a progressive recovery in the synchronization to pre-crisis levels. Notably, there exits significant discrepancies in the recovery paths. Some countries have been able to catch up their pre-crisis level of synchronization very fast, letting even some countries to improve their initial levels. However, some EA members are still far from recovering their pre-crisis degrees of business cycle synchronization.
In Chapter 5 the interest is to analyse the existence of common factors that describe a global behaviour in the international energy market, together with group-specific factors explaining energy prices related to regions, countries or industrial sectors.
Up to our knowledge, it is a novelty to analyse the co-movements of international energy prices in a bigdata scenario of 30 countries and 12 industrial sectors. The data set is from Sato et al. (2019). The Dynamic Factor Model with Cluster Structure (DFMCS) is used for the analysis. This model allows to investigate if there exists a group structure between international energy prices, to characterize the heterogeneity of the global energy market based on industry, country or region, to quantify the extent to which “crisis” affected the global energy prices, and to identify the sources that explain the cross-section variations in energy prices through control variables which are country-specific.
An extension of the methodology proposed in Alonso et al. (2020) is presented in order to study the effect of control variables, which are country specific, over energy prices. Also, a Monte Carlo simulation is provided to evaluate the performance, in finite samples, of Alonso et al. (2020) clustering procedure when control variables are taken into account. Results provide useful interpretations about the existence of trading groups of countries in the global energy market.
Results from the application of international energy prices have provided useful interpretations about the existence of co-movements between energy prices related to group of countries instead of groups related to industrial sectors. Country connections within groups may be also explained by the high price of a specific fuel type. This analysis gives new insights for public policy decision making, to formulate and implement environmental policies, and for energy market investors to diversify their portfolios.
Chapter 6 presents the conclusions. This research aimed to improve the way dynamic factor models (DFMs) are built and shows its applications to large databases. Results from this thesis may be useful for economic policy decisions and for analysing high dimensional heterogeneous dynamic data.
The main contributions of the thesis are as follows: First, this thesis presents a new approach for finding the number of factors in a DFM and estimating them. Second, it extends the methodology proposed by (Alonso et al., 2020) to build DFM with cluster structure by introducing the effect of macroeconomic variables. Third, it shows how DFM can contribute to the analysis of macroeconomic variables which are representative of the business cycles, to the study of CO2 emission, to the evaluation of the synchronization of Euro Area business cycles and to investigate co-movements between international energy prices.
The first drawback that researchers encounter when estimating the DFM is to select the number of common factors. In this thesis a new eigenvalue ratio test is proposed and theoretical reasons for the advantages of the proposal are presented, especially when the error structures include heteroscedasticity and serial and cross-sectional dependencies. These properties have been confirmed by Monte Carlo simulations and by an application to real macroeconomic data.
Furthermore, the proposed approach is extended to the estimation of the common component of the model. Using Monte Carlo simulations improvements in the estimation of common factors with respect to other alternative methods are observed. This happens especially when the errors are heteroscedastic and present serial and cross-sectional correlation. It is shown in an application with real data, on CO2 emissions, that the proposal provides interpretable results which are more meaningful than alternative methods.
Next, the usefulness of DFMs with cluster structure (DFMCS) is analysed in two empirical applications: one on the synchronization of Euro Area business cycles and the other on international energy prices.
The first application, on the synchronization of Euro Area business cycles, shows the advantages of considering DFMCS when analysing economic relations between country members, evaluating the effect of expansionary monetary policies and studying the effect of the financial crisis in 2008 and the European sovereign debt crisis in 2011. Results conclude that, although the countries experience a generalized fall in synchronization in the financial crisis, they recover the levels of synchronization that characterized the pre-recessive period. Furthermore, results support the presence of a Two-speed Europe after the financial crisis in terms of economic synchronization.
Finally, an extension in the methodology proposed by (Alonso et al.,2020) is included. In the thesis a penalized regression is proposed to estimate the coefficients associated to explanatory variables. The effect of this estimation method on the group factor structure has been evaluated with Monte Carlo simulations under different data generating processes. This new proposal has been applied to a large data set of international energy prices with country-specific explanatory variables. Results from the analysis identify the existence of co-movements between energy prices related to groups of countries and highlight the effect of some macroeconomic variables.
Future extensions of this research are: first, to develop the theoretical framework behind the estimation of the factor space for the proposed new approach. Second, build up the theoretical assumptions associated to the effect of including exogenous variables that are country-specific in the DFMCS. Third, to evaluate the performance of the new approach under different models specifications for the common latent factors. Fourth, to extend the new approach to the analysis of nonstationary data and time-varying parameters. Fifth, to evaluate the potential of the proposed estimation method for forecasting large data bases.
© 2008-2024 Fundación Dialnet · Todos los derechos reservados