Abstract
Gross domestic product (GDP) is one of the key economic variables observed to assess the country’s overall economy. India is the third largest economy in the world but lagging in quality and timeliness of GDP reporting. 1 The USA, UK, Euro zone and most of the developed countries have been providing better quality and timely information on GDP. Nowcasting is defined as estimation of very recent past, the immediate present and the very recent future (Giannone, Reichlin, & Small, 2008). Much of the work on GDP nowcasting uses pseudo real time data, whereas our research work has used a real time dataset for both the dependent and independent variables for nowcasting Indian GDP in real time. However, the real time datasets have issues of data revisions and biases, which have been handled in this article using a factor modelling approach with bridge model and vector auto regression model. We also explore the impact of within quarter new information flow and this will provide an opportunity to improve the nowcasting accuracy by using the most recent information.
Introduction
The state of real economy is continuously evolving. Policy makers and businesses require accurate and timely understanding of real economic activity for demand estimation, inventory management, invest-ment decisions and policy formulation. It is of great importance to have information on current gross domestic product (GDP) growth rate for the understanding of the overall state of the underlying eco-nomy. Quarterly GDP 3 data for India have been available since 1996 by Central Statistical Organization (CSO) under Ministry of Statistics and Programme Implementation (MOSPI). Indian quarterly GDP data released with a substantial time lag of 45–60 days from the passage of reference quarter (Bhadury, 2018). Hence, for almost first 2 months of next quarter, we do not have actual information about last and current quarter GDP data. MOSPI recently in 2015 had revised the base year to 2011–2012 and the back series was only available to public after much delay 4 in November 2018.
Nowcasting is defined as estimation of very recent past, the immediate present and the very recent future (Giannone, Reichlin, & Small, 2008). GDP nowcasting is relatively a new field in economics. Until recently, it had received very little focus as research area (Bernanke & Boivin, 2003). However, various relevant key economic indicators are available at higher frequency 5 in a timely fashion such as industrial production index (IIP), inflation index, import, export, railway, money flow, etc. The information content in these high-frequency variables can be used to estimate the quarterly GDP growth rate much before its official release by CSO.
Nowcasting exercise is a short-term prediction which comprises last and current quarter estimations, hence there is a great advantage in using within quarter information flow. Each month, some new information in relevant monthly variables is available through either fresh release or revision in already released dataset. Empirical studies have shown that with release of every intra-quarter information, the current quarter GDP nowcast become progressively more accurate. Our modelling strategy incorporates any new information in selected monthly independent variables by revising implicit weights or coefficients in nowcasting models.
In real time, available economic information is either estimate or provisional, which is subject to future revisions after its initial release. The revisions in economic variables happened at different interval of time and are of different magnitude (Zellner, 1958). This study uses real time (vintage) data series for quarterly GDP and other eighteen independent variables in nowcast modelling. As per best of our knowledge and available literature, our study is the first attempt to nowcast Indian GDP using real time (vintage) data series.
We used bridge model and vector autoregressive (VAR) model with factors extracted as principal components from eighteen independent monthly variables. Principal component analysis (PCA) was first used by Karl Pearson in 1901 and later independently by Harold Hotelling in 1930. The principal component is used to exploit information available in large number of variables by identify relatively small number of components (Giannone, Reichlin, & Small, 2008). The principal components are extracted using all variables and the decision on number of final principal components for nowcast modelling is based jointly on eigenvalues criteria and their incremental contribution in total variance. The principal components cancel out the impact of data revision and biases across real time data series and hence are robust to both data revision and data bias issues (Aastveit & Trovik, 2013; Giannone, Reichlin, & Small, 2008). Therefore, nowcasting models using principal components as factors can extract information content from large number of variables and also provide remedy for data revisions and biases in real time dataset.
As a convention, each model performance is compared with univariate benchmark model. The multiplicative seasonal autoregressive integrated moving average (MSARIMA) modelling for individual series was done at end of each month for estimation of missing values up to current quarter. MSARIMA comes under the category of univariate time series models. We also used survey of professional forecaster (SPF) by Reserve Bank of India (RBI) 6 as bench mark model. Angel, Geert, and Min (2007) suggest that survey forecasts outperform others, because of use of training, expertise and experience of professional forecasters. The results of our current work have shown the superior performance by factor VAR model over all three models namely univariate benchmark, factor bridge model and SPF in nowcasting quarterly GDP growth rate of Indian economy in real time.
The rest of the article is structured as follows. The next section is about literature review followed by objective and rationale of the study. The main sections discuss methodology, data and analysis. We also presented the managerial implications and future scope of this research work.
Review of Literature
Nowcasting in economics is used for those macroeconomic variables that are available at low frequency, typically quarterly and released with a publication lag. Literature survey evaluates different nowcasting techniques, challenges in using real time dataset and elaborates implications of data revision on nowcasting models.
Box and Jenkins (1970) popularized Autoregressive Moving Average (ARIMA) models for economic forecasting, which is simple and performed well (Stock & Watson, 2002). The ARIMA model uses the information incorporated in existing series for forecasting (Ghosh, 2009). These models are linear and include lag of variable of interest. ARIMA model is used as benchmark or rival model for comparing nowcasting performance of other nowcasting models, due to their simplicity and non-dependency on other series for estimation.
Further literature focused on multivariate modelling techniques for nowcast quarterly GDP growth rate using monthly or high-frequency variables. The information on quarterly GDP growth rate has been released by statistical agencies or central banks with a publication lag. However, information on various leading and coincident monthly variables relevant to quarterly GDP product growth rate is available early and in a timely fashion (Baffigi, Golinelli, & Parigi, 2004; Diron, 2006; Kitchen & Monaco, 2003). Therefore, multivariate techniques can take advantage of the information content available in relevant monthly variables to nowcast quarterly GDP growth rate in real time. For example, industrial production index (IIP) measures directly certain component of quarterly GDP growth rate and hence contains strong signal for GDP growth rate nowcast. Monthly IIP data are also available shortly after the reference month with little lag. This gives opportunity to update GDP nowcast every month, thrice within a reference quarter because of availability of monthly IIP information.
One of the early approaches used quarterly and monthly variables simultaneously was bridge modelling (Baffigi, Golinelli, & Parigi, 2004). Bridge models are widely used in policy institutions and central banks for obtaining an early estimate of quarterly GDP using information available in selected monthly variables. Bridging here implies linking monthly variables with quarterly variables. Bridge equation links the quarterly GDP with aggregated monthly variables to nowcast the quarterly GDP.
Unlike bridge models, the VAR is a dynamic system exploiting the inter relationship among all the variables. It is an approach making no or minimum underlying economic theory assumptions (Sims, 1980). But VAR model with too many lags or variables performs poorly because of spurious or over fitting of model in the estimation period. Also, VAR nowcast assumes that the past relationship holds in future and is susceptible to structural break. This can be overcome by using the principal component or factor in VAR model which not only reduces the number of variables for estimating VAR but also takes care of data revision and bias.
The nowcasting performance is not based on absolute value of individual data revision but on the relative size of revisions. The relative size of data revision is measured by noise–signal ratio 7 which varies across economic variables (Aruoba, 2008). Further if data revisions are small and random, then they do not matter much in nowcasting accuracy. Therefore, use of many variables in place of single or few variables as in case of principal components provides a statistical method to take care of data revision because of pooling of random errors (Giannone, Reichlin, & Small, 2008; Stock & Watson, 1999, 2002). Second, the use of growth rates instead of levels helps in reducing the impact of data revision since levels are more sensitive to data revision (Stock & Watson, 1989). Data revision influences the nowcast in three ways: (a) a direct channel because of revision in variable values, (b) an indirect channel because of change in estimated coefficients and (c) change in overall model specification (Stark & Croushore, 2002). So real time data nowcasting requires re-estimation of model with each new piece of information. A recursive modelling approach can help to overcome the issue of data revision by updating the nowcast model in real time.
Objectives of the Study
The objective of the study is to nowcast Indian quarterly GDP growth rate using large number of high-frequency monthly variables in real time.
Rationale of the Study
There is no real time study of quarterly GDP growth rate for India. Our study has used the real time dataset compiled by the author for Indian economy. We also evaluate the impact of information flow within quarter by nowcasting GDP growth rate at end of each successive month using all available information up to that point in time. The nowcasting horizon in our study for quarterly GDP growth rate is current quarter and in case of non-availability of previous quarter information we also nowcast GDP growth rate for previous quarter. The nowcast horizon up to current quarter additionally ensures that the nowcast model performance is judged on its modelling characteristics rather than influenced by estimation accuracy of independent variables for future horizon.
Methodology and Data
DATA
Often statistical agencies release data to public well before their samples have been completed. Over time they gather more and more information and improve upon their initial estimates of economic activity. The economic variables including quarterly GDP growth rate are following the same pattern of data revision and become more accurate over time. The implication of data revision is that the nowcasting will be a function of dataset at the time of observation (Croushore & Stark, 2001). Kishor (2011) studied the data revision of quarterly GDP growth rate for the Indian economy. He concluded that data revision in GDP growth rate is significant and neither contains pure news nor noise.
The real time data on economic indicators for developed countries are readily available from various sources such as Federal Reserve Bank of Philadelphia, Federal Reserve Bank of St. Louis, EABCN Real Time Database (RTDB 8 ) and Federal Reserve Bank of Dallas, etc. The real time dataset for Indian economic variables is not available as compiled dataset to public. We have compiled the individual data series for all the eighteen economic variables and quarterly GDP (Table 1).
Economic Variable and Source
The real time historical data series of each variable is done by compilation of past data from their original source of publications. The complete dataset is from April 1996 to December 2013 with a total of 213 months, comprising 4,047 unique time series. This real time dataset perfectly mimics the information flow available to forecaster in real life. The first vintage we use is for December 2009 and last at December 2013. The descriptive statistics is available in Table 2.
Descriptive Statistics of Original Series
The unit root testing and appropriate data transformation conducted at each month end after considering all available information in real time. The non-stationary variables are transformed into stationary variables with appropriate transformation of individual time series. This exercise has been done for all 19 series for all 48 months from December 2009 to December 2013. There is varying degree of information availability in a month because of difference in publication dates and missing values across variables. Missing values are estimated using univariate models generally autoregressive models (Angelini, Camba-Méndez, Giannone, Reichlin, & Rünstler, 2011). Nowcasting literature on quarterly GDP growth rate aggregates the monthly variables into quarter, so that both the left and right-hand side variables in nowcasting equation have same frequency (Foroni & Marcellino, 2014).
Bridge Model
Generally, information on high-frequency variables is available early and more timely than quarterly variables. Bridge model regresses quarterly GDP on a set of high-frequency indicators which are temporally aggregated to quarterly frequency so that we have similar frequency on both sides of the equation (Angelini, Camba-Méndez, Giannone, Reichlin, & Rünstler, 2011). The in between quarter missing values of the high-frequency variables are calculated using univariate methods before aggregation (Antipa, Barhoumi, Brunhes, & Darné, 2012). Temporal aggregation of high-frequency variables depends on nature of variable, that is, whether it is a stock 9 or a flow 10 variable. A stock variable is temporal aggregated through skip sampling and flows variable by summing with no overlap in the sums.
Bridge models can only be used with limited numbers of regressors because of challenges in degree of freedom and multicollinearity (Diron, 2006; Kitchen & Monaco, 2003). Giannone, Reichlin, and Small (2008) proposed a parsimonious way to exploit large information available in monthly indicators to forecast quarterly GDP using a few common factors as regressors in bridge model:
In the above equation, m is the number of autoregressive terms, q is the number of monthly explanatory variables Xt and k is the number of lag of monthly explanatory variables.
Vector Auto Regression (VAR) Model
Sims (1980), the VAR approach is a reduced form of dynamic system of equations with n equations having n variables and each variable in an equation has p lags. VAR of order p, that is, VAR (p) can be represented as
where Yt is (n × 1) matrix of stationary endogenous variables, C is (n × 1) matrix of constant terms, Ap is (n × n) coefficient matrices and ut is a (n × 1) error term uncorrelated with all the right-hand side variables. Error terms are uncorrelated with their own lagged values cov(ut,ut-1) = 0 for i ≠ 0 but may be contemporaneously correlated. Even in presence of contemporaneously correlated errors, the ordinary least square estimator is efficient because all VAR equations have same regressors.
In expanded form, a VAR(p) is written as below, where lag order p is based on information criterion. The various information criteria such as Akaike info criterion (AIC), Schwarz information criterion (SIC) and Hannan–Quinn criterion (HQ) can be used for the selection of appropriate lag length order for VAR. The information criteria measure the distance between observations and model classes. Smaller the information criteria value, smaller the distance, and hence the chosen model is a good descriptor of data generating process.
The nowcast modelling exercise includes estimation of VAR order p followed by model estimation and checking for standard assumptions of residual’s normality and no autocorrelation (Lutkepohl, 1991). The final nowcasted model is used to nowcast current and last quarter GDP growth rate.
The VAR with principal component has PCi and εpCit as principal components and standard errors, where i = 1, 2, …, N. N is the number of principal components for each nowcasting period selected based on criteria.
Analysis
The predictive accuracy of a nowcast model is important not only for nowcasting but also for selection of appropriate model for policy analysis. The impact of data revision on real time nowcasting models can be substantial (Croushore & Stark, 2001). The nowcast models are evaluated with univariate benchmark model, and with the ‘Survey of professional forecaster, RBI’. A nowcast model is judged based on how successful it is in nowcasting quarterly GDP growth rate in real time. For comparing nowcast accuracy we need to minimize the specific loss function. The mean square error (MSE) is the most common used loss function for evaluating predictive accuracy of a model (Granger & Pesaran, 1999). One of the objectives of our study is to analyse the impact of information flow during the quarter on nowcasting performance of the models. Therefore, we have looked separately into the nowcast performance during the first, second and third month of a quarter. The nowcast models are evaluated based on relative MSE (RMSE) in comparison with rival or benchmarking model. All the nowcast models are evaluated using out of sample relative mean square error (RMSE) from December 2009 to December 2014. Then each nowcast model is ranked based on its relative performance.
The univariate model and survey of professional forecaster are the two benchmark models. As per the definition of relative MSE (RMSE), a value of more than one is good, and higher the value better the relative performance of respective nowcast model for quarterly GDP growth rate.
The relative MSE of benchmark model will always be one (Table 3). The first month’s relative MSE of all models except the survey of professional forecasters is more than one. This signifies the better performance of individual models than ARIMA model for the first month. In the second month, the performance of ARIMA has improved because of the availability of the last quarter GDP growth rate. However, the performance of other models is still better than the benchmark model. The third month’s performance of VAR is better than the benchmark model. Nowcast from survey of professional forecaster is available once in quarter. So, an attempt to increase the frequency of survey will help in improving its overall performance.
Relative MSE with ARIMA as Bench Mark Model
The significance of difference in nowcasting performance of two alternate models cannot be completely judged by traditional evaluation criteria based on RMSE alone (Antipa, Barhoumi, Brunhes, & Darné, 2012). Diebold and Mariano (1995) developed a model free test known as Diebold Marino test (DM-test) for nowcast (forecast) accuracy. DM-test determines the significance of difference in nowcast performance of two different models. The DM-test is robust in the sense that it is applicable to non-quadratic loss function, and to nowcast errors which are non-normal, non-zero mean and serially and contemporaneously correlated. Table 4 provides pairwise DM-test statistics with associated probability in brackets. All pairwise DM-tests reject the null hypothesis of equal nowcast accuracy of models.
DM-test
Conclusion
This research work compiles the real time dataset of Indian economy and is the first real time nowcasting study of Indian quarterly GDP growth rate. The different modelling techniques used in this work for nowcasting quarterly GDP growth rate are principal component, seasonal adjusted ARIMA, bridge model and VAR model. First, we used for the first time a real time dataset for Indian economy. Second, we used principal components to simultaneously exploit information in a large number of relevant high-frequency variables. Third, we used bridge and VAR model for nowcasting quarterly GDP growth rate. Finally, the nowcasting performance is evaluated with benchmark model and the survey of professional forecaster by RBI in real time. Nowcasting models also evaluate the impact of information flow within quarter by nowcasting quarterly GDP growth rate at the end of each successive month. The month-on-month performance of all models has shown improvement and confirms the approach of using the real time information and recursive nowcasting at every month end. The VAR model with principal components has outperformed the survey of professional forecaster and benchmark model. This study can be used beyond GDP nowcasting and for other macroeconomic variables.
Managerial Implications
Nowcasting of quarterly GDP growth rate plays a prominent role in policy and decision-making. Organizations and governments across the world are interested to know about the current economic activities. The real time knowledge of current economic activities is imperfect for present or even for the recent past. This is because data on many key economic variables have been released with a long publication delay and then subsequently revised. Real time nowcasting of quarterly GDP growth rate is done using most recent available information and helps the businesses and policy makers in decision-making. The methodology used in our research work can be extended to nowcast other low-frequency variables and study of business cycle in real time (Daniel & Fosten, 2016).
Future Research
The future scope of research should be developing nowcasting models using big data and a large number of economic variables. One can even incorporate sentiment factors from newspaper articles, twitter, Facebook posts, etc. on current economic conditions to improve the accuracy of GDP nowcasts. The empirical work has shown the impact of sentiments on Indian stock market and economy (Jana, 2016). In future with improved availability of real time dataset for Indian economy, one can study the nowcasting models at frequency higher than monthly. Arora and Kalsie (2017) have studied the impact of US financial crisis on BRIC nations GDP. Hence future studies may include the international variables impacting the Indian GDP growth rate.
Footnotes
Acknowledgements
The author are grateful to the anonymous referees of the journal for their extremely useful suggestions to improve the quality of the article. Usual disclaimers apply.
Declaration of Conflicting Interests
The author declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
The author received no financial support for the research, authorship and/or publication of this article.
