Short‐term forecasting of the US unemployment rate

Published date01 April 2020
DOIhttp://doi.org/10.1002/for.2630
Date01 April 2020
AuthorBenedikt Maas
Received: 17 April 2019 Revised: 6 August 2019 Accepted: 28 October 2019
DOI: 10.1002/for.2630
RESEARCH ARTICLE
Short-term forecasting of the US unemployment rate
Benedikt Maas
Department of Economics, University of
Hamburg, Hamburg, Germany
Correspondence
Benedikt Maas, Department of Economics,
University of Hamburg, Von-Melle-Park5,
20146 Hamburg, Germany.
Email: Benedikt.Maas@uni-hamburg.de
Abstract
This paper aims to assess whether Google search data are useful when predicting
the US unemployment rate among other more traditional predictor variables.
A weekly Google index is derived from the keyword “unemployment” and is
used in diffusion index variants along with the weekly number of initial claims
and monthly estimated latent factors. The unemployment rate forecasts are
generated using MIDAS regression models that takeinto account the actual fre-
quencies of the predictor variables. The forecasts are made in real time, and the
forecasts of the best forecasting models exceed, for the most part, the root mean
squared forecast error of two benchmarks. However, as the forecasting hori-
zon increases, the forecasting performance of the best diffusion index variants
decreases over time, which suggests that the forecasting methods proposed in
this paper are most useful in the short term.
KEYWORDS
MIDAS, Google data, forecast comparison, US unemployment
1INTRODUCTION
In general, traditional labor statistics are available with
at least a 1-month lag. However, a more timely estimate
of the unemployment rate is desirable for investors and
policymakers, especially in times of economic uncertainty.
An accurate prediction of the US unemployment rate has
become even more important after the 2008–09 recession,
especially since the Federal Reserveannounced in Decem-
ber 2012 a shift of its monetary policy to a specific unem-
ployment rate threshold. The so-called “Evans rule” stated
that “the Committee decided to keep the target range for
the federal funds rate at 0 to 1/4 percent and currently
anticipates that this exceptionally low range for the fed-
eral funds rate will be appropriate at least as long as the
unemployment rate remains above 6-1/2 percent.”1
1For the official statement of the Federal Reserve's Open Market Com-
mittee see https://www.federalreserve.gov/newsevents/pressreleases/
monetary20121212a.htm.
This paper investigates whether or not the information
given in Google searches is useful to predict the US unem-
ployment rate. The idea behind using search engine data
is that, if an increase in searches is observed in connection
with unemployment, then this could give an early indi-
cation of an increasing unemployment rate. The potential
predictive power of Google search is used alongside other
more traditional predictors. One of these is the number
of IC. The IC is widely used in the literature as a predic-
tor variable in unemployment rate forecasts.2The current
state of the economy is also considered as a predictor in the
forecasts. The state of the economy has a major impact on
the unemployment rate: During a recession, an increase
in the unemployment rate is expected, whereas, during an
upswing and prosperity phase, a decrease in the unem-
ployment rate is expected. Totake into account the current
2See, for example, Montgomery,Zarnowitz, Tsay, and Tiao (1998)
This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivsLicense, which permits use and distribution in any medium,
provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.
© 2019 The Authors. Journal of Forecasting published by John Wiley & Sons, Ltd.
Journal of Forecasting. 2020;39:394411.wileyonlinelibrary.com/journal/for394
state of the economy,unobserved latent factors are derived
from a macroeconomic database by principal components,
as suggested in Stock and Watson (2002). These factorsare
intended to establish a link between the economic situa-
tion and the unemployment rate in the forecasting exercise
in this paper.
Given that the Google and IC data are available on a
weekly frequency, this paper uses weekly data to forecast
the monthly US unemployment rate with three DI variants
after Stock and Watson (2002) based on the MIDASregres-
sion model introduced by Ghysels, Santa-Clara, and Valka-
nov (2006) and Ghysels, Sinko, and Valkanov (2007). In
addition, factor-augmented versions, where the monthly
unobserved latent factors and the weekly data are com-
bined, are also estimated. In general, the MIDAS frame-
work allows us to combine variables of mixed frequencies
in a regression model. Related studies have used monthly
averages of the Google and IC data (D'Amuri & Marcucci,
2017), but—as empirically shown in Smith (2016), who
applies the MIDAS approach to forecasting the unemploy-
ment rate in the UK—there is no need to adjust frequency
to the target variable and thus lose valuable information.
As stated generally in Andreou, Ghysels, and Kourtellos
(2010), there is no reason to ignore the fact that variables
involved in empirical models are generated from processes
of mixed frequencies and are used to estimate econometric
models based on an aggregation scheme of equal weights,
because an equal weighting scheme can lead to informa-
tion losses and thus to inefficient or biased estimates.
The forecasts in this paper are conducted in real time and
almost all exceed an autoregressive benchmark for each
forecast horizon. However, the results show a mixed pic-
ture, in which a combination of predictor variables is most
favorable because the best empirical results change from
horizon to horizon. Comparing the MIDAS short-term
forecasts with the forecasts of D'Amuri and Marcucci
(2017), which are based on monthly averages of an alter-
native Google index, the models presented here obtain a
lower RMSFE for the shortest forecast horizons compared
to this benchmark.
The rest of this paper is organized as follows. Section
2 gives a compact overview of the related literature deal-
ing with the use of Internet data to forecast economic
variables. Moreover, it focuses on potential pitfalls when
using Internet data and the choice of keyword to obtain the
Google index. Section 3 explains the econometric frame-
work. Section 4 describes the data, the forecasting models,
and the real-time forecasting design. Section 5 states the
empirical results, and Section 6 concludes.
2THE USE OF INTERNET SEARCH
DATA IN FORECASTING
2.1 Related literature
Internet search data have been used in a number of dif-
ferent research topics. In economics, Choi and Varian
(2012) show that Google Trends data can help to forecast
near-term values of economic indicators, such as auto-
mobile sales, travel destinations, consumer confidence
and initial claims for unemployment benefits. Their paper
inspired many economists to use Google Trends data to
predict a variable that can be linked to the behavior of
households. For example, Vosen and Schmidt (2011) fore-
cast consumption of goods, whereas Bangwayo-Skeeteand
Skeete (2015) and Yang, Pan, Evans, and Lv (2015) use
Google data to predict future tourism demand. Wu et al.
(2015) predict US housing prices and sales. Using a Markov
switching framework, Chen et al. (2015) use Google search
data to improve the timeliness of business cycle turn-
ing point identification, and they successfully nowcast the
peak date within a month that the turning point occurred.
In their analysis, they use the three keywords “reces-
sion,” “foreclosure help,” and “layoff,” which represent
the aggregated economy, the credit market, and the labor
market, respectively. Liu, Xu, and Fan (2018) also use
Internet search behavior to forecast Chinese GDP. How-
ever,because Google is not prevalent in China, the authors
use data from its Chinese counterpart: Baidu.
Considering inflation expectations, Guzmán (2011) pro-
poses a real-time measure using search queries obtained
from Google. She demonstrates that higher frequency
measures tend to outperform standard lower frequency
measures such as the SPF in tests of accuracy, predictive
power and out-of-sample forecasts.
Dergiades, Milas, and Panagiotidis (2014) analyze
whether Google and social media data influence European
financial markets. They find that the data provide signif-
icant short-run information for the Greek–German and
Irish–German government bond yield differential.
Considering the unemployment rate, McLaren and
Shanbhogue (2011) analyze the labor market in the
UK and compare standard autoregressive (AR) mod-
els with those augmented with Internet data, finding
that the augmented models outperform the autore-
gressive benchmarks. Askitas and Zimmermann (2009)
demonstrate strong correlations between Google key-
word searches and unemployment rates for Germany,
and Fondeur and Karamé (2013) find that including
Google data improves youth unemployment predictions
in France. Vicente, López-Menéndez, and Pérez (2015)
investigate the unemployment rate in Spain using autore-
gressive integrated moving average (ARIMA) models with
an included explanatory variable that is derived from
the Google search term “job offers.” They find that sig-
MAAS 395

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT