Forecasting Australia's real house price index: A comparison of time series and machine learning methods

AuthorGeorge Milunovich
Published date01 November 2020
Date01 November 2020
DOIhttp://doi.org/10.1002/for.2678
Received: 2 July 2019 Revised: 8 December 2019 Accepted: 22 February 2020
DOI: 10.1002/for.2678
RESEARCH ARTICLE
Forecasting Australia's real house price index: A
comparison of time series and machine learning methods
George Milunovich
Department of Actuarial Studies and
Business Analytics, Macquarie University,
Sydney,New South Wales, Australia
Correspondence
George Milunovich, Department of
Actuarial Studies and Business Analytics,
Macquarie University, Sydney, NSW 2109,
Australia.
Email: george.milunovich@mq.edu.au
Funding information
Australian Research Council,
Grant/AwardNumber: DP190102049
Abstract
We employ 47 different algorithms to forecast Australian log real house
prices and growth rates, and compare their ability to produce accurate
out-of-sample predictions. The algorithms, which are specified in both single-
and multi-equation frameworks, consist of traditional time series models,
machine learning (ML) procedures, and deep learning neural networks. A
method is adopted to compute iterated multistep forecasts from nonlinear ML
specifications. While the rankings of forecast accuracy depend on the length
of the forecast horizon, as well as on the choice of the dependent variable
(log price or growth rate), a few generalizations can be made. For one- and
two-quarter-ahead forecasts we find a large number of algorithms that out-
perform the random walk with drift benchmark. We also report several such
outperformances at longer horizons of four and eight quarters, although these
are not statistically significant at any conventional level.Six of the eight top fore-
casts (4 horizons ×2 dependent variables) are generated by the same algorithm,
namely a linear support vector regressor (SVR). The other two highest ranked
forecasts are produced as simple mean forecast combinations. Linear autore-
gressive moving average and vector autoregression models produce accurate
olne-quarter-ahead predictions, while forecasts generated by deep learning nets
rank well across medium and long forecast horizons.
KEYWORDS
australian real house price index, autoregression, forecasting, machine learning, neural networks,
time series
1INTRODUCTION
Residential real estate provides shelter, preserves wealth,
and is one of the main drivers of the Australian economy
via construction and finance. Having access to accurate
house price predictions is therefore equally important to
central banks, financial supervision authorities, investors,
and home owners. In this article we explore the ability
of increasingly popular machine learning (ML) and deep
learning (DL) algorithms to forecast quarterly real house
prices in Australia, and comparetheir performance against
a number of traditional time series models. More specifi-
cally, we employ 47 algorithms to generate out-of-sample
predictions of log house prices and growth rates across
the forecast horizons of one, two, four, and eight quarters.
We then measure forecast accuracyaccording to the mean
squared error (MSE), and evaluate each forecast relative
to a benchmark. Statistical tests provide further evidence
regarding any such outperformance. We conclude by
producing a ranking of the forecasting algorithms that is
informative concerning their ability to produce accurate
house price predictions.
1098 © 2020 John Wiley & Sons, Ltd. wileyonlinelibrary.com/journal/for Journalof Forecasting. 2020;39:1098–1118.
MILUNOVICH 1099
Forecasting national house price indices is challenging
for a number of reasons. The main issue is a relatively short
length of available time series data. House price indices
are typically constructed at monthly and quarterly fre-
quencies, which limits the length of computed indices and
makes model building and testing difficult. The analysis is
further complicated by asymmetric boom and bust cycles
that introduce nonlinear effects (Miles, 2008). As discussed
by Genesove and Mayer (2001) and Engelhardt (2003),
high levels of loss aversion associated with home owner-
ship cause house prices to adjust more quickly when they
rise towards equilibrium value than when they fall. Lastly,
large transaction costs prevalent in the housing markets
tend to amplify any nonlinear dynamics that are present in
the data (Muellbauer & Murphy, 1997).
Despite the modeling difficulties discussed above, there
is a substantial literature on forecasting house prices.
A key finding is that house price growth rates exhibit
positive serial dependence and predictability over short
horizons (see, e.g., Case & Shiller, 1989; 1990; Gau, 1984;
1985; McIntosh & Henderson, 1989; Schindler, 2013). The
evidence, however, is somewhat mixed regarding the
ability of nonlinear models to outperform linear autore-
gressive moving average (ARMA) specifications. For
instance, Crawford and Fratantoni (2003) reported
that while regime-switching models outperformed
ARMA in-sample, they fell behind when forecasting
out-of-sample. Similarly,Balcilar, Gupta, and Miller (2015)
found that smooth-transition autoregressive models failed
to outperform linear AR specifications over short hori-
zons. These findings are puzzling given that the studies
mentioned above found substantial evidence of nonlinear
behavior in house prices. The literature also investigates
the ability of economic factors to improve the accuracy of
house price forecasts. Rapach and Strauss (2007) found
that autoregressive distributed lag models outperformed
univariate autoregressions in forecasting state house price
indices in the USA. In a similar setting, Bork and Møller
(2015) reported that the best forecasting model varied
over time as well as across US states. Gupta, Kabundi,
and Miller (2011) forecast aggregate US real house price
growth using two sets of explanatory variables: a small
set containing 10 economic factors and a large set of 120
variables. They reported that the small-scale model out-
performed all other specifications, including the random
walk. Bork and Møller (2018) employed factor analysis
with 128 economic time series and showed that macroeco-
nomic fundamentals had strong predictive power for US
house price returns. Their model outperforms predictions
generated from the price–rent ratio, autoregressivebench-
marks, and regression models based on smaller data sets.
Further studies are surveyed by Ghysels, Plazzi, Valkanov,
and Torous (2013).
This paper contributes to the existing literature in
several ways. First, wecompare a large number of forecast-
ing algorithms according to their accuracy in predicting
Australian real house prices and growth rates. In con-
trast, the existing studies typically consider one model at
a time and do not conduct forecast comparisons (see, e.g.,
Abelson, Joyeux, Milunovich, & Chung, 2005; Bourassa &
Hendershott, 1995). Our predictions are generated from
a set of algorithms that include traditional time series
models as well as alternative machine and deep learning
specifications. This is of interest for at least two reasons.
Many of the ML and DL algorithms (i) have the ability
to capture nonlinear relationships, and (ii) employ
cross-validation rather than information criteria for model
selection. Whether such departures from traditional
methods make ML more suitable for forecasting house
prices is an open empirical question. Second, we extend
the idea of multiequation models from linear vector
autoregressions (VARs) to all of our machine learning
specifications. For instance, we construct multiequation
support vector regressors and multiequation random
forests. This is needed in order to compute iterated
multistep-ahead forecasts in a multivariate context. An
advantage of this approach is that only one model needs
to be specified in order to make predictions for all forecast
horizons. Lastly, as is well known, thereare complications
in constructing iterated multistep-ahead predictions from
nonlinear models. We address this issue by adopting a
residual-based procedure of Brown and Mariano (1989) to
generate unbiased multistep forecasts from nonlinear ML
and DL algorithms.
Our data set contains an Australian real house price
index collected at the quarterly frequency over the
1972:Q3–2017:Q3 period. In addition, we obtain data on
seven economic factors that may help improve house price
forecasts. These include inflation, exchange and unem-
ployment rates, house rents, a stock price index, gross
disposable income per capita, and mortgage rates. We
choose the seven predictors largely in consideration of
the existing literature. For instance, a large number of
studies suggest that income often plays a key role in fore-
casting house price growth rates (see, e.g., Case & Shiller,
1990; Favilukis, Ludvigson, & Van Nieuwerburgh, 2017;
Malpezzi, 1999). Similarly, rent is employed as a relevant
predictor by Plazzi, Torous, and Valkanov (2010) and
DiPasquale and Wheaton (1994), and interest rates in
Muellbauer and Murphy (1997). In further literature,
inflation and equities help explain Australian house prices
in Abelson et al. (2005), while employment features in
Abraham and Hendershott (1996).
Weset up our investigation in a fixed estimation window
scheme, which facilitates comparing forecasts from nested
models via the Diebold and Mariano (1995); DM) test.

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT