Predictability of commodity futures returns with machine learning models
Published date | 01 February 2024 |
Author | Shirui Wang,Tianyang Zhang |
Date | 01 February 2024 |
DOI | http://doi.org/10.1002/fut.22471 |
Received: 26 November 2022
|
Accepted: 19 October 2023
DOI: 10.1002/fut.22471
RESEARCH ARTICLE
Predictability of commodity futures returns with machine
learning models
Shirui Wang
1
|Tianyang Zhang
2
1
Independent Researcher, Beijing, China
2
Wenlan School of Business, Zhongnan
University of Economics and Law,
Wuhan, Hubei, China
Correspondence
Tianyang Zhang, Wenlan School of
Business, Zhongnan University of
Economics and Law, 182 Nanhu Ave,
East Lake High‐tech Development Zone,
430073 Wuhan, Hubei, China.
Email: tyzhangecon@gmail.com
Abstract
We use prevailing machine learning models to investigate the predictability of
futures returns in 22 commodities with commodity‐specific and macro-
economic factors as predictors. Out‐of‐sample prediction errors for the
majority of futures contracts are lowered compared with those obtained by
the baseline models of AR(1) and forecast combinations. Using Shapley values
to explain feature importance, we identify dominant predictors for each
commodity. A long–short portfolio strategy based on monthly light gradient‐
boosting machine predictions outperforms the benchmark linear models in
terms of annual return, Sharpe ratio, and max drawdown.
KEYWORDS
commodity futures, machine learning, return predictability
JEL CLASSIFICATION
G12, Q02, C53
1|INTRODUCTION
The predictability of futures returns is one of the key issues in the commodity markets in recent decades. Although the
commodity futures market was previously treated as a traditional market where commodity producers and purchasers
use futures contracts to hedge spot positions, there have been increasing speculative activities since the
“financialization”of commodity markets (Tang & Xiong, 2012). As an alternative asset class from traditional financial
assets, such as equities, the factors that price the traditional asset classes may not be able to price commodities
(Giampietro et al., 2018; Gorton & Rouwenhorst, 2006). In the previous literature, many factors have been introduced
in the commodity markets, such as carry, momentum, and hedge ratio (Bakshi et al., 2019; Boons & Prado, 2019;
Gorton et al., 2013; Szymanowska et al., 2014; Yang, 2013). Besides these commodity‐specific factors, commodity
markets are also well known to be affected by macroeconomic factors. Bakas and Triantafyllou (2019) find that the
latent macroeconomic uncertainty measure is a common volatility forecasting factor for commodity markets. However,
there is still debate about the predictability of risk factors since previous studies have shown mixed results, examples
include Daskalaki et al. (2014), Fernandez‐Perez et al. (2018), and others.
Also, previous studies focus on predicting all commodity classes based on common factors, such as Daskalaki et al.
(2014). However, commodities in different sectors are influenced by various sources of information. For example, as
consumer goods, the futures prices of feeder cattle and live cattle are more likely to be affected by the macroeconomic
predictors, such as inflation rates. Hence, it is of great importance to investigate the factors' predictive power for each
commodity. Our results show that the dominant predictor of each futures contract is indeed distinct, which partly
explains why there is no agreement on the return predictability of commodity futures with common factors.
J Futures Markets. 2024;44:302–322.wileyonlinelibrary.com/journal/fut302
|
© 2023 Wiley Periodicals LLC.
Most studies in the commodity markets mainly use the traditional asset pricing models, such as linear regression
models. However, the underlying data‐generating process of asset returns is likely nonlinear, as predictions are
typically improved by adding nonlinearities and accounting for overfitting. We employ machine learning methods that
allow for nonlinear forms of predictors that are not captured by traditional linear pricing models. Our study also
compares the performance of linear and nonlinear models.
Machine learning methods have been applied to many areas of economics and finance. It is particularly attractive to
analyze asset pricing with machine learning methods for three reasons. First, the risk premium is the conditional
expectation of a future realized return, so the measurement of the risk premium of an asset is naturally a prediction
problem, and machine learning methods are well suited for predictive tasks. The standard pipeline of predictive tasks
has been well understood and implemented by machine learning practitioners, including feature selection, parameter
tuning, and choosing the most suitable models. Second, the number of market‐related and macroeconomic factors is
large and highly correlated, machine learning methods can be powerful since they can solve these problems by
reducing dimensions and condensing redundant variation among predictors (Gu et al., 2020). Third, machine learning
algorithms have been criticized for their opaqueness and lack of interpretations. But modern techniques such as partial
dependence, local surrogate, and Shapley values have been developed for model‐agnostic interpretations.
Our paper has several contributions. First, we apply modern machine learning techniques to select and interpret
factors, and find that the significant predictive factors of each commodity are distinct, which explains why there is no
agreement on the common predictive factors in the commodity markets based on linear relationships. This indicates
that it is critical to focus on each contract instead of doing a cross‐sectional test as in most previous research in the
literature. Second, we implement a comprehensive comparison of models and contracts, and find that light gradient‐
boosting machine (LightGBM) performs the best in predicting futures returns. Third, we construct a practical
long–short portfolio strategy based on return predictions that outperform strategies with AR(1) and forecast
combinations.
The remainder of the paper continues as follows. Section 2provides the previous work in this field. Second 3
presents the motivation of this study. The data and machine learning models used are introduced in Section 4.
Section 5shows the performance of different predictive models and feature importance results for each commodity.
Section 6analyzes the performance of the constructed portfolios. Section 7concludes.
2|RELATED LITERATURE
The predictability of risk factors in commodity futures markets has been debated for a long time in the literature.
Daskalaki et al. (2014) include macroeconomic, tradable, and commodity‐specific factors and explore the common
factors in the cross‐sectional of commodity futures returns. Results show that no asset pricing factor‐motivated model
can explain futures returns well. Yang (2013) builds a factor that is calculated as the difference between high‐and low‐
basis portfolio returns, which can explain most of the average excess returns of commodity futures portfolios sorted by
basis. Ahmed and Tsvetanov (2016) construct two risk factors: the equally weighted average excess return on long
positions in futures contracts and the return difference between the high‐and low‐basis portfolios. There is no
significant evidence that the factor models can have better forecast performance than the random walk with the drift
model. The open interest in commodity futures markets has been found to predict commodity returns after controlling
for other known predictors (Hong & Yogo, 2012). Basu and Miffre (2013) construct a factor that captures the hedging
pressure risk premium of commodity futures, the predictive power of hedging pressure in future returns is different
from the forecasting power of past returns and the slope of the term structure. The factor constructed as the skewness
of commodity futures returns is found to have significant power to predict futures returns (Fernandez‐Perez
et al., 2018). Besides these futures‐related factors, Zhang (2019) shows that commodity options markets can have a
large effect on the underlying assets futures, thus, we include the commodity call‐to‐put volume ratio which has been
studied in the stock markets (Pan & Poteshman, 2006). Guidolin and Pedio (2021) show that three commodity‐specific
factors (basis, hedging pressure, and momentum) make little contribution in predicting futures returns when
macroeconomic factors are also included.
The commodity markets are well known to be affected by macroeconomic conditions. Commodity futures play an
important role in the global economy, allowing companies to ensure the future value of their outputs or inputs
(Bhardwaj et al., 2015). For example, the inflation rate has been proven to affect individual commodity futures
(Anzuini et al., 2012). The global economic policy uncertainty has been found to have power to predict return variance
WANG and ZHANG
|
303
Get this document and AI-powered insights with a free trial of vLex and Vincent AI
Get Started for FreeStart Your 3-day Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant
-
Access comprehensive legal content with no limitations across vLex's unparalleled global legal database
-
Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength
-
Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities
-
Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting

Start Your 3-day Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant
-
Access comprehensive legal content with no limitations across vLex's unparalleled global legal database
-
Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength
-
Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities
-
Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting

Start Your 3-day Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant
-
Access comprehensive legal content with no limitations across vLex's unparalleled global legal database
-
Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength
-
Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities
-
Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting

Start Your 3-day Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant
-
Access comprehensive legal content with no limitations across vLex's unparalleled global legal database
-
Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength
-
Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities
-
Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting

Start Your 3-day Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant
-
Access comprehensive legal content with no limitations across vLex's unparalleled global legal database
-
Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength
-
Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities
-
Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting
