Using social media mining technology to improve stock price forecast accuracy
Author | Jia‐Yen Huang,Jin‐Hao Liu |
DOI | http://doi.org/10.1002/for.2616 |
Date | 01 January 2020 |
Published date | 01 January 2020 |
RESEARCH ARTICLE
Using social media mining technology to improve stock
price forecast accuracy
Jia‐Yen Huang | Jin‐Hao Liu
Department of Information Management,
National Chin‐Yi University of
Technology, Taichung City, Taiwan, ROC
Correspondence
Jia‐Yen Huang, Department of
Information Management, National
Chin‐Yi University of Technology, No. 57,
Sec. 2, Zhongshan Rd., Taiping Dist.,
Taichung City 41170, Taiwan, ROC.
Email: jygiant@ncut.edu.tw
Abstract
Many stock investors make investment decisions based on stock‐price‐related
chip indicators. However, in addition to quantified data, financial news often
has a nonnegligible impact on stock price. Nowadays, as new reviews are
posted daily on social media, there may be value in using web opinions to
improve the performance of stock price prediction. To this end, we use logistic
regression to screen the chip indicators and establish a basic stock price predic-
tion model. Then, we employ text mining technology to quantify the unstruc-
tured data of social media opinions on stock‐related news into sentiment
scores, which are found to correlate significantly with the change extent of
the stock price. Based on the findings that the higher the sentiment scores,
the lower the prediction accuracy of the logistic regression model, we propose
an improved prediction approach that integrates sentiment scores into the
logistic regression model. Our results show that the proposed model can
improve the prediction accuracy for stock prices, and can thus provide a new
reference for investment strategies for stock investors.
KEYWORDS
chip indicators, logistic regression model, prediction accuracy, sentiment scores, text mining
1|INTRODUCTION
In order to make the best investment decisions, stock
investors generally collect related information from
sources such as TV media, newspapers, magazines and
the Internet. However, faced with various information,
investors often cannot identify which information is most
important. How to build up high‐yield investment strate-
gies with the help of quantitative indicators and stock
price forecasting tools is obviously a topic of concern to
investors.
Stock price fluctuations are dynamic, nonlinear, non-
stationary, and carry a lot of noise, which makes the
stock price difficult to forecast. Stock price forecasting
has been a research topic of wide concern in academia
and financial domains. Early research has mainly been
based on random walk theory (RWT) and efficient mar-
ket hypothesis (EMH). RWT argues that stock price fluc-
tuations are random and, hence, the next step of the stock
price is as irregular as a person walking on a square.
EMH suggests that designing a system based on any
information to predict stock price changes is impossible
because all information has already been reflected in
the existing stock prices.
According to previous EMH research, stock prices are
mainly driven by new information rather than current
and past prices. Since the news is unpredictable, the stock
price will follow the random walk model and cannot
exceed 50% accuracy prediction (Qian & Rasheed, 2007).
However, many studies have shown that stock prices
are not random, but can indeed be predicted to a certain
extent. News may be unpredictable, but early metrics can
Received: 9 November 2018 Revised: 10 May 2019 Accepted: 16 May 2019
DOI: 10.1002/for.2616
Journal of Forecasting. 2020;39:104–116.wileyonlinelibrary.com/journal/for© 2019 John Wiley & Sons, Ltd.104
To continue reading
Request your trial