Using social media mining technology to improve stock price forecast accuracy

AuthorJia‐Yen Huang,Jin‐Hao Liu
DOIhttp://doi.org/10.1002/for.2616
Date01 January 2020
Published date01 January 2020
RESEARCH ARTICLE
Using social media mining technology to improve stock
price forecast accuracy
JiaYen Huang | JinHao Liu
Department of Information Management,
National ChinYi University of
Technology, Taichung City, Taiwan, ROC
Correspondence
JiaYen Huang, Department of
Information Management, National
ChinYi University of Technology, No. 57,
Sec. 2, Zhongshan Rd., Taiping Dist.,
Taichung City 41170, Taiwan, ROC.
Email: jygiant@ncut.edu.tw
Abstract
Many stock investors make investment decisions based on stockpricerelated
chip indicators. However, in addition to quantified data, financial news often
has a nonnegligible impact on stock price. Nowadays, as new reviews are
posted daily on social media, there may be value in using web opinions to
improve the performance of stock price prediction. To this end, we use logistic
regression to screen the chip indicators and establish a basic stock price predic-
tion model. Then, we employ text mining technology to quantify the unstruc-
tured data of social media opinions on stockrelated news into sentiment
scores, which are found to correlate significantly with the change extent of
the stock price. Based on the findings that the higher the sentiment scores,
the lower the prediction accuracy of the logistic regression model, we propose
an improved prediction approach that integrates sentiment scores into the
logistic regression model. Our results show that the proposed model can
improve the prediction accuracy for stock prices, and can thus provide a new
reference for investment strategies for stock investors.
KEYWORDS
chip indicators, logistic regression model, prediction accuracy, sentiment scores, text mining
1|INTRODUCTION
In order to make the best investment decisions, stock
investors generally collect related information from
sources such as TV media, newspapers, magazines and
the Internet. However, faced with various information,
investors often cannot identify which information is most
important. How to build up highyield investment strate-
gies with the help of quantitative indicators and stock
price forecasting tools is obviously a topic of concern to
investors.
Stock price fluctuations are dynamic, nonlinear, non-
stationary, and carry a lot of noise, which makes the
stock price difficult to forecast. Stock price forecasting
has been a research topic of wide concern in academia
and financial domains. Early research has mainly been
based on random walk theory (RWT) and efficient mar-
ket hypothesis (EMH). RWT argues that stock price fluc-
tuations are random and, hence, the next step of the stock
price is as irregular as a person walking on a square.
EMH suggests that designing a system based on any
information to predict stock price changes is impossible
because all information has already been reflected in
the existing stock prices.
According to previous EMH research, stock prices are
mainly driven by new information rather than current
and past prices. Since the news is unpredictable, the stock
price will follow the random walk model and cannot
exceed 50% accuracy prediction (Qian & Rasheed, 2007).
However, many studies have shown that stock prices
are not random, but can indeed be predicted to a certain
extent. News may be unpredictable, but early metrics can
Received: 9 November 2018 Revised: 10 May 2019 Accepted: 16 May 2019
DOI: 10.1002/for.2616
Journal of Forecasting. 2020;39:104–116.wileyonlinelibrary.com/journal/for© 2019 John Wiley & Sons, Ltd.104

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT