Measuring large‐scale market responses and forecasting aggregated sales: Regression for sparse high‐dimensional data

Document

Cited in

Published date	01 August 2019
Date	01 August 2019
Author	Nobuhiko Terui,Yinxing Li
DOI	http://doi.org/10.1002/for.2574

RESEARCH ARTICLE

Measuring large‐scale market responses and forecasting

aggregated sales: Regression for sparse high‐dimensional data

Nobuhiko Terui | Yinxing Li

Graduate School of Economics and

Management, Tohoku University, Sendai,

Japan

Correspondence

Nobuhiko Terui, Graduate School of

Economics and Management, Tohoku

University, Kawauchi Aoba‐ku, Sendai

980‐8756, Japan.

Email: terui@tohoku.ac.jp

Abstract

In this article, we propose a regression model for sparse high‐dimensional data

from aggregated store‐level sales data. The modeling procedure includes two

sub‐models of topic model and hierarchical factor regressions. These are

applied in sequence to accommodate high dimensionality and sparseness and

facilitate managerial interpretation.

First, the topic model is applied to aggregated data to decompose the daily

aggregated sales volume of a product into sub‐sales for several topics by alloca-

ting each unit sale (“word”in text analysis) in a day (“document”) into a topic

based on joint‐purchase information. This stage reduces the dimensionality of

data inside topics because the topic distribution is nonuniform and product

sales are mostly allocated into smaller numbers of topics. Next, the market

response regression model for the topic is estimated from information about

items in the same topic. The hierarchical factor regression model we introduce,

based on canonical correlation analysis for original high‐dimensional sample

spaces, further reduces the dimensionality within topics. Feature selection is

then performed on the basis of the credible interval of the parameters' posterior

density.

Empirical results show that (i) our model allows managerial implications from

topic‐wise market responses according to the particular context, and (ii) it per-

forms better than do conventional category regressions in both in‐sample and

out‐of‐sample forecasts.

KEYWORDS

dimension reduction, feature selection, hierarchical factor regression, high‐dimensional sparse data,

topic model

1|INTRODUCTION

Disaggregated scanner panel records from stores have

been analyzed for various purposes using a variety of

models. To model consumer heterogeneity as predicted

by microeconomic theory, heterogeneous choice models

with hierarchical Bayes modeling were proposed by

Rossi, McCulloch, and Allenby (1996). These have been

widely and successfully applied to understand individual

customers and explore targeted or one‐to‐one marketing

strategies, as is discussed in Rossi, Allenby, and

McCulloch (2005) and associated references. Terui, Ban,

and Allenby (2011) discussed the effectiveness of TV

advertising in relation to the formation of consideration

sets using the single source of disaggregated purchases

with individual TV exposure records. Hasegawa, Terui,

Received: 20 November 2017 Revised: 14 September 2018 Accepted: 3 January 2019

DOI: 10.1002/for.2574

and Allenby (2012) explored the mechanisms of con-

sumer satiation with products by modeling product attri-

butes using dynamic heterogeneous‐choice models.

On the other hand, sales data are automatically accu-

mulated at customer check‐out points by the point‐of‐sale

(POS) terminals in most retail locations. These data are

extremely important for developing promotional pro-

grams, even if the store does not use a customer loyalty

program. Most traditional methods of analyzing the

aggregated store data specify the range of products by cat-

egory—that is, category regression. Category regression

as discussed by Hanssen, Parsons, and Shultz (2001)

assumes that the number of products is smaller than the

number of days for which sales data are available. This

approach is useful when applied to products from well‐

recognized categories with high frequency of sales. How-

ever, the approach cannot be applied to all products in a

store, particularly to products that are purchased infre-

quently over the period of observation.

By contrast, recent advances in network technology

have made clear that scanning entire databases can

uncover unexpected hidden patterns of joint purchases.

This big‐data approach could offer insights for marketing

managers that help them to understand the shopping

contexts of their customers. The aggregated scanner data

from POS terminals contain records of a huge number of

sales, prices, and promotions, as the store in the case

study below has about 8,000 products. These variables

cannot be used directly as covariates because the size of

the covariate matrix in the market response function is

intractably large. Even when a model can be estimated,

overfitting occurs because of the so‐called N<Pproblem,

where Nand Pare the numbers of samples and covari-

ates. Thus we need to generate smaller datasets by

decomposing the larger dataset or reducing the dimen-

sion of the data.

These high‐dimensional datasets contain many zeros,

so they are sparse, as is the case with aggregated scanner

data in our study, because the daily number of product

items purchased is considerably smaller than that of

items displayed in a store, and the sales data for many

items are therefore zero, so that information about their

price and promotional variables is not recorded, produc-

ing yet another data entry with zero value.

In this study, we accommodate entire products into

our analysis without assuming categories. The proposed

model is composed of two sub‐models. The first model

reduces dimension of the original data space by

decomposing it into several sub‐datasets. Hidden struc-

ture within the aggregated scanner data is uncovered by

applying a topic model (Blei, Ng, & Jordan, 2003) to

aggregated data. The second model solves the N<P

problem by reducing the dimensionality of the covariate

space. A hierarchical factor regression model is proposed

for this purpose, to estimate market structure in the lower

dimensional space between dependent variable and

covariates. The market response functions defined by

regression are estimated regularly since in the reduced‐

dimensional space it does not include many zeros.

Finally, the market response in terms of regression coeffi-

cients in the high‐dimensional original data space is

recovered by converting the estimated structure in the

reduced‐dimensional space into the original space. An

overview of the proposed model is shown in Figure 1.

Our paper contributes to the theory of marketing

modeling and forecasting by the Big Data approach. More

specifically, we apply a topic model to aggregated sales

data in a new manner to reduce dimensionality and

uncover shopping contexts. This approach is next

combined with a regression model for sparse high‐

dimensional data by proposing hierarchical factor regres-

sion to find effective covariates and measure their market

responses. These models are verified with an empirical

study that unveils actionable insights for retail managers.

We derive unusual findings from these response func-

tions by scanning the whole database, which contributes

to managerial practice usually based on category‐specific

analysis.

In the next section, we apply the topic model to aggre-

gated sales data and generate sub‐datasets for topics that

can be interpreted as shopping contexts. These sub‐

datasets already have a lower dimension than those of

the original dataset, since product sales are not likely to

be allocated to the topics evenly. In Section 3, according

to the sub‐datasets for the respective topics, we apply a

second model to further reduce the data dimensionality

to estimate topic‐wise market response regressions. In

Section 4, we report an empirical study of aggregated

scanner data. The concluding remarks are given in

Section 5.

2|DIMENSION REDUCTION

USING TOPIC MODEL

2.1 |Decomposing aggregated sales into

sub‐sales by shopping context

Consumer motivations for making purchases of products

are hidden in aggregated sales data. For example, out of

50 sales of a chocolate, 15 could be purchased for con-

sumption, 25 to be given away, which are purchased

jointly with a card, and 10 could be purchased for

cooking, which is indicated by purchasing flour simulta-

neously. The decomposition of total sales into several

topics allows a better understanding of the market and

TERUI AND LI 441

To continue reading

Request your trial

Subscribers can access the reported version of this case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the cited cases and legislation of a document.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the documents that have cited the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the revised versions of legislation with amendments.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see any amendments made to the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a visualisation of a case and its relationships to other cases. An alternative to lists of cases, the Precedent Map makes it easier to establish which ones may be of most relevance to your research and prioritise further reading. You also get a useful overview of how the case was received.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the list of results connected to your document through the topics and citations Vincent found.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Measuring large‐scale market responses and forecasting aggregated sales: Regression for sparse high‐dimensional data

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users