Using Computational Linguistics to Identify Competitors and Competitive Interactions.

AuthorPhillips, Gordon M.

Identifying competitors and analyzing competitive interactions is difficult in many markets. For well-defined markets with well-defined products, many examinations of competitors and markets can be done with traditional methods. However, firms increasingly operate in multiple markets. A given firm's products may also differ sharply, both in their attributes and consumers, within markets. In addition, some firms may offer customized products or offer services along with physical products, increasing complexity. A firm's product choice thus can involve multiple dimensions such as product differentiation and product quality. For all these reasons, identification of any given firm's competitors and markets has become increasingly difficult.

Gerard Hoberg and I take a noncon-ventional approach to identifying and examining firm competitors and firm organization. We use natural language processing (NLP) of text to calculate firm pair-by-pair product similarity scores to build a new spatial, text-based network industry classification (TNIC). (1) This new spatial representation can capture both horizontal and vertical industry connections among firms. Using these new text-based competitor and industry classifications, we along with other coauthors, examine mergers and acquisitions, vertical integration, entry threats by new firms, covariation in the stock market, and competition among patenting firms.

In a sequence of articles, we use multiple sources of text, including the business and product descriptions in firms' 10-K annual reports filed with the Securities and Exchange Commission, product text in the input-output classifications from the Bureau of Economic Analysis (BEA), and patent text from US Patent and Trademark Office filings. Additional sources of text could also be incorporated into our network.

Text to Determine Competitors and Merger Synergies

We examine merging firms and their competitors in our early computational linguistics research. (2) We take an agnostic view of markets and examine firms' pairwise 10-K text-based product similarities to identify rival and complementary firms. Using NLP, we compute the product market similarity of each pair of firms using the product description text in firms' 10-Ks and produce ranked competitors for each firm. Our text-based similarity measure gives a continuous related score indicating the actual degree of product word similarity. The relatedness score changes each year as the firms' product descriptions change. Thus, the similarity scores are dynamic and continuous, capturing the degree of relatedness of two firms each year rather than just a "Yes/No" relatedness measure.

These relatedness measures are much better on multiple dimensions than Standard Industrial Classification (SIC) or North American Industry Classification System (NAICS) codes used extensively in economics, finance, and management. Hoberg and I show in our paper that in regressions of firm characteristics on industry grouping characteristics, our network codes can explain many accounting characteristics and outcomes such as merger relatedness significantly better than SIC and NAICS codes. (3)

There are two reasons SIC codes and NAICs codes can have severe misclassification problems. First, SIC and NAICS codes do not capture how related firms are because the codes are primarily based on how a product is made, rather than the end customer of that product. Second, the codes are updated infrequently and may be based on historical designations. Our measure is a continuous relatedness measure allowing product similarity and differentiation to be measured. It is updated each year and is firm specific.

These product market similarities enable us to rank a given firm's closest product competitors to understand the incidence and outcomes of mergers and acquisitions. We show that firm pairs with very low or very high product similarity are both less likely to merge. The lower likelihood that we find for high-similarity firms possibly reflects rivals capturing some of the merger gains or antitrust concerns. Firms that are...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT