Debunking NLP: Detecting Products, Brands, Companies and Industries

September 28, 2017

Clarabridge Analytics
Text Analytics
Artificial Intelligence
Machine Learning
Natural Language Processing
Natural Language Understanding

In my last entry in the Debunking NLP blog series, we talked about the value of named entities in customer feedback data and the challenges that Natural Language Processing (NLP) engines face in identifying them in text. Finding named entities in your customer feedback data opens the doors to new kinds of business questions and answers. However, finding them in text is only half the battle. The other challenge is in determining what type of named entity you have at hand – whether it’s a product, brand, company, industry, or another type of proper noun.

Take the word “Georgia,” for example. Georgia could be a state, a country, a city, a woman, a brand of coffee, a university, a movie, a song, or even a battleship. It is often difficult to differentiate even for humans. “I love Georgia” could legitimately be referring to any of those types. For some NLP systems, the distinction of one named entity type to another is irrelevant. But for a CX NLP engine like Clarabridge, the distinction carries importance as it could completely change how a business understands their customer’s needs and preferences.

In order to differentiate these types, Clarabridge combines linguistic approaches and knowledge based approaches. We look for clues such as the presence of a suffix (Inc, LLC, PLC) that would indicate if an entity is an organization, or a version number that would indicate a product. We combine those with dictionaries of known products, brands and organizations, and tie it all together with a hierarchy. Therefore, if a consumer mentions trying to integrate their iPhone with car, we also want to reflect that as a mention of the Apple brand, Apple Inc and the Computer Software industry.

In academia, there are four standard types of named entities: persons, locations, organizations, and products. However, when Clarabridge approached the challenge of interpreting named entities, we found that this four-type classification was too narrow to meet the needs of those analyzing customer feedback.

So, we started by expanding the dual classification of product and organization entities to a four-part hierarchy that linked products to brands to companies to industries. Why? Because many of our customers and their competitors own multiple products in multiple brands that roll up to a single organization.

Take General Mills, for example, which owns dozens of brands under the General Mills label. They would want to track Yoplait, Pillsbury, Larabar and Old El Paso, to name a few. Most consumers would not even mention “General Mills” when talking about their moldy Yoplait yogurt or their delicious Larabar. We had to bridge the gap between what consumers say and how business think about their offerings.

Here at Clarabridge, we feel strongly that we should put our NLP to work for you, which is why we have four attributes that populate automatically with business named entities every time you load data: Product, Brand, Company and Industry. These attributes provide immediate value as you can determine which are the top products or brands mentioned in your data within seconds of the data loading — no category model or keyword search required!

How are these semantic attributes best leveraged? Here are the top use cases for these features:

  1. Competitor Analysis

Search your data for mentions of your competitors. Look at corresponding volume and sentiment to assess market share and health. Look at these mentions with a customer journey model to determine areas in which your competition shines and areas in which they fail. Or, search for sentences which mention both you and your competition comparatively. Armed with this information, you can design your business strategy to capitalize on your advantages.

  1. Brand or Product Analysis

Identify references to your brands and your products in your data. Analyze these mentions with sentiment, emotion and effort to determine how your customers relate to your offerings. Look at these mentions with filters for “suggestions” or “requests” to identify areas of opportunity to improve your offerings.

  1. External Brand Analysis

Your customers’ world are bigger than just you (sorry to burst your bubble!). Identify mentions of other non-competitor brands in your data that may be affecting your customers’ experience with your offerings. Cross analyze these with mentions of effort or top topics to find intersections between offerings. Seek to capitalize on your customers’ preferences or biases for other brands that will help them adopt your products more seamlessly.

We’ve found great value in the above mentioned use cases but we are continually impressed by the new ways in which our customers use them. These only scratch the tip of the iceberg.

This week we discussed just part of the world of named entities. Next week we will turn our attention to the techniques required to identify people mentioned in your data. Can you think of other names that present a challenge for named entity recognition? Tweet @ellenfalci with your ideas! I’ll share the best ones in my next post.


To read the previous blog posts in this series, please visit:

Debunking NLP: Introduction

Debunking NLP: Translation

Debunking NLP: Named Entities