Qualtrics completes acquisition of Clarabridge. Learn more.

Debunking NLP: Detecting People, Phone Numbers and Email Addresses

October 30, 2017

Clarabridge Analytics
Customer Effort
Artificial Intelligence
Machine Learning
Natural Language Processing
Natural Language Understanding

This blog post is part 3 of our Debunking Natural Language Processing (NLP) series. Throughout this series, Ellen will highlight several features that help Clarabridge users go beyond simple topic analysis. This series will show you how new types of analysis aren’t so farfetched after all!


“A rose Rose by any other name would smell as sweet” — or would she? Was Juliet simply referring to the flower or was it a subtle reference to a woman? Another lover, perhaps? A literary scholar or even a high school English student would surely conclude the former. However, if you give this task to a computer, the answer may not be so obvious.

In my last two posts, we’ve discussed the challenges that machines face in trying to detect the subtleties of named entities in text and the importance of being able interpret them. The difference between generic and specific versions of brand or product names like, “apple” and “Apple,” is significant as well as human names like “rose” and “Rose.” In the CX industry, it would be misleading to misinterpret a name like “Bill” and catastrophic to misinterpret a name like “Sue.” I don’t think it’s an overstatement to say that the ability to both identify and classify these words correctly is mission critical in CX.

The four part classification of named entities common in academia (persons, locations, organizations and products) doesn’t quite align to the needs of a CX program. In order to help our users answer questions about their business, Clarabridge offers four hierarchical business entities: Product > Brand > Company > Industry. (See my last post for more details on these business entities!) Similarly, we aligned an expanded set of functionality around the “person” named entity classification. Why? Two reasons. First, the success and failure of every organization are tied to their employees. Second, the livelihood of every organization is tied to its customers. Without a doubt, people are the essential cogs in the business workflow; we designed our Person detection functionality to cater directly to this keystone.

Clarabridge offers three attributes that populate automatically when you load data that assist in identifying individuals: Person, Phone Number and Email Address. These attributes provide immediate value as you can determine which of your employees are mentioned in your data as well as identify which customers are seeking engagement by providing their contact information.

In the past, identifying these kinds of attributes was extremely difficult in Clarabridge. Users would add the top male and female names from the previous census to a category node. But, as you can imagine, names like Mark, Sue, Will, Bill, April, May, June, Ray, Angel, Guy, Bob, Virginia, Rose, Ruby, Grace, Dawn, Amber, Joy, Terry, Penny, Kay, Violet, Daisy and Barb cause a real headache! Similarly, there was no simple way to query for all of the different variations in the structure of phone numbers or to get a list of all email addresses mentioned. The power of a mature NLP engine is that we can use techniques that go far beyond simple keyword matches to find these kinds of entities. The result is higher precision and higher recall of detection of names and contact information.

How are these semantic attributes best leveraged? Here are the top use cases for these features:

1. Employee Praise and Critique

Discover which of your employees are mentioned in your data using the Person attribute. The praise or criticism that customers provide about an associate can be used to reward or modify performance internally. The Person attribute is highly valuable for operational use cases where “on the ground” employees frequently interact with customers.

2. Customer Engagement

Quickly identify customers who are seeking engagement with your brand by finding all records that contain contact information using our Phone Number and Email Address attributes. Customers may or may not be explicit when soliciting a response. Picking up on subtle cues like the presence of a phone number or email address in text and then proactively calling them back is sure to make a very positive impression.

3. Influencer Identification

Not only will Clarabridge pick up on the names of your employees, it will also find names of other individuals in the data. For example, it’s not uncommon to see names of performers, politicians, business executives and other celebrities bubble up in reports for Person. It’s incredibly insightful to discover these external influences on your customers’ perceptions. When analyzing Las Vegas casino reviews, I actually found lots of negative mentions of Elton John. Turns out that he had been canceling lots of shows and customers were not pleased!

We’ve found great value in the above mentioned use cases, but we are continually impressed by the new ways in which our customers use them. These only scratch the tip of the iceberg.

This post wraps up our discussion about named entities. I hope you found it eye-opening to explore the value behind these special elements. We will turn our attention next to a slightly bigger space by moving from a discussion of the meaning of words to the meaning of complete sentences. Stay tuned!