Analyzing Customer Feedback Data: Manual Analysis vs NLP

By: Clarabridge Team

September 5, 2017

Clarabridge Analytics
Text Analytics
Artificial Intelligence
Natural Language Processing
Natural Language Understanding

Organizations have two kinds of customer feedback data that they measure, store, and analyze: structured and unstructured data.

Structured data is information that is clearly defined and easy to report on. It is the kind of data that is generally found in a survey and can be organized in a spreadsheet: name, location, age, and rating (3 out of 5 stars, for example, or a 10 for “most satisfied” versus a 1 for “least satisfied”).

Unstructured data as it exists today is, basically, text, although it can also include other media such as audio, photos, or videos. Unstructured data can be captured in an email, the “additional comments” section of a survey, voice recordings of customer interactions, a post on a customer review site, in social media, call center notes, chat transcripts, and dozens of other places.


Challenges of customer feedback analysis:

Analyzing all this data correctly is critical, because it reveals everything from buying trends to product flaws and provides a significant business advantage. Considering that 95% of customer feedback data is unstructured, there is significant opportunity to uncover customer, product, competitive, research, and marketing insights. Organizations are able to better serve customers, control costs and risk, compete more effectively, and drive profitability. That’s why so many companies have explored so many different approaches to extracting these valuable insights.

Organizations often struggle to do this analysis, however, because unstructured data is significantly harder to categorize and report on than structured data. It can be hard to parse due to grammatical errors or slang, it frequently contains multiple unrelated ideas, and it can represent various levels of sentiment related to each idea (for example, “I absolutely loved the food, but the waiter was rude and finding parking was impossible.”).


Manual analysis:

Many approaches to text analytics have been tried and found lacking. Manual analysis is the most basic way to understand feedback text is simply to have someone read the text, note the contents, and categorize it. Market researchers, for example, often categorize, or “code,” the free-text responses in surveys. If an organization is only receiving a few dozen surveys a month, manual review and coding could suffice. How­ever, this option doesn’t scale well for several reasons:

1. Cost: Assuming minimum wage and that the average person can process 50 items of unstructured data an hour, it costs $145,000 to have someone read through and categorize text for one million items.1 To put this in perspective, Verizon is analyzing 700,000 post-call surveys per month, and that’s only one of their data sources.

2. Time: Using the same example listed above, one person can process 400 unstructured data elements in an eight-hour period. Going through one million comments would take 2,500 days to process, ² or just under seven years. (To add perspective, Yelp users post 37,987,200 reviews per day. TripAd­visor has 200 million posts and counting.) By the time data is actually analyzed it is often too late to make any meaningful change on the back of it.

3. Errors: People make mistakes and are less accurate at coding than one may expect. Studies suggest that humans have an average accuracy rate of just 80%.

4. Inflexibility: Manual processes can’t easily accommodate new categories or codes, and are even less likely to include “why” and “what if” analysis.

When these challenges are combined with the growing volume of unstructured data, as well as multiple languages, it is easy to understand why automated processes have been explored for text analysis.



Natural Language Processing (NLP) automates the reading of text using sophisticated algorithms. Fast, consistent, and programmable, NLP engines identify words and grammar to find meaning in large amounts of text.

Clarabridge has developed a proprietary NLP engine that uses both linguistic and statistical algorithms. This hybrid framework makes it straightforward to use without extensive understanding of statistics, comprehensive domain knowledge, or linguistic expertise.

Check out the video below to get a better understanding of the Clarabridge approach to NLP and text analytics: