Skip to main content

Experience Management

Improved Topic Sentiment Analysis using Discourse Segmentation

Text iQ’s powerful text analysis capabilities allow you to quickly assign topics to the customer feedback (aka. survey responses) received, perform sentiment analysis and report results in powerful widgets. However in the real world, much of the feedback received is very noisy. Users may quickly jot down their thoughts, and are susceptible to grammatical errors or incoherent organization of their overall feedback. When you factor in other forms of unstructured feedback such as social media, reviews, or conversational data, the task of topic detection and sentiment analysis becomes even more challenging. Consider the below example, for instance.

"I visited Pizza Planet for the first time, great salad selection and pies.  I also loved the number of draft beers, however there was no full bar.

Topic Sentiment
Salads Very Positive
Pies Very Positive
Draft Beers Very Positive
Full bar Negative

This above feedback response captures several aspects of a user's experience, however, not all of them express the same sentiment.  We see that while the customer praises the salads, pies and beer selection they were disappointed by the lack of a full bar. Our legacy segmentation system is largely rules based and places a greater emphasis on punctuation-based structure.  This legacy paradigm, however, does not easily extend to non-English languages, especially those languages which follow different structures. As a result, the topic sentiment accuracy is lower.

Over the course of the past year, we have developed a sophisticated machine learning technique called Discourse Segmentation to handle this kind of feedback data. As a follow up to our previous work to improve sentiment prediction, we will demonstrate how by applying Discourse Segmentation techniques we are able to segment non-standard human feedback text to identify salient segments (aka topics) and accurately predict sentiment expressed on those topics over legacy modeling approaches while extending seamlessly across different languages.

Out with the old

Like many Natural Language Processing (NLP) systems, our legacy segmentation system was largely rules based. In a rules-based system, certain grammatical structures, such as punctuation, largely guides how a legacy system splits textual segments. As pointed out in the above example, much of the textual feedback  data does not conform to strict grammar and punctuation.  In addition, our legacy system was harder to scale across many languages since the amount of shared structure becomes increasingly sparse across different languages.

I visited Pizza Planet for the first time, great salad selection and pies.  I also loved the number of draft beers, however there was no full bar.

In the above example, our legacy segmentation system correctly captures “great salad selection and pies” as a single segment. However if we add a comma after “selection” the sentiment for pies becomes lost and our legacy system marks the sentiment for pies as ”neutral,” which is incorrect.

In with the new

When the pass the same example through our new Discourse Segmentation system, it correctly identifies sections relating to the visit overall, salad and pizza, draft beer selection, and the full bar as shown below

If we add some punctuation noise to the overall text example, our new system is robust enough to marginalize the noise and correctly segment based on information saliency.

With outstanding results: A huge increase in topic sentiment accuracy

To measure the topic sentiment accuracy of the new discourse segmentation system over our legacy system, we created a statistically significant sample of data with many targets uniformly across 3 languages (English, Spanish and Japanese) and performed detailed analysis on the outputs of both the systems.

Overall the new Discourse Segmentation model has a 41 percentage point increase in its F1 score (i.e. harmonic mean of precision and recall) performance on Japanese responses.  Furthermore, with responses that have overall neutral or mixed sentiment, the target based sentiment has 56 and 44 percentage point increase in F1 performance, respectively.  This is largely attributed to more precise and shorter segments.

The new Discourse Segmentation model saw a 41% pt increase in its F1 score performance on Japanese responses

When benchmarked on English and Spanish feedback responses, we saw a 9 percentage point increase in performance averaged across the sample data. For neutral topic sentiment, the performance increases range across 16 to 30 points and for mixed topic sentiment, we observed performance increases from 14 to 25 points.

We are excited to announce that the new Discourse Segmentation model (for english language) is now available to all brands  to reap the benefits of improved topic sentiment accuracy for feedback responses.


Kudos to the team: Kristjan Arumae, Yashmeet Gambhir, Susan Rios, Tony Gao, Ahmed Abdelmotaleb, Amy McNamara, Nikhil Kamath, Kevin Toledo, Zhengzheng Xing, Samir Joshi, Denise Diaz, Naomi Leong, Aaron Colak

Related Articles