Detecting Customer Complaint Escalation with Recurrent Neural Networks and Manually-Engineered Features

University of Toronto Machine Intelligence Team

3 min readNov 11, 2019

By Cindy Zhang

This article is based on the presentation of Luchen Tan at the Second Annual AI² Forum where she explained how RSVP.ai, a startup that aims to build deep natural language understanding, developed an algorithm to detect customer complaint escalation in real time for a Chinese commerce website.

Problem Overview

Fundamentally, the goal of the algorithm is to identify the customer concerns described in online conversations between customers and customer service representatives that are unresolved and thus are likely be escalated to government agencies. To do so, RSVP.ai employed a recurrent neural network to categorize the conversations and manually engineered features to fine tune their algorithm.

Baseline Model

Figure 1: Encoded dialogue representation.

The model (fig. 1) will concatenate recent dialogue between the customer and the agent and represent the dialogue through word embeddings. This vector representation of the dialogue can then be fed into the recurrent neural network framework to encode the dialogue in a meaningful way relevant to the detection objective. Subsequently, the word embeddings are fed into a long short-term memory (LSTM) layer. Unlike basic feed forward neural networks, LSTM networks are a special kind of RNN that allow information to persist with each cell that the data vectors are processed in. Finally, the encoded representation is fit to an attention mechanism that compares the processed word embeddings with all word embeddings to capture information from the original dialogue.

This a flat attention network that will serve as a baseline model for the experiment.

External Features

In addition to encoded dialogue representation, external features that cannot be leveraged in our RNN model can be added. In the context of online conversations between customers and agents, the added external features include emojis, special punctuation, sentence length, and special customer specific words such as the customer service phone number. As these are not taken into account in the RNN but could help indicate if an issue will be further escalated, it is important to highlight these features in addition to the encoded dialogue representation.

Additionally, tf-idf vectors are extracted from the dialogue. Tf-idf or term frequency — inverse document frequency vectors indicate the frequency at which each word appears in the dialogue. They play a large role in many other NLP tasks, indicating their potential usefulness in this model. Since they differ from word embeddings, tf-idf vectors can be used as another feature to help categorize online dialogue.

The combination of encoded dialogue representation, external features, and tf-idf vectors make up the full model framework.

Evaluating the Model

The different components of the model (fig. 2) are concatenated in the softmax layer and the algorithm outputs a prediction score.

The model was evaluated by comparing the number of detected customer complaints in the pool of customers of size K to the number of actual customer complaints. By adding external features, the full model’s performance was boosted and performed better than all other baseline models, including logistic regression and simple neural networks. The model also performed consistently over a week’s time and on offline and online data.

Lessons Learned

Start with simple models that allow faster training and faster predictions.
Don’t start from scratch or try to reinvent the wheel, always reuse existing solutions.
If neural networks cannot provide enough capabilities, try adding manually engineered features.