“OK Google, play the Rolling Stones on Spotify.”, “Alexa, what is the weather like in Paris today?”, “Siri, who is the French president?”
If you have ever used vocal assistants, you have indirectly used some Natural Language Understanding (NLU) processes. The same logic applies to chatbot assistants or automated routing of tickets in customer services. For some time now, NLU has been part of our everyday life and it’s probably not about to stop.
Automating the extraction of customer intent, for example, NLU can help us answer our clients’ requests faster and more accurately. That is why every large company has embarked on the development of its own solution. Yet, with all the libraries and models existing in the NLU field, all claiming state-of-the-art or easy-to-get results, it is sometimes complicated to find one’s way around. Having experimented with various libraries in our NLU projects at Artefact, we wanted to share our results and help you get a better understanding of the current tools in NLU.
What is NLU ?
Natural Language Understanding (NLU) is defined by Gartner as “the comprehension by computers of the structure and meaning of human language (e.g., English, Spanish, Japanese), allowing users to interact with the computer using natural sentences”. In other words, NLU is a subdomain of artificial intelligence that enables the interpretation of text by analysing it, converting it into computer language and producing an output in an understandable form for humans.
If you look closely at how chatbots and virtual assistants work, from your request to their answer, NLU is one layer extracting your main intent and any information important to the machine so that it can answer your request best. Say you call your favorite brand customer service to know if your dream bag is finally available in your city: NLU will tell the assistant you have a product availability request and look for the particular item in the product database to find out if it is available at your desired location. Thanks to NLU, we have extracted an intent, a product name and a location.
(Above: llustration of a customer intent and several entities that are extracted from conversation)
Natural language is instilled in most companies’ data and, with the recent breakthroughs in this field, considering the democratisation of the NLU algorithms, the access to more computing power & more data, a lot of NLU projects have been launched. Let’s look at one of them.
A typical project using NLU is, as mentioned before, helping call centre advisors answering customers’ requests more easily as the conversation goes. This would require us to perform two different tasks:
- Understand the customer’s intent during the call (i.e. text classification)
- Catch the important elements that would make it possible to answer the customer’s request (i.e. named-entity recognition), for example contract numbers, product type, product color, etc.
When we first looked at the simple and off-the-shelf solutions released for both of these tasks we were able to find more than a dozen frameworks, some developed by the GAFAM, some by open-source platform contributors. Impossible to know which one to choose for our use case, how each one of them performs on a concrete project and real data, here call centre audio conversations transcribed into text. That is why we have decided to share our performance benchmark with some tips as well as pros and cons for each solution that we have tested.
It is important to note that this benchmark has been done with English data and transcribed speech text and therefore can be used less as a reference for other languages or applications directly using written text, e.g. chatbot use cases.