Natural Language Processing has advanced to a stage where it can recognize patterns, based on historical data, in human speech and respond to it. However, it continues to face challenges when using the same technology to ‘understand’ human language.  There is now significant commercial interest in the field of Natural Language Understanding (NLU) because of its potential applications in automated reasoning, question-answering, voice-activation, text categorization, and large-scale content analysis.

Difference between NLU, NLG, and NLP

Natural-language understanding is the computer’s ability to comprehend the meaning and context of human language, which allows users to interact more naturally with the computer. It is considered an AI-hard problem. Natural language generation (NLG) is defined as the task of generating written or spoken narrative from a set of data.

NLP is, overall, the wider concept concerned with the interactions between computers and human (natural) languages, how to program computers to process and analyze natural language data.

NLU and NLG are subsets of NLP. NLU is one of the crucial processes to master to help achieve complete NLP – the machine must be able to understand the language in order to process it and generate new text.  NLU usually refers only to the computer’s ability to understand language. Anything beyond that, such as responding, making decisions based on the text etc., will come under the bucket of NLP.

What is NLU

The first step in NLU is converting speech and natural language into text.  Speech recognition systems are most commonly based on statistical models called Hidden Markov Models (HMMs). The HMM breaks down human speech into small units called phonemes. It then looks at the set of phonemes are statistically determining what was said, with the output in text.

There are various techniques used by different NLP systems to understand the words. They also have a lexicon and grammar rules built into the system to help them understand. The ‘understanding’ part begins with the process called Part-of-Speech tagging (POS). This is where the algorithm tries to comprehend if the word is a noun, verb or adjective and so on.  Based on the lexicon and encoded grammar rules, it will then use statistical machine learning to apply these rules to determine what was said.

What makes NLU so complex

The emotional ambiguity and context-specific nature of human language make Natural Language Understanding (NLU) a particularly complex undertaking. When faced with new information, neural networks on machine learning systems use pattern recognition based on historical data and past examples to understand it. They use the techniques that work well on spatial data to text. However, in human language, meaning is derived from context and relationships and emotion, so it is harder for computers to understand using such rules-driven methods. James Allen, Professor of Computer Science at the University of Rochester and an expert on NLP/NLU is renowned for his studies on the overlap between NLU and reasoning. He writes, “While most of the NLP field has moved to statistical learning methods as the paradigm for language processing, I believe that deep language understanding can only currently be achieved by significant hand-engineering of semantically-rich formalisms coupled with statistical preferences.”

Challenges faced in NLU 

Several challenges are likely to come up during the Automatic Speech Recognition (ASR) and the POS process, for example when different words have similar meaning and so on. All these variables must be encoded into the system to train it to respond with accuracy.

An NLU module often needs to map many different surface texts onto the same meaning, as the same surface expression can mean different things in different contexts.

For example, someone could say, “You know, you can’t have it all,” and the NLU module may need to find a specific interpretation for expressions like ‘it all’ within this specific context. For instance, what does ‘it all’ mean – does it mean a whole box of apples or something else altogether? Human linguistics is extremely complex and nuanced, and it can be hard to find a general-purpose semantic representation that is practical for use in your specific domain. Therefore, NLU builders must often create domain-specific systems.

James Allen and his colleagues at the University of Rochester have worked on these problems. As they write, “Deep linguistic processing is essential for spoken dialogue systems designed to collaborate with users to perform collaborative tasks. As we develop the system, we are constantly balancing two competing needs – deep semantic accuracy and portability, the need to reuse our grammar, lexicon, and discourse interpretation processes across domains.”

For NLU to improve, experts suggest AI should make the shift from simply converting natural language to data through, for example, statistics-based logic, to using linguistic structure and principles to understand language.

NLU helps AI work better

NLU solutions need to be domain specific and require substantial subject matter expertise to train the system. More products and services are now embedding NLU into their offerings like in voice-driven assistants, natural language search, chatbots, sentiment analysis for trading, business intelligence and social media analytics. As AI systems become more sophisticated, NLU is likely to play a much bigger role in improving human-computer interaction.