With advances in technologies such as cognitive computing and intelligent systems, NLP needs to satisfy the demands of spoken dialog systems and intelligent digital assistants. Such systems should mainly converse with humans in natural language as if they were talking to a native speaker. NLG technology strives to express information stored and modeled in software to speak with humans.

NLG is the process of automatic generation of human language by a computer program to render a thought or stored data. As the primary intention of such systems is to communicate a piece of the required information, the core functionality is to decide on what to convey and then to organize the theme by selecting proper sentences and organizing them rhetorically to fit it into a grammar [3].

There are two main tasks of NLG. (i) The first is to decide and choose the content to be transferred as text. This task is done by a phase generally known as a document planner, which finally delivers chunks of information required to structure the output text, (ii) The second is to realize the actual text to be generated from the output of the document planner as a discourse in a coherent way. The primary function of the realizer is to determine the syntactic structure of the content to be generated and present it in a gr ammatically and syntactically correct way.

  • • Word sense disambiguation
  • • Coreference resolution
  • • Discourse coherence
  • • Semantic role labeling
  • • Named entity recognition
  • • Co-occurrence, that is, how words occurring together form a relationship
  • • Paraphrase generation, that is, airtomatic paraphrase generation of a text.

Cognitive approaches can be applied in solving several research issues in NLP. A few of them are listed as follows.

Paraphrase generation—rewording a passage to get a concise idea


Original: A total of 3000 kg of solid waste has beeti collected from world’s highest peak, Mt. Everest since April 14 when Nepal launched an ambitious clean-up campaign aimed at bringing back tonnes of thrash left behind by climbers.

Paraphrase: Since April 14, 3000 kg of solid waste left by climbers has been removed from Mt. Everest.

Text entailment-—It can be thought of as capturing relationships of the form t => //, where

t is some natural language text, and h is some hypothesis, also expressed in natural

language. Need to infer from the given text.

Example: He is snoring Entailed text: He is sleeping Example:

Pi •emise: A large grey elephant walked beside a herd of zebras Hypothesis: The elephant was lost Metonymy resolution:

Metaphor-verbal figurative expressions

A metaphor often describes a word or phrase in a different style or symbolic way.


Life is a roller-coaster means life has ups and down

Metonymy—figurative expressions


Wall Street, Silicon Valley, White House, Hollywood, Silverfox

Metonymy resolution, text entailment, and paraphrase generation are complex research issues in NLP where human common sense thinking is highly required to extract the exact semantics. In future, cognitive self- learning algorithms with a rich context-aware knowledge base can be used to simulate human thought processes. Such cognitive systems can rely on deep learning or statistical approach based on NLP issues addressed to expand the growth of NLP.


The typical applications of NLP include the following.

  • Question answering: It is a system that answers to the question automatically. It is an information retrieval system.
  • Machine translation: It is one of the oldest but beneficial applications of NLP. It automatically translates text from one language to another by considering the syntax, semantics, etc., of both languages.
  • Text summarization: It takes a document/piece of text as input and generates a compressed form of text packing essential content without any change in meaning.
  • Optical character recognition: Extracts text out from an image embedded with text.
  • Text similarity and clustering of documents: F hiding similar texts helps in building relationships quickly. Consider (Man, Woman), (Boy, Girl) pairs; these words are not the same, but they have some similarities. These types of relationships are identified by finding text similarity between two words.

Levenshtein distance: The similarity of two strings is calculated by accounting the total number of editing operations (insertion, deletion, replace) required to convert one string into another. The Levenshtein distance of the pair (Hook, Hack) is 2. The second letter “o” should be replaced with “a” and the next “o” is replaced with “c” to get the string “Hack.” Here, two substitution operations were required to convert “hook” to “hack.” Therefore, the Levenshtein distance is 2.

Cosine similarity: After converting the text to vectors, the similarity of vectors can be found out using the cosine similarity measure.

Phonetic similarity: The voice-to-text converter applications use a phonetic matching concept. It tries to find a matched word from the dictionary that is phonetically similar.

< Prev   CONTENTS   Source   Next >