Graph-based word sense disambiguation
Graph-based WSD algorithms try to identify the actual sense of a word in a determined phrase, exploiting the information derived from its context. They are gaining much attention in the NLP community. This is because graph theory is a powerful tool that can be employed for different purposes, from the organization of the contextual information to the computation of the relations among word senses. These kind of approaches construct a graph collecting all the possible senses of the words in a text and represent them as nodes. Then, using connectivity measures can identify the most relevant word senses in the graph [SIN 07, NAV 07a].
Navigli and Lapata [NAV 07a] conducted an extensive analysis of graph connectivity measures for unsupervised WSD. Their approach uses a knowledge base, such as WordNet, to collect and organize all the possible senses of the words to be disambiguated in a graph structure, then uses the knowledge base to search for a path between each pair of senses in the graph and if it exists, it adds all the nodes and edges on this path to the graph. These measures analyze local and global properties of the graph. The results of the study indicate that local measures outperform global measure and, in particular, degree centrality and PageRank [PAG 99] achieve the best results.
PageRank [PAG 99] is one of the most popular algorithms for WSD; in fact, it has been implemented in several different ways by the research community [MIH 04, HAV 02, AGI 14, DEC 10]. It represents the senses of the words in a text as nodes of a graph. It uses a knowledge base to collect the senses of the words in a text and represents them as nodes of a graph. The structure of the knowledge base is used to connect each node with its related senses in a directed graph. The main idea of this algorithm is that whenever a link from one node to another node exists, a vote is produced, increasing the rank of the voted node. It works by counting the number and quality of links to a node to determine an estimation of how important the node is in the network. The underlying assumption is that more important nodes are likely to receive more links from other nodes [PAG 99]. Exploiting this idea, the ranking of the nodes in the graph can be computed iteratively with the following equation:
where M is the transition matrix of the graph, v is a N x 1 vector representing a probability distribution and c is the so-called damping factor that represents the chance that the process stops, restarting from a random node. At the end of the process, each word is associated with the most important concept related to it. One problem of this framework is that the labeling process is not assumed to be consistent.
An algorithm, which tries to improve centrality algorithms, is SUDOKU, introduced by Manion and Sainudiin [MAN 14]. It is an iterative approach that simultaneously constructs the graph and disambiguates the words using a centrality function. It starts inserting in the graph the nodes corresponding to the senses of the words with low polysemy. The advantages of this method are that it uses small graphs at the beginning of the process, reducing the complexity of the problem; furthermore, it can be used with different centrality measures.
Recently, a model based on an undirected graphical model has been introduced by [CHA 15]. This method interprets the WSD problem as a maximum a posteriori query on a Markov random field [JOR 02]. The graph is constructed using the content words of a sentence as nodes and connecting them with edges if they share a relation, determined using a dependency parser. The values that each node in the graphical model can take include the senses of the corresponding word. The senses are collected using a knowledge base and weighted using a probability distribution based on the frequency of the senses in the knowledge base. Furthermore, the senses between two related words are weighted using a similarity measure. The goal of this approach is to maximize the joint probability of the senses of all the words in the sentence, given the dependency structure of the sentence, the frequency of the senses and the similarity among them.
A new graph-based, semi-supervised approach, introduced to deal with multilingual WSD [NAV 12b] and entity linking problems, is Babelfy [MOR 14]. Multilingual WSD is an important task because traditional WSD algorithms and resources are mainly focused on English language. It exploits the information from large multilingual knowledge bases, such as BabelNet [NAV 12a] to perform this task. Babelfy creates the semantic signature of each word to be disambiguated, that consists of collecting, from a semantic network, all the nodes related to a particular concept, exploiting the global structure of the network. This process leads to the construction of a graph- based representation of the whole text. Then, it applies Random Walk with
Restart [TON 06] to find the most important nodes in the network, solving the WSD problem.
Approaches that are more similar to ours in the formulation of the problem have been described by Araujo [ARA 07] and a specific evolutionary approach to WSD has been introduced by Menai [MEN 14]. It uses genetic algorithms [HOL 75] and memetic algorithms [MOS 89] to improve the performances of a gloss-based method. It assumes that there is a population of individuals, represented by all the senses of the words to be disambiguated and that there is a selection process that chooses the best candidates in the population. The selection process is defined as a sense similarity function that gives a higher score to candidates with specific features, increasing their fitness. This process is repeated until the fitness level of the population regularizes and, at the end, the candidates with higher fitness are selected as solutions of the problem.