Ontology Repair with PatOMat

The PatOMat is a pattern-based ontology transformation framework specifically designed for OWL ontologies [23]. By applying transformation it enables a designer to modify the structure of an ontology or its fragments to make it more suitable for a target application. While it can adapt any ontology aspect (logical, structural, naming or annotation aspect), within the context of LOD2 project the PatOMat focuses on ontology naming aspect.

During the decades of knowledge engineering research, there has been recurrent dispute on how the natural language structure influences the structure of formal knowledge bases and vice versa. A large part of the community seems to recognise that the content expressed in formal representation languages, such as the semantic web ones, should be accessible not only to logical reasoning machines but also to humans and NLP procedures, and thus resemble the natural language as much as possible [17].

Often, an ontology naming practice can be captured as a naming pattern. For instance, it is quite common in ontologies that a subclass has the same head noun as its parent class (Non-Matching Child Pattern).[1] By an earlier study [22] it was estimated that in ontologies for technical domains this simple pattern is verified in 50–80 % of class-subclass pairs such that the subclass name is a multi-token one. This number further increases if one considers thesaurus correspondence (synonymy and hypernymy) rather than literal string equality. In fact, the set-theoretic nature of taxonomic path entails that the correspondence of head nouns along this path should be close to 100 % in principle; the only completely innocent deviations from it should be those caused by incomplete thesauri. In other words, any violation of head noun correspondence may potentially indicate a (smaller or greater) problem in the ontology. Prototypical situations are:

Inadequate use of class-subclass relationship, typically in the place of wholepart or class-instance relationship, i.e., a conceptualisation error frequently occurring in novice ontologies.

Name shorthanding, typically manifested by use of adjective, such as “StateOwned” (subclass of “Company”).

While the former requires complex refactoring of the ontology fragment, the latter can be healed by propagation of the parent name down to the child name. While in the biomedical field there have already been efforts in naming analysis, e.g., in [6, 19], naming in the broad field of linked data vocabularies (where domainspecific heuristics cannot be applied) has rarely been addressed.

A pattern in the PatOMat framework, called transformation, consists of three parts: two ontology patterns (source OP and target OP) and the description of the transformation between them, called pattern transformation (PT). Naming pattern, such as non-matching child pattern, can be captured by specifying violation of a naming pattern to be detected (i.e. source OP) and its refactored variant (e.g. non-matching child pattern as target OP). Transformation patterns can be designed directly as XML files or by using graphical editor. For general

usage the framework can be applied directly from the code by importing the PatOMat Java library[2] or by using Graphical User Interface for Pattern-based Ontology Transformation [21].

Naming issue detection and repair is supported by integrating the PatOMat framework into the ORE. The whole process is basically done in three subsequent steps, all of them visualized in a single view shown in Fig. 10. Here the user can select a naming pattern in the leftmost list (1 ). PatOMat then detects instances of the selected pattern in the currently loaded ontology, e.g. [?OP 1P = Contribution; ?OP 1A = Poster](2 ). For the selected pattern instances the user will be provided a list of renaming instructions (see 3 ), for example to rename the class Poster to PosterContribution, which can then be used to transform the ontology and solve the detected naming issues.

Fig. 10. Screenshot of naming pattern detection and repair view in the ORE.

  • [1] The head noun is typically the last token, but not always, in particular due to possible prepositional constructions, as, e.g., in “HeadOfDepartment”
  • [2] owl.vse.cz:8080/
< Prev   CONTENTS   Next >