Schema Enrichment


The Semantic Web has recently seen a rise in the availability and usage of knowledge bases, as can be observed within the Linking Open Data Initiative, the TONES and Prot´eg´e ontology repositories, or the Watson search engine. Despite this growth, there is still a lack of knowledge bases that consist of sophisticated schema information and instance data adhering to this schema. Several knowledge bases, e.g. in the life sciences, only consist of schema information, while others are, to a large extent, a collection of facts without a clear structure, e.g. information extracted from data bases or texts. The combination of sophisticated schema and instance data would allow powerful reasoning, consistency checking, and improved querying possibilities. Schema enrichment allows to create a sophisticated schema base based on existing data (sometimes referred to as “grass roots” approach or “after the fact” schema creation).

Example 1. As an example, consider a knowledge base containing a class Capital and instances of this class, e.g. London, Paris, Washington, Canberra, etc. A machine learning algorithm could, then, suggest that the class Capital may be equivalent to one of the following OWL class expressions in Manchester OWL syntax[1]:

Both suggestions could be plausible: The first one is more general and includes cities that are capitals of states, whereas the latter one is stricter and limits the instances to capitals of countries. A knowledge engineer can decide which one is more appropriate, i.e. a semi-automatic approach is used, and the machine learning algorithm should guide the user by pointing out which one fits the existing instances better.

Assuming the knowledge engineer decides for the latter, an algorithm can show the user whether there are instances of the class Capital which are neither instances of City nor related via the property isCapitalOf to an instance of Country.

[2] The knowledge engineer can then continue to look at those instances and assign them to a different class as well as provide more complete information; thus improving the quality of the knowledge base. After adding the definition of Capital, an OWL reasoner can compute further instances of the class which have not been explicitly assigned before.

Fig. 7. Screenshot of the enrichment view for SPARQL knowledge bases.

Support in ORE

The enrichment view for SPARQL knowledge bases(see Fig. 7), can be subdivided into two main parts: The first part on the left side (1 ) allows for configuring the enrichment process like to denote for which entity and which types ORE will search for schema axioms. The second part on the right side(2 ) shows the generated axiom suggestions as well as their confidence score for each chosen axiom type in forms of tables. Additionally, it is possible to get some more details about the confidence score by clicking on the question mark symbol(?). This shows up a new dialog as shown in Fig. 8. The dialog gives some natural language based explanation about the F-score depending on the axiom type. Moreover, positive and negative examples (if exists) according to the axiom are shown, thus, giving some more detailed insights in how the axiom fits the data of the knowledge base.

  • [1] For details on Manchester OWL syntax (e.g. used in Prot´eg´e, OntoWiki) see
  • [2] This is not an inconsistency under the standard OWL open world assumption, but

    rather a hint towards a potential modelling error

< Prev   CONTENTS   Next >