The Semantic Web has recently seen a rise in the availability and usage of knowledge bases, as can be observed within the Linking Open Data Initiative, the TONES and Prot´eg´e ontology repositories, or the Watson search engine. Despite this growth, there is still a lack of knowledge bases that consist of sophisticated schema information and instance data adhering to this schema. Several knowledge bases, e.g. in the life sciences, only consist of schema information, while others are, to a large extent, a collection of facts without a clear structure, e.g. information extracted from data bases or texts. The combination of sophisticated schema and instance data would allow powerful reasoning, consistency checking, and improved querying possibilities. Schema enrichment allows to create a sophisticated schema base based on existing data (sometimes referred to as “grass roots” approach or “after the fact” schema creation).
Example 1. As an example, consider a knowledge base containing a class Capital and instances of this class, e.g. London, Paris, Washington, Canberra, etc. A machine learning algorithm could, then, suggest that the class Capital may be equivalent to one of the following OWL class expressions in Manchester OWL syntax:
Both suggestions could be plausible: The ﬁrst one is more general and includes cities that are capitals of states, whereas the latter one is stricter and limits the instances to capitals of countries. A knowledge engineer can decide which one is more appropriate, i.e. a semi-automatic approach is used, and the machine learning algorithm should guide the user by pointing out which one ﬁts the existing instances better.
Assuming the knowledge engineer decides for the latter, an algorithm can show the user whether there are instances of the class Capital which are neither instances of City nor related via the property isCapitalOf to an instance of Country.
Fig. 7. Screenshot of the enrichment view for SPARQL knowledge bases.
Support in ORE
The enrichment view for SPARQL knowledge bases(see Fig. 7), can be subdivided into two main parts: The ﬁrst part on the left side (→1 ) allows for conﬁguring the enrichment process like to denote for which entity and which types ORE will search for schema axioms. The second part on the right side(→2 ) shows the generated axiom suggestions as well as their conﬁdence score for each chosen axiom type in forms of tables. Additionally, it is possible to get some more details about the conﬁdence score by clicking on the question mark symbol(?). This shows up a new dialog as shown in Fig. 8. The dialog gives some natural language based explanation about the F-score depending on the axiom type. Moreover, positive and negative examples (if exists) according to the axiom are shown, thus, giving some more detailed insights in how the axiom ﬁts the data of the knowledge base.