Opening and interoperability with the web of data
The adoption of the W3C semantic web standards (rdf/rdfs, OWL, SPARQL) facilitates the interoperability between the data produced locally by the graph editor with external knowledge bases and particularly those that are open on the Web, such as DBpedia, Geonames, VIAF and MusicBrainz. This interoperability allows:
- - the identification of the individuals created locally by identifiers accepted around the world, such as the ISWC (International Standard Work Code), which is a unique identification code attributed to works;
- - the assurance of a certain quality of the knowledge base by comparing the local entities with entities identified on the Web and highlighting potential incompatibilities with them;
- - the completion of the semantic search performed on the local knowledge base by requests directed to the web of data in order to resolve data incompleteness;
- - the enrichment of the local entities by external information, information currently used for data publication;
- - the annotation of the audiovisual content by individuals or concepts available on the web of data;
- - the support for the annotator in his documentation task by proposing rich, trustworthy information from web knowledge bases.
Access to these bases can be achieved via:
- - SPARQL queries, if the knowledge base exhibits a “SPARQL Endpoint’ on the Web;
- - HTTP queries, if the knowledge base is reachable through a REST API;
- - access to a local implementation of the knowledge base if a dump of that base is available on the Web and is installed on a server connected to the company’s Intranet or on the Cloud.
Entities coming from different sources can designate a single entity in the real world. Their similarities are detected by recursively comparing their key properties. These properties or combination of properties allows a given entity to almost certainly be identified. For example, it is accepted that the combination of first name, last name, date of birth and date of death form a certain key for identifying a physical person. Similar entities can have similar, complementary or even contradictory properties. A data fusion task allows these properties of a single entity to be regrouped recursively. Various regrouping criteria can be imagined: the precision of property values (complete data with month/date/year vs. year alone), the number of occurrences of each value (voting system), the quality of the knowledge bases concerned (reliability of data), etc. This allows a compact and simplified view of the data to be presented to the user.