Publishing procedures in Linked Data follows the identification of data sources and the modelling phase and actually refers to the description of the data as rdf and the storing and serving of the data. A variety of tools have been created to assist the different aspects of this phase from different vendors and include a variety of features. According to the needs of each specific business case and the nature of the original enterprise data, shorter publishing patterns can be created.

Publishing Pattern for Relational Data

Relational databases (rdb) are the core asset in the existing state-of-art of data management and will remain a prevalent source of data in enterprises. Therefore the interest of the research community[1] [2] has gathered around the development of mapping approaches and techniques in moving from rdb to rdf data. These approaches will enable businesses to:

Integrate their rdb with another structured source in rdb, xls, csv, etc. (or unstructured html, pdf, etc.) source, so they must convert rdb to rdf and assume any other structured (or unstructured) source can also be in rdf.

Integrate their rdb with existing rdf on the web (Linked Data), so they must convert to rdf and then be able to link and integrate.

Make their rdb data to be available for sparql or other rdf-based querying, and/or for others to integrate with other data sources (structured, rdf, unstructured).

Two key points should be taken into consideration and addressed within the enterprise (see Fig. 8):

Definition of the Mapping Language from RDB2RDF

Automatic mappings provided by tools such as d2r[3] and Virtuoso rdf Views provide a good starting point especially in cases when there is no existing Domain Ontology to map the relational schema to. However, most commonly the manual definition of the mappings is necessary to allow users to declare domainsemantics in the mapping configuration and take advantage of the integration

Fig. 8. rdb2rdf publishing pattern

and linking facilities of Linked Data. r2rml [4], a w3c recommendation language for expressing such customized mappings, is supported from several tools including Virtuoso rdf Views and d2r.

Materializing the Data

A common feature of rdb2rdf tools is the ability to create a “semantic view” of the contents of the relational database. In these cases, an rdf version of the database is produced so that content can be provided through a sparql endpoint and a Linked Data interface that works directly on top of the source relational database, creating a virtual “view” of the database. Such a “semantic view” guarantees up-to-date access to the source business data, which is particularly important when the data is frequently updated. In contrast, generating and storing rdf requires synchronization whenever either the source data model, the target rdf model, or the mapping logic between them changes. However, if business decisions and planning require running complicated graph queries, maintaining a separate rdf store becomes more competitive and should be taken under consideration.

  • [1]
  • [2]
  • [3]
  • [4]
< Prev   CONTENTS   Next >