Integrating Heterogeneous Tools into the LOD2 Stack
The LOD2 Stack serves two main purposes. Firstly, the aim is to ease the distribution and installation of tools and software components that support the Linked Data publication cycle. As a distribution platform, we have chosen the well established Debian packaging format. The second aim is to smoothen the information ﬂow between the diﬀerent components to enhance the end-user experience by a more harmonized look-and-feel.
Deployment Management Leveraging Debian Packaging
In the Debian package management system , software is distributed in architecture-speciﬁc binary packages and architecture-independent source code packages. A Debian software package comprises two types of content: (1) control information (incl. metadata) of that package, and (2) the software itself.
The control information of a Debian package will be indexed and merged together with all other control information from other packages available for the system. This information consists of descriptions and attributes for:
(a) The software itself (e.g. licenses, repository links, name, tagline, ... ),
(b) Its relation to other packages (dependencies and recommendations),
(c) The authors of the software (name, email, home pages), and
(d) The deployment process (where to install, pre and post install instructions).
The most important part of this control information is its relations to other software. This allows the deployment of a complete stack of software with one action. The following dependency relations are commonly used in the control information:
Depends: This declares an absolute dependency. A package will not be conﬁgured unless all of the packages listed in its Depends ﬁeld have been correctly conﬁgured. The Depends ﬁeld should be used if the depended-on package is required for the depending package to provide a signiﬁcant amount of functionality. The Depends ﬁeld should also be used if the install instructions require the package to be present in order to run.
Recommends: This declares a strong, but not absolute, dependency. The Recommends ﬁeld should list packages that would be found together with this one in all but unusual installations.
Suggests: This is used to declare that one package may be more useful with one or more others. Using this ﬁeld tells the packaging system and the user that the listed packages are related to this one and can perhaps enhance its usefulness, but that installing this one without them is perfectly reasonable.
Enhances: This ﬁeld is similar to Suggests but works in the opposite direction. It is used to declare that a package can enhance the functionality of another package.
Conflicts: When one binary package declares a conﬂict with another using a Conﬂicts ﬁeld, dpkg will refuse to allow them to be installed on the system at the same time. If one package is to be installed, the other must be removed ﬁrst.
Fig. 4. Diﬀerent RDF serializations of three triples from Fig. 3
Fig. 5. Example DEB-package dependency tree (OntoWiki). Some explanation: Boxes are part of the LOD2 Stack, Ellipses are part of the Debian/Ubuntu base system, Dashed forms are meta-packages, Relations: Depends (D), Depends alternative list (A), Conﬂicts (C) and Suggests (S).
All of these relations may restrict their applicability to particular versions of each named package (the relations allowed are <<, <=, =, >= and >>). This is useful in forcing the upgrade of a complete software stack. In addition to this, dependency relations can be set to a list of alternative packages. In such a case, if any one of the alternative packages is installed, that part of the dependency is considered to be satisﬁed. This is useful if the software depends on a speciﬁc functionality on the system instead of a concrete package (e.g. a mail server or a web server). Another use case of alternative lists are meta-packages. A metapackage is a package which does not contain any ﬁles or data to be installed. Instead, it has dependencies on other (lists of) packages.
Example of meta-packaging: OntoWiki.
To build an appropriate package structure, the ﬁrst step is to inspect the manual deployment of the software, its variants and the dependencies of these variants. OntoWiki is a browser-based collaboration and exploration tool as well as an application for linked data publication. There are two clusters of dependencies: the runtime environment and the backend. Since OntoWiki is developed in the scripting language PHP, it's architecture-independent but needs a web server running PHP. More speciﬁcally, OntoWiki needs PHP5 running as an Apache 2 module. OntoWiki currently supports two diﬀerent back-ends which can be used to store and query RDF data: Virtuoso and MySQL. Virtuoso is also part of the LOD2 Stack while MySQL is a standard package in all Debian-based systems. In addition to OntoWiki, the user can use the OntoWiki command line client owcli and the DL-Learner from the LOD2 Stack to enhance its functionality.
The dependency tree (depicted in Fig. 5) is far from being complete, since every component also depends on libraries and additional software which is omitted here. Given this background information, we can start to plan the packaging. We assume that users either use MySQL or Virtuoso as a backend on
Fig. 6. Basic architecture of a local LOD2 Stack.
a server, so the ﬁrst decision is to split this functionality into two packages: ontowiki-mysql and ontowiki-virtuoso. These two packages are abstracted by the meta-package ontowiki, which requires either ontowiki-mysql or ontowiki-virtuoso, and which can be used by other LOD2 Stack packages to require OntoWiki. Since both the MySQL backend and the Virtuoso backend version use the same system resources, we need to declare them as conﬂicting packages.