Use Cases LOD2 for Media and Publishing
Abstract. It is the core business of the information industry, including traditional publishers and media agencies, to deal with content, data and information. Therefore, the development and adaptation of Linked Data and Linked Open Data technologies to this industry is a perfect ﬁt. As a concrete example, the processing of legal information at Wolters Kluwer as a global legal publisher through the whole data life cycle is introduced. Further requirements, especially in the ﬁeld of governance, maintenance and licensing of data are developed in detail. The partial implementation of this technology in the operational systems of Wolters Kluwer shows the relevance and usefulness of this technology.
Keywords: Data transformation • Data enrichment • Metadata management • Linked data visualization • Linked data licensing • IPR • Wolters Kluwer • Media • Publishing • Legal domain
Rationale for the Media and Publishing Use Case
The media and publishing use case within the LOD2 project aims at enabling largescale interoperability of (legal) domain knowledge based on Linked Data. This is a necessary precondition in the media industry to proﬁt from the beneﬁts of distributed and heterogeneous information sources (DBpedia, EuroVoc) on the Semantic Web. Hence, this use case aims at improving access to high-quality, machine-readable datasets generated by publishing houses for their customers.
This attempt is accompanied by several challenges: Traditional ofﬁcial content such as laws and regulations or court case proceedings are increasingly publicly available on the web and are directly published by the respective issuing bodies. Social networks and platforms, such as Wikipedia, aggregate professional knowledge and publish it at no charge. At the same time e.g. news media generate large amounts of relevant information about events and people that are complementary to conventional content of specialized publishers, but hardly integrated (exception is e.g. integration between BBC and DBpedia). In addition, the amount of relevant information is still growing exponentially; this amount cannot be incorporated and structured by using traditional manual annotation mechanisms. Finally, the customer expects more and more exact and to-the-point information in her actual professional workﬂow that covers individual interests, personal preferences and one central trusted access to distributed data sources. Interests and preferences of a professional can even change over time and tasks to be completed.
From the perspective of Wolters Kluwer, the relevance of using schema-free data models like RDF and SKOS as well as accessing external content for their data-driven business is obvious. By interlinking quality-approved proprietary data sources and “tapping” classiﬁcation resources from the community and existing references in the LOD cloud, Wolters Kluwer is exploring diversiﬁcation scenarios for existing assets as well as business opportunities under new licensing regimes. These efforts must lead to a win-win situation, where, on the one hand, additional revenues can be created by adding value to existing products and, on the other hand, customers of Wolters Kluwer and the public can beneﬁt from well-licensed datasets, new tools and customized services to pursue their professional and personal goals.
The tasks within the use case can be organized according to three main areas:
• Making the Wolters Kluwer data available in a machine-readable form and then executing the interlinking and data enrichment tools of the LOD2 Stack on it.
• Creating a semantic knowledge layer based on this data and executing the editorial
part of data management as well as general data visualization tools of the LOD2 Stack on it.
• Describing in more detail the business impact of this new kind of data in the media
and publishing industry, especially with respect to expected hurdles in usage like governance and licensing issues.