CubeViz – Exploration and Visualization of Statistical Linked Data

A vast part of the existing Linked Data Web consists of statistics (cf. LODStats[1] [3]) being represented according to the RDF Data Cube Vocabulary [2]. To hide the inherently complex, multidimensional statistical data structures and to offer a user-friendly exploration the RDF Data Cube Explorer CubeViz [2] has been developed. In this chapter we showcase how large data cubes comprising statistical data from different domains can be analysed, explored and visualized. CubeViz is based on the OntoWiki Framework [7] and consists of the following OntoWiki extensions:

The Integrity Analysis Component (cf. Sect. 3.2) evaluates the existence and the quality of selected RDF graphs according to given integrity constraints.

The Facetted Data Selection Component (cf. Sect. 3.3) is retrieving the structure of the selected Data Cube using SPARQL [5] in order to generate filter forms. Those forms allow to slice the data cube according to user interests.

The Chart Visualization Component (cf. Sect. 3.4) receives all observation as input, that correspond to the given filter conditions, in order to generate a chart visualization.

All components support the comprehensive CubeViz GUI shown in Fig. 2. Before we introduce the three components in more detail, we give a brief introduction of the RDF Data Cube Vocabulary in the next section. We conclude the paper with links to publicly available deployments and a list of some upcoming features planned for the next release. Further information about CubeViz can be obtained in the repository wiki [3] or via a recorded webinar[4] comprising a comprehensive screencast.

The RDF Data Cube Vocabulary

The RDF Data Cube vocabulary is a W3C recommendation for representing statistical data in RDF. The vocabulary is compatible with the Statistical Data and

Fig. 2. The CubeViz GUI visualizing a slice of a 2-dimensional RDF DataCube in a combined polar-column chart.

Medadata eXchange XML format (SDMX) [4], which is defined by an initiative chartered in 2001 to support the exchange of statistical data. Sponsoring institutions[5] of SDMX are among others the Bank for International Settlements, the European Central Bank, Eurostat, the International Monetary Fund, the Organisation for Economic Co-operation and Development (OECD), the United Nations Statistics Division and the World Bank. Experiences while publishing statistical data on the Web using SDMX were summarized by the United Nations in [11] and by the OECD in [12].

The core concept of the Data Cube vocabulary is the class qb:Observation[6], that is used to type all statistical observations being part of a Data Cube. Every observation has to follow a specific structure that is defined using the class qb:DataStructureDefinition (DSD) and referenced by a dataset resource (DS) of type qb:DataSet. Since every observation should refer to one specific DS (which again refers to the corresponding DSD) the structure of the observation is fully specified. DSD components are defined as set of dimensions (qb:DimensionProperty), attributes (qb:AttributeProperty) and measures (qb:MeasureProperty) to encode the semantics of observations. Those component properties are used to link the corresponding elements of dimensions, measure values and units with the respective observation resource. Furthermore, it is possible to materialize groups and slices of observations as well as hierarchical orders of dimension elements using respective concepts.

  • [1] stats.lod2.eu/rdf classes?search=Observation
  • [2] aksw.org/Projects/CubeViz
  • [3] https://github.com/AKSW/cubeviz.ontowiki/wiki
  • [4] youtube.com/watch?v=ZQc5lk1ug3M#t=1510
  • [5] sdmx.org/?page id=6
  • [6] Prefix qb:purl.org/linked-data/cube#
 
< Prev   CONTENTS   Next >