LOD2 Statistical Workbench in Use

This section provides some basic concepts of the Data Cube Vocabulary and how these were adapted in the Statistical Workbench, followed by some examples of using the workbench.

The RDF Data Cube Vocabulary

A statistical data set comprises a collection of observations (see Fig. 3) made at some points across some logical space. Using the RDF Data Cube vocabulary, a resource representing the entire data set is created and typed as qb:DataSet[1] and linked to the corresponding data structure definition via the qb:structure property.

The collection must be characterized by a set of dimensions (qb: DimensionProperty) that define what the observation applies to (e.g. time rs: time, observed sector rs:obsSector, country rs:geo)[2] along with metadata describing what has been measured (e.g. economic activity, prices) through measurements. Optionally, additional information can be provided on how the observation or cube was measured and how the observations are expressed through the use of attribute (qb:AttributeProperty) elements (e.g. units, multipliers, status).

The qb:dataSet property (see excerpt below) indicates that a specific qb:Observation instance is a part of a dataset. In this example, the primary measure, i.e. observation value (represented here via sdmx-measure:obsValue), is a plain decimal value. To define the units the observation in question is measured in, the sdmx-attribute:unitMeasure property which corresponds to the SDMX-COG concept of UNIT MEASURE was used. In the example, the code MIO NAT RSD corresponds to millions of national currency (Serbian dinars). The values in the time and location dimensions (rs:geo and rs:time), indicate that the observation took place in the Republic of Serbia (geographic region code RS), and in 2003 (time code Y2003), respectively.

Listing 1.1: RDF representation of an observation

Each data set has a set of structural metadata (see Table 3). These descriptions are referred to in SDMX and the RDF Data Cube Vocabulary as Data Structure Definitions (DSD). Such DSDs include information about how concepts are associated with the measures, dimensions, and attributes of a data cube along with information about the representation of data and related metadata, both identifying and descriptive (structural) in nature. DSDs also specify which code lists provide possible values for the dimensions, as well as the possible values for the attributes, either as code lists or as free text fields. A DSD can be used to describe time series data, cross-sectional and multidimensional table data. Because the specification of a DSD is independent of the actual data that the data cube is about, it is often possible to reuse a DSD over multiple data cubes.

  • [1] qb is the prefix purl.org/linked-data/cube#
  • [2] rs is the prefix elpo.stat.gov.rs/lod2/RS-DIC/rs/
< Prev   CONTENTS   Next >