The ЖКОБ Approach to Support the Intelligible Reuse of Ecological Data
Solving the Business and Information Challenges
ЖKOS was designed as a repository to support the publication of ecological data. In this respect, we decided to focus our attention on rich plot-based data on the basis that this was a recognized gap in Australia and despite it being clear that it was arguably the most challenging space to achieve progress in. To limit the scope, we also chose to largely ignore several other types of ecological data (at least from the initial prototype). First, biodiversity data (species by location observations) were deemed out of scope because this type of data is already aggregated in a national repository through the Atlas of Living Australia (ALA), a member node of GBIF. Second, spatial and grid- ded data are also already available via several thematic national government and research repositories. This type of data is also suited to a different style of infrastructure, so that combining them in a single system seemed to be counterproductive. Similar arguments were also used to exclude time-series sensor data. In all cases, this allowed us to better focus our solution and at the same time avoids duplication of effort with these other initiatives.
To build ЖKOS, we adopted an adaptive strategy, which was necessary at the time as we did not have a complete understanding of the scope of the problem or how the resulting implementation would look. The overall approach was to identify design requirements based on the needs of the user community. To this end, we established user reference groups and additionally solicited feedback through a range of other channels (including questionnaires, feedback buttons on the portal, product demonstrations to research groups). Implementation and feedback was an iterative process and as requirements became clearer, so did the challenges described in the previous sections. Several innovative approaches were prototyped and tested by the end users with the most promising design elements incorporated into the emergent design. Taking this approach minimized the risk of failure, meaning that we avoided unproven technologies, as there was a risk they wouldn't scale to production levels or alternatively would not be supported in future. We also chose flexible directions that kept as many options open as possible.
We adopted several additional fundamental principles consistent with addressing the challenges associated with publication of reproducible ecological data with the overall goal of thereby facilitating its reuse. First, given the complex and context-sensitive nature of the data, publication was considered more of a knowledge transfer challenge than simply a data transfer challenge. While we expand upon this in more detail later, this fundamentally means that the data and important contextual information need to be coupled and thus considered together as complimentary elements of knowledge. With this in mind, the second principle was to then present all data and information as fully as possible because every user will have different needs of the data set. While we can obviously determine some of what would be considered important knowledge, we cannot predict every specific use case. Similarly, it was also considered important not to change any of the underlying data or information and instead preserve an exact copy of what was received from data creators. Thus, the third principle was that any manipulation of the data needed to happen in a way that was reproducible and hence transparent to the user community. The actual mechanism used to do this is described in the following text. Finally, in order to maximize usability, the tools we built needed to easily fit with users' scientific practice, focusing on generating efficiencies and benefits for them rather than expecting them to adapt to the system.