Digital humanities and a new research culture: between promoting and practicing open research data

Urszula Pawlicka-Deger

Research questions and methodology

Research data are any information that has been collected, generated, or created to validate original research findings. “When you call something data, you imply that it exists in discrete, fungible units; that it is computationally tractable; that its meaningful qualities can be enumerated in a finite list; that someone else performing the same operations on the same data will come up with the same results. This is not how humanists think of the material they work with” (Posner 2015). Data, in common thinking, are associated with quantitative information stored in a spreadsheet. Therefore, humanists do not necessarily think about their source materials in terms of data, although they always work with data in the form of text, images, and videos. They collect cultural data which are now digitized and accessible through digital libraries such as Europeana Collections, Google Arts & Culture, and World Digital Library. Digital humanists, however, not only gather qualitative cultural data but also produce quantitative information in the form of counts or numbers, which can be used for statistical analysis. Unlike qualitative materials, these data can be verified and evaluated by others. Hence, data can be divided into two categories: collected and produced.

In the face of the European open data movement, universities, national research agencies, and funders require researchers to publish their work in open repositories or in journals, as well as to open their research data, defined as the “data underpinning scientific research results that has no restrictions on its access, enabling anyone to access it” (European Commission 2017). The open data movement is a new research culture that demands proper research data management (RDM) (to manage and store data and related metadata for current and future research purposes and uses), producing data based on FAIR-based principles (data that can be easily Findable, Accessible, Interoperable, and Reusable in the future), and taking a new attitude towards research practices. Nevertheless, there are many concerns related to opening research data: maintaining data integrity, identifying the data ownership rights, protecting sensitive data, and misunderstanding and misusing research data. Similar observations were presented during “Embedding Openness and Scholarship”, a workshop organized by the University of Helsinki in 2018 (University of Helsinki 2018). Per Oster, Director of the Research Infrastructures unit at the IT Center for Science (CSC), in Finland, stressed that there is a need to integrate services and infrastructures and engage with a broad range of stakeholders to build trust and skills in open science. Further, Oster argued that trust is required for the adoption of an open approach to scientific research. Open science requires restructuring research infrastructures and services in order to build a new model of scholarly interaction and collaboration. However, technical facilities and services alone are not enough to promote and advance open research data actions.

The hypothesis of the paper is that the open research data movement engages two cultures: (1) the administrative or managerial side of the institution that demands opening data on the basis of national top-down requirements and maintains research service infrastructure that provides support in managing and opening data (e.g. IT Services, Information Specialists, RDM services) and (2) academic researchers—academics who collect, generate, produce, and reuse data; this is the part of the institution that is required to properly manage and open research data. These two sides of the institution—administration and academics—are guided by different kinds of motives and goals. One challenge is thus to reconcile the institutional requirements with researchers’ experiences and motivations. How can one find a common language and build a bridge between the two cultures of open research data movement? What is the new research culture and how is it implemented? How does the new research culture change the landscape of the humanities discipline? What is the role of the digital humanities in applying open data practices and developing new study approaches?

The goal of this paper is to address the above research questions based on my own perspective on what it means to work with data as a humanist, and based on my experience as a Data Agent in the School of Arts, Design, and Architecture (ARTS), liaising between promoting institutional services and practicing open research data principles. The new culture of research, along with data management and openness, is a transformative force in the digital humanities. The RDM movement is about more than opening data itself: it is a new perspective on humanities research, seeing it as a process (documented work of data collection and analysis) rather than a product (published scholarly article). Thus, a research process that can be traced, validated, and reproduced is the main challenge and opportunity for the humanities.

The objectives of the paper are as follows: (1) to present the administrative and academic research cultures of the open research data movement; (2) to identify the gap between them in relation to open research data; (3) to introduce the role of a data agent as a bridge between institutional services and academics; and (4) to analyse the transforming impact of the emerging research culture on the humanities and highlight the role of the digital humanities in establishing this new approach. Therefore, in this essay, I reflect on data produced and published by scholars rather than data that is collected and already accessible (e.g. open cultural heritage data).

The analysis is based on a case study of Aalto University and Finnish open research data policy, actions, and digital humanities practices. I use the following materials, open data sets, and published reports: (1) a data set of structured, qualitative interviews which I conducted with five selected scholars from the ARTS faculty at Aalto University (a professor and PhD student from the Department of Media, a professor from the Department of Architecture, a professor from the Department of Art, and a PhD student from the Department of Design) in October 2017 (“Open Science and Research at Aalto University”; see Aalto University 2018a). The purpose of the interviews was to identify RDM practices and needs; (2) The open data set “Collaboration through big data and open science” created by Mona Roman from the Department of Industrial Engineering and Management at Aalto University (2018b). It is a set of notes from interviews with database owners, potential users, and open science experts. The data set relates to a case study of how a foundation could open its database for the benefit of open science and innovation. This data set provides an interesting view of the issue of reusing research data in the public arena; (3) “Report on the DARIAH Digital Practices in the Arts and Humanities Web Survey 2016” by Ines Matres at the University of Helsinki. This survey was conducted among students and researchers at eight universities and several cultural heritage organizations in Finland. One purpose was to identify digital research practices and needs for digital infrastructures in Finland. The report provides vital information relating to access to digital materials and open data in the digital humanities; (4) The public presentation of “Barriers to Sharing Research Data: Interview Study Among University of Helsinki Researchers” by Ala-Kyyny et al. (2018), Information Specialists in Research Services from the University of Helsinki Library (2018). The interview results were presented during the workshop on open science and a data action plan for the Finnish research community entitled “Embedding Openness to Science & Scholarship—Avoimuus Osaksi Tieteen Arkea”, which took place in March 2018 in Helsinki. The goals of the interviews were to obtain an overview of the amount and quality of valuable research data at the University of Helsinki and to hear interviewees’ opinions about the selection process of valuable data at the university. This project has undertaken the crucial task of archiving data based on the selection criteria of valuable research data. In summary, this paper is based on the above data sets and reports in order to address the issue of the gap between administrative services and academic researchers regarding open research data practices.

The Finnish landscape of the open data movement

According to Christine L. Borgman, “The ability to release, share, and reuse data—or representations thereof—depends upon the availability of appropriate knowledge infrastructures to do so” (2015, p. 206). She goes on to say that knowledge infrastructure is a driving force behind implementing a new research culture. For the last several years, Finland has been developing and advancing open data services and practices through various national programs and initiatives (see Karvonen 2017); for example, the “Open data programme 2013-2015” led by the Ministry of Finance (2015); “Open science and research roadmap 2014-2017” by the Ministry of Education and Culture (2014); Open Knowledge Finland Foundation established in 2004; and “Data Citation Roadmap for Finland” produced by the Finnish Committee for Research Data (FCRD) in dialogue with other members of the Finnish research community (Laine 2018).

Finland is a great example of a country that provides a strong open data infrastructure and requires clear open data policies.1 It widely promotes the reuse of publicly open data through events such as the “Flack4FI—Flack your heritage hackathon”, which brings together artists, programmers, designers, humanists, educators, and others interested in digital cultural heritage and multi-professional collaboration. Finland is a dream place for hackathons, which have popped up in many places, from cultural heritage (the Hack4FI) to the digital humanities (the annual Digital Humanities Hackathon at the University of Helsinki) to the state- owned railway (the Door-to-Door Hackathon organized by the state-owned railway company VR to encourage the building of an app that would improve the travel options for Finnish commuters; see Nordlund 2016). The goals of these hackathons are to create usable software or hardware and to initiate social projects that benefit the Finnish community. The guiding lights of hackathons are the open data that are discovered, reused, and promoted by organizers and participants.

The European Union policies and initiatives towards opening data have contributed, certainly, to the advancement of this movement in Finland. However, the open data movement is driven, first of all, by the Constitution of Finland, which says that “Documents and recordings in the possession of the authorities are public, unless their publication has for compelling reasons been specifically restricted by an Act. Everyone has the right of access to public documents and recordings” (see Harju & Teiniranta 2016). Consequently, data management and open data practices became high-level cultural and political goals in Finland, where institutions make Finnish cultural heritage accessible to any citizen. Universities are evaluated on the basis of the amount of open research data at each institution (measured by the number of data sets published in the university’s research information system), and governmental funding bodies (e.g. the Academy of Finland) require RDM plans and open publications. The national open data policy, which aims to make all significant public data available to citizens, enterprises, and society, puts a lot of pressure on the universities, which are required to apply, advance, and promote open research data practices. Each university has built its own infrastructure to assist researchers in data stewardship, provide training regarding the management, use, and sharing of research data, and promote open data practices. It is sufficient to mention the Research Data Management Programme at Aalto University, DataSupport at the University of Helsinki, and Open Science Centre atjyvaskyla University, a unique collaboration of Jyvaskyla University’s Library and University Museum established in 2017.

Researchers as data agents

“Properly managed research data creates a competitive edge and is an important part of a high-quality research process. Optimal use and reuse of research data is a strategic goal of Aalto University. The goal is to make the data compatible as well as easy to find, reach and understand” (Aalto University 2018b). Aalto University brought the idea of open science into existence by setting forth Aalto University’s Policy on Open Access in 2014. The policy’s main principles request that Aalto researchers publish their scientific articles and research results in an open-access format and provide a university-maintained publication archive called Aalto Current Research Information System (ACRIS) as a platform for open-access publication. Subsequently, rules were established for the Aalto University Research Data Management Policy of 2016, which provided information about managing data according to FAIR data principles (Findability, Accessibility, Interoperability, and Reusability). Throughout the last several years, Aalto University has invested in building its knowledge infrastructure through initiatives such as providing expert data management services (Research Data Management Programme, Open Science and ACRIS, and RDM Working Group), creating the Aalto ACRIS repository, organizing training in best practices for open science, and holding special events to raise awareness about data management, such as Aalto Data Day.

The Research Data Management Programme of Aalto University established several separate action points for RDM activities, including open-access publishing, implementation of a data management planning tool, opening research data, and the creation of a repository service for storage, back-up, and collaboration (Soderholm & Sunikka 2017). The Aalto RDM network consists of Research and Innovation Services (RIS), IT Services, and the Learning Centre (library and information services). Under external pressure from the government and funding bodies to create data infrastructure, accelerate open data availability, and report on RDM practices, RIS launched the Open Science and ACRIS team, RDM Working Group, and Data Agents (researchers who provide practical help with RDM in schools and departments). These, at first glance, complex networks function together as a platform for raising awareness of RDM among researchers, creating research data services, piloting services provided by national and international agents, and developing trends within the university. The Aalto platform represents a network-based collaboration model that aims to “meet the increasing expectations for research data services we face from the funders, governments, and researchers” (Soderholm & Sunikka 2017).

To foster internal collaboration among services at Aalto, Anne Sunikka, Head of Open Science and ACRIS at Aalto University, organized the “RIS Collaboration Workshop” in February 2018. The goal of this internal meeting was to develop a strategy to raise the awareness of researchers regarding RDM issues and identify schools’ needs regarding RDM and opening data. In her introductory speech, Sunikka presented the strategic leadership of Aalto University as helping to fulfil Finland’s aspiration to become “one of the leading countries for openness in science and research” (Ministry of Education 2014). In the next presentation, Richard Darst, staff scientist from the Department of Computer Science, switched the question around and addressed not the strategic move on the part of the university, but, instead, researchers’ challenges with regards to RDM practices.

These two talks set next to each other represent the different landscapes of open science and RDM: the administrative side of institution and academic researchers. The first perspective is shaped by institutional and government policy and funders’ requirements. It consists of research services focused on providing knowledge infrastructure, promoting RDM, and enforcing open data at the university. The second view, in turn, sketches the scholarly landscape affected by RDM requirements and policy. It is made up of researchers who fall abruptly into a situation of having to implement a new research culture based on data management, openness, and transparency. They do not feel ready to change their research practices and are not motivated enough to apply open data principles. Wallis et al. (2013) listed many reasons for not making data available, such as a lack of appropriate infrastructure, concerns about protecting the researcher’s right to publish their results first, and difficulty in establishing trust in others’ data.

To bridge the service units and academics, Aalto University initiated a Data Agents group that consists of “researchers who work to improve data management in their department, school, or unit. They are a first, practical, hands-on resource to researchers in their department” (Aalto University 2018c). The mission of this initiative is to make data management better: “A cultural change is needed to maximize Aalto University’s research impact. We have recruited Data agents to ensure that data is managed properly and it is possible to open. Proper data management is essential in order to achieve open science goals”. The tasks of group members include helping their colleagues to manage data properly, fostering reuse of data, and driving the cultural change in their workplace. Data agents propel and support RDM practices through various initiatives, such as designing RDM materials for each department, identifying repositories for each discipline, training researchers on RDM practices, and raising awareness of RDM at the school level. As a data agent in ARTS, I firstly conducted interviews with researchers to identify the state of knowledge on RDM and recognize the needs of the scholarly community regarding RDM practices.

The interviews consisted of seven parts: awareness (e.g. What is your research data?), organization (e.g. How do you organize and document your data?), storage (e.g. Where are the data stored?), needs (e.g. How big are the data for a single project?), sharing (e.g. Do you share your data with collaborators?), archiving (e.g. How do you archive your data long term?), and opening (e.g. Do you share your data publicly? If not, would you want to start sharing your data publicly?). Based on the interviews, I divided the humanities and arts research data into three categories: created by researchers (images, videos, text-based data, 3D models, software), collected by researchers (digital and physical books, images, videos, maps, photographs), and obtained with permission. All respondents agreed that opening data is important for developing research, contributing critically to studies, and determining the validity and reliability of research. They want to share and open the data they produce, but they just do not do so in practice. One reason is that there is still not enough knowledge and practical skill: from choosing a right license to preparing data for publication to finding a suitable repository. The second point is related to the nature of humanities and arts data that are extremely diverse, heterogeneous, and complex, and it is hard to maintain data integrity. The last reason is that the data take on the meaning only as an integrative whole; once the pieces of data are separated they can lose their significance and features. I identified the following challenges that must be addressed in order to improve RDM in the arts and humanities: disseminating knowledge about RDM practices and tools for planning RDM (e.g. Finnish DMPTuuli, data management planning tool); undertaking ethical, legal, and technical issues regarding the process of collecting, archiving, and opening data; identifying repositories for publishing and archiving data; raising awareness about the citation of shared data; and disseminating knowledge about license options for open data. One of the most significant tasks, however, is to encourage people to search both repositories and the data of other scholars who came before them in their research group to find out if there is similar research from which they could benefit. Along with opening data, discovering and reusing data seem to be crucial challenges in improving research practices.

The role of data agents then is both about promoting a new research culture and helping in the practice. At present, researchers are willing to manage and share their data better but, in practice, they have not done so. As Borgman said, “willingness does not equal action” (2015, p. 205). Therefore, the goal of the data agents group is to bridge the gap between institutional requirements and researchers’ practices. It is thus a continuous movement between the two cultures.

The institution and the researcher: two perspectives on open research data

One interview question asked of a legal counsel at Aalto University by Mona Roman was, “What types of business models are there for researchers to publish their data sets?” The open science expert answered with: “Researchers are not primarily looking for a business model, they want to progress, e.g., in a tenure- track career. Thus, the incentives for data publication should be built to support researcher career advancement” (Roman 2018; Interview Notes, Open Science Expert 1, 4 April 2017). These recalled words capture aptly the idea of the two perspectives on the open science movement. From the researcher’s point of view, RDM and open science practices should entail direct, visible, and individual benefits, such as receiving citations for shared data, increasing the impact of the research, receiving a grant by meeting the requirements of funding agencies in terms of a data management plan, and advancing academic careers. From the university’s perspective, in turn, the open science movement is a kind of business model. Open data entails financial support from the government, financial transactions related to accessing databases, and increasing collaborations with external data owners such as cultural institutions and foundations. These two entities—the managerial side of the institution and the researchers—are thus driven by different motivations, incentives, and practices. Let us now consider these two approaches to the following concepts which make up the new research culture: data management, data archiving, and open data.

Data defined as “entities used as evidence of phenomena for the purposes of research or scholarship” (Borgman 2015, p. 28) are prerequisite for researchers when conducting a study. Research projects depend on data; thus, the more disorganized the data are, so much less is the research that can be undertaken. Data are also academic outputs (scholarly papers, softwares, codes) that is the visible result of study and evidence of research success. It is therefore important that a data management plan (DMP) describes how a study will collect and use data and how the data are stored and made available to other researchers after the project has been completed (Finish Social Science Data Archive 2017). A DMP is an essential part of research practice, and it should become a habit for scholars to decide in advance the methods of dealing with data during the study; however, writing a DMP is a new research practice. There is still need for better understanding, motivations, and promotion of DMP plans among scholars so that they might comprehend that it entails better work organization and, along with that, better results, greater impacts, and broader dissemination of research.

Developing a new research habit is thus based on the carrot and stick strategy, that is, researchers are required to create a plan for data management since, without one, they will lose benefits and not achieve their goals. This strategy is made clear in “funder DMPs”. While the “practical DMP” is used in daily research practices, the “funder DMP” is required for funding purposes. For instance, the Academy of Finland, the main governmental funding body for scientific research in Finland, has required DMPs in grant applications since 2016. As data agents, we promote DMP tools and practices with regard to complying with funders’ and the university’s requirements rather than raising awareness about the very notion of DMP practices. We advise on writing DMPs and answer specific questions, including “How will you control access to keep the data secure?” and “How, when, where, and to whom will the data be made available?” Meanwhile, the university’s goal is to make sure that the research data will belong to the institution and be stored on its servers even when a scholar leaves the university. The challenge is thus to encourage researchers to develop the habit of planning data management rather than developing DMPs only to meet funding requirements.

One aspect of RDM is data archiving, that is, moving data that are no longer being used actively to a separate storage device for long-term preservation. Scholars have typically moved such data to external storage devices, kept them in the same file, or just deleted them. The survey results of “Report on the DARI AH Digital Practices in the Arts and Humanities Web Survey 2016” (Matres 2016) show that only three out of ten respondents have taken measurements regarding data archiving, while one out of ten admits that parts of their research will be destroyed. “None of the respondents provide detail on special measures taken or tools used for preserving their research data, other than doing back-ups, in external and in multiple devices, as well as using space offered by their institution” (2016, p. 15). Data archiving is not a new routine in the academy, but, at present, it is promoted as a necessary research practice. It draws scholars’ attention to thinking about their work using a long-term perspective. In addition, nowadays, the practice of data archiving is not only an individual scholar’s decision regarding the selection of data for preservation purposes and the type of storage system but also a strategic action to be taken due to internal (research groups and the university) and external pressures (government policies, international standards, and public use). Hence, not all data will be archived, just the data that are valuable from the perspective of long-term preservation.

Interesting results about data archiving issues at the university were presented by Information Specialists in Research Services from the University of Helsinki during the “Embedding Openness to Science & Scholarship” event (University of Helsinki 2018). Ala-Kyyny et al. (2018) conducted many interviews with the deans, vice deans, and researchers at the University of Helsinki in order to obtain an overview of the value of a research database and the selection process in terms of valuable data. They indicated three criteria for assigning value to data: cultural/societal, academic/scientific, and economic. In addition, they identified types of valuable data, including long time-series data, large data sets, unique data, as well as expensive and multinational research data. Further, they asked the participants how to decide which research data were valuable enough to archive. Based on the interviews, they presented a dual board model, that is, suggestions came from the researchers and then a panel of experts made the final decisions. Selecting which research data should be slated for long-term preservation is a great challenge and responsibility. As one interviewer said: “Science progresses rapidly. I would not want to make that assessment myself’. Thus, in practice, data archiving takes place both at the individual and university level. While scholars make a decision about data archiving based on their own research needs, the university is required to take into consideration many criteria (e.g. the social, scientific, and economic impact of data and its uniqueness) in order to ensure that the data selected as valuable are well-managed and preserved.

The core practice of RDM is to make data available to others, not to keep them locked up for the privileged few. Data are released mostly via institutional repositories as open access. So far, much attention has been paid to attitudes towards sharing and opening data among researchers (Wallis et al. 2013; Van den Eynden & Bishop 2014; Van den Eynden et al. 2016; Curty et al. 2017; Pas- quetto et al. 2017). Van den Eynden et al. stated that open data practices vary depending on research discipline, career stage, the location where a scholar is based or carries out research, and the kind of research methods used and data generated (2016, p. 3). Regardless of these criteria, opening research data sets is not a common habit. It depends heavily on motivations and incentives. An open science expert interviewed by Mona Roman acknowledged that researchers are not highly interested in opening data initially; the practices are driven mainly by funding organization requirements. The legal counsel stated that there is no amount of financial compensation and no career credit that would motivate researchers to open their data (Roman 2018, Interview Notes, Open Science Expert 1, 4 April 2017). Be that as it may, it is worth noting that measures are being taken gradually to reward researchers who open their data. A good example is an action taken by the Academy of Finland, the Finnish Advisory Board on Research Integrity, and Universities Finland (UNIFI), which drafted jointly a template for a researcher’s curriculum vitae which credits the production and distribution of data (included in the section on the scientific and societal impact of research; see Finnish Social Science Data Archive 2017).

Furthermore, one open science expert pointed out a crucial challenge for open research practices, that is, the “culturally tied issue” (Roman 2018, Interview Notes, Open Science Expert 1, 4 April 2017). This expression means that the existing research culture is not in accordance with open data rules. Another interviewer, a grant writer, also admitted that researchers perceive of open data as a request not in line with the practices of their fields (Roman 2018, Interview Notes, Open Science Expert 2, 5 April 2017). Since scholars are strongly attached to their research habits, it is a great challenge to establish a new research culture. Nevertheless, both open science experts agreed that introducing new research practices requires strong support and competencies on the part of administrative staff. Practicing open data is contingent upon the support level at the university. The services play an important role in promoting and implementing a new research culture.

The main purpose of the services is to liaise between the researchers and the managerial side of the university to achieve mutual benefits. While scholars are motivated by funding organization requirements, the university’s open data approach is driven by a top-down governmental policy that evaluates the university’s outcomes by measuring the number of open data sets. These two entities, the researcher and the institution, are again driven by different demands and impulses. Open data practices are pushed by a top-down policy that, however, causes a short-term effect. A good example of this situation is writing a DMP for grant application purposes rather than for research practice. In order to establish an open data habit, the practice should be propelled by a bottom-up approach, meaning that the whole culture of research must be changed. The development of a new perspective on the scholarship can ensure that open data will become a research routine. One way of remodelling the research culture is by introducing a new practice resulting from opening data: data reuse, which is defined as “using data that were originally collected by someone else for some other purpose” (Finnish Social Science Data Archive 2017).

The administrative side of the institution and the researchers represent two perspectives on open data practices resulting from different impulses, benefits, and sources of pressure. However, these two entities influence each other directly and shape the local culture of open research data. The growth of interest in open data practices among scholars depends heavily on local services and their competencies in promoting and implementing new scholarly habits. Nevertheless, as the legal counsel interviewed by Roman rightly noticed: “Motivation needs to come from the scientific community, create a cluster for this. Not pushed by administration” (Roman 2018, Interview Notes, Open Science Expert 1, 4 April 2017). Only in this way will open data practices be applied naturally by researchers.

Digital humanities and the implementation of open research culture

The divergence between the administration’s and researchers’ approaches towards open science and RDM is visible in the humanities. As Stefan Buddenbohm et al. noticed, “Even when researchers in these fields publish their data in the European repositories and archives, the data is usually difficult to find and to access” (2016, p. 8). Humanists rely heavily on various research materials, such as articles, letters, photographs, diaries, and so on. Many of these sources are collected and accessible in archives at universities, libraries, museums, and public and private institutions. Along with the digitization of physical materials, humanists are more and more in the position of benefiting from open science and, consequently, more motivated to make their research data freely available. Therefore, due to the use of digital sources in the humanities, data management practices find perfect applications in this discipline. As Miriam Posner observed, traditional humanists have pressing data management needs (e.g. curating and managing the collected cultural data) but “the need becomes even greater when you’re talking about people who consider themselves digital humanists—that is, people who use digital tools to explore humanities questions” (Posner 2015). Although both traditional and digital humanists rely on open data and encounter RDM requirements, only the latter group seems likely to perceive of research materials as data and apply open science practices. This is due to the fact that digital humanists both collect already accessible, digitized cultural data, and produce data, such as statistic data and codes. Thus, the digital humanities play a key role in transforming humanities scholarship into an open research culture.

Digital humanities have the potential to alter the whole culture of how humanities scholarship is conducted. The research practices initiated by the digital humanities are distinguished by the following aspects: collaboration, interdisciplinarity, the use of digital tools, the application of computational techniques, the utilization of open data, and both experimental and hands-on work. The digital humanities represent a wide range of practices and methods but, at the same time, face many challenges and barriers that hinder applying new open research approaches. One of the main problems is open access to digital publications and research data.

The need for open access was recognized in the 2016 report conducted by Matres at the University of Helsinki. The first survey question about the use of digital research sources by humanists showed that the researchers use digitally formatted materials and digital devices extensively. “Although books are still read as much in print as digitally, 70 percent of the respondents reported accessing archival holdings from a digital device, whereas only 36 percent relied on archive visits or printed material” (Matres et al. 2018, p. 40). The next question devoted to access to digital research materials revealed clear deficiencies and needs. “Most respondents (93.7%, r=224) considered better accessibility to resources the most important requirement for doing research in the digital age. They specifically mentioned the need for the further digitisation of research sources, and for them to be attached to open licenses and referencing guidelines as essential for their re-use in scientific contexts” (pp. 40-41). Respondents expressed the opinion that there are real needs for improvement in access to materials on the web (e.g. limited access to digitally published articles), knowledge of licenses, and the machine readability of materials. The survey showed that deficiencies are related to aspects that are fundamental to digital humanities practices, such as the access to and findability of digital resources. “In sum, on the one hand, digital practices in the humanities in Finland enable scholars to explore interdisciplinary interests, and to make use of diverse digital data: from digitised collections to born-digital data. On the other hand, digital data poses challenges related to access, data management, and ethical referencing and use” (p. 42).

The digital humanities have stayed at the forefront of the implementation of new research practices developed by the open science movement. Drawing on the description provided by the Open Science and the Humanities conference held at the University of Barcelona in 2018, we can identify the following benefits of open science for the humanities: sustainable projects and initiatives, valorization of Humanities knowledge and practices, greater research impacts, and the promotion of new discoveries and paradigms. However, based on the Finnish case study, there are still crucial problems that must be addressed to make open science truly practicable and beneficial. Thus, the digital humanities can be a driving force for an open research approach. However, in order to encourage advances in open science practices in humanities, it is important both to improve open access and to develop a new perspective on the very notion of humanities research. Open data belongs to a new research culture and its implementation entails new study approaches and practices.

The new vision of the humanities resulting from the open science movement is based on the assumption that the research culture is moving from a final result- based approach towards a processual approach, in which each step of the study and data can be traced, validated, and reproduced. Let us now reflect upon the changes in the humanities research culture as a result of applying open science practices.

Humanities research as a management project

Looking at research through the lens of management is completely foreign to some humanists. However, conducting a study is more and more reminiscent of project management, in which we have to plan in advance file formats for the data, the methods used in data documentation, data storage, the methods for keeping the data secure, licensing data, and the methods for opening data and their long-term preservation. Many humanists are not used to considering their research materials as data as well as perceiving the research in terms of project management. Nevertheless, as nowadays the research has become increasingly collaborative and performed by people coming from different disciplines and active only at a particular stage of project development, data management seems more and more necessary and beneficial. RDM practices can help to control and validate any task in team-based work. A good example is a joint research project conducted by humanists and computer scientists. While the latter executes a digital project, the humanists analyse the results, write them up, and publish them as a scholarly paper. In the end, the work done by the computer scientists is untraceable. Documenting and opening data can make the research workflow more detectable and identifiable. Therefore, all tasks and data produced by any contributor can be evaluated and credited. Consequently, humanities research is not perceived of in terms of a product defined as a scholarly article but is seen as a workflow, in which each step of knowledge construction, from initiation to production, is visible, detectable, and validated.

The publication and evaluation of research data

Open research data practices will become more and more common in scholarly communications, exposing data biases and inequalities in research projects. Access to produced research data can help in the review and evaluation of a scholarly paper built upon these data. It can demonstrate how the knowledge was constructed, revealing its gaps, errors, data misuse, and any aestheticization of the data. Open data will become an indicator of paper quality: while the value of the paper will lie in the accessibility of the data, the quality of the data will lie in their reusability. Data will be thus reviewed by evaluating their relevance, clarity, and trustworthiness. Releasing data sets with each publication will become part of the research culture, changing a way of thinking about the study as an integrated process.

The implementation of data reuse culture

The most significant change resulting from the open science movement is the ability to reuse data that have been published to create a new project, analyse them in different contexts, or combine them with other data sets. Curty et al. (2017) describe several advantages of reusing existing data as opposed to collecting new data, including lessening expenses related to data collection, saving time for the research process, extending research beyond the temporal and spatial limitations of a single study, allowing trans-disciplinary applications, and enabling meta-analyses. The authors, however, also point out the risks that trigger negative attitudes towards data reuse, such as hidden errors in a data set, a lack of control over data quality, a lack of adequate metadata, wasting time aligning data sets for combination and reuse, and the trustworthiness of data. Reusing data produced by another scholar is not a part of the humanities research culture yet, but, along with the increase in open data, it could become a common practice. It will bring both benefits and risks, and the risks should be addressed in order to improve data reuse practices. Therefore, there are still many challenges and questions concerning open research data that need to be undertaken in order to incorporate data reuse culture into the digital humanities. It is thus necessary to tackle the following problems: providing comprehensive metadata that is a prerequisite for the correct interpretation of the data, being able to discern data set content and checking data suitability for analysis; ensuring interoperability of tools and enabling the reuse of code software and tools; and solving ethical and legal issues related to reusing data sets produced in a different research culture. The last challenge is particularly urgent in light of the rapidly developing global digital humanities, entailing international collaboration and global data exchange. Viewed in this context, the last need is related to facilitating data availability, discovery, and readability so that they can be understood easily and reused by others.

The reinforcement of research trust and responsibility

“Trust is the data reusers’ belief that the data will result in positive outcomes, leading to the reuse of such data in their research. Data reusers’ trust judgments can be understood as psychological processes, and whether they accept and use certain data can be seen as an indication of trusting behavior” (Yoon quoted in Frank et al. 2017, p. 4). Using Yoon’s definition, we can see that data reuse requires a great deal of trust in data producers and digital repositories. Open science requires that scholars have high levels of trust, engagement, and confidence in order to use others’ data as well as publish their own data sets in a particular archive. Trustworthiness becomes the main decision-making indicator of engagement in the open science movement. To enhance the trust needed to produce one’s own data set and, in consequence, increase the possibility of reuse by others, scholars will have to pay special attention to data collection methods and data analysis techniques. Since research data will be open and evaluated, plainly speaking, researchers will become more responsible for the methods of conducting research and its outcomes. As Borgman said, open science revives the discussion about the responsibility for ideas and documents in the research (2015, p. 255). Therefore, it will contribute to building and enhancing the culture of trust and responsibility in digital humanities and science.

A big picture approach to humanities research

One key talk at the Association of Commonwealth Universities (ACU) and the South African Research and Innovation Management Association (SARIMA) conference was focused on the theme “Research and innovation for global challenges” and pointed to “the need to open up scientific research at universities around the world, to encourage the sharing of information, which, in turn, will lead to greater progress in overcoming the challenges currently facing communities across the globe” (Zondi 2016). The open science and open information movements are seen as the answer for global challenges and development. Open science is the movement towards building a collaborative research environment in which scholars can share, reuse, and integrate data, methods, and materials. The goal of making data available is thus to accelerate scientific advancement and solve real-world problems. For humanists, engaging in the open science movement means perceiving research from the perspective of its application, reuse, and extension by others. Open research data can be reused by various stakeholders, including scholars, universities, and foundations. Therefore, before conducting a project, a scholar should pose the following important questions: How can research data and results be reused by others? How will the data be accessible in the future? How will digital data, projects, and tools, particular products of the digital humanities, be made sustainable? How can data be expanded for future purposes? How can the research participate in the global development of the discipline, society, and culture? Open science will make humanists adopt a big- picture approach to research, and such a long-term perspective will entail the above-mentioned research responsibilities towards academic institutions, learned societies, and the general public.


Open science is the ongoing movement towards building transparent, collaborative, traceable, and reusable scientific and humanities research. The implementation of open data practices will benefit the university and researchers, facilitate global collaboration, and accelerate scientific progress. However, many efforts and initiatives must be undertaken to encourage and motivate scholars to utilize open data principles. Moreover, the administrative side of the institution and researchers are driven by different purposes, motivations, and incentives. Therefore, it is essential to take the bottom-up approach and present open research data as a part of a new research culture, rather than something that is only necessary in order to fulfil institutional and funding requirements. Developing the new research culture based on open science principles will lead naturally to the implementation of data archiving procedures, open data, and data reuse practices. The open research culture, nevertheless, entails taking a new research perspective that differs significantly from the current approach, which is focused on research results. Hence, it represents the move towards a proces- sual approach, in which each research task and piece of data can be traced, validated, and reproduced. When considering the digital humanities field, we can see that there are still a lot of challenges along the way, but, ultimately, the application of open science practices will essentially transform the entire ecosystem of digital humanities research.


1 See the Finnish Social Science Data Archive, IDA research data storage service, AVAA open data publishing portal offered by the Ministry of Education and Culture, Finnish Biobanks, and Finna service of the National Digital Library, combining material from Finnish archives, libraries, and museums. According to the European Commission (2017), only Finland and Austria within the European Union require clear open data policies.


Aalto University., (2018a). Aalto University wiki [online]. Aalto University. Available from: ience+and+Research+at+Aalto+University+-+Open+Publishing+and+Open+Data

Aalto University., (2018b). Research Data Management (RDM) and open science [online]. Aalto University. Available from: and-open-science

Aalto University., (2018c). Data agents and data advisor [online]. Aalto University. Available from:

Ala-Kyyny, J., Korhonen, T. and Roinila, M., (2018). Barriers to sharing research data: Interview study among University of Helsinki researchers. Embedding openness and scholarship, a workshop organized by the University of Helsinki, 14 March 2018 [online]. Available from:

Borgman, C. L., (2015). Big data, little data, no data: Scholarship in the networked world. Cambridge and London: The MIT Press.

Buddenbohm, S. et al., (2016). State of the art report on open access publishing of research data in the humanities: HaS deliverable 7.1. HaS-DARIAH [online]. Available from:

Curty, R. G., Crowston, K., Specht, A.. Grant, B. W. and Dalton, E. D., (2017). Attitudes and norms affecting scientists’ data reuse. PLoS One. 12(12), e0189288. 10.1371/journal.pone.0189288

European Commission., (2017). Facts and figures for open research data. Open science Monitor: The European Commission |online|. Available from: open-science/open-science-monitor/facts-and-figures-open-research-data_en Finnish Social Science Data Archive., (2017). Why are research data managed and reused? Finnish Social Science Data Archive: Data Management Guidelines [online]. Available from: Frank, R. D., Chen, Z., Crawford, E., Suzuka, K. and Yakel, E., (2017). Trust in qualitative data repositories. Proceedings of the Association for Information Science and Technology. 54(1), 102-111. Hackdash., (no date). Hack4FI-Hack your heritage hackathon. Hackdash. Available from:

Harju, K. and Teiniranta, R., (2016). Open data policy in Finland: With practical experiences from SYKE. Finnish Environment Institute SYKE [online], 19 October. Available from: A1F%7D/125507

Karvonen, M., (2017). Open data initiatives in Finland. Nordic CIO conference, Hanken, Helsinki, 31 March 2017: Ministry of Education and Culture. Available from: sites/default/files/atoms/files/minna_karvonen.pdf Laine, H.. ed., (2018). Tracing data: Data citation roadmap for Finland. Finnish Committee for Research Data. Helsinki, Finland. Available from: 04106446

Matres, I., (2016). Report on the DARIAH digital practices in the arts and humanities web survey 2016. Digital Research Infrastructure for the Arts and Humanities (DARIAH), University of Helsinki [online]. Available from: finnish-survey-on-digital-research-practice-in-the-arts-and-humanities Matres, I., Oiva, M. and Tolonen, M., (2018). In between research cultures: The state of digital humanities in Finland. Informaatiotutkimus [online]. 37(2). 978/inf.71160

Ministry of Education and Culture, Finland., (2014). The open science and research roadmap 2014-2017. Ministry of Education and Culture, Finland [online]. [Viewed 10 November 2018]. Available from: roadmap-2014-2017

Ministry of Finance, Finland., (2015). Avoimesta datasta innovatiiviseen tiedon hyo- dyntamiseen. Avoimen tiedon ohjelman 2013-2015, loppuraportti. Ministry of Finance, Finland [online]. Available from: Nordlund, H., (2016). Finland has hackathons for everything: From cultural heritage to state-run railway operators. Business Insider Nordic. 14 December. [Viewed 10 November 2018). Available from: for-everything-from-cultural-heritage-to-state-run-railway-operators-2016-12 Open Science and the Humanities Conference 2018., (2020). Open science in the field of Humanities. Open Science & the Humanities Conference 2018 [online]. University of Barcelona. Available from: Pasquetto, I. V., Randles, В. M. and Borgman, C. L., (2017). On the reuse of scientific data.

Data Science Journal [online]. 16. Posner, M., (2015). Humanities data: A necessary contradiction. Miriam Posner’s Blog [online]. Available from: contradiction/

Roman, M., (2018). Collaboration through big data and open science [online]. Aalto University, Research. Available from: big-data-and-open-science(80f6110a-818f-4a76-a991-24c34934c678).html

Soderholm, M. and Sunikka, A., (2017). Collaboration in RDM activities: Practices and development at Aalto University. The 12th Muttin Conference on Scholarly Publishing 2017: UiT, The Arctic University of Norway, 22-23 November 2017: The Septentrio Conference Series, 1. https://doi.Org/10.7557/5.4247

University of Helsinki., (2018). Embedding openness and scholarship: A workshop organized by the University of Helsinki [online], 14 March. Available from: hankkeet/avoin-tiede/open-science-workshop/

Van den Eynden, V. and Bishop, L., (2014). Sowing the seed: Incentives and motivationsJbr sharing research data, a researcher’s perspective [online]. Bristol: Knowledge Exchange. Available from: Van den Eynden, V. et al., (2016). Towards open research: Practices, experiences, barriers and opportunities [online]. London: Wellcome Trust, 4055448

Wallis, J. C., Rolando, E. and Borgman, C. L., (2013). If we share data, will anyone use them? Data sharing and reuse in the long tail of science and technology. PLoS One. 8(7), e67332. Zondi, N., (2016). Open science is the answer for global challenges [online]. The Association of Commonwealth Universities. [Viewed 10 November 2018]. Available from: www.

< Prev   CONTENTS   Source   Next >