Publicly available research data (Open Data) is a main pillar of the Open Science movement. Open research data can accelerate innovation processes and create additional economic profit. However, the publication of research data brings some challenges. Some important questions in practice are how research data is prepared, documented, published and archived. This with consideration of technical and legal aspects.
In order to investigate these practical questions, 12 pilot projects were carried out within the DLCM 2 project. Research data was published in a variety of disciplines (architecture, life sciences, engineering, social work, applied psychology, applied linguistics, health and management & law) and the related processes and questions were reflected in workshops together with the DLCM consortium as a provider of a research data repository and archiving solution.
The pilot projects provide an insight into the characteristics of individual research data life cycles across the eight mentioned domains. A key finding of the pilot projects is that the path to open research data is very domain-specific. This applies not only to the choice of the (domain-specific) data repository, but also to all involved data processing procedures. Therefore, there is a need for good practices to be developed by the research communities themselves. A few general findings can be reported across all domains: helpful instruments include identified software and tools for active research data management as well as frameworks for discipline-specific research data management (e.g. domain data protocols).
We think that the overarching success criteria for the re-use of research data are well curated data, a comprehensive data quality and the consideration of FAIR principles when publishing them. Based on the experience gained in the pilot projects, we recommend active support of the researchers using a data stewardship model. Data Stewards – or simply experts in research data management – in our model help researchers apply best practices in the handling of research data throughout the entire data life cycle.