Every other year, the E-Science Days take place in Heidelberg. It is an interdisciplinary conference series around the topics of research data management and open science - with versatile offers for the professional exchange between science and technology. This year’s motto is “Empower Your Research – Preserve Your Data”.
On Friday, 3 March, Dorothea Iglezakis and Sibylle Hermann, will present a TandemTalk on “Promoting research through quality-controlled data and code publications: Collaboration between research and infrastructure using the example of SimTech and FoKUS.”
FoKUS, the Competence Center for Research Data Management, is the contact for all questions concerning research data management (RDM) at the University of Stuttgart: from data management plans to storage space, administration, publication and archiving.
Research Data Management at SimTech develops data management strategies, guidelines and technical solutions. In 2021, SimTech together with the Cluster of Excellence IntCDC organized a two-day virtual workshop on “Data and software management in large research consortia” that attracted around 70 participants from across Germany and abroad and eventually led to a joint document on the importance of research data and software. This document was signed by major German research alliances and published in March 2022 in the journal “Bausteine Forschungsdatenmanagement”.
ABSTRACT: Due to the digital transformation in science, the publication of research data and software, but also their quality assurance, are increasingly coming into focus. Scientific performance should not only be measured by the number and citation of scientific articles, but also adequately recognize the work for well reusable datasets or research software. Initiatives such as Make Data Count are simultaneously working to create an infrastructure for citations to data publications. Thus, it can be expected that in the future, high-quality, easily citable and re-usable data and software publications will play an increasingly important role in the evaluation and funding of research, as well as in the filling of positions. The Cluster of Excellence SimTech at the University of Stuttgart has also committed itself to measures to ensure the quality of research data and software.
In order to take this development into account, new forms of quality assessment of data and code publications are emerging in the collaboration between SimTech and the Research Data Competence Center FoKUS at the University of Stuttgart. To this end, alternative systems of reputation attribution are being strengthened on SimTech's side on the one hand and prepared and supported in terms of infrastructure on the other. For example, research data and software can already be published in a citable form on the institutional research data repository DaRUS, which automatically finds its way into the university bibliography. The publication process is subject to a two-stage quality check. The data sets are first checked for content by SimTech's research data managers and, in a second step, formally optimized by the FoKUS team with regard to the metadata and feedback is sent to the researchers. The aim is to ensure that the data are relevant from a scientific point of view on the one hand and prepared for linking in terms of a PID graph via persistent identifiers for authors, keywords, projects and subject classifications on the other.
In order to make the currently very time-consuming quality check scalable in the long term, we are currently working together to automate the quality check workflow as much as possible, to further improve it through content-related components of the quality check, and to simplify it for the researchers. SimTech is serving as a pioneer in this effort. Based on the experience gained, the developed processes and tools will then be rolled out across the university. In the future, it is also planned to support individual quality criteria and make them visible with the help of badges. An integration of Make Data Count will additionally prepare metrics based on citation of published datasets. This will build up the technical capabilities for metrics around research data (and software) on the infrastructure side, and in turn give these metrics a higher priority from a SimTech perspective.