Research Data Management at SimTech

Thinking research and infrastructure together

At SimTech, we provide and actively develop an infrastructure and tools for the management of research data. We are integrating research data management into research, and conducting research relating to research data management. Our vision is to make available research data and software to every published article, make science reproducable and promote the transfer into practice.Therefore we supply the right tools for you to use our research data and software for your own projects:

Our research

The Research Data Management team at SimTech initiates data and software management projects in close collaboration with researchers to enable individual RDM solutions as well as an integration to DaRUS. In the following, active as well as finished projects are listed.

Highlights

Fair-FLEXI

Description

For reliable simulation of complex flow problems, comparative data from good simplified benchmark problems are absolutely necessary for validation and verification of the computational program and modeling.  Only in this way can the simulation of complex flows be trusted at all.

It is true that there are benchmark simulations for very basic flow problems and associated databases. These are often well maintained, but they are far away from a real application and most of them have reached a certain age. In addition to their application as validation data, there is also a great need for data from this side - to train ML models - due to the strong research activities in machine learning. 

In this PN-1 project, benchmark problems for turbulent flows and their modeling are designed and carried out by using FLEXI. The SimTech RDM-Team assists and develops in close collaboration with the Institute of Aerodynamics and Gas Dynamics solutions to efficiently manage the big data that is generated in the process of CFD-Simulations. The goal is to provide a comprehensive yet easily accessible resource to validate simulation runs, but also to train.

Data Version Control for Simulation

Description

Simulation meta-data logging is a critical aspect that requires a well-structured approach to ensure rich metadata descriptions and seamless workflows for our applications. At SimTech, we have found a solution to this problem by implementing Data Version Control in simulation sciences and the intersection with Machine Learning. This software provides a well-designed backend structure and that combines and organizes input parameters, quality assessment metrics, and the model itself, while providing multiple interfaces for interactions.

We have already successfully applied this approach to molecular dynamics problems in chemistry and biochemistry, as well as in modeling biocatalytic systems. Our objective is to continue to apply this concept and software to our research and make it accessible across the simulation sciences community. With Data Version Control, we are confident that we can achieve our goals of streamlining workflows and producing rich metadata descriptions for our applications.

Project members
Fabian Zills - ICP Stuttgart
Marcelle Spera IBTB Stuttgart
Jan Range - EXC2075 RDM-Team Stuttgart

 

Data Modelling

Description

Deep eutectic solvents (DES) are promising solvents in biocatalysis, because they are designable, renewable, biodegradable, and cheap, and thus are an alternative to organic solvents for enzyme-catalyzed reactions that involve hydrophobic substrates. Their thermophysical properties (density, viscosity, thermodynamic activities) are obtained by molecular simulation and by experiment and are stored in the standardized data exchange format ThermoML.

The goal of the project is to establish an API as well as REST interface to write ThermoML documents by either converting existing CML documents or reading from modelling tools. Next, the data model will be projected onto DaRUS to make use of the federated database system to share ThermoML across multiple installation, but also to provide reasearchers from various fields the opportunity to combnine their metadata with ThermoML. 

Publications

  1. 2024

    1. B. Flemisch et al., “Research Data Management in Simulation Science: Infrastructure, Tools, and Applications,” Datenbank-Spektrum, 2024.
  2. 2023

    1. S. Lauterbach et al., “EnzymeML: seamless data flow and modeling of enzymatic data,” Nature Methods, vol. 20, no. 3, Art. no. 3, 2023, doi: 10.1038/s41592-022-01763-1.
  3. 2022

    1. J. Range et al., “EnzymeML—a data exchange format for biocatalysis and enzymology,” The FEBS Journal, vol. 289, no. 19, Art. no. 19, Oct. 2022, doi: https://doi.org/10.1111/febs.16318.
    2. C. Brecher et al., “Commitment zu aktivem Daten- und -softwaremanagement in großen Forschungsverbünden,” Bausteine Forschungsdatenmanagement, vol. 1, 2022, doi: 10.17192/BFDM.2022.1.8412.
    3. S. Hermann and J. Fehr, “Documenting research software in engineering science,” Scientific Reports, vol. 12, no. 1, Art. no. 1, 2022.
    4. M. Gültig, J. P. Range, B. Schmitz, and J. Pleiss, “Integration of Simulated and Experimentally Determined Thermophysical Properties of Aqueous Mixtures by ThermoML,” Journal of Chemical & Engineering Data, vol. 67, no. 11, Art. no. 11, 2022, doi: 10.1021/acs.jced.2c00391.
  4. 2020

    1. X. Xu, J. Range, G. Gygli, and J. Pleiss, “Analysis of Thermophysical Properties of Deep Eutectic Solvents by Data Integration,” Journal of Chemical & Engineering Data, vol. 65, pp. 1172–1179, 2020, doi: https://doi.org/10.1021/acs.jced.9b00555.
    2. B. Flemisch et al., “Umgang mit Forschungssoftware an der Universität Stuttgart,” Universität Stuttgart, 2020. doi: 10.18419/OPUS-11178.

Empowering researchers

RDM at SimTech offers the possibility to initiate data and software management projects in close collaboration with researchers. We are part of a large research data network, contribute to various working groups and offer workshops and seminars. The team is also closely involved in FoKUS, the Competence Center for Research Data Management from the University of Stuttgart, to support researchers in publishing data.

Workshops and seminars

News


Wouldn’t it be great if we would be able to combine any dataset with any other dataset we would want to?

European Commission, Reproducibility of scientific results in the EU

SimTech RDM committee

This image shows Research Data Management Team

Research Data Management Team

 

Pfaffenwaldring 5a, 70569 Stuttgart

To the top of the page