The Research Data Management team at SimTech initiates data and software management projects in close collaboration with researchers to enable individual RDM solutions as well as an integration to DaRUS. In the following, active as well as finished projects are listed.
The logging process of simulation meta-data remains to be a task the user has to take care of. Hence, log files vary in structure as well as depth and the meta-data remains unstructured. On top of that, there is no efficient way to seamlessly deploy such meta-data from HPC to the database. In parallel, the machine learning community already solved the problem of tracking model development as well as model deployment. The software MLFlow takes care of such tasks, by simply combining input parameters, quality assessment metrics and the model itself into a well-designed web GUI, whilst providing many interfaces for interactions.
SimFlow is the simulation science response to solve the problem of standardized logging. The software builds upon MLFlow and extends its capabilities to automatically track simulation input paramaters as well as analysis results such as metrics, which are displayed in a web GUI. Simulation runs are displayed within an experiment and enable an overview of
- Ongoing simulations
- Finished runs
- Analysis results
Furthermore, the GUI allows simple as well conditional sorting of all runs, which finally can be compared and visualized depending on parameters and metrics. In addition, SimFlow offers a seamless integration to the Dataverse installation DaRUS of the University of Stuttgart, completing the data-workflow from generation to deployment.
- Dennis Gläser - IWS Stuttgart
- Benjamin Schmitz - IBTB Stuttgart
For reliable simulation of complex flow problems, comparative data from good simplified benchmark problems are absolutely necessary for validation and verification of the computational program and modeling. Only in this way can the simulation of complex flows, as shown in Figure 1, be trusted at all.
It is true that there are benchmark simulations for very basic flow problems and associated databases. These are often well maintained, but they are far away from a real application and most of them have reached a certain age. In addition to their application as validation data, there is also a great need for data from this side - to train ML models - due to the strong research activities in machine learning.
In this PN-1 project, benchmark problems for turbulent flows and their modeling are designed and carried out by using FLEXI. The SimTech RDM-Team assists and develops in close collaboration with the Institute of Aerodynamics and Gas Dynamics solutions to efficiently manage the big data that is generated in the process of CFD-Simulations. The goal is to provide a comprehensive yet easily accessible resource to validate simulation runs, but also to train.
Deep eutectic solvents (DES) are promising solvents in biocatalysis, because they are designable, renewable, biodegradable, and cheap, and thus are an alternative to organic solvents for enzyme-catalyzed reactions that involve hydrophobic substrates. Their thermophysical properties (density, viscosity, thermodynamic activities) are obtained by molecular simulation and by experiment and are stored in the standardized data exchange format ThermoML.
The goal of the project is to establish an API as well as REST interface to write ThermoML documents by either converting existing CML documents or reading from modelling tools. Next, the data model will be projected onto DaRUS to make use of the federated database system to share ThermoML across multiple installation, but also to provide reasearchers from various fields the opportunity to combnine their metadata with ThermoML.