New bridges between research data and AI: SimTech presents MCP server at GAMM and deRSE

March 24, 2026 / ts

How can researchers identify the right mathematical model for a specific research problem without first having to work their way through large volumes of specialist literature? A new tool developed at SimTech addresses exactly this challenge: a Model Context Protocol (MCP) server that makes research data from the DaRUS repository accessible to AI language models. Jan Range, Research Data Software Engineer at SimTech, presented the tool at two specialist events.

At both the GAMM Annual Meeting and the first Stuttgart Research Software Engineer Day (deRSE), the presentation attracted considerable interest. The feedback was clear: the relevance of the tool was immediately recognized, including by Simone Rehm, CIO of the University of Stuttgart.

Embedded in SimTech’s long-term RDM strategy

The new tool is not a standalone project, but the latest step in a development that SimTech has been pursuing for years. The overall goal is to make research data and software available for every published article, to strengthen reproducibility in science, and to support transfer into practice.

This work is grounded in the FAIR principles: research data should be findable, accessible, interoperable, and reusable. With the DaRUS research data repository, its involvement in the National Research Data Infrastructure (NFDI), and active participation in the consortia MaRDI (Mathematical Research Data Initiative) and NFDI-MatWerk, SimTech has helped embed this work institutionally. A further key component is the MathModDB ontology, a structured knowledge graph for mathematical models that is also used by the MCP server. Prof. Dr. Dominik Göddeke, who is also involved in MaRDI, and his research group made major contributions to its development.

Jan Range is also no newcomer to this ecosystem. Since 2021, he has been working at SimTech as a doctoral researcher and Research Data Software Engineer, initially contributing to the development of EasyDataverse and PyDataverse, two widely used open-source libraries for programmatic work with Dataverse repositories. PyDataverse is maintained and further developed in cooperation with the Institute for Quantitative Social Science (IQSS) at Harvard University. With the new MCP server, Range builds directly on this foundation and extends it into the field of AI.

What is new?

The MCP server acts as an interface between research data and AI language models such as Claude. Built on PyDataverse, it uses the SPARQL query language to search the growing MathModDB efficiently. In doing so, it extracts only the relevant structures from the knowledge graph, helping to reduce the risk of inaccurate or misleading responses from the AI model, often referred to as “hallucinations.”

In addition, Range developed a docking tool that works like a translator, making existing code accessible to AI language models. The long-term vision is a chat button integrated into the DaRUS interface, enabling researchers to query datasets directly in natural language, without programming skills and without advanced training in mathematics.

The Dataverse MCP server was deliberately designed as a universal solution. It is not tied to any specific AI model and can be used with any Dataverse instance. All components are openly available.

 

Why does this matter?

SimTech’s interdisciplinary profile makes the tool especially valuable:

“SimTech thrives on collaboration across disciplinary boundaries, but that strength also comes with a challenge: not everyone can engage deeply with the methods and data of other fields. With the MCP server, we want to change that. Anyone who wants to know which mathematical model fits their research problem, or how colleagues in other disciplines have worked with similar data, should in future be able to simply ask. We are building bridges where none existed before.” Jan Range, Research Data Software Engineer, SimTech

Three key benefits

Efficiency: Fast and precise querying of large volumes of data from MathModDB and DaRUS, without requiring programming skills.

Inclusivity: Researchers without a mathematical background gain equal access to complex research data. In the future, questions about suitable models for individual research problems could be answered via chat across disciplinary boundaries.

Knowledge integration: Multiple MCP servers can be combined, up-to-date data can be integrated, and hallucinations by AI models can be reduced.

AI as part of research infrastructure is still not receiving enough attention in the German research landscape. SimTech is helping to move this discussion forward, not only within MaRDI but potentially across the entire NFDI, which also builds on knowledge graphs. The guiding idea is to look beyond classical infrastructure alone and consider the role AI can play within it, while making both infrastructure and AI-based access more widely available.

Try it out

The MCP server is freely available as a working prototype and is under active development. The next step is to host it at the University of Stuttgart so that other researchers and institutions can use the tool directly.

Anyone interested in seeing what such queries look like in practice can explore three example chats:

To the top of the page