Jan Range starts new working group to revamp pyDataverse

January 15, 2024

Jan Range, Research Data Software Engineer in SimTech and PhD Researcher at the Institute of Biochemistry and Technical Biochemistry, will start the new “PyDataverse Working Group”. He is leading the initiative, that aims to update, expand and improve the PyDataverse software. 

PyDataverse is a well-known library and repository within the GDCC that provides a broad range of features for interacting with Dataverse. It allows users to upload and download datasets, and empowers admins to configure Dataverse, making it a one-stop-shop for Datanauts. However, due to recent halts in its development and maintenance, certain functionalities have become outdated or even broken. Moreover, other libraries such as EasyDataverse and DVCLI have been introduced, which offer more options but also create confusion about which tool to use.

To address these concerns, the pyDataverse Re-Vamp initiative and working group was launched in late 2023. The working group's aim is to update the library to a functional state, resolve any issues that hinder functionality, and merge contributions that have gone stale. In addition, the group plans to incorporate concepts from other client libraries like EasyDataverse and DVCLI into pyDataverse to enhance its usability and focus on a single Python library. This initiative also envisions including new libraries and concepts that enable large language models to interact with Dataverse, such as populating metadata from text.

Last but not least, to address maintenance gaps in open-source projects, the Re-Vamp initiative proposes a dynamic approach to incorporating upstream changes to the native API by utilizing OpenAPI specifications and code generation. Ultimately, the Re-Vamp initiative seeks to re-establish pyDataverse as an invaluable tool for Datanauts exploring the vast depths of the Dataverse. Bi-weekly meetings are planned.

Contact

This image shows Jan Range

Jan Range

 

Research Data Software Engineer

To the top of the page