Core DTE Modules

yProv

Description

An open-source service to support provenance management within scientific workflows.

yProv relies on the W3C PROV family of standards, a RESTful interface and a graph database back-end based on Neo4J. The yProv web service is implemented in Python by using the Flask micro-framework which is based on the Jinja2 Template Engine and Werkzeug WSGI Toolkit. The service is domain-agnostic, though its primary case studies in the project come from the climate change domain (i.e. climate analytics workflows). The service aims at implementing the micro-provenance concept, to navigate within the provenance space across different dimensions (e.g., horizontal & vertical).

Users can exploit the yProv service to manage (i.e. store, retrieve, explore, visualise) the provenance information associated with scientific datasets, thus getting a better understanding about specific datasets. The value proposition is about (i) stronger traceability, transparency, and trust (through a richer set of metadata) and (ii) multidimensional exploration/navigation of provenance metadata information (i.e., multi-level).

Release Notes

yProv has been adopted in interTwin to implement provenance support within scientific workflows, starting from some case studies identified in the environmental domains (i.e. climate data analytics workflows). Being a new effort, the first release is still pre-operational, though it already provides a preliminary set of core functionalities.

Future Plans

yProv will be evolved during interTwin in order to accommodate additional requirements. A cloud-enabled version of the service based on containers will be implemented. The multi-dimensional support will also be integrated to address the vertical exploration of provenance within complex scientific workflows, thus fully implementing the micro-provenance concept. Finally, a long-running service instance will be established at UNITN as a reference service for the community.

Target Audience
+

Users of the Component Scientific users, both producers and consumers of datasets. End users can interact via the yProv RESTful API to manage (i.e., CRUD operations) the provenance information.

License
+

GPLv3

Created by
+