Zusammenfassung
Good scientific work requires comprehensible, transparent and reproducible research. One way to ensure this is to include all data relevant to a study or evaluation when publishing an article. This data should be at least aggregated or anonymized, at best compact and complete, but always resilient.
In this paper we present ProSA, a system for calculating the minimal necessary data set, called ...
Zusammenfassung
Good scientific work requires comprehensible, transparent and reproducible research. One way to ensure this is to include all data relevant to a study or evaluation when publishing an article. This data should be at least aggregated or anonymized, at best compact and complete, but always resilient.
In this paper we present ProSA, a system for calculating the minimal necessary data set, called sub-database. For this, we combine the Chase — a set of algorithms for transforming databases — with additional provenance information. We display the implementation of provenance guided by the ProSA pipeline and show its use to generate an optimized sub-database. Furhter, we demonstrate how the ProSA GUI looks like and present some applications and extensions.