Zusammenfassung
Multi-model data stores combine the benefits of different data stores to ensure optimal storage for various datasets.
To adequately connect data in a multi-model scenario e.g., to join data in different data models and to consistently evolve multi-model databases, inclusion dependencies (INDs) must be known.
INDs are database integrity constraints that define a subset of relationships between ...
Zusammenfassung
Multi-model data stores combine the benefits of different data stores to ensure optimal storage for various datasets.
To adequately connect data in a multi-model scenario e.g., to join data in different data models and to consistently evolve multi-model databases, inclusion dependencies (INDs) must be known.
INDs are database integrity constraints that define a subset of relationships between attributes or attribute sets.
A classic use case for INDs is the definition of foreign keys in relational databases, i.e., single stores.
In multi-model systems, we cannot assume the availability of all INDs, especially the intra-model-constraints are often not defined in advance and have to be extracted from the data.
In this article, we present an data-driven reverse engineering approach for finding intra- and inter-model-linkages between relational, document, and graph databases and show a proof-of-concept. To do so, we present a flexible framework that is capable of integrating existing as well as one's own algorithm. The meta level at which data dependencies are detected, is represented by a relational database, as most existing IND detection algorithms are based on relational data. Besides integrating the existing algorithms HOPF or IRIS-DS, we additionally implemented a Hybrid approach, combining elements of these existing methods and a classical bottom-up IND search. The evaluation of all four algorithms accessible in our framework, confirmed that a classical bottom-up IND search surpasses all other algorithms, requiring special multi-model algorithms derived from classical IND search methods.