Direkt zum Inhalt

Klettke, Meike ; Lutsch, Adrian ; Störl, Uta

Kurz erklärt: Measuring Data Changes in Data Engineering and their Impact on Explainability and Algorithm Fairness

Klettke, Meike , Lutsch, Adrian und Störl, Uta (2021) Kurz erklärt: Measuring Data Changes in Data Engineering and their Impact on Explainability and Algorithm Fairness. Datenbank-Spektrum 21 (3), S. 245-249.

Veröffentlichungsdatum dieses Volltextes: 13 Aug 2025 06:57
Artikel
DOI zum Zitieren dieses Dokuments: 10.5283/epub.77290


Zusammenfassung

Data engineering is an integral part of any data science and ML process. It consists of several subtasks that are performed to improve data quality and to transform data into a target format suitable for analysis. The quality and correctness of the data engineering steps is therefore important to ensure the quality of the overall process. In machine learning processes requirements such as ...

Data engineering is an integral part of any data science and ML process. It consists of several subtasks that are performed to improve data quality and to transform data into a target format suitable for analysis. The quality and correctness of the data engineering steps is therefore important to ensure the quality of the overall process.
In machine learning processes requirements such as fairness and explainability are essential. The answers to these must also be provided by the data engineering subtasks. In this article, we will show how these can be achieved by logging, monitoring and controlling the data changes in order to evaluate their correctness. However, since data preprocessing algorithms are part of any machine learning pipeline, they must obviously also guarantee that they do not produce data biases.
In this article we will briefly introduce three classes of methods for measuring data changes in data engineering and present which research questions still remain unanswered in this area.



Beteiligte Einrichtungen


Details

DokumentenartArtikel
Titel eines Journals oder einer ZeitschriftDatenbank-Spektrum
Verlag:Springer Nature
Band:21
Nummer des Zeitschriftenheftes oder des Kapitels:3
Seitenbereich:S. 245-249
DatumOktober 2021
InstitutionenInformatik und Data Science > Allgemeine Informatik > Data Engineering (Prof. Dr.-Ing. Meike Klettke)
Identifikationsnummer
WertTyp
10.1007/s13222-021-00392-wDOI
Stichwörter / KeywordsData engineering pipelines · Reliability · Explainability · Data bias · Degree of data changes
Dewey-Dezimal-Klassifikation000 Informatik, Informationswissenschaft, allgemeine Werke > 004 Informatik
StatusVeröffentlicht
BegutachtetJa, diese Version wurde begutachtet
An der Universität Regensburg entstandenNein
URN der UB Regensburgurn:nbn:de:bvb:355-epub-772900
Dokumenten-ID77290

Bibliographische Daten exportieren

Nur für Besitzer und Autoren: Kontrollseite des Eintrags

nach oben