Nur für Besitzer und Autoren: Kontrollseite des Eintrags

Hagn, Michael ; Heinrich, Bernd

; Krapf, Thomas ; Schiller, Alexander

Handling imperfection: A taxonomy for machine learning on data with data quality defects

Hagn, Michael, Heinrich, Bernd

, Krapf, Thomas und Schiller, Alexander (2025) Handling imperfection: A taxonomy for machine learning on data with data quality defects. Decision Support Systems 196, S. 114493.

Veröffentlichungsdatum dieses Volltextes: 24 Jun 2025 05:57
Artikel
DOI zum Zitieren dieses Dokuments: 10.5283/epub.76908

Veröffentlichte Version
Download ( PDF | 2MB)

Lizenz: Creative Commons Namensnennung 4.0 International

Zusammenfassung

In recent years, machine learning (ML) has become ubiquitous in sectors including transportation, security, health, and finance to analyze large amounts of data and support decision-making. However, real-world datasets used in ML often exhibit various data quality (DQ) defects that can significantly impair the performance and validity of ML models and thus also the decisions derived from them. Therefore, a plethora of methods across various research strands have been proposed to address DQ defects and mitigate their negative impact on ML-based data analysis and decision support. This has resulted in a fragmented research landscape, where comparisons and classifications of methods dealing with ML on data with DQ defects are very challenging for both researchers and practitioners. Thus, based on a structured design process, we develop and present a taxonomy for this research field. The taxonomy serves as a systematic framework to classify and organize existing research and methods according to relevant dimensions and facilitates future work in this area. Its reliability, understandability, completeness, and usefulness are supported by an evaluation with external researchers and practitioners. Finally, we identify current trends and research gaps and derive challenges and directions for future research.

Alternative Links zum Volltext

Beteiligte Einrichtungen

Wirtschaftswissenschaften > Institut für Wirtschaftsinformatik > Lehrstuhl für Wirtschaftsinformatik II (Prof. Dr. Bernd Heinrich) Informatik und Data Science > Fachbereich Wirtschaftsinformatik > Lehrstuhl für Wirtschaftsinformatik II (Prof. Dr. Bernd Heinrich)
Browse Publikationen

Details

Dokumentenart

Artikel

Titel eines Journals oder einer Zeitschrift

Decision Support Systems

Verlag:

Elsevier

Open Access Art:

DEAL (Elsevier)

Band:

196

Seitenbereich:

S. 114493

Datum

16 Juni 2025

Institutionen

Wirtschaftswissenschaften > Institut für Wirtschaftsinformatik > Lehrstuhl für Wirtschaftsinformatik II (Prof. Dr. Bernd Heinrich)
Informatik und Data Science > Fachbereich Wirtschaftsinformatik > Lehrstuhl für Wirtschaftsinformatik II (Prof. Dr. Bernd Heinrich)

Projekte

Datenqualität bei textuellen, Nutzer-generierten Inhalten

Gefördert von: Deutsche Forschungsgemeinschaft (DFG) (494840328)

Identifikationsnummer

Wert	Typ
10.1016/j.dss.2025.114493	DOI

Stichwörter / Keywords

Taxonomy, Machine learning, Data quality, Data uncertainty

Dewey-Dezimal-Klassifikation

000 Informatik, Informationswissenschaft, allgemeine Werke > 004 Informatik

Status

Veröffentlicht

Begutachtet

Ja, diese Version wurde begutachtet

An der Universität Regensburg entstanden

URN der UB Regensburg

urn:nbn:de:bvb:355-epub-769086

Dokumenten-ID

76908

Bibliographische Daten exportieren

Nur für Besitzer und Autoren: Kontrollseite des Eintrags

Downloadstatistik

Altmetric

Alternative Statistik (altmetrics)

Weitere Literatur (mittels CORE)

nach oben