Direkt zum Inhalt

Hagn, Michael ; Heinrich, Bernd ; Krapf, Thomas ; Schiller, Alexander

Handling imperfection: A taxonomy for machine learning on data with data quality defects

Hagn, Michael, Heinrich, Bernd , Krapf, Thomas und Schiller, Alexander (2025) Handling imperfection: A taxonomy for machine learning on data with data quality defects. Decision Support Systems 196, S. 114493.

Veröffentlichungsdatum dieses Volltextes: 24 Jun 2025 05:57
Artikel
DOI zum Zitieren dieses Dokuments: 10.5283/epub.76908


Zusammenfassung

In recent years, machine learning (ML) has become ubiquitous in sectors including transportation, security, health, and finance to analyze large amounts of data and support decision-making. However, real-world datasets used in ML often exhibit various data quality (DQ) defects that can significantly impair the performance and validity of ML models and thus also the decisions derived from them. ...

In recent years, machine learning (ML) has become ubiquitous in sectors including transportation, security, health, and finance to analyze large amounts of data and support decision-making. However, real-world datasets used in ML often exhibit various data quality (DQ) defects that can significantly impair the performance and validity of ML models and thus also the decisions derived from them. Therefore, a plethora of methods across various research strands have been proposed to address DQ defects and mitigate their negative impact on ML-based data analysis and decision support. This has resulted in a fragmented research landscape, where comparisons and classifications of methods dealing with ML on data with DQ defects are very challenging for both researchers and practitioners. Thus, based on a structured design process, we develop and present a taxonomy for this research field. The taxonomy serves as a systematic framework to classify and organize existing research and methods according to relevant dimensions and facilitates future work in this area. Its reliability, understandability, completeness, and usefulness are supported by an evaluation with external researchers and practitioners. Finally, we identify current trends and research gaps and derive challenges and directions for future research.



Beteiligte Einrichtungen


Details

DokumentenartArtikel
Titel eines Journals oder einer ZeitschriftDecision Support Systems
Verlag:Elsevier
Band:196
Seitenbereich:S. 114493
Datum16 Juni 2025
InstitutionenWirtschaftswissenschaften > Institut für Wirtschaftsinformatik > Lehrstuhl für Wirtschaftsinformatik II (Prof. Dr. Bernd Heinrich)
Informatik und Data Science > Fachbereich Wirtschaftsinformatik > Lehrstuhl für Wirtschaftsinformatik II (Prof. Dr. Bernd Heinrich)
Projekte
Gefördert von: Deutsche Forschungsgemeinschaft (DFG) (494840328)
Identifikationsnummer
WertTyp
10.1016/j.dss.2025.114493DOI
Stichwörter / KeywordsTaxonomy, Machine learning, Data quality, Data uncertainty
Dewey-Dezimal-Klassifikation000 Informatik, Informationswissenschaft, allgemeine Werke > 004 Informatik
StatusVeröffentlicht
BegutachtetJa, diese Version wurde begutachtet
An der Universität Regensburg entstandenJa
URN der UB Regensburgurn:nbn:de:bvb:355-epub-769086
Dokumenten-ID76908

Bibliographische Daten exportieren

Nur für Besitzer und Autoren: Kontrollseite des Eintrags

nach oben