| Item type: | Article | ||||||
|---|---|---|---|---|---|---|---|
| Journal or Publication Title: | Metabolomics | ||||||
| Publisher: | SPRINGER | ||||||
| Place of Publication: | NEW YORK | ||||||
| Volume: | 8 | ||||||
| Number of Issue or Book Chapter: | Suppl | ||||||
| Page Range: | pp. 146-160 | ||||||
| Date: | June 2012 | ||||||
| Institutions: | Medicine > Institut für Funktionelle Genomik > Lehrstuhl für Funktionelle Genomik (Prof. Oefner) Medicine > Institut für Funktionelle Genomik > Lehrstuhl für Statistische Bioinformatik (Prof. Spang) Informatics and Data Science > Department Computational Life Science > Lehrstuhl für Statistische Bioinformatik (Prof. Spang) | ||||||
| Identification Number: |
| ||||||
| Keywords: | DATA SETS; SPECTROSCOPY; METABONOMICS; PROTEOMICS; URINE; ALIGNMENT; VARIANCE; SPECTRA; DISEASE; BIAS; Metabolomics; NMR; Data normalization; Preprocessing; Classification | ||||||
| Dewey Decimal Classification: | 600 Technology > 610 Medical sciences Medicine 500 Science > 570 Life sciences 600 Technology > 610 Medical sciences Medicine | ||||||
| Status: | Published | ||||||
| Refereed: | Yes, this version has been refereed | ||||||
| Created at the University of Regensburg: | Yes | ||||||
| Item ID: | 30587 |
Abstract
Extracting biomedical information from large metabolomic datasets by multivariate data analysis is of considerable complexity. Common challenges include among others screening for differentially produced metabolites, estimation of fold changes, and sample classification. Prior to these analysis steps, it is important to minimize contributions from unwanted biases and experimental variance. This ...

Abstract
Extracting biomedical information from large metabolomic datasets by multivariate data analysis is of considerable complexity. Common challenges include among others screening for differentially produced metabolites, estimation of fold changes, and sample classification. Prior to these analysis steps, it is important to minimize contributions from unwanted biases and experimental variance. This is the goal of data preprocessing. In this work, different data normalization methods were compared systematically employing two different datasets generated by means of nuclear magnetic resonance (NMR) spectroscopy. To this end, two different types of normalization methods were used, one aiming to remove unwanted sample-to-sample variation while the other adjusts the variance of the different metabolites by variable scaling and variance stabilization methods. The impact of all methods tested on sample classification was evaluated on urinary NMR fingerprints obtained from healthy volunteers and patients suffering from autosomal polycystic kidney disease (ADPKD). Performance in terms of screening for differentially produced metabolites was investigated on a dataset following a Latin-square design, where varied amounts of 8 different metabolites were spiked into a human urine matrix while keeping the total spike-in amount constant. In addition, specific tests were conducted to systematically investigate the influence of the different preprocessing methods on the structure of the analyzed data. In conclusion, preprocessing methods originally developed for DNA microarray analysis, in particular, Quantile and Cubic-Spline Normalization, performed best in reducing bias, accurately detecting fold changes, and classifying samples.
Metadata last modified: 29 Sep 2021 07:40

Altmetric