Zusammenfassung
We present a novel, cost efficient two-phase design for predictive clinical gene expression studies: early marker panel determination (EMPD). In Phase-1, genome-wide microarrays are used only for a small number of individual patient samples. From this Phase-1 data a panel of marker genes is derived. In Phase-2, the expression values of these marker panel genes are measured for a large group of ...
Zusammenfassung
We present a novel, cost efficient two-phase design for predictive clinical gene expression studies: early marker panel determination (EMPD). In Phase-1, genome-wide microarrays are used only for a small number of individual patient samples. From this Phase-1 data a panel of marker genes is derived. In Phase-2, the expression values of these marker panel genes are measured for a large group of patients and a predictive classification model is learned from this data. Phase-2 does not require the use of expensive whole genome microarrays, thus making EMPD a cost efficient alternative for current trials. The expected performance loss of EMPD is compared to designs which use genome-wide microarrays for all patients. We also examine the trade-off between the number of patients included in Phase-1 and the number of marker genes required in Phase-2. By analysis of five published datasets we find that in Phase-1 already 16 patients per group are sufficient to determine a suitable marker panel of 10 genes, and that this early decision compromises the final performance only marginally.