Direkt zum Inhalt

Borozan, Bartol ; Prusina, Tomislav ; Borozan, Luka ; Ševerdija, Domagoj ; Rojas Ringeling, Francisca ; Matijević, Domagoj ; Canzar, Stefan

Optimal marker genes for c-separated cell types with SepSolve

Borozan, Bartol, Prusina, Tomislav, Borozan, Luka, Ševerdija, Domagoj, Rojas Ringeling, Francisca, Matijević, Domagoj und Canzar, Stefan (2025) Optimal marker genes for c-separated cell types with SepSolve. Genome Research 35 (12), S. 2770-2780.

Veröffentlichungsdatum dieses Volltextes: 17 Dez 2025 10:55
Artikel
DOI zum Zitieren dieses Dokuments: 10.5283/epub.78345


Zusammenfassung

The identification of cell types in single-cell RNA-seq studies relies on the distinct expression signature of marker genes. A small set of target genes is also needed to design probes for targeted spatial transcriptomic experiments and to target proteins in single-cell spatial proteomics or for cell sorting. Although traditional approaches have relied on testing one gene at a time for ...

The identification of cell types in single-cell RNA-seq studies relies on the distinct expression signature of marker genes. A small set of target genes is also needed to design probes for targeted spatial transcriptomic experiments and to target proteins in single-cell spatial proteomics or for cell sorting. Although traditional approaches have relied on testing one gene at a time for differential expression between a given cell type and the rest, more recent methods have highlighted the benefits of a joint selection of markers that together distinguish all pairs of cell types simultaneously. However, existing methods either consider all pairs of individual cells, which becomes intractable even for medium-sized data sets, or ignore intra-cell-type expression variation entirely by collapsing all cells of a given type to a single representative. Here, we address these limitations and propose to find a small set of genes such that cell types are c-separated in the selected dimensions, a notion introduced previously in learning a mixture of Gaussians. To this end, we formulate a linear program that naturally takes into account expression variation within cell types without including each pair of individual cells in the model, leading to a highly stable set of marker genes that allow to accurately discriminate between cell types and that can be computed to optimality efficiently.



Beteiligte Einrichtungen


Details

DokumentenartArtikel
Titel eines Journals oder einer ZeitschriftGenome Research
Verlag:Cold Spring Harbor
Band:35
Nummer des Zeitschriftenheftes oder des Kapitels:12
Seitenbereich:S. 2770-2780
Datum9 Oktober 2025
Zusätzliche Informationen (Öffentlich)This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International),as described at http://creativecommons.org/licenses/by-nc/4.0/.
InstitutionenInformatik und Data Science > Fachbereich Bioinformatik > Algorithmische Bioinformatik (Prof. Dr. Stefan Canzar)
Identifikationsnummer
WertTyp
10.1101/gr.280637.125DOI
Dewey-Dezimal-Klassifikation000 Informatik, Informationswissenschaft, allgemeine Werke > 004 Informatik
StatusVeröffentlicht
BegutachtetJa, diese Version wurde begutachtet
An der Universität Regensburg entstandenZum Teil
URN der UB Regensburgurn:nbn:de:bvb:355-epub-783458
Dokumenten-ID78345

Bibliographische Daten exportieren

Nur für Besitzer und Autoren: Kontrollseite des Eintrags

nach oben