Abstract:
Flow cytometry provides measurements of proteins for individual cells. Such measurements contain information on the cellular heterogeneity of biological systems, which is often of great interest. Although this technology has more than 30 years of history, the current standard analysis methods for single-cell data are still subjective and labor-intensive processes that require prior knowledge of the biology underlying the data. We present a novel analytical approach, Spanning-tree Progression Analysis of Density-normalized Events (SPADE), which uncovers an underlying cellular hierarchy from single-cell data without requiring prior knowledge. We applied SPADE to two single-cell datasets derived from mouse and human bone marrow. In both cases, SPADE detected a hierarchy which recapitulates well-described patterns of cellular differentiation. We also applied SPADE to study drug response, cancer classification, and rare cell identification. In the DREAM6/FlowCAP2 AML prediction challenge in 2011, we achieved 100% accuracy in classification of AML patients and health subjects, and were recognized as one of the two top-performers. In the FlowCAP3 rare cell identification challenge in 2012, we achieved 80% accuracy and were ranked #1. Our predictions were better than the 2nd place by a large margin, and were even significantly better than ensemble predictions built from consensus of multiple participating teams.