Question: I have run my data with Cell Ranger 2 and Cell Ranger 3+. Why do I see more cells that are enriched in mitochondrial (MT) genes from the Cell Ranger 3+ results? Which result should I trust?
Answer: Cell Ranger versions 3 or later have a cell calling algorithm that is more sensitive than the cell calling algorithm in version 2. The new algorithm is based on the EmptyDrops method (Lun et al., 2018). This method may detect cells that were missed by the previous version of our cell calling algorithm, especially dead or dying cells, cells with naturally low RNA content, or in heterogeneous samples. See this page for more details on the new EmptyDrops cell calling algorithm.
The figure below shows an example barcode rank plot for the same data run with Cell Ranger 2.2 and 3.0. It shows additional lower UMI count barcodes called cells (in the blue shaded portion of the curve) in Cell Ranger 3.0.
This increased sensitivity in Cell Ranger 3 may lead to the detection of cells exhibiting signs of stress or differential expressing mitochondrial genes, which could be an indicator of:
- Poor sample quality, leading to a high fraction of apoptotic or lysed cells.
- The overall biology of the sample, for example, tumor biopsies may have increased mitochondrial gene expression due to metabolic activity and/or necrosis.
If only a single or a few clusters of cells have very few up-regulated genes (indicating low overall gene expression) other than mitochondrial genes, this cluster or clusters most likely represents a population of cells that are either about to enter, or are already going through, apoptosis or necrosis pathways.
For example, in the figure below, barcodes with low UMIs cluster together (dark blue in UMI plot on lower left, blue and lime green in lower right t-SNE plot). These two clusters with the extra cells identified by Cell Ranger 3.0 onwards are enriched for mitochondrial genes, which could be indicative of dying or stressed cells.
In summary, a potential side effect of the new cell calling algorithm is that it may include more poor quality cells that differentially express mitochondrial genes due to the higher sensitivity of the algorithm. EmptyDrops is technically correct in retaining damaged cells as distinct from true empty droplets, even if these cells are not biologically interesting. You can choose to exclude these cells (as explained here) from downstream analysis.
Reference articles:
Why do I see a high level of mitochondrial gene expression?
Products: Single Cell Gene Expression, Single Cell Immune Profiling