Question: Does Cell Ranger automatically exclude doublets or multiplets?
Answer: We currently don't have a method for computationally classifying whether a barcode contains more than one cell in single cell gene expression data from a single species.
At present, Cell Ranger software only detects doublets in the context of a barnyard or mixed-species experiment used for estimating multiplet rates.
In a barnyard experiment, a 10x library is made from a mixture of cells from different species and mapped to a custom multi-species reference which combines the corresponding reference genomes. Multiplet rates are then estimated by counting the number of cells where transcripts are detected from multiple genomes. For more information, please see the "Estimating Multiplets" section in the Algorithms overview.
https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/algorithms/overview
While we currently do not have an official recommendation on this subject, a couple of methods might be explored. Please note that some of these methods are manual/ad-hoc or experimental and may filter out valid single cells.
- Infer doublets in a single species case if there are known cell-type specific markers. For example, the presence of T cell and B cell specific markers coming from a single barcode may indicate a GEM with both T and B cells.
- Evaluate the suggested workflow in the R package, Seurat, which has some functionality for flagging cells with a clear outlier number of UMIs or genes detected http://satijalab.org/seurat/pbmc3k_tutorial.html ("QC and selecting cells for further analysis.)
- Evaluate the doublet identification methods, such as DoubletFinder. More details and comparisons of several computational methods can be found here: Benchmarking Computational Doublet-Detection Methods for Single-Cell RNA Sequencing Data.