Question: How can I remove batch effects among samples with cellranger aggr
?
Answer: If you are aggregating libraries generated by different chemistry versions of the Single Cell Gene Expression Reagents, you might observe systematic differences in gene expression profiles between libraries. In Cell Ranger v3 we introduced a new Chemistry Batch Correction algorithm to correct the batch effects between chemistries. The algorithm is based on mutual nearest neighbors (MNN) to identify similar cell subpopulations between batches.
This Chemistry Batch Correction is specifically intended to correct for systematic variability in gene expression profiles caused by different versions of the Single Cell Gene Expression chemistry. 10x has tested and verified its effectiveness primarily on aggregating Single Cell Gene Expression 3' v2 and v3 chemistry with well-matched input material.
cellranger aggr
and the Chemistry Batch Correction module can aggregate results for a combination of 5' and 3' v2 or 3' v3 Gene Expression data. Enabling Chemistry Batch Correction in this scenario improves the mixing of the batches in the t-SNE visualization and clustering results. However, residual batch effects may still be present, and we advise careful validation of the results.
Please see this page for guidance on how to implement batch effect correction in cellranger aggr
: Aggregating Libraries With Different Chemistry Versions. For details on the algorithm and the citation please see this page: Cell Ranger Algorithms Overview.
What about other kinds of batch effects?
Beyond the scope of 10x tools, there are a number of packages in R, such as Seurat (1), scran (2), and scone, which attempt to address various types of batch effects. The following tutorial shows an example using the RegressOut function: Seurat Batch Effect Correction. Alternatively, here is a tutorial for “aligning” data sets or samples which are expected to be similar (as in the case of biological replicates): Seurat Alignment Tutorial.
References: