Question: When I try to run "cellranger aggr" pipeline with >128 k cells I get an error. Is there an upper limit on the number of cells that can be used with "cellranger aggr" pipeline?
Answer: Cell Ranger does not have a maximum number of cells threshold when "aggr" pipeline is run without batch correction. We have validated for up to 250k cells when run on minimum compute resources. But if you have larger compute resources, you could run larger cell counts in Cell Ranger when merging data without chemistry correction.
However, if you are using chemistry batch correction, then there indeed is a threshold of 128,000 cells. The chemistry batch correction is resource-intensive and, therefore, considering the minimum compute resources (64 GB RAM), we have specified the 128k cell limit. If however, you have sufficiently high compute resources, you can manually change the limit that is hardcoded. Here are the instructions:
1. First navigate into the folder where you have Cell Ranger installed in your system.
2. Next, navigate to the folder containing the "constants.py" script by running
Cell Ranger 4+:
Cell Ranger 3:
3. Open the constants.py script here using a text editor such as nano, emacs, or vim.
4. Within the script you will find this section:
5. Here, change the CBC_MAX_NCELLS parameter (arrow on the left) from 128,000 to the maximum number that you need.
6. Save this change in the script, reload Cell Ranger to your environment, and you will be able to run "cellranger aggr" pipeline with higher number of cells.
Please note that we have not validated the pipeline with cell counts beyond 128,000 and advise caution.
Also, please note that Loupe Cell Browser can load up to 1.3 million cells at this time. In addition, the differential expression analysis in Loupe Cell Browser can be done on 100k cells only.
Disclaimer: This article and code-snippet are provided for instructional purposes only. 10x Genomics does not support or guarantee modifications to Cell Ranger code base.