Question: Why did my Cell Ranger job fail in stage code with exit status 255?
Answer: The above message with an exit code 255 is usually due to user limits (ulimits) not set appropriately for the system.
ERROR; return code from pthread_create() is 11
Error detail: Resource temporarily unavailable
Job failed in stage code
exit status 255
You can use the sitecheck file to assess if there is a need to change the ulimits. Since cellranger
spans multiple processes per core, there are times when the job can exceed those limits. Hence the recommendation is to have the ulimit -n set to 16000. However, the max user processes is the system limit that varies for different users. In some systems it defaults to 1024 or 4096.
Inspect the sitecheck file and verify the max open files and max user processes ulimits. For example,
=====================================================================
User Limits
bash -c 'ulimit -a'
---------------------------------------------------------------------
open files (-n) 4096
max user processes (-u) 1024
=====================================================================
If a limit of 1024 is set by a user but the job exceeds 1024 processes, then the system will usually kill such jobs. Thus if the max user process is set to 1024, then --localcores=16
can usually be used safely. So either you can throttle the resources by making use of --localcores
and --localmem
options or set the ulimits appropriately for Cell Ranger as recommended here under the guidance of your admin.