Question: How does Cell Ranger auto-detect the assay chemistry?
3' or 5' Single Cell Gene Expression
To auto-detect the assay chemistry (default), Cell Ranger samples 100k reads (from top 1M) in the FASTQ files, and maps them to provided reference. The 3' versus 5' assay configurations are inferred based on the dominant orientation of the R2 read mapping. Assignment of 3' versus 5' requires at least 1,000 confidently mapped reads and 2x the reads in one orientation versus the other. For example, if the number of antisense mapped reads is 2x greater than sense reads for the R2 read, the library is inferred to be a 5' gene expression assay.
The distinction between Single Cell 3' v1, v2, v3, and LT chemistries is made based on the fraction of barcodes overlapping the whitelist for each specific chemistry. The identification of HT is based on the throughput detection algorithm (to distinguish HT vs standard chemistries), which is a separate step from chemistry detection. Throughput detection is performed only for analysis of CellPlex data (Cell Ranger 6.1+). More details here.
For 3' and 5' assays, the chemistry settings can be nested as shown below:
fiveprime auto-detect options within the 2 possible assay configurations (3' or 5'), respectively. Within 3' or 5', there are versions of the assay and alignment options that can also be directly specified.
threeprimefor Single Cell 3':
SC3Pv1for Single Cell 3' v1
SC3Pv2for Single Cell 3' v2
SC3Pv3for Single Cell 3' v3
SC3Pv3LTfor Single Cell 3' v3 LT
SC3Pv3HTfor Single Cell 3' v3 HT
fiveprimefor Single Cell 5':
SC5P-PEfor Single Cell 5' paired-end (where both R1 and R2 are used for alignment)
- This can be used if you sequenced R1 longer than 81 bases
SC5P-R2for Single Cell 5' R2-only (where only R2 is used for alignment)
- This assumes the recommended sequencing configuration
For more information on the 3' and 5' chemistry options, please consult the Cell Ranger documentation.
Fixed RNA Profiling
The detection of Fixed RNA Profiling (FRP) chemistry is made based on the fraction of barcodes overlapping the whitelist (737k-fixed-rna-profiling.txt.gz). There are two assay configurations for FRP: singleplex (only one probe barcode used in the assay) and multiplex (multiple probe barcodes used in the assay). By default, Cell Ranger will auto-detect the configuration of the data based on the number of probe barcode sequences (one or more than one) in the library. To override the configuration detection, users may specify either of the followings in the multi config csv file under the [gene expression] section:
SFRPfor singleplex FRP
MFRPfor multiplex FRP
For more information on the FRP configurations, please see Fixed RNA Profiling with cellranger multi.