Question: How can I customize the parameters for Principal Components Analysis (PCA), and what are the default values?
Answer: You can modify the following parameters related to PCA by passing a CSV file to cellranger reanalyze
:
Parameter | Type | Default | Recommended Range | Description |
num_pca_bcs | int | null | Cannot be set higher than the available number of cells. | Randomly subset data to N barcodes when computing PCA projection (the most memory-intensive step). The PCA projection will still be applied to the full dataset, i.e. your final results will still reflect all the data. Try reducing this parameter if your analysis is running out of memory. |
num_pca_genes | int | null | Cannot be set higher than the number of genes in the reference transcriptome. | Subset data to the top N genes (ranked by normalized dispersion) when computing PCA. Differential expression will still reflect all genes. Try reducing this parameter if your analysis is running out of memory. |
num_principal_comps | int | 10 | 10-100, depending on the number of cell populations / clusters you expect to see. | Compute N principal components for PCA. Setting this too high may cause spurious clusters to be called. |
For more information please see Customized Secondary Analysis using cellranger reanalyze .