Question: How can I customize the parameters for k-means and graph-based clustering in Cell Ranger, and what are the default values?
Answer: You can modify the following parameters related to k-means and graph-based clustering by passing a CSV file to cellranger reanalyze
:
Parameter | Type | Default Value | Recommended Range | Description |
graphclust_neighbors | int | 0 | 10-500, depending on desired granularity | Number of nearest-neighbors to use in the graph-based clustering. Lower values result in higher-granularity clustering. The actual number of neighbors used is the maximum of this value and that determined by neighbor_a and neighor_b. Set this value to zero to use those values instead. |
neighbor_a | float | -230 | Determines how clustering granularity scales with cell count. | The number of nearest neighbors, k, used in the graph-based clustering is computed as follows: k = neighbor_a + neighbor_b * log10(n_cells). The actual number of neighbors used is the maximum of this value and graphclust_neighbors. |
neighbor_b | float | 120 | Determines how clustering granularity scales with cell count. | The number of nearest neighbors, k, used in the graph-based clustering is computed as follows: k = neighbor_a + neighbor_b * log10(n_cells). The actual number of neighbors used is the maximum of this value and graphclust_neighbors. |
max_clusters | int | 10 | 10-50, depending on the number of cell populations / clusters you expect to see. | Compute K-means clustering using K values of 2 to N. Setting this too high may cause spurious clusters to be called. |
For more information please see Customized Secondary Analysis using cellranger reanalyze .