Q: I have a large scRNA-seq dataset. How can I partition the dataset, based on manual annotations, to open smaller, more manageable Loupe files?
A: You can subset a .cloupe
file with custom categories to make several smaller .cloupe
files. Here is an example below, walking you through how to do this. The data for this example are available here.
1. Load .cloupe
in the Loupe Browser.
2. Identify the barcodes you want to keep, for example, Atp1a1 and Aqp4 are your interesting genes.
2.1. Click on the “Gene/Feature Expression” Tab.
2.2. Select for cells where Gene/Feature Expression for Atp1a1 > 0.
- Search for Atp1a1 in the “Search for a feature” text box.
- Enter > 0 for "Select by Count - Atp1a1"
- Click on the filter button, to the right of the 0 counts
- Create a Category name and Cluster name. In this example, we name the Category: “Interesting_genes” and Cluster: "Atp1a1"
2.3. Go back to “Gene/Feature Expression” Tab.
- Search for "Aqp4"
- Select by Count, > 0
- Click on the filter button and name the Category: “Interesting_genes” and Cluster: "Aqp4"
3. The Loupe Browser can import/export a table of barcodes with custom "Categories". For example, here are cells in “Interesting_genes” category. Note there are 1339 cells expressing Atp1a1 and 64 cells expressing Aqp4.
Click on the three dots next to the pull-down menu for the custom category and select "Export", which will give you CSV to download with the barcodes associated with each category. That file will be formatted to look like this:
4. Then you will use that CSV file to run the cellranger reanalyze
pipeline with the larger .h5 file to generate new .cloupe
files:
cellranger reanalyze --id=SC3_V3_NextGem_DI_Neurons_5k_reanalysis --matrix=SC3_v3_NextGem_DI_Neurons_5K_Multiplex_count_raw_feature_bc_matrix.h5 --barcodes=Interesting_genes.csv
Instructions for how to run the cellranger reanalyze
pipeline are located here:
These steps will remove all barcodes not in the CSV file. In this specific example, the file size is reduced by about half. As you can see now, there are only these cells remaining (1339 Atp1a1 and 64 Aqp4 cells) after loading the .cloupe
file from the reanalyze
pipeline:
Repeat steps 2-4 as needed to generate customer smaller .cloupe
files.