Question: What are the genes and criteria that were used for clustering the Lung Cancer (NSCLC) data in the 5' GEX and V(D)J integrated tutorial?
Answer: Below are the rules that were used to identify the cell types for the tutorial dataset. Please note that the order matters because currently a cell can only belong to the last cluster it was assigned to.
1) T Cells: All barcodes that were identified as T-cells from the corresponding vloupe file for LungTumor T-cells (http://cf.10xgenomics.com/supp/cell-vdj/LungTumorT.vloupe) were assigned as T-cells.
2) B Cells: CD19 || MS4A1 || IGHG1 > 0 (Gene Exp Max of B cell list > 0)
3) T-Helper Cells: CD3E && CD4 > 0 (Gene Exp Min of CD4+ list > 0)
4) T-reg: FOXP3 > 1
5) Monocytes: CD14 && FCGR3A > 1 (Gene Exp Min of Monocyte list > 1)
6) Epithelial: CA9 || TNFRSF12A || KRT5 || KRT6A (Gene Exp Sum of Lung/Carcinoma List > 1)
7) Basal: KRT6A > 1
8) MAST: Manual selection using rectangle/lasso selection mode. The cluster was named after finding the significant genes in this group.
9) NK Cells: Manual selection of entire region (including cells marked as CD8+ cytotoxic cells now). The cluster was named after finding significant genes in this group.
10) Cytotoxic T Cells: CD8A || CD8B > 0 (Gene Exp Max of CD8+ list > 0)
Please also note that the cell annotations in the tutorial (derived from the method above) are for demonstration purposes only. Therefore, any biological conclusions from the tutorial should be interpreted with caution.