Question: What are the genes and criteria that were used for clustering the Lung Cancer (NSCLC) data in the 5' gene expression and V(D)J integrated tutorial?
Answer: Here are the rules for identifying cell types in the lung cancer tutorial dataset. Please note that the order of rules matters - a cell can only belong to the last cluster to which it was assigned.
1) T Cells: All barcodes that were identified as T-cells from the corresponding .vloupe
file for lung tumor T-cells (http://cf.10xgenomics.com/supp/cell-vdj/LungTumorT.vloupe) were assigned as T-cells.
2) B Cells: CD19 || MS4A1 || IGHG1 > 0 (Gene Exp Max of B cell list > 0).
3) T-Helper Cells: CD3E && CD4 > 0 (Gene Exp Min of CD4+ list > 0).
4) T-reg Cells: FOXP3 > 1
5) Monocytes: CD14 && FCGR3A > 1 (Gene Exp Min of Monocyte list > 1)
6) Epithelial Cells: CA9 || TNFRSF12A || KRT5 || KRT6A (Gene Exp Sum of Lung/Carcinoma List > 1)
7) Basal Cells: KRT6A > 1
8) MAST Cells: Manual selection using rectangle/lasso selection mode. The cluster was named after finding the significant genes in this group.
9) NK Cells: Manual selection of entire region (including cells marked as CD8+ cytotoxic cells now). The cluster was named after finding significant genes in this group.
10) Cytotoxic T Cells: CD8A || CD8B > 0 (Gene Exp Max of CD8+ list > 0)
Please also note that the cell annotations in the tutorial (derived from the method above) are for demonstration purposes only. Therefore, any biological conclusions from the tutorial should be interpreted with caution.