Question: I need to upload my 10x Genomics data to the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO), Sequence Read Archive (SRA), or other similar database. Which file(s) should I upload to ensure reproducibility?
Answer: The files to submit to NCBI GEO or SRA depend on the 10x Genomics platform and product used to generate the data. Researchers should carefully consider which files are needed to reproduce all relevant results for their specific experiment. Researchers should also check with relevant funding agencies and journal editorial policies, as best practices will continue to evolve. For a recent example of a GEO/SRA submission that includes Chromium, Visium, and Xenium data, see the superset associated with Janesick et al. (2023), Nature Communications.
Below are our latest data archiving guidelines for the Chromium, Visium, and Xenium platforms.
Chromium
For Chromium data, 10x generally recommends archiving the R1 and R2 FASTQs instead of the BAM file. Cell Ranger v8.0 and later has made `--create-bam` a required parameter. 10x recommends not creating a BAM for 10x Genomics Chromium V(D)J, which does not contain all the necessary tags for reversion to original FASTQ data, and for probe-based assays (Flex). However, the following exceptions apply:
- We recommend submitting the BAM file for Chromium CellPlex data and for Hashtag Multiplexing data. For more information, see Which CellPlex data files should be uploaded/downloaded to/from public repositories such as SRA/GEO?
- For multiplex Flex and On Chip Multiplexing (OCM) data, if submission of data from individual samples pooled in a library is required/preferred, one can convert the per-sample BAM files to per-sample FASTQ files by following the instructions in this article.
- For Epi ATAC and Epi Multiome ATAC data, the R1, R2, and R3 FASTQ files are required to reproduce the analysis, and all relevant information is stored in the BAM file output from the `cellranger-atac` or `cellranger-arc` pipelines. Therefore, uploading the 10x BAM file is the best way to archive Epi ATAC data. Users can convert 10x BAM files back to FASTQ format using the 10x Genomics tool bamtofastq.
Visium
For Visium spatial applications, 10x recommends archiving R1 and R2 FASTQs. As with Cell Ranger, BAM files are no longer generated by default by Space Ranger, and are not recommended for probe-based assays due to unnecessary computational overhead. Otherwise, BAM files are an acceptable alternative for data archiving (use `--create-bam` in `spaceranger count`). In addition to the sequence data, it is necessary to make the microscope and CytAssist images available because they are required inputs to Space Ranger. It is also highly recommended to document the Visum slide serial number and capture area.
Xenium
We recommend archiving Xenium raw data outputs, which consist of 1) decoded transcripts with assigned Phred-scaled Q-Scores and 2) high-resolution morphology images. For more details, see Archiving Xenium Data.