Question: Our sequencing core has given me the FASTQ files from a Chromium run (3' v2 sequencing chemistry) with 150 bp reads for R1 and R2. My question is whether there is a way to use these FASTQ files (trimmed or otherwise) to provide Cell Ranger with the appropriate I1, R1, and R2 FASTQ files? Or should I get the raw BCL output and use
Answer: If you sequenced more bases than necessary on any of your reads (R1, R2, or index), you have three basic options.
1. You can leave the sequences "as-is." It may be helpful to specify the chemistry (as of Cell Ranger 2.1.x) using the
--chemistry option (e.g.
- For Single Cell 3' v1 samples, extra cycles on the i7 index read (cell-barcode) and R2 (UMI) reads will be ignored. By default, all of R1 (the RNA read) is used, and longer R1 lengths should improve mapping rates.
- For Single Cell 3' v2 samples, extra cycles on the R1 (cell-barcode, UMI) read will be ignored. By default, all of R2 (the RNA read) is used, and longer R2 lengths should improve mapping rates.
2. You (or your sequencing core) can demultiplex your BCL data using the
--use-bases-mask option in
cellranger mkfastq or
bcl2fastq to mask the extra bases from ever appearing in your FASTQ files, which will save some disk space, especially for Read 1 where the extra sequence would be ignored anyway.
3. Alternatively, if you are unwilling or unable to demultiplex again, yet you want to save as much disk space as possible, you could use a third party tool to trim your FASTQs directly.
For more information please see: Using bcl2fastq directly.