Question: I have sequencing data from a NovaSeq, and we have upgraded our control software to v1.8.0. For demultiplexing using mkfastq (cellranger, cellranger-arc, or spaceranger), I am using a simple sample sheet CSV and have confirmed that my 10x index name, e.g. SI-TT-G10 was correct, but reads were only assigned to the Undetermined FASTQ files. Why are my reads going into Undetermined?
Answer:
The Illumina NovaSeq control software v1.8 upgrade affected mkfastq's (cellranger, cellranger-arc, spaceranger) ability to autodetect the i5 (Index 2) orientation due to Illumina's reagent name changes in the recipe xml file.
NOTE: This issue has been fixed in Cell Ranger v7.0.1, Cell Ranger ARC v2.0.2, and Space Ranger v2.0.0. If you are experiencing this issue, please upgrade software to at least these versions.
If you are unable to upgrade to above versions, please see below.
This applies to all dual index 10x products.
You can identify whether your NovaSeq instrument software was upgraded to v1.8.0 by locating these lines in your RunParameters.xml file located within the instrument's run directory:
You may experience this issue if you used a simple sample sheet CSV with the 10x index plate name (e.g. SI-TT-G10) such as this:
Lane,Sample,Index
*,sample3,SI-TT-G10
Taking a look at the laneBarcode.html demultiplexing report, which is located here in the output directory:
novaseq_mkfastq_output/outs/fastq_path/Reports/html/*/all/all/all/laneBarcode.html
It will appear that the i5 indices (Index 2) will be in the forward (workflow_a) orientation, so no sequences will be identified.
For example:
In the "Top Unknown Barcodes"(Undetermined files) section of the demultiplexing report, it will appear that the i5 indices are in the reverse complement (workflow_b) orientation, instead. For example:
Article link: What is workflow_a and what is workflow_b?
How can I fix this issue?
There are three ways to work around the auto-detection of the i5 index orientation, as follows:
1. Run the mkfastq pipeline and set the `--rc-i2-override` option to true . For example, if you are demultiplexing single cell gene expression data then you would use a command like the following:
cellranger mkfastq --id=novaseq_override \
--rc-i2-override=true \
--run=/path/to/run_directory/your_data \
--csv=simple_sample_sheet.csv
2. Explicitly specify the Index2 sequence in the reverse complement (workflow_b) orientation in the sample sheet. You can lookup the index sequences from the 10x index plate CSV for the kit you used. For example:
Lane,Sample,Index,Index2
*,sample3,ACTTTACGTG,AGGGCGTTCA
Then run the mkfastq command as follows:
cellranger mkfastq --id=novaseq_seqs \
--run=/path/to/run_directory/your_data \
--csv=sample_sheet_seqs.csv
3. Directly run Illumina's bcl2fastq or bcl-convert demultiplexing software with an IEM sample sheet, using the i5 workflow_b (reverse complement) orientation:
Article link: Is mkfastq really needed to demultiplex, or can we use bcl2fastq?
If you have any questions or concerns, please contact support@10xgenomics.com.
Products: Single Cell Gene Expression, Single Cell Immune Profiling, Visium for Fresh Frozen, Visium for FFPE, Multiome Gene Expression