Question: I have a dual indexed library that was also sequenced in dual indexed mode. But the quality of one of my index reads is low. Now, I want to demultiplex my data using only one of the indices. How can I do that?
Answer: In the case with a dual index library where the sequencing quality of one index may be lower than anticipated, then the other index can sometimes be used for demultiplexing to recover the data.
Sometimes, if you try to demultiplex the flow cell with low quality indices, many of your reads may not be assigned to your samples and they will go into the 'Undetermined' files instead. Even though you used the correct indices in your sample sheet, you will see a large percentage of reads in the 'Undetermined' files.
It is recommended to use the --barcode-mismatches argument set to the maximum 2 allowable mismatches to minimize the effect of sequencing errors on demultiplexing.
Run Cell Ranger mkfastq to demultiplex using only the i7 index with 2 allowed mismatches:
cellranger mkfastq --id=FASTQ_output_directory \
--use-bases-mask=Y28n*,I10,N10,Y90n* \
--filter-single-index \
--barcode-mismatches=2 \
--run=/path/to/run/directory/ \
--csv=samplesheet.csv
Make sure that the samplesheet.csv file contains the nucleotide sequences for only the first index:
Lane,Sample,Index
*,sample1,ATGGCTTGTG
*,sample2,CCTTCTAGAG
Run Cell Ranger mkfastq to demultiplex using only the i5 index with 2 allowed mismatches:
cellranger mkfastq --id=FASTQ_output_directory \
--use-bases-mask=Y28n*,N10,I10,Y90n* \
--filter-single-index \
--barcode-mismatches=2 \
--run=/path/to/run/directory/ \
--csv=samplesheet.csv
Make sure that the samplesheet.csv file contains the nucleotide sequences for only the second index and include the reverse complement sequence if working with data from workflow_b instruments:
Lane,Sample,Index
*,sample1,CACAACATTC
*,sample2,TCGTTGTATT
Please note that using only one index to demultiplex is not usually recommended unless there is a special circumstance such as you have a low quality sequencing run. Demultiplexing with only one index will not help prevent index hopping.
If you are concerned about index hopping, the 10x index-hopping-filter software tool can be used to detect and remove likely index hopped reads from the demultiplexed FASTQs. This tool is available for download here: https://support.10xgenomics.com/docs/index-hopping-filter
The index-hopping-filter software tool will output new, filtered, FASTQs that are suitable for use with cellranger.
Related articles:
How to demultiplex a single indexed library on a dual indexed flow cell?
How to use masking parameter while demultiplexing 10x sequencing data
How can I demultiplex my data if I sequenced 8bp of the index reads instead of 10bp?