Question: How can I demultiplex 10x multiome libraries sequenced on Illumina NextSeq 2000?
Answer: If you have version 2.x of Cell Ranger ARC, the pipeline supports FASTQ files generated onboard by these sequencers. However, if you use the older version 1.x of Cell Ranger ARC, then it is necessary to download Illumina's newer BCL Convert software and follow the steps described below.
To start, download and install BCL Convert from Illumina here. This example uses BCL Convert v3.7.5 for CentOS 6. You can use a command like this to install from rpm:
rpm2cpio ./BCLConvertv3.7.5forCentOS6.rpm | cpio -idmv
In this example, the data were sequenced on two flow cells, the gene expression data on AAACJHTHV and the ATAC data on AAAF35NHV, so BCL Convert must be run twice, for the GEX data and the ATAC data separately.
Here is an example CSV sample sheet for the GEX data:
[Header]
FileFormatVersion,2
RunName,MyRun
InstrumentPlatform,NextSeq
InstrumentType,NexSeq2000
[Reads]
Read1Cycles,28
Read2Cycles,90
Index1Cycles,10
Index2Cycles,10
[BCLConvert_Settings]
CreateFastqForIndexReads,0
[BCLConvert_Data]
Lane,Sample_ID,index,index2
1,OR337_PBMC_500_rep1_A,GTAGCCCTGT,GAGCATCTAT
2,OR337_PBMC_5k_rep1_A,CGCGGTAGGT,CAGGATGTTG
Here is an example CSV for the ATAC data, note that the reads and settings differ from the GEX data:
[Header]
FileFormatVersion,2
RunName,MyRun
InstrumentPlatform,NextSeq
InstrumentType,NexSeq2000
[Reads]
Read1Cycles,50
Read2Cycles,49
Index1Cycles,8
Index2Cycles,24
[BCLConvert_Settings]
CreateFastqForIndexReads,1
TrimUMI,0
OverrideCycles,Y50;I8;U24;Y49
[BCLConvert_Data]
Lane,Sample_ID,index
1,OR373_G19_rep1_ATAC_NextSeq_2k_bclconvert,ATTGGGAA
1,OR373_G19_rep1_ATAC_NextSeq_2k_bclconvert,CAGTCTGG
1,OR373_G19_rep1_ATAC_NextSeq_2k_bclconvert,GGCATACT
1,OR373_G19_rep1_ATAC_NextSeq_2k_bclconvert,TCACACTC
Note that the same sample is listed 4 times on separate lines, one for each index. This is because the 10x ATAC libraries are single-indexed with a mixture of 4 oligonucleotides per index. All 4 index sequences must be listed, otherwise all of the reads will not be assigned to the FASTQ files.
To run BCL Convert, use a command like this in your jobscript, first for the GEX data:
/path/to/usr/bin/bcl-convert --bcl-input-directory /path/to/nxseq2k001a/210208_VH00391_3_AAACJHTHV --output-directory outs_gex --sample-sheet /path/to/AAACJHTHV_GEX.csv
Run it again to produce the ATAC FASTQs:
/path/to/usr/bin/bcl-convert --bcl-input-directory /path/to/nxseq2k001a/210329_VH00391_19_AAAF35NHV --output-directory outs_atac --sample-sheet /path/to/AAAF35NHV.csv
Finally, once you have successfully run BCL Convert on both flow cells, follow these instructions to run cellranger-arc count on both sets of FASTQ files. No renaming of the FASTQs is necessary.