Question: Is there a way to get the read counts for each barcode in addition to UMIs?
Answer: Most customers only want the UMI counts because it corrects for amplification bias. If you are interested in the read counts, then you can extract them from the
possorted_genome_bam.bam file with some custom coding. The Linux command shown below requires
samtools, a copy of which can be found in your Cell Ranger installation.
Before starting, you should source the following file so Linux knows where to find
samtools. Please be sure to make changes to the bolded part below depending on where Cell Ranger is installed:
Afterwards, the following command will compute reads per barcode. Try it out with the first 1000 reads to see if the output matches expectation. If the test run checks out, then you can remove the bolded "| head -n 1000" to process the entire bam file.
samtools view possorted_genome_bam.bam | head -n 1000 | grep CB:Z: | sed 's/.*CB:Z:\([ACGT]*\).*/\1/' | sort | uniq -c > reads_per_barcode
Please keep in mind that this command can take a while to run since the computer needs to stream through the entire BAM file, and sort millions of barcodes.
Disclaimer: This article and code-snippet are provided for instructional purposes only. 10x Genomics does not support or guarantee the code.