Question: Are there any parameters I can adjust to optimize the performance of Supernova?
Answer: There is only one assembly parameter which can be supplied to Supernova: the number of reads to use, which is specified with the --maxreads
option. The optimal number of reads to use can be inferred based on the genome size estimate. The recommendation is to target 56x coverage, using the number of bases in the raw input reads (i.e. prior to any trimming).
The calculation is:
Number of Input Reads = (Estimated Genome Size*56)/(Read Length in bp)
Using 150 bp for the read length, this is where the 1.2 billion read recommendation for human comes from:
Number of Input Reads (Human) = (3,200,000,000*56)/150 = 1,194,666,667 reads
This will typically result in an effective coverage of approximately 42x in the final assembly.
Note that not all genomes will necessarily have the same optimum coverage. There are examples in our test genomes where we used up to 70x raw input coverage. It may therefore be worthwhile to vary --maxreads
to see the relationship for your specific genome.
For more information please see Achieving Success with De Novo Assembly and Supernova 2.0 Performance.