Description
Hello everyone, I am using canu to assemble reads that were sequenced by pacbio. I am running on an hpc cluster and my command is as follows
canu -p <project_name> genomeSize=900m -pacbio-raw corMhapFilterThreshold=0.0000000002 corMhapOptions="--threshold 0.80 --num-hashes 512 --num-min-matches 3 --ordered-sketch-size 1000 --ordered-kmer-size 14 --min-olap-length 2000 --repeat-idf-scale 50" mhapMemory=60g mhapBlockSize=500 ovlMerDistinct=0.975 gridEngineArrayMaxJobs=99 <sample1>.fastq.gz <sample2>.fastq.gz <sample3>.fastq.gz
Most of the settings you see above were obtained from the faq in the canu website (https://canu.readthedocs.io/en/latest/faq.html#my-assembly-is-running-out-of-space-is-too-slow) but even after implementing all these steps, my assembly has been running for days and is at 100 terabytes and counting. I was wondering if there is any other way to speed the process up and reduce storage space. Thank you all for your help