8000 QUAST_BINS can be very slow / Reconfigure BIN_SUMMARY table · Issue #789 · nf-core/mag · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
QUAST_BINS can be very slow / Reconfigure BIN_SUMMARY table #789
Open 74DA
@prototaxites

Description

@prototaxites

Description of the bug

Creating issue as requested. This was over a year ago now but I had trouble with the QUAST_BINS process failing when running on very large numbers of bins.

My thought at the time was that this has something to do with the fact that it is implemented as a loop over each input FASTA file: https://github.com/nf-core/mag/blob/master/modules/local/quast_bins.nf, and that it's doing some kind of ORF calling for each bin.

edit: MetaQUAST calls genes using MetaGeneMark according to the paper: https://academic.oup.com/bioinformatics/article/32/7/1088/1743987

"In contrast with regular QUAST which uses GeneMarkS, MetaQUAST uses MetaGeneMark ([Zhu et al., 2010]) for gene prediction, which is developed specially for metagenomes."

edit 2:

The local module has --rna-finding --gene-finding set, so gene finding is turned on (this is disabled by default). A quick fix (ignoring SeqKit stuff) is to just remove these arguments - the pipeline profiles genes using Prodigal and Prokka anyway

Command used and terminal output

Relevant files

No response

System information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0