USAGE

8000

Usage

Nextflow workflows are fairly portable in general due to using containers for each process and requires either Docker or Singularity.

Installation is covered on a different wiki page.

Grandeur can run as a standalone workflow and start from paired-end Illumina fastq files or on the contig/fasta files from other sources including

fasta/contig files from PHOENIX
fasta files from Donut Falls
fasta files downloaded from NCBI

We have created some examples for typical use-cases on a different wiki page. We welcome additional suggestions to this page, and communications with us may be anonymized and used to augment this wiki.

Choosing the right profile

For simplificity, Grandeur has some profiles which should fit the majority of uses. The workflow is meant to work with containers and has basic profiles for both docker and singularity.

Choose a container manager

singularity : use singularity to manage containers
docker : use docker to manage containers

Test profiles

All test profiles download reads from SRA accessions with fasterq-dump. As such, if fasterq-dump fails due to authentication issues, the entire workflow will fail. Most information about the test subworkflow can be found on a different wiki page.

Selected test profiles:

test0 : default values for fastq files
test2 : default phylogenetic analysis

Phylogenetic analysis and other profiles

- msa                    : for multiple sequence alignment with roary (all inputs should be related) of input files
- just_msa               : for multiple sequence alignment with roary (all inputs should be related) of input files, and turns off processes not directly used
- uphl                   : the profile used at UPHL (is not intended to work on other systems)

WARNING: All input files for *msa* profiles must all be somewhat related (i.e. same species) because they need to share enough genes in their core genome.

Setting where the files are copied.

For all settings and inputs, the results are copied to the directory specified with the 'outdir' param. The default is 'grandeur'.

Standalone

Paired-end fastq.gz (ending with 'fastq', 'fastq.gz', 'fq', or 'fq.gz') reads in a directory named 'directory/reads'

(can be set in a config file with 'params.reads')

directory
└── reads
     └── *fastq.gz

Usage:

nextflow run UPHL-BioNGS/Grandeur -profile docker --reads directory/reads

From a sample sheet

When using a sample sheet, Grandeur is expecting a csv file with columns 'sample', 'fastq_1', and 'fastq_2'.

sample : value used in Grandeur for filenames
fastq_1 : forward read or read 1 of a paired-end fastq file
fastq_2 : reverse read or read 2 of a paired-end fastq file

Example sample sheet:

sample,fastq_1,fastq_2
SRR11725329,/home/eriny/sandbox/test_files/grandeur/reads/SRR11725329_1.fastq.gz,/home/eriny/sandbox/test_files/grandeur/reads/SRR11725329_2.fastq.gz
SRR13643280,/home/eriny/sandbox/test_files/grandeur/reads/SRR13643280_1.fastq.gz,/home/eriny/sandbox/test_files/grandeur/reads/SRR13643280_2.fastq.gz
SRR14436834,/home/eriny/sandbox/test_files/grandeur/reads/SRR14436834_1.fastq.gz,/home/eriny/sandbox/test_files/grandeur/reads/SRR14436834_2.fastq.gz
SRR14634837,/home/eriny/sandbox/test_files/grandeur/reads/SRR14634837_1.fastq.gz,/home/eriny/sandbox/test_files/grandeur/reads/SRR14634837_2.fastq.gz
SRR7738178,/home/eriny/sandbox/test_files/grandeur/reads/SRR7738178_1.fastq.gz,/home/eriny/sandbox/test_files/grandeur/reads/SRR7738178_2.fastq.gz
SRR7889058,/home/eriny/sandbox/test_files/grandeur/reads/SRR7889058_1.fastq.gz,/home/eriny/sandbox/test_files/grandeur/reads/SRR7889058_2.fastq.gz

Usage:

nextflow run UPHL-BioNGS/Grandeur -profile docker --sample_sheet sample_sheet.csv

Fasta files

Fasta files could be from prior versions of Grandeur, created by another workflow, or downloaded from NCBI. There are two options of reading fasta files into Grandeur.

Option 1 : Putting all the fasta files into a single directory (must end in '.fasta', '.fa', or '.fna')

directory
└── fastas
     └── *fasta

Then following with the nextflow command

nextflow run UPHL-BioNGS/Grandeur -profile docker --fastas directory/fastas

Option 2 : Listing the fasta files in a file and specifying via --fasta_list (which is similar to a sample sheet)

Example fasta list:

sample1.fasta
sample2.fasta
sample3.fasta

Then following with the nextflow command

nextflow run UPHL-BioNGS/Grandeur -profile docker --fasta_list fastas.txt

After PHOENIX

PHOENIX is a nextflow workflow developed for the identification of known antimicrobial resistance (AMR) genes, and has the core features of de novo alignment for contig file generation. The authors of Grandeur do not see any real benefit of running a new de novo alignment on reads again, so the resultant contig/fasta files from PHOENIX can be used as input instead of fastq files.

Copy the PHOENIX-generated contig files to a directory, and then specify that directory with '--fastas ' or set 'params.fastas = ' in a config file.

Fasta files (ending with 'fa', 'fasta', or 'fna')

(can be set in a config file with 'params.fastas')

directory
└── fastas
     └── *fasta

Usage:

nextflow run UPHL-BioNGS/Grandeur -profile docker --fastas directory/fastas

This does essentially mean that any fasta file fed into Grandeur will attempt to go through the subworkflows and processes, so we request that users only post issues about using microbial sequence files. (Do not give Grandeur [Candida] auris files!)

Computational environments

Grandeur is a nextflow workflow should work on

local linux instances (as long as the cpu and memory is sufficient for the tools used)
HPC environments
cloud-based systems that support nextflow workflows (such as AWS)

Each of these environments may need inputs from the user.

More information can be found in Nextflow's documentation. A highlighted list of pages that may be useful includes:

information about setting an executor : https://www.nextflow.io/docs/latest/executor.html
tips for hpc users : https://www.nextflow.io/blog/2021/5_tips_for_hpc_users.html
using nextflow in AWS cloud : https://www.nextflow.io/docs/latest/awscloud.html

To get a copy of an editable config file with many of the params needed for some of these options can be obtained with the following command:

nextflow run UPHL-BioNGS/Grandeur --config_file true

More information about config files can be found on a different page of this wiki.

Home
Installation
Usage
Subworkflows
Processes
- amrfinderplus
- bbduk
- blastn
- blobtools_*
- core_genome_evaluation
- circulocov
- datasets_*
- drprg
- elgato
- emmtyper
- fastani
- fastp
- fastqc
- heatcluster
- iqtree2
- kaptive
- kleborate
- kraken2
- mash_*
- mashtree
- mlst
- multiqc
- mykrobe
- panaroo
- pbptyper
- phytreeviz
- plasmidfinder
- prokka
- quast
- seqsero2
- serotypefinder
- shigatyper
- snp_dists
- spades
User supplied reference files and databases (optional)
- blobtools
- mash
- fastani
- kraken2
FAQ

USAGE

Usage

Choosing the right profile

Choose a container manager

Test profiles

Phylogenetic analysis and other profiles

Setting where the files are copied.

Standalone

Paired-end fastq.gz (ending with 'fastq', 'fastq.gz', 'fq', or 'fq.gz') reads in a directory named 'directory/reads'

From a sample sheet

Fasta files

After PHOENIX

Fasta files (ending with 'fa', 'fasta', or 'fna')

Computational environments

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally