8000 GitHub - erblast/snakemake_minimal: minimal snakemake repo for mixing R and python scripts
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

erblast/snakemake_minimal

Repository files navigation

Snakemake Build Status snakemake report gitrepo

snakemake minimal workflow

In Snakefile a set of rules are supplied on the basis of which output files are supposed to be produced by the workflow.

It is customary to start with rule all a blank rule that uses all final output files as input files. snakemake will go through the rest of the rules and create an execution sequence for all rules based on the first rule. It will also determine which steps can be executed in parallel.

Run in docker container

docker run -it --rm -v "$PWD":/app erblast/r_conda_snakemake_pkgs

Execute

snakemake

Dryrun

snakemake -n

Execute after code changes

snakemake -R `snakemake --list-code-changes`

Force re-execution

snakemake -F

Parallel Processing

snakemake --cores 3

Execute and build conda environment

The conda environment will be reconstructed from yml file and stored in ./.snakemake/conda. A single conda environment can be defined for each rule.

conda env export --name snakemake_minimal -f ./envs/snake_minimal_macos.yml
snakemake --use-conda

Bringing it all together

snakemake -R `snakemake --list-code-changes` --use-conda --cores 3

Visualize workflow

snakemake --dag | dot -Tpng > ./docs/wflow.png

Build Report

snakemake --report docs/index.html

YAML configuration file

config.yml

Shell vs Scripts

Scripts in R and python have access to a snakemake object carrying all rule parameters as attributes. However when shell commands can be constructed snakemake's parallel processing and logging capabilities can be leveraged.

R Scripts and Markdown

R scripts can be added as .R or as .Rmd. When they are added as .Rmd they can only produce one single html-output file. A workaround is to use an intermediate R script as shown in rule.

see rules plot_rmd_direct and plot_rmd_via_script in Snakefile

Python Scripts and Jupyter Notebooks

Python scripts can be added as .py files. We can use papermill to execute parametrized jupyter notebooks which we can then render as html. html is preferred to notebooks because there is no doubt about the execution state.

see rules plot_execute_nb and plot_nb_2_html Snakefile

** the rules for rendering notebooks are not compatible with nb_conda as is.**

Testing

All common R functions are collected in an R package under utilR which is checked and tested

Benchmarking

Execution times of each rule are stored in ./benchmark. Can be defined in Snakefile

Logging

unfortunately logging is not supported for scripts thus needs to be setup for each script individually using script-language-specific tools. https://bitbucket.org/snakemake/snakemake/issues/917/enable-stdout-and-stderr-redirection

About

minimal snakemake repo for mixing R and python scripts

Resources

Stars

Watchers

Forks

Releases

No releases published
3E37

Packages

No packages published
0