-
Notifications
You must be signed in to change notification settings - Fork 8
merge the dev to master to 2.0 version release #86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Co-authored-by: James A. Fellows Yates <jfy133@gmail.com>
Co-authored-by: James A. Fellows Yates <jfy133@gmail.com>
update the zenodo doi.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great effort, and it's obvious you've put a lot of time into this!
It seems there are a couple of repetitive things that I'd like to see addressed (beyond what @mirpedrol and @SPPearce pointed out)
- Inline scripts in Nextflow files. Those should be in
bin/
or in a custom module. Just a nf-core thing, it makes it easier to maintain. I'm guessing you did that to be able to test the nextflow local modules, but since you've written so many custom scripts, I think it might be better to just test them with testthat, or another CLI tester. They're really extensive, so it would be nice to have confidence in them! - I'd like to see the hicexplorer, homer, etc. modules live in nf-core/modules
- Make an image for those scary java tools 🙈
- Homer modules need to use fasta input rather than the outside of Nextflow genome install. I'm not sure if it's working as intended as is, it would install the genome in the process' workdir.
- Clean up subworkflows only outputting versions, and empty outputs. It'll make it more maintainable.
So I don't think this is entirely up to nf-core standards of maintainability, but I think in the name of progress we merge this and maybe tag it with 2.0.0-rc
or something if 2.0.0
is controversial. Also, up to @jianhong and how they want to release it.
Before I would approve this PR, I'd like to see a roadmap tracked in the milestones of these issues to be addressed from this PR (and maybe with a list of modules to move scripts out of, for example). I think those will be easier to address in separate, smaller PRs.
We should also make sure proper repo access is given to @jianhong, so they can work on branches in this repo and create smaller PRs, so the review process doesn't take multiple months, and so we can review the scientific ideas in the code changes.
CHANGELOG.md
Outdated
- update to nf-core-template-2.7.2 | ||
- replace the dots by '\_' in the samples name | ||
- add support for MseI. | ||
- add TAD, AB compartments, APA analysis (see available tools in usage documentation) | ||
- add additional methods for interaction caller | ||
- adjust default resource requirement | ||
- update the CITAIONS | ||
- re-arrange the output folder to meet the requirements of multiple analysis tools | ||
- updated multipe documentations | ||
- remove the parameter for java resources requirements | ||
- update annotations from 5k,10k bins to nearest R2 peaks | ||
- removed local_modules tests | ||
- add scale factor to atac reads coverage calculation | ||
- change the circos plot style | ||
- update reads_summary table to include unmapped/multiple mapped info | ||
- add the possibility to subsample to balance the input reads | ||
- add parameter to let user input the 5 end sequence to cutadapt step | ||
- change the cutadapt error tolerance from 0 to 0.15 | ||
- changes in hipeak caller: | ||
- add filter condition for lowest count number as 1. | ||
- change the cutoff value for hipeak type assignment step from 12 to 3. | ||
- fix multiple bugs: | ||
- the sorting method for huge bed file; | ||
- the post count for hipeak when there is empty Interactions; | ||
- add totalLinks parameter for prepare_circos; | ||
- the issue if bplapply does not work in differential analysis for hipeak; | ||
- fix the space issue for enzyme_cut.nf; | ||
- fix the chromosome style for homer TFEA analysis and annotation by ChIPpeakAnno; | ||
- fix the duplicated imported modules; | ||
- fix multiple typos. | ||
- use local SAMTOOLS_SORT module; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to ask if there were any links to these. But then I remembered this was all done on a fork. Maybe link some commits?
I think we need to add something to the nf-core docs for people to reach out if they're developing on a pipeline.
WT,2,https://raw.githubusercontent.com/nf-core/test-datasets/hicar/data/genomics/homo_sapiens/T2/WT_rep2_R1.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/hicar/data/genomics/homo_sapiens/T2/WT_rep2_R2.fastq.gz,, | ||
KD,1,https://raw.githubusercontent.com/nf-core/test-datasets/hicar/data/genomics/homo_sapiens/T2/KD_rep1_R1.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/hicar/data/genomics/homo_sapiens/T2/KD_rep1_R2.fastq.gz,, | ||
KD,2,https://raw.githubusercontent.com/nf-core/test-datasets/hicar/data/genomics/homo_sapiens/T2/KD_rep2_R1.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/hicar/data/genomics/homo_sapiens/T2/KD_rep2_R2.fastq.gz,, | ||
WT.dot,1,https://raw.githubusercontent.com/nf-core/test-datasets/hicar/data/genomics/homo_sapiens/T1/WT_rep1_R1.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/hicar/data/genomics/homo_sapiens/T1/WT_rep1_R2.fastq.gz,, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it would break stuff, or are the groups separated using .
?
conf/test.config
Outdated
|
||
// Output folder | ||
outdir = 'test_results' | ||
outdir = 'results' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe these were removed because they could cause people using it on cloud to write the results to never write to a bucket. It is just the test config, just a consideration.
conf/test_full.config
Outdated
@@ -18,7 +20,7 @@ params { | |||
input = "${projectDir}/assets/test_full_samplesheet.csv" | |||
|
|||
// Output folder | |||
outdir = 'full_test_results' | |||
outdir = 'results' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the above comment, this one probably doesn't need to be set by default, it'll just result in a foot gun that's worse than users have to specify --outdir results
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason this isn't a script in bin/
?
|
||
input: | ||
tuple val(meta), path(hic) | ||
path hic_tools_jar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be in a container. I'm guessing it's https://github.com/aidenlab/HiCTools.
Nvm I looked at it and that looks like a nightmare, I guess I'll get to the config and this points at the jar download?
val long_bedpe_postfix | ||
val short_bed_postfix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be args in a config
modules/local/re_cut.nf
Outdated
cat <<-END_VERSIONS > versions.yml | ||
"${task.process}": | ||
python: \$(echo \$(python --version) | sed 's/Python //') | ||
END_VERSIONS | ||
|
||
restriction_enzyme_cutsite.py $enzyme |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cat <<-END_VERSIONS > versions.yml | |
"${task.process}" A3DB span>: | |
python: \$(echo \$(python --version) | sed 's/Python //') | |
END_VERSIONS | |
restriction_enzyme_cutsite.py $enzyme | |
restriction_enzyme_cutsite.py $enzyme | |
cat <<-END_VERSIONS > versions.yml | |
"${task.process}": | |
python: \$(echo \$(python --version) | sed 's/Python //') | |
END_VERSIONS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could why this is a local module get documented? What'st the reason for not using a patch and the nf-core module? That will help keep this maintainable.
nextflow.config
Outdated
|
||
// Boilerplate options | ||
outdir = null | ||
outdir = 'results' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
outdir = 'results' | |
outdir = null |
merge template 2.9
fix a issue in if id contains dots.
Dev remove output folder from config file
fix the meta id issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huge work @jianhong! 😄
Apart from my suggestions in the code, I agree with @emiller88 summary
README.md
Outdated
@@ -22,7 +19,7 @@ The pipeline can also handle the experiment of HiChIP, ChIA-PET, and PLAC-Seq. I | |||
|
|||
The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible. The [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl2.html) implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies. Where possible, these processes have been submitted to and installed from [nf-core/modules](https://github.com/nf-core/modules) in order to make them available to all nf-core pipelines, and to everyone within the Nextflow community! | |||
|
|||
On release, automated continuous integration tests run the pipeline on a full-sized dataset on the AWS cloud infrastructure. This ensures that the pipeline runs on AWS, has sensible resource allocation defaults set to run on real-world datasets, and permits the persistent storage of results to benchmark between pipeline releases and other analysis sources. The results obtained from the full-sized test can be viewed on the [nf-core website](https://nf-co.re/hicar/results). | |||
On release, automated continuous integration tests run the pipeline on a full-sized dataset on the AWS cloud infrastructure. This ensures that the pipeline runs on AWS, has sensible resource allocation defaults set to run on real-world datasets, and permits the persistent storage of results to benchmark between pipeline releases and other analysis sources.The results obtained from the full-sized test can be viewed on the [nf-core website](https://nf-co.re/hicar/results). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On release, automated continuous integration tests run the pipeline on a full-sized dataset on the AWS cloud infrastructure. This ensures that the pipeline runs on AWS, has sensible resource allocation defaults set to run on real-world datasets, and permits the persistent storage of results to benchmark between pipeline releases and other analysis sources.The results obtained from the full-sized test can be viewed on the [nf-core website](https://nf-co.re/hicar/results). | |
On release, automated continuous integration tests run the pipeline on a full-sized dataset on the AWS cloud infrastructure. This ensures that the pipeline runs on AWS, has sensible resource allocation defaults set to run on real-world datasets, and permits the persistent storage of results to benchmark between pipeline releases and other analysis sources. The results obtained from the full-sized test can be viewed on the [nf-core website](https://nf-co.re/hicar/results). |
process { | ||
publishDir = [ | ||
path: { "${params.outdir}/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" }, | ||
mode: params.publish_dir_mode, | ||
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, | ||
enabled: false | ||
saveAs: { filename -> filename.equals('versions.yml') ? null : filename } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be better to enable it to false
by default again and not need to declare some processes just to avoid the publication of its files (e.g. GUNZIP
)?
|
||
// Output folder | ||
outdir = 'test_results' | ||
input = "${projectDir}/assets/test_multi_samplesheet.csv" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this not be available in test-datasets instead ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally like to keep the test samplesheets in the repo as examples as well. That ensures they stay up to date! 😆
@@ -17,9 +17,6 @@ params { | |||
// Input data for full size test | |||
input = "${projectDir}/assets/test_full_samplesheet.csv" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as my comment above
@@ -20,17 +20,17 @@ params { | |||
// Input data for full size test | |||
input = "${projectDir}/assets/test_multi_samplesheet.csv" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, and any other input
instance in the configs
genome).tads | ||
ch_versions = HOMER_FINDTADSANDLOOPS_TADS.out.versions | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HOMER_FINDMOTIFSGENOME(ch_new_bed, additional_param) | ||
ch_versions = ch_versions.mix(HOMER_FINDMOTIFSGENOME.out.versions) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
break | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
|
||
emit: | ||
versions = ch_versions // channel: [ versions.yml ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also here only version is emitted, is this correct?
versions = ch_versions // channel: [ versions.yml ] | ||
v4c = VIRTUAL4C_BY_COOLTOOLS.out.v4c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
versions = ch_versions // channel: [ versions.yml ] | |
v4c = VIRTUAL4C_BY_COOLTOOLS.out.v4c | |
v4c = VIRTUAL4C_BY_COOLTOOLS.out.v4c | |
versions = ch_versions // channel: [ versions.yml ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also add channel content description
Fix multiple minor typos for review comments.
This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation. |
if: > | ||
contains(github.event.comment.html_url, '/pull/') && | ||
contains(github.event.comment.body, '@nf-core-bot fix linting') && | ||
github.repository == 'nf-core/hicar' | ||
runs-on: ubuntu-latest | ||
steps: | ||
# Use the @nf-core-bot token to check out so we can push later | ||
- uses: actions/checkout@v3 | ||
with: | ||
token: ${{ secrets.nf_core_bot_auth_token }} | ||
|
||
# Action runs on the issue comment, so we don't get the PR by default | ||
# Use the gh cli to check out the PR | ||
- name: Checkout Pull Request | ||
run: gh pr checkout ${{ github.event.issue.number }} | ||
env: | ||
GITHUB_TOKEN: ${{ secrets.nf_core_bot_auth_token }} | ||
|
||
- uses: actions/setup-node@v3 | ||
|
||
- name: Install Prettier | ||
run: npm install -g prettier @prettier/plugin-php | ||
|
||
# Check that we actually need to fix something | ||
- name: Run 'prettier --check' | ||
id: prettier_status | ||
run: | | ||
if prettier --check ${GITHUB_WORKSPACE}; then | ||
echo "result=pass" >> $GITHUB_OUTPUT | ||
else | ||
echo "result=fail" >> $GITHUB_OUTPUT | ||
fi | ||
|
||
- name: Run 'prettier --write' | ||
if: steps.prettier_status.outputs.result == 'fail' | ||
run: prettier --write ${GITHUB_WORKSPACE} | ||
|
||
- name: Commit & push changes | ||
if: steps.prettier_status.outputs.result == 'fail' | ||
run: | | ||
git config user.email "core@nf-co.re" | ||
git config user.name "nf-core-bot" | ||
git config push.default upstream | ||
git add . | ||
git status | ||
git commit -m "[automated] Fix linting with Prettier" | ||
git push |
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 17 days ago
To fix the issue, we need to add a permissions
block to the workflow. Since the workflow involves reading repository contents, checking out pull requests, and pushing changes, we will explicitly define the required permissions. The minimal permissions required are:
contents: write
(to push changes to the repository).pull-requests: write
(to interact with pull requests).
The permissions
block should be added at the root level of the workflow to apply to all jobs unless overridden.
-
Copy modified lines R6-R9
@@ -5,2 +5,6 @@ | ||
|
||
permissions: | ||
contents: write | ||
pull-requests: write | ||
|
||
jobs: |
- name: Install Prettier | ||
run: npm install -g prettier @prettier/plugin-php | ||
|
||
# Check that we actually need to fix something | ||
- name: Run 'prettier --check' |
Check failure
Code scanning / CodeQL
Checkout of untrusted code in a privileged context Critical
issue_comment
- name: Run 'prettier --write' | ||
if: steps.prettier_status.outputs.result == 'fail' | ||
run: prettier --write ${GITHUB_WORKSPACE} | ||
|
||
- name: Commit & push changes |
Check failure
Code scanning / CodeQL
Checkout of untrusted code in a privileged context Critical
PR checklist
nf-core lint
).nextflow run . -profile test,docker --outdir <OUTDIR>
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).