8000 Compressed waveforms bank workflow by GarethCabournDavies · Pull Request #4969 · gwastro/pycbc · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Compressed waveforms bank workflow #4969

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

GarethCabournDavies
Copy link
Contributor

In order to make the generation of the bank with compressed waveforms much easier, and making this more straightforward for future developers/users, I've put this into a workflow.

I've also made changes which mean that when running pycbc_compress_bank, we can exclude certain waveforms; e.g. SPAtmplt, and these will be ignored when making the compressed bank.

When fetching the compressed bank, the fact that the template hash key isn't present will then mean that the waveform is just generated as normal.

Standard information about the request

This is a new feature
This change affects the offline search
This change changes nothing for scientific output, just makes things easier
This change: has appropriate unit tests, follows style guidelines (See e.g. PEP8), has been proposed using the contribution guidelines

This change will require additional dependencies for the CI: sbank
sbank will also need updated from its curent form as it needs a tweak i made to the hdf5 bank combiner there (I will link PR when I have made it)

Contents

I'll split this into subsections:

Minor changes to compression

  • I've put in some changes to how the compression is done/used.
    • Dont make a compressed waveform group for certain waveforms according to a new command line option
    • Make the waveform as normal if the compressed waveform group isn't in the bank

Making a workflow to generated the compressed waveform bank(s)

So that we can parallelise nicely, I've put this into a workflow which:

  • Splits the bank
  • Adds compressed waveforms to each split bank
  • Recombines these banks into any number of banks
    • Note that the offline search only uses compressed waveforms for the inspiral jobs, i.e. where the bank is split anyway.
  • Plot the waveform compression in histograms / scatter points and add to a results page

The plots are similar to:

image

the number of / types of these plots can be chosen in the config file

Links to any issues or associated PRs

sbank PR:

Testing performed

The workflow was run, and the results page can be seen at https://ldas-jobs.ligo.caltech.edu/~gareth.cabourndavies/testoutput/compress_bank_workflow/testing/compress_bank_results_page_new/

I will add more tests (e.g. profiling) soon

  • The author of this pull request confirms they will adhere to the code of conduct

@@ -50,5 +50,8 @@ pip install -r companion.txt
echo -e ">> [`date`] installing mpi4py"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we using this script anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm - that might be why Im having trouble getting the CI to work! I'll have a look once i am back from holiday

@GarethCabournDavies
Copy link
Contributor Author

Note this needs gwastro/sbank#64

I'm not sure whether to temporarily point to my branch, or to wait for a release

@GarethCabournDavies GarethCabournDavies force-pushed the compress_bank_workflow branch 2 times, most recently from a725f05 to 867d6ff Compare December 16, 2024 16:37
Copy link
Contributor
@spxiwh spxiwh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One change requested.


try :
if not (self.has_compressed_waveforms and self.enable_compressed_waveforms):
raise ValueError
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit concerned that we could have a situation where get_decompressed_waveform isn't working, but we don't know because we're always falling back on the slower method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a UserWarning so that if this happens, at least there is evidence, and the message of which includes the approximant as well, so that it is immediately obvious if something is wrong

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Note that as the message changes, it will not be supressed for different approximants when repeated, see https://docs.python.org/3/library/warnings.html#repeated-warning-suppression-criteria)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant that we should distinguish between self.has_compressed_waveforms being false, where we should silently fall back, and a call to self.get_decompressed_waveform( failing, which would indicate a genuine problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Not clear why self.get_decompressed_waveform failures would not be ValueErrors???)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes sense. I've moved to a combined if/else with try/except which doesn't rely on the get_decompressed_waveform function not raising a ValueError

@@ -1122,12 +1122,13 @@ def __init__(self, ifos, exe_name, segs, file_url=None,
exe_name: string
A short description of the executable description, tagging
only the program that ran this job.
segs : igwn_segments.segment or igwn_segments.segmentlist
segs : igwn_segments.segment or igwn_segments.segmentlist or None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is rejected (or at least musn't be bundled in with this patch). Following other workflows, we provide a "dummy" segment if there is not a logical one to give, but a segment should still be given.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been reverted - it means that the files in the workflow are all named "...-0-0.ext" or similar, but that is a minor thing

Copy link
Contributor
@spxiwh spxiwh left a comment

Choose a reason for hiding this c 685C omment

The reason will be displayed to describe this comment to others. Learn more.

One change requested.

@GarethCabournDavies GarethCabournDavies changed the title Draft: Compressed waveforms bank workflow Compressed waveforms bank workflow Jan 17, 2025
Copy link
Contributor
@spxiwh spxiwh left a comment

Choose a reason for hiding this comment

The reason will be displayed to A36C describe this comment to others. Learn more.

I'm still a bit confused about the fallback logic around calling the compressed waveforms, but will approve at this stage anyway. I don't need to see this again before it's merged if you choose to make any further changes to that logic.

@GarethCabournDavies
Copy link
Contributor Author

CI failure looks to be a Segfault in the 3-IFO coinc jobs - I will double-check that the inspiral output is the same before and after this change before I merge

@GarethCabournDavies GarethCabournDavies merged commit 91db678 into gwastro:master Jan 22, 2025
30 of 31 checks passed
@GarethCabournDavies GarethCabournDavies added the v23_release_branch PRs applied to the v2.3.X release branch or to be cherry-picked if merging to master label Feb 12, 2025
@GarethCabournDavies
Copy link
Contributor Author

@spxiwh - I've just realised that this PR will need cherry-picked onto the v23 release branch so that the mixed-compressed banks (SPATmplt not compressed, SEOB... compressed) will work

Sorry for the confusion

spxiwh pushed a commit to spxiwh/pycbc that referenced this pull request Feb 18, 2025
* Add compressed waveforms to bank workflow

* Allolw plotting script to use any bank conversion parameter

* Some fixes to allow the joined bank to be plotted

* Use inference's parameter labels: they are available and mostly good

* Add mismatch to plotting, make some tweaks

* some tidying

* thinko

* Try to make the CI workflow run

* Fix do-not-compress default

* Use different examples in compress bank workflow

* Proper name for the github workflow

* Thinko

* python shebang in compression workflow script

* minor edits

* move to readily-available waveform

* TRy IMRPhenomD instead

* revert change to workflow.core

* Warn for KeyError in get_decompressed_waveform

* Fix issue with if get_decompressed_waveform raised a ValueError
< F4CA div > @spxiwh spxiwh mentioned this pull request Feb 18, 2025
spxiwh pushed a commit to spxiwh/pycbc that referenced this pull request Feb 27, 2025
* Add compressed waveforms to bank workflow

* Allolw plotting script to use any bank conversion parameter

* Some fixes to allow the joined bank to be plotted

* Use inference's parameter labels: they are available and mostly good

* Add mismatch to plotting, make some tweaks

* some tidying

* thinko

* Try to make the CI workflow run

* Fix do-not-compress default

* Use different examples in compress bank workflow

* Proper name for the github workflow

* Thinko

* python shebang in compression workflow script

* minor edits

* move to readily-available waveform

* TRy IMRPhenomD instead

* revert change to workflow.core

* Warn for KeyError in get_decompressed_waveform

* Fix issue with if get_decompressed_waveform raised a ValueError
spxiwh added a commit that referenced this pull request Feb 27, 2025
* Merge part of 4997 to avoid failure

* Bump to v2310

* Compressed waveforms bank workflow (#4969)

* Add compressed waveforms to bank workflow

* Allolw plotting script to use any bank conversion parameter

* Some fixes to allow the joined bank to be plotted

* Use inference's parameter labels: they are available and mostly good

* Add mismatch to plotting, make some tweaks

* some tidying

* thinko

* Try to make the CI workflow run

* Fix do-not-compress default

* Use different examples in compress bank workflow

* Proper name for the github workflow

* Thinko

* python shebang in compression workflow script

* minor edits

* move to readily-available waveform

* TRy IMRPhenomD instead

* revert change to workflow.core

* Warn for KeyError in get_decompressed_waveform

* Fix issue with if get_decompressed_waveform raised a ValueError

* Combined plotifar (#5034)

* added plot script

* cleanups

* remove now unused bits

* Generalize fit plotting

* rename script

* Added page_farstat in summary (#5052)

* Fix release naming

* Stat correction patch (#5061)

* Limit number of stage output jobs

* Reorganize FAR/stat plots on summary page (#5061)

* Get the dq files into a nice layout (#5064)

* Sphinx version CI fix (#5060)

* Update pycbc_page_fars_vs_stat (#5067)

---------

Co-authored-by: Gareth S Cabourn Davies <gareth.cabourndavies@ligo.org>
Co-authored-by: Thomas Dent <thomas.dent@usc.es>
Co-authored-by: Rahul Dhurkunde <rahul.dhurkunde@ldas-pcdev6.ligo.caltech.edu>
@GarethCabournDavies GarethCabournDavies removed the v23_release_branch PRs applied to the v2.3.X release branch or to be cherry-picked if merging to master label Mar 14, 2025
@GarethCabournDavies GarethCabournDavies deleted the compress_bank_workflow branch April 17, 2025 10:42
khunsang pushed a commit to khunsang/pycbc that referenced this pull request May 23, 2025
* Add compressed waveforms to bank workflow

* Allolw plotting script to use any bank conversion parameter

* Some fixes to allow the joined bank to be plotted

* Use inference's parameter labels: they are available and mostly good

* Add mismatch to plotting, make some tweaks

* some tidying

* thinko

* Try to make the CI workflow run

* Fix do-not-compress default

* Use different examples in compress bank workflow

* Proper name for the github workflow

* Thinko

* python shebang in compression workflow script

* minor edits

* move to readily-available waveform

* TRy IMRPhenomD instead

* revert change to workflow.core

* Warn for KeyError in get_decompressed_waveform

* Fix issue with if get_decompressed_waveform raised a ValueError
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0