8000 Add statistic correction option by spxiwh · Pull Request #5061 · gwastro/pycbc · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add statistic correction option #5061

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Feb 27, 2025
Merged

Conversation

spxiwh
Copy link
Contributor
@spxiwh spxiwh commented Feb 25, 2025

This patch adds a statistic_correction "statistic keyword" in stat.py. This option will add a constant factor to all returned statistic values.

The immediate problem this patch sets out to fix is this plot (from https://inspirehep.net/literature/2053424)

image

This shows the desired behaviour using the state of the art statistic as of a few years ago. In particular HL doubles lie above other things.

When adding the new KDE term we see this relative behaviour change and H/L singles lie above the doubles line, causing singles to be weighted more than HL doubles when combining. Using this option we can downrank singles relative to doubles (and add appropriate factors when considering 3-ifo behaviour).

However, the patch is broader than that, it could be used (for e.g.) to try to make 0 correspond to a 50/50 chance of being real ... Or any other thing you could dream up where we just want to scale the statistic.

I've tested that this produces the desired effect.

@spxiwh spxiwh requested a review from tdent February 25, 2025 09:25
@spxiwh spxiwh force-pushed the pr_stat_correction branch from f60b9bb to 922a9af Compare February 25, 2025 10:37
@tdent
Copy link
Contributor
tdent commented Feb 25, 2025

How does this deal with different event types? Is that already in the stat parsing machinery, and if so can you remind me where?

Copy link
Contributor
@tdent tdent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See suggested comment changes & the question about event types.

@spxiwh
Copy link
Contributor Author
spxiwh commented Feb 25, 2025

Different event types are handled in the config file. SO one specifies something like:

[sngls-h1&sngls-l1]
statistic-keywords = correction_factor:-3.7
minimum-stat = -4

[sngls-v1]
statistic-keywords = correction_factor:-7.4
minimum-stat = -7

[coinc-h1l1]
statistic-keywords = correction_factor:-1

[coinc-l1v1]
statistic-keywords = correction_factor:-3.7

[coinc-h1v1]
statistic-keywords = correction_factor:-3.7


[coinc-h1l1v1]
statistic-keywords = correction_factor:0

(old naming for this argument, but same idea).

@tdent
Copy link
Contributor
tdent commented Feb 25, 2025

Right, that is what I suspected & seems the easiest way of doing this (modulo the case where we would want a different number for the same event type in different coinc times, but that seems not to be needed quite yet).

spxiwh and others added 4 commits February 25, 2025 12:45
Co-authored-by: Thomas Dent <thomas.dent@usc.es>
Co-authored-by: Thomas Dent <thomas.dent@usc.es>
Co-authored-by: Thomas Dent <thomas.dent@usc.es>
Co-authored-by: Thomas Dent <thomas.dent@usc.es>
@spxiwh
Copy link
Contributor Author
spxiwh commented Feb 25, 2025

So to me the only remaining concern is around benchmark_lograte and if this adds redundancy. Is there anything I should do about this, or are folks happy?

@GarethCabournDavies
Copy link
Contributor

So to me the only remaining concern is around benchmark_lograte and if this adds redundancy. Is there anything I should do about this, or are folks happy?

I am happy with the redundancy - all the benchmark_lograte means is that the numbers are a bit more natural this way

Co-authored-by: Thomas Dent <thomas.dent@usc.es>
@tdent
Copy link
Contributor
tdent commented Feb 25, 2025

If both the benchmark lograte and this term are used in exactly the same ranking statistics, we ought to be able to get rid of one of them by defining a constant, eg BENCHMARK_LOGRATE = -14.6 at the module level to replace the class attribute in ln_noise_rate -= self.benchmark_lograte.

This leaves the output identical but takes out one redundant option which could confuse any users who tried to understand all the optional kwargs.
How do people feel about adding that 'trivial' change here?

@tdent tdent added offline search v23_release_branch PRs applied to the v2.3.X release branch or to be cherry-picked if merging to master labels Feb 25, 2025
@tdent tdent self-requested a review February 25, 2025 21:24
@tdent
Copy link
Contributor
tdent commented Feb 25, 2025

I guess alternatively, in the spirit of keeping the production branch changes as simple as possible, we could cherry pick this minimal change to the 2.3 branch and then clean up the benchmark redundancy on master with a separate PR.

Copy link
Contributor
@tdent tdent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve, if we agree to address the redundancy of options that just shift the statistic by a constant at least on master.

@spxiwh spxiwh merged commit 0a16d83 into gwastro:master Feb 27, 2025
31 checks passed
@spxiwh spxiwh deleted the pr_stat_correction branch February 27, 2025 12:52
spxiwh added a commit to spxiwh/pycbc that referenced this pull request Feb 27, 2025
spxiwh added a commit to spxiwh/pycbc that referenced this pull request Feb 27, 2025
spxiwh added a commit to spxiwh/pycbc that referenced this pull request Feb 27, 2025
spxiwh added a commit to spxiwh/pycbc that referenced this pull request Feb 27, 2025
spxiwh added a commit that referenced this pull request Feb 27, 2025
* Merge part of 4997 to avoid failure

* Bump to v2310

* Compressed waveforms bank workflow (#4969)

* Add compressed waveforms to bank workflow

* Allolw plotting script to use any bank conversion parameter

* Some fixes to allow the joined bank to be plotted

* Use inference's parameter labels: they are available and mostly good

* Add mismatch to plotting, make some tweaks

* some tidying

* thinko

* Try to make the CI workflow run

* Fix do-not-compress default

* Use different examples in compress bank workflow

* Proper name for the github workflow

* Thinko

* python shebang in compression workflow script

* minor edits

* move to readily-available waveform

* TRy IMRPhenomD instead

* revert change to workflow.core

* Warn for KeyError in get_decompressed_waveform

* Fix issue with if get_decompressed_waveform raised a ValueError

* Combined plotifar (#5034)

* added plot script

* cleanups

* remove now unused bits

* Generalize fit plotting

* rename script

* Added page_farstat in summary (#5052)

* Fix release naming

* Stat correction patch (#5061)

* Limit number of stage output jobs

* Reorganize FAR/stat plots on summary page (#5061)

* Get the dq files into a nice layout (#5064)

* Sphinx version CI fix (#5060)

* Update pycbc_page_fars_vs_stat (#5067)

---------

Co-authored-by: Gareth S Cabourn Davies <gareth.cabourndavies@ligo.org>
Co-authored-by: Thomas Dent <thomas.dent@usc.es>
Co-authored-by: Rahul Dhurkunde <rahul.dhurkunde@ldas-pcdev6.ligo.caltech.edu>
@GarethCabournDavies GarethCabournDavies removed the v23_release_branch PRs applied to the v2.3.X release branch or to be cherry-picked if merging to master label Mar 14, 2025
GarethCabournDavies pushed a commit to GarethCabournDavies/pycbc that referenced this pull request May 12, 2025
* Add statistic correction option

* Update pycbc/events/stat.py

Co-authored-by: Thomas Dent <thomas.dent@usc.es>

* Update pycbc/events/stat.py

Co-authored-by: Thomas Dent <thomas.dent@usc.es>

* Update pycbc/events/stat.py

Co-authored-by: Thomas Dent <thomas.dent@usc.es>

* Update pycbc/events/stat.py

Co-authored-by: Thomas Dent <thomas.dent@usc.es>

* Update pycbc/events/stat.py

Co-authored-by: Thomas Dent <thomas.dent@usc.es>

---------

Co-authored-by: Thomas Dent <thomas.dent@usc.es>
khunsang pushed a commit to khunsang/pycbc that referenced this pull request May 23, 2025
* Add statistic correction option

* Update pycbc/events/stat.py

Co-authored-by: Thomas Dent <thomas.dent@usc.es>

* Update pycbc/events/stat.py

Co-authored-by: Thomas Dent <thomas.dent@usc.es>

* Update pycbc/events/stat.py

Co-authored-by: Thomas Dent <thomas.dent@usc.es>

* Update pycbc/events/stat.py

Co-authored-by: Thomas Dent <thomas.dent@usc.es>

* Update pycbc/events/stat.py

Co-authored-by: Thomas Dent <thomas.dent@usc.es>

---------

Co-authored-by: Thomas Dent <thomas.dent@usc.es>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0