8000 crps ensembles is faster and more robust by nicholasloveday · Pull Request #694 · nci/scores · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

crps ensembles is faster and more robust #694

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Oct 16, 2024

Conversation

nicholasloveday
Copy link
Collaborator

No description provided.

@nicholasloveday nicholasloveday force-pushed the 693-crps-fcst-spread-term branch from 8a6fc61 to f07ea42 Compare September 16, 2024 20:14
@nicholasloveday
Copy link
Collaborator Author

I've done some performance testing and this approach leads to some big speed-ups.

I also refactored the unit tests as I was adding more

@nicholasloveday nicholasloveday changed the title (under development) crps ensembles now has an unvectorised option (ready for review) crps ensembles now has an unvectorised option Sep 16, 2024
@nicholasloveday nicholasloveday changed the title (ready for review) crps ensembles now has an unvectorised option (under development) crps ensembles now has an unvectorised option Sep 16, 2024
@nicholasloveday nicholasloveday changed the title (under development) crps ensembles now has an unvectorised option (ready for review) crps ensembles now has an unvectorised option Sep 16, 2024
@nicholasloveday
Copy link
Collaborator Author

One additional change that I would like to make before this is merged is to add an additional arg to be able to choose where or not to use dask to parallelise the loop. It will need a little refactoring.

@tennlee
Copy link
Collaborator
tennlee commented Oct 12, 2024

I have started some investigation in response to this. I think more realistic testing is needed. So far on synthetic data, the new unvectorised option is faster than the original code. Should that bear out, then the original code should be replaced rather than having the flags added. Using dask didn't seem to change the execution time particularly, but didn't hurt either.

@nicholasloveday
Copy link
Collaborator Author

I have some some benchmarking. I don't think that the parallelisation of the loop with dask offers any significant benefit.

For 40x40 fcst/obs with 700 ensemble members:

  • Vectorised = 26.9 s ± 1 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
  • Unvectorised = 5.93 s ± 36.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

For higher ensemble member counts memory errors start occurring. Even using dask is problematic as it has to do a extremely large broadcast

For 600x600 fcst/obs with 50 ensemble members:

  • Vectorised = 34.2 s ± 2.22 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
  • Unvectorised = 5.38 s ± 6.98 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

For 1200x1200 fcst/obs with 50 ensemble members:

  • Vectorised = 15.5 s ± 396 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
  • Unvectorised = 3.68 s ± 5.63 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

I plan to simplify this to only do an unvectorised calculation with no dask option. I'll create a separate issue to explore if numba + xr.apply_ufunc can be used to speed up the for loop

@nicholasloveday nicholasloveday force-pushed the 693-crps-fcst-spread-term branch from 31e84f7 to 16f13ba Compare October 15, 2024 22:29
@nicholasloveday
Copy link
Collaborator Author

@tennlee I have simplified this pull request to only contain the simple loop calculation of the forecast spread term.

I left in the refactoring of the unit tests as you will see.

@tennlee
Copy link
Collaborator
tennlee commented Oct 16, 2024

Thanks, this looks great. I'll merge it shortly.

@tennlee tennlee changed the title (ready for review) crps ensembles now has an unvectorised option crps ensembles now has an unvectorised option Oct 16, 2024
@tennlee tennlee changed the title crps ensembles now has an unvectorised option crps ensembles is faster and more robust Oct 16, 2024
@tennlee tennlee merged commit 9504cb7 into nci:develop Oct 16, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

memory issues with fcst_spread term for many ensembles in crps_for_ensembles
2 participants
0