8000 Apply mask to cube by mspelman07 · Pull Request #2014 · metoppv/improver · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Apply mask to cube #2014

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 23, 2024
Merged

Apply mask to cube #2014

merged 4 commits into from
Jul 23, 2024

Conversation

mspelman07
Copy link
Contributor

Ticket: https://metoffice.atlassian.net/browse/EPPT-1305
Acceptance test data: metoppv/improver_test_data#50

This PR adds the ability to apply a mask to a cube

Testing:

  • Ran tests and they passed OK
  • Added new tests for the new feature(s)

@mspelman07 mspelman07 added the EPP PR's related to Enhancing Post Processing team label Jul 15, 2024
Copy link
codecov bot commented Jul 15, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.32%. Comparing base (84a8944) to head (a6ed3d3).
Report is 18 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2014      +/-   ##
==========================================
- Coverage   98.39%   98.32%   -0.07%     
==========================================
  Files         124      130       +6     
  Lines       12212    12630     +418     
==========================================
+ Hits        12016    12419     +403     
- Misses        196      211      +15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor
@cpelley cpelley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10000

We should ensure that we don't require the plugin to be provided a cubelist, only that we internally create a cubelist from what is provided (hence the use of the as_cubelist helper function).

Comment on lines +45 to +54
cube_names = [cube.name() for cube in cubes]
if len(cubes) != 2:
raise ValueError(
f"""Two cubes are required for masking, a mask and the cube to be masked.
Provided cubes are {cube_names}"""
)

mask = cubes.extract_cube(mask_name)
cubes.remove(mask)
cube = cubes[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems a bit arbitrary to limit this to 2 cube only 🤷‍♂️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that the inputs would be a mask and the cube you want to mask. In theory you could make the plugin work with multiple cube (i.e. you input a mask and a list of cubes you want to apply the mask to and apply the mask to each input cube) but then you'd have to output all the cubes that you have. As we (mostly) require plugins to only output one cube it would be odd to make an exceptionfor this plugin.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we (mostly) require plugins to only output one cube it would be odd to make an exceptionfor this plugin.

Don't worry about it now. I think this is something worth considering in future. I took such an approach with some plugins as part of the CLI/plugin refactor work. Considering a plugin as being an operator, taking 1 or more cube to apply to that operator doesn't preclude it from being used with only 1 and extending its support would offer more options for graph construction.

cube = cubes[0]

# Ensure mask is in a boolean form and invert if requested
mask.data = mask.data.astype(np.bool)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you check whether this array is a masked array, you assume that it isn't I think??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sort of assuming that the mask isn't masked (it would be an odd thing to do) but if you did input a masked array I don't think it would break. The masked points would just end up with no mask is the new cube.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. I typically add a masked array unittest for things myself as I find mask handling somewhat inconsistent across numpy. - up to you.

Copy link
Contributor
@Katie-Howard Katie-Howard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have looked through the code changes and the added tests, and have made sure the tests run through (unit and acceptance) which is fine. I just have a minor comment about a change that I think should be made, but am otherwise happy 👍

A cube with a mask applied to the data.

Raises:
ValueError: If the number of cubes provided is not equal to 2.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you also mention the dimension_coordinate_mismatch valueError that gets raised (from line 68)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor
@Katie-Howard Katie-Howard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with the change that has been made, thanks Marcus 👍

Copy link
Contributor
@cpelley cpelley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to you on whether you add a mask array unittest. I appreciate that such input is unlikely.
Cheers

Copy link
Contributor
@cpelley cpelley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, a couple more things I didn't think of until now.

  1. What was your reasoning being not making apply_mask a Plugin?

From a technical point of view it doesn't have to be of course.
However, as a Plugin object, by convention, input data would be supplied in the callable and parameters provided in the class initialisation.

Illustration:

ApplyMask(mask_name=<mask-name>, invert_mask=False)(cube, mask_cube)

  1. Is parameter mask_name necessary?

I would generally minimise the number of parameters if at all possible and to me this parameters suggest filling in for metadata from the input masked cube no?

That is, can't you simply have mask_array check for which cube is a landsea mask or ocean mask (through their names)?

From above, if you know that it is a land mask or ocean mask, supplying an additional invert_mask parameter would only be necessary if we supplied land/ocean input alongside an ocean mask or land/ocean input alongside a land mask (as we can't so easily identify whether the input cube is a land or ocean field - the cube we want to apply the mask, not the mask cube).

@cpelley cpelley self-requested a review July 23, 2024 06:56
@mspelman07
Copy link
Contributor Author

@cpelley

  1. You could write this as a plugin rather than a function. I think of a class/Plugin as being a way to link multiple functions together in a single command to help clarify more complicated functionarlity. For this code it is relatively simple and short so it doesn't feel like it makes much sense to put it into an independent plugin.

  2. Its to help keep the plugin as agnostic as possible. mask_name allows someone to provide any mask without restricting it to a land sea mask. Also even a land sea mask could have a variety of names. We call ours "land_binary_mask" but it doesn't necessarilly mean that collaboraters or anyone else will call it this so rather than assume or require some naming convention its tidiest to let the user tell the plugin the name of the mask cube to help with any ambiguity.
    The invert_mask option is something we need for the EPP workflow where we need to mask the land points of one cube and the sea points of a different cube. It's nicer to provide the option to invert a mask before applying it here rather than having a separate ancillary file for masking sea points.

@cpelley
Copy link
Contributor
cpelley commented Jul 23, 2024

it doesn't feel like it makes much sense to put it into an independent plugin.

That's fine. I just wanted to double there wasn't some convention, implied or otherwise.

...provide any mask without restricting it to a land sea mask.

Fair enough 👍

Copy link
Contributor
@cpelley cpelley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mspelman07

@mspelman07 mspelman07 merged commit 707436b into metoppv:master Jul 23, 2024
10 checks passed
MoseleyS added a commit to MoseleyS/improver that referenced this pull request Aug 23, 2024
…distance

* refs/heads/master:
  Height of max (metoppv#2020)
  Migrate CLI functionality: simple_bias_correction (metoppv#2018)
  BUG: Fix Scheduled Tests Sphinx-Pytest-Coverage (metoppv#2021)
  Plugin discovery (metoppv#2009)
  Adds masked_add to cube combiner (metoppv#2015)
  Apply mask to cube (metoppv#2014)
  Make difference module handle circular cubes (metoppv#2016)

# Conflicts:
#	improver/utilities/spatial.py
#	improver_tests/utilities/spatial/test_DifferenceBetweenAdjacentGridSquares.py
MoseleyS pushed a commit to MoseleyS/improver that referenced this pull request Aug 23, 2024
* Add apply_mask cli and function

* remove print statement

* Updates to make sure plugin doesn't require cubelist input

* Update Docstring

---------

Co-authored-by: Marcus Spelman <marcus.spelman@metoffice.gov.uk>
MoseleyS added a commit to MoseleyS/improver that referenced this pull request Aug 23, 2024
…distance

* refs/heads/master:
  Height of max (metoppv#2020)
  Migrate CLI functionality: simple_bias_correction (metoppv#2018)
  BUG: Fix Scheduled Tests Sphinx-Pytest-Coverage (metoppv#2021)
  Plugin discovery (metoppv#2009)
  Adds masked_add to cube combiner (metoppv#2015)
  Apply mask to cube (metoppv#2014)
  Make difference module handle circular cubes (metoppv#2016)

# Conflicts:
#	improver/utilities/spatial.py
#	improver_tests/utilities/spatial/test_DifferenceBetweenAdjacentGridSquares.py
MoseleyS pushed a commit to MoseleyS/improver that referenced this pull request Aug 23, 2024
* Add apply_mask cli and function

* remove print statement

* Updates to make sure plugin doesn't require cubelist input

* Update Docstring

---------

Co-authored-by: Marcus Spelman <marcus.spelman@metoffice.gov.uk>
MoseleyS added a commit to MoseleyS/improver that referenced this pull request Aug 23, 2024
…distance

* refs/heads/master:
  Height of max (metoppv#2020)
  Migrate CLI functionality: simple_bias_correction (metoppv#2018)
  BUG: Fix Scheduled Tests Sphinx-Pytest-Coverage (metoppv#2021)
  Plugin discovery (metoppv#2009)
  Adds masked_add to cube combiner (metoppv#2015)
  Apply mask to cube (metoppv#2014)
  Make difference module handle circular cubes (metoppv#2016)

# Conflicts:
#	improver/utilities/spatial.py
#	improver_tests/utilities/spatial/test_DifferenceBetweenAdjacentGridSquares.py
bayliffe pushed a commit that referenced this pull request Aug 27, 2024
* Added test for equal-area grids and got it passing

Made input test fixture configurable

Got test for gradient on lat/long grid failing for correct reasons (response is in m/s/degree; needs to be /s). Commented out regridding test for now.

Hacked about and more or less got latlon distance calculation working but made a lot of mess. Plan to abstract it to its own class so I can test it properly and allow it to be reused.

First test for distance class fails in the right way

Got happy path test passing

Added more tests. Did some tidying

Prepared to implement equalarea logic. Wrote failing test.

Test for equalarea passes

Tidying up

Gradient test passes

Improved tests. Nearly working. Problem with generating expected data for y when interpolated back out to original dims.

Added extra test cases and got all tests working. Reduced required absolute precision due to unexplained weirdness in how iris linear interpolation works when regridding with large gradients.

Tidied up tests

Ran black against test file then re-squared the arrays

Tidying

Got non-standard units working for equal area projections. Not going to implement for latlong projections, as anything other than degrees is rare, and will not fail silently in any case

Removed unused test

Moderately graceless exception handling. Open to suggestions for a better way to work out which coordinates represent latitude and longitude.

Fixed cube instantiation to allow better way of checking projection type

More tidying

Tidied imports in spatial utilities

Review comments - Tuple type hinting fix

Ran autoformatting tools

Added docstrings for some methods

Moved static method back to original location

Added docstrings to all of the new functions in spatial.py

Added some more docstrings to test helper functions

Review actions around error handling

Numpy efficiency improvement

Review actions and tidying

Added extra tests and made distance module handle all coordinate systems that provide distances between grid points

Added a todo comment. Will be actioned shortly

spelling correction

Co-authored-by: Stephen Moseley <stephen.moseley@metoffice.gov.uk>

more spelling corrections

Co-authored-by: Stephen Moseley <stephen.moseley@metoffice.gov.uk>

Yet more spelling corrections

Co-authored-by: Stephen Moseley <stephen.moseley@metoffice.gov.uk>

And another

Co-authored-by: Stephen Moseley <stephen.moseley@metoffice.gov.uk>

Spelling corrections: Meters -> Metres

Switched to using earth radius supplied with input cube rather than hard-coded constant

Added docstrings to distance tests

Review actions

Co-authored-by: Stephen Moseley <stephen.moseley@metoffice.gov.uk>

tidied up y latlon distances method

Tidied up x distances calculation

Arguably simplified latlon distance methods. Going to try to split out further using factory pattern. Will revert if it just makes things worse.

Tests passing after factory refactor. Still needs lots of tidying.

Finished tidying

A little more tidying

Review actoin

Updated dosctring and deleted corresponding one on child class as sphinx handles docstring inheritence

Fixed gradient module broken by refactor of distance module

Docstring change.

Ran autoformatting

Added appropriate decoration to distances base class

Added failing tests to gradient module for wrapping around the meridian - still need to move grid points used in tests to cover whole globe at 120 degree increments.

Made test cube generation work with non-equal x and y spacing

Update improver/utilities/spatial.py

Co-authored-by: Stephen Moseley <stephen.moseley@metoffice.gov.uk>

Docstring improvements

Working on gradient tests. They seem to only work for 10 degrees separations. Expected values are correct, actuals wrong.

Amended test data to give gradients large enough that no tests pass due to both expected and actual being too small to tell the difference between them

Fixed bug causing tests to fail for non wrap-around case.

Spelling corrections

Updated docstring

Switched to 120 degree longitude separations. Some tests now failing. Expected values correct; mid way through debugging actuals.

Got part way to testing wrap-around interpolaiton. Found a bug with non wrap-around calculations for large angles along the way, and got part way through fixing it

Added tests for different longitude separations to distance module and fixed some bugs. Added failing tests to Gradient module for circular latlong grid

More faffing with tests

Amalgamated some distance tests into single parameterised function. Fixed testing bug

More test tidying

Distance tests fail correctly for circular cubes

Started work to make distance caluclation work for circular coords

Removed dependency on diffs from projection cube subclass. Did some tidying.

Made gradient stop passing diffs to distances method

Added test for projected cubes which wrap around the meridian

Made difference module handle cubes that wrap around the meridian

Added another failing test for difference module. Differences calculated correctly but need to add extra dimension for the wrap-around distance.

Driveby docstring fix

Removed some completed todos. Added a couple of new ones

Removed another completed tod

Got distance module passing test for circualar coords

Make difference module handle circular cubes (#2016)

* Resolved merge conflict

* Added another failing test for difference module. Differences calculated correctly but need to add extra dimension for the wrap-around distance.

* Resolved merge conflicts

* Auto-formatting

* Import Bugfix

* Ran isort against spatial.py

* Added test for appropriate error for unhandled cubes and made code match it.

* iSort on spatial.py again

* Another attempt at autoformatting

* Tidied up

* Removed confusing variable name

* Autoformatting

* Tried manually sorting import order as isort still failing in github actions despite running locally and pushing

* And again

* And again

* Linting suggestion

* Review Actions

* Ran isort to fix gha

* Added test for cubes with flipped axes (lonitude increasing along columns not rows and vice versa for latitude). Fixed code to handle this case.

* black, isort, flake8

* Ran isort in isolation

* Fixed test failures resulting from breaking changes made by upstream PR

* Black

* updates as per review comments and adds test for 0-360 degree case

* update comment

* formatting

---------

Co-authored-by: Marcus Spelman <marcus.spelman@metoffice.gov.uk>

* Removes duplicate unit test that has been superseded.

* Fixes calculation of x coord for distances cube with circular coord.

* Updates licence in new file.

* Updates units from "metres" to "m".

* Refactors tests for utilities/spatial into a subdir as per convention.

* Improves doc-strings

* Deals with todos:

- circular projections raise error
- make_test_cube uses set_up_variable_cube
- black

* Deals with todos:

- Tests non-uniform lat-lon grids

* Deals with todos:

- No more "full haversine" refs to delete

* Deals with todos:

- Equalarea circular grid with DifferenceBetweenAdjacentGridSquares

* Deals with todos:

- Moves one into a doc-string
- Adds another about unit test complexity

* Style

* Style

* Style

* Codecov - removes `pass` from abstract methods as these can never be reached and are unnecessary.

* Adds test for uncovered raises statement

* Simplifies tests for GradientBetweenAdjacentGridSquares which led to:

- Simplification of calculation of coordinate points
- Always raise error for circular projections
- Output cubes inherit metadata from input cube

* Adds doc-string and strengthens a test

* Fixes handling of mandatory attributes

* Style

* Adds metadata and error tests for regrid=True mode

* Adds data tests for regrid=True mode

* Adjusts test data so that we can see obvious expected behaviour with circular data.

* Removes redundant precision specifier

* Corrects doc-string

* Corrects metadata expected outputs

* Fixes test that only fails in some environments.

* Moves cube argument to process method for consistency with other plugin behaviour.

* Adds doc-string

* Improves doc-string

* Removes todo comment

* Updates doc-strings following review.

* Updates tests following review

* Make difference module handle circular cubes (#2016)

* Resolved merge conflict

* Added another failing test for difference module. Differences calculated correctly but need to add extra dimension for the wrap-around distance.

* Resolved merge conflicts

* Auto-formatting

* Import Bugfix

* Ran isort against spatial.py

* Added test for appropriate error for unhandled cubes and made code match it.

* iSort on spatial.py again

* Another attempt at autoformatting

* Tidied up

* Removed confusing variable name

* Autoformatting

* Tried manually sorting import order as isort still failing in github actions despite running locally and pushing

* And again

* And again

* Linting suggestion

* Review Actions

* Ran isort to fix gha

* Added test for cubes with flipped axes (lonitude increasing along columns not rows and vice versa for latitude). Fixed code to handle this case.

* black, isort, flake8

* Ran isort in isolation

* Fixed test failures resulting from breaking changes made by upstream PR

* Black

* updates as per review comments and adds test for 0-360 degree case

* update comment

* formatting

---------

Co-authored-by: Marcus Spelman <marcus.spelman@metoffice.gov.uk>

* Apply mask to cube (#2014)

* Add apply_mask cli and function

* remove print statement

* Updates to make sure plugin doesn't require cubelist input

* Update Docstring

---------

Co-authored-by: Marcus Spelman <marcus.spelman@metoffice.gov.uk>

* Adds masked_add to cube combiner (#2015)

* Adds masked_add to cube combiner

* Remove unnecessary import

* formatting

* add docstrings and simplifications

* formatting

---------

Co-authored-by: Marcus Spelman <marcus.spelman@metoffice.gov.uk>

* Plugin discovery (#2009)

* BUG: Fix Scheduled Tests Sphinx-Pytest-Coverage (#2021)

Pulling over pinned packages introduced to the `conda-forge.yml` into the `latest.yml` file.
#2011

* Migrate CLI functionality: simple_bias_correction (#2018)

* refactored simple_bias_correction cli

* docstring and contributing

* changes for actions

* sphinx building locally without error

* whitespace change

* bugfix: test upper_bound

* changes based on review

* review changes

* review - doc-strings

* Height of max (#2020)

* Plugin for calculating the height of the maximum vertical velocity once the maximum vertical velocity has been calculated.

* Minor code updates.

* Deleting or adding required spaces.

* Updating some comments.

* Adds masked_add to cube combiner

* Remove unnecessary import

* formatting

* Updating and improving unit tests, and adding cli.

* Isort changes.

* Formatting changes.

* Sorting out trailing whitespaces.

* Sorting out formatting.

* flake8 change.

* Further changes to get acceptance tests to work.

* Changes to get acceptance test data to run.

* add docstrings and simplifications

* formatting

* update checksums and correct acceptance test bug

* Update default name and change low_or_high to boolean

* updates acceptance tests

* formatting

---------

Co-authored-by: Kathryn.Howard <kathryn.howard@metoffice.gov.uk>
Co-authored-by: Marcus Spelman <marcus.spelman@metoffice.gov.uk>

* Removes differences in spatial.py w.r.t. reviewed version

* Resolves differences with reviewed version

* Make difference module handle circular cubes (#2016)

* Resolved merge conflict

* Added another failing test for difference module. Differences calculated correctly but need to add extra dimension for the wrap-around distance.

* Resolved merge conflicts

* Auto-formatting

* Import Bugfix

* Ran isort against spatial.py

* Added test for appropriate error for unhandled cubes and made code match it.

* iSort on spatial.py again

* Another attempt at autoformatting

* Tidied up

* Removed confusing variable name

* Autoformatting

* Tried manually sorting import order as isort still failing in github actions despite running locally and pushing

* And again

* And again

* Linting suggestion

* Review Actions

* Ran isort to fix gha

* Added test for cubes with flipped axes (lonitude increasing along columns not rows and vice versa for latitude). Fixed code to handle this case.

* black, isort, flake8

* Ran isort in isolation

* Fixed test failures resulting from breaking changes made by upstream PR

* Black

* updates as per review comments and adds test for 0-360 degree case

* update comment

* formatting

---------

Co-authored-by: Marcus Spelman <marcus.spelman@metoffice.gov.uk>

* Resolves differences with reviewed version again

---------

Co-authored-by: Marcus Spelman <marcus.spelman@metoffice.gov.uk>
Co-authored-by: mspelman07 <99179165+mspelman07@users.noreply.github.com>
Co-authored-by: Carwyn Pelley <carwyn.pelley@metoffice.gov.uk>
Co-authored-by: SamGriffithsMO <122271903+SamGriffithsMO@users.noreply.github.com>
Co-authored-by: Kathryn.Howard <kathryn.howard@metoffice.gov.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EPP PR's related to Enhancing Post Processing team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0