Apply mask to cube #2014

mspelman07 · 2024-07-15T08:54:40Z

Ticket: https://metoffice.atlassian.net/browse/EPPT-1305
Acceptance test data: metoppv/improver_test_data#50

This PR adds the ability to apply a mask to a cube

Testing:

Ran tests and they passed OK
Added new tests for the new feature(s)

codecov · 2024-07-15T09:01:38Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.32%. Comparing base (84a8944) to head (a6ed3d3).
Report is 18 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #2014      +/-   ##
==========================================
- Coverage   98.39%   98.32%   -0.07%     
==========================================
  Files         124      130       +6     
  Lines       12212    12630     +418     
==========================================
+ Hits        12016    12419     +403     
- Misses        196      211      +15

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

cpelley

10000

We should ensure that we don't require the plugin to be provided a cubelist, only that we internally create a cubelist from what is provided (hence the use of the as_cubelist helper function).

improver/cli/apply_mask.py

improver/utilities/mask.py

cpelley · 2024-07-15T13:45:06Z

improver/utilities/mask.py

+    cube_names = [cube.name() for cube in cubes]
+    if len(cubes) != 2:
+        raise ValueError(
+            f"""Two cubes are required for masking, a mask and the cube to be masked.
+                         Provided cubes are {cube_names}"""
+        )
+
+    mask = cubes.extract_cube(mask_name)
+    cubes.remove(mask)
+    cube = cubes[0]


seems a bit arbitrary to limit this to 2 cube only 🤷‍♂️

I was thinking that the inputs would be a mask and the cube you want to mask. In theory you could make the plugin work with multiple cube (i.e. you input a mask and a list of cubes you want to apply the mask to and apply the mask to each input cube) but then you'd have to output all the cubes that you have. As we (mostly) require plugins to only output one cube it would be odd to make an exceptionfor this plugin.

As we (mostly) require plugins to only output one cube it would be odd to make an exceptionfor this plugin.

Don't worry about it now. I think this is something worth considering in future. I took such an approach with some plugins as part of the CLI/plugin refactor work. Considering a plugin as being an operator, taking 1 or more cube to apply to that operator doesn't preclude it from being used with only 1 and extending its support would offer more options for graph construction.

cpelley · 2024-07-15T13:48:29Z

improver/utilities/mask.py

+    cube = cubes[0]
+
+    # Ensure mask is in a boolean form and invert if requested
+    mask.data = mask.data.astype(np.bool)


Should you check whether this array is a masked array, you assume that it isn't I think??

I am sort of assuming that the mask isn't masked (it would be an odd thing to do) but if you did input a masked array I don't think it would break. The masked points would just end up with no mask is the new cube.

Fair enough. I typically add a masked array unittest for things myself as I find mask handling somewhat inconsistent across numpy. - up to you.

Katie-Howard

I have looked through the code changes and the added tests, and have made sure the tests run through (unit and acceptance) which is fine. I just have a minor comment about a change that I think should be made, but am otherwise happy 👍

Katie-Howard · 2024-07-16T12:40:35Z

improver/utilities/mask.py

+        A cube with a mask applied to the data.
+
+    Raises:
+    ValueError: If the number of cubes provided is not equal to 2.


Should you also mention the dimension_coordinate_mismatch valueError that gets raised (from line 68)?

Katie-Howard

I'm happy with the change that has been made, thanks Marcus 👍

cpelley

Up to you on whether you add a mask array unittest. I appreciate that such input is unlikely.
Cheers

cpelley

Sorry, a couple more things I didn't think of until now.

What was your reasoning being not making apply_mask a Plugin?

From a technical point of view it doesn't have to be of course.
However, as a Plugin object, by convention, input data would be supplied in the callable and parameters provided in the class initialisation.

Illustration:

ApplyMask(mask_name=<mask-name>, invert_mask=False)(cube, mask_cube)

Is parameter mask_name necessary?

I would generally minimise the number of parameters if at all possible and to me this parameters suggest filling in for metadata from the input masked cube no?

That is, can't you simply have mask_array check for which cube is a landsea mask or ocean mask (through their names)?

From above, if you know that it is a land mask or ocean mask, supplying an additional invert_mask parameter would only be necessary if we supplied land/ocean input alongside an ocean mask or land/ocean input alongside a land mask (as we can't so easily identify whether the input cube is a land or ocean field - the cube we want to apply the mask, not the mask cube).

mspelman07 · 2024-07-23T10:50:34Z

@cpelley

You could write this as a plugin rather than a function. I think of a class/Plugin as being a way to link multiple functions together in a single command to help clarify more complicated functionarlity. For this code it is relatively simple and short so it doesn't feel like it makes much sense to put it into an independent plugin.
Its to help keep the plugin as agnostic as possible. mask_name allows someone to provide any mask without restricting it to a land sea mask. Also even a land sea mask could have a variety of names. We call ours "land_binary_mask" but it doesn't necessarilly mean that collaboraters or anyone else will call it this so rather than assume or require some naming convention its tidiest to let the user tell the plugin the name of the mask cube to help with any ambiguity.
The invert_mask option is something we need for the EPP workflow where we need to mask the land points of one cube and the sea points of a different cube. It's nicer to provide the option to invert a mask before applying it here rather than having a separate ancillary file for masking sea points.

cpelley · 2024-07-23T11:26:46Z

it doesn't feel like it makes much sense to put it into an independent plugin.

That's fine. I just wanted to double there wasn't some convention, implied or otherwise.

...provide any mask without restricting it to a land sea mask.

Fair enough 👍

cpelley

Thanks @mspelman07

…distance * refs/heads/master: Height of max (metoppv#2020) Migrate CLI functionality: simple_bias_correction (metoppv#2018) BUG: Fix Scheduled Tests Sphinx-Pytest-Coverage (metoppv#2021) Plugin discovery (metoppv#2009) Adds masked_add to cube combiner (metoppv#2015) Apply mask to cube (metoppv#2014) Make difference module handle circular cubes (metoppv#2016) # Conflicts: # improver/utilities/spatial.py # improver_tests/utilities/spatial/test_DifferenceBetweenAdjacentGridSquares.py

* Add apply_mask cli and function * remove print statement * Updates to make sure plugin doesn't require cubelist input * Update Docstring --------- Co-authored-by: Marcus Spelman <marcus.spelman@metoffice.gov.uk>

…distance * refs/heads/master: Height of max (metoppv#2020) Migrate CLI functionality: simple_bias_correction (metoppv#2018) BUG: Fix Scheduled Tests Sphinx-Pytest-Coverage (metoppv#2021) Plugin discovery (metoppv#2009) Adds masked_add to cube combiner (metoppv#2015) Apply mask to cube (metoppv#2014) Make difference module handle circular cubes (metoppv#2016) # Conflicts: # improver/utilities/spatial.py # improver_tests/utilities/spatial/test_DifferenceBetweenAdjacentGridSquares.py

* Add apply_mask cli and function * remove print statement * Updates to make sure plugin doesn't require cubelist input * Update Docstring --------- Co-authored-by: Marcus Spelman <marcus.spelman@metoffice.gov.uk>

…distance * refs/heads/master: Height of max (metoppv#2020) Migrate CLI functionality: simple_bias_correction (metoppv#2018) BUG: Fix Scheduled Tests Sphinx-Pytest-Coverage (metoppv#2021) Plugin discovery (metoppv#2009) Adds masked_add to cube combiner (metoppv#2015) Apply mask to cube (metoppv#2014) Make difference module handle circular cubes (metoppv#2016) # Conflicts: # improver/utilities/spatial.py # improver_tests/utilities/spatial/test_DifferenceBetweenAdjacentGridSquares.py

* Added test for equal-area grids and got it passing Made input test fixture configurable Got test for gradient on lat/long grid failing for correct reasons (response is in m/s/degree; needs to be /s). Commented out regridding test for now. Hacked about and more or less got latlon distance calculation working but made a lot of mess. Plan to abstract it to its own class so I can test it properly and allow it to be reused. First test for distance class fails in the right way Got happy path test passing Added more tests. Did some tidying Prepared to implement equalarea logic. Wrote failing test. Test for equalarea passes Tidying up Gradient test passes Improved tests. Nearly working. Problem with generating expected data for y when interpolated back out to original dims. Added extra test cases and got all tests working. Reduced required absolute precision due to unexplained weirdness in how iris linear interpolation works when regridding with large gradients. Tidied up tests Ran black against test file then re-squared the arrays Tidying Got non-standard units working for equal area projections. Not going to implement for latlong projections, as anything other than degrees is rare, and will not fail silently in any case Removed unused test Moderately graceless exception handling. Open to suggestions for a better way to work out which coordinates represent latitude and longitude. Fixed cube instantiation to allow better way of checking projection type More tidying Tidied imports in spatial utilities Review comments - Tuple type hinting fix Ran autoformatting tools Added docstrings for some methods Moved static method back to original location Added docstrings to all of the new functions in spatial.py Added some more docstrings to test helper functions Review actions around error handling Numpy efficiency improvement Review actions and tidying Added extra tests and made distance module handle all coordinate systems that provide distances between grid points Added a todo comment. Will be actioned shortly spelling correction Co-authored-by: Stephen Moseley <stephen.moseley@metoffice.gov.uk> more spelling corrections Co-authored-by: Stephen Moseley <stephen.moseley@metoffice.gov.uk> Yet more spelling corrections Co-authored-by: Stephen Moseley <stephen.moseley@metoffice.gov.uk> And another Co-authored-by: Stephen Moseley <stephen.moseley@metoffice.gov.uk> Spelling corrections: Meters -> Metres Switched to using earth radius supplied with input cube rather than hard-coded constant Added docstrings to distance tests Review actions Co-authored-by: Stephen Moseley <stephen.moseley@metoffice.gov.uk> tidied up y latlon distances method Tidied up x distances calculation Arguably simplified latlon distance methods. Going to try to split out further using factory pattern. Will revert if it just makes things worse. Tests passing after factory refactor. Still needs lots of tidying. Finished tidying A little more tidying Review actoin Updated dosctring and deleted corresponding one on child class as sphinx handles docstring inheritence Fixed gradient module broken by refactor of distance module Docstring change. Ran autoformatting Added appropriate decoration to distances base class Added failing tests to gradient module for wrapping around the meridian - still need to move grid points used in tests to cover whole globe at 120 degree increments. Made test cube generation work with non-equal x and y spacing Update improver/utilities/spatial.py Co-authored-by: Stephen Moseley <stephen.moseley@metoffice.gov.uk> Docstring improvements Working on gradient tests. They seem to only work for 10 degrees separations. Expected values are correct, actuals wrong. Amended test data to give gradients large enough that no tests pass due to both expected and actual being too small to tell the difference between them Fixed bug causing tests to fail for non wrap-around case. Spelling corrections Updated docstring Switched to 120 degree longitude separations. Some tests now failing. Expected values correct; mid way through debugging actuals. Got part way to testing wrap-around interpolaiton. Found a bug with non wrap-around calculations for large angles along the way, and got part way through fixing it Added tests for different longitude separations to distance module and fixed some bugs. Added failing tests to Gradient module for circular latlong grid More faffing with tests Amalgamated some distance tests into single parameterised function. Fixed testing bug More test tidying Distance tests fail correctly for circular cubes Started work to make distance caluclation work for circular coords Removed dependency on diffs from projection cube subclass. Did some tidying. Made gradient stop passing diffs to distances method Added test for projected cubes which wrap around the meridian Made difference module handle cubes that wrap around the meridian Added another failing test for difference module. Differences calculated correctly but need to add extra dimension for the wrap-around distance. Driveby docstring fix Removed some completed todos. Added a couple of new ones Removed another completed tod Got distance module passing test for circualar coords Make difference module handle circular cubes (#2016) * Resolved merge conflict * Added another failing test for difference module. Differences calculated correctly but need to add extra dimension for the wrap-around distance. * Resolved merge conflicts * Auto-formatting * Import Bugfix * Ran isort against spatial.py * Added test for appropriate error for unhandled cubes and made code match it. * iSort on spatial.py again * Another attempt at autoformatting * Tidied up * Removed confusing variable name * Autoformatting * Tried manually sorting import order as isort still failing in github actions despite running locally and pushing * And again * And again * Linting suggestion * Review Actions * Ran isort to fix gha * Added test for cubes with flipped axes (lonitude increasing along columns not rows and vice versa for latitude). Fixed code to handle this case. * black, isort, flake8 * Ran isort in isolation * Fixed test failures resulting from breaking changes made by upstream PR * Black * updates as per review comments and adds test for 0-360 degree case * update comment * formatting --------- Co-authored-by: Marcus Spelman <marcus.spelman@metoffice.gov.uk> * Removes duplicate unit test that has been superseded. * Fixes calculation of x coord for distances cube with circular coord. * Updates licence in new file. * Updates units from "metres" to "m". * Refactors tests for utilities/spatial into a subdir as per convention. * Improves doc-strings * Deals with todos: - circular projections raise error - make_test_cube uses set_up_variable_cube - black * Deals with todos: - Tests non-uniform lat-lon grids * Deals with todos: - No more "full haversine" refs to delete * Deals with todos: - Equalarea circular grid with DifferenceBetweenAdjacentGridSquares * Deals with todos: - Moves one into a doc-string - Adds another about unit test complexity * Style * Style * Style * Codecov - removes `pass` from abstract methods as these can never be reached and are unnecessary. * Adds test for uncovered raises statement * Simplifies tests for GradientBetweenAdjacentGridSquares which led to: - Simplification of calculation of coordinate points - Always raise error for circular projections - Output cubes inherit metadata from input cube * Adds doc-string and strengthens a test * Fixes handling of mandatory attributes * Style * Adds metadata and error tests for regrid=True mode * Adds data tests for regrid=True mode * Adjusts test data so that we can see obvious expected behaviour with circular data. * Removes redundant precision specifier * Corrects doc-string * Corrects metadata expected outputs * Fixes test that only fails in some environments. * Moves cube argument to process method for consistency with other plugin behaviour. * Adds doc-string * Improves doc-string * Removes todo comment * Updates doc-strings following review. * Updates tests following review * Make difference module handle circular cubes (#2016) * Resolved merge conflict * Added another failing test for difference module. Differences calculated correctly but need to add extra dimension for the wrap-around distance. * Resolved merge conflicts * Auto-formatting * Import Bugfix * Ran isort against spatial.py * Added test for appropriate error for unhandled cubes and made code match it. * iSort on spatial.py again * Another attempt at autoformatting * Tidied up * Removed confusing variable name * Autoformatting * Tried manually sorting import order as isort still failing in github actions despite running locally and pushing * And again * And again * Linting suggestion * Review Actions * Ran isort to fix gha * Added test for cubes with flipped axes (lonitude increasing along columns not rows and vice versa for latitude). Fixed code to handle this case. * black, isort, flake8 * Ran isort in isolation * Fixed test failures resulting from breaking changes made by upstream PR * Black * updates as per review comments and adds test for 0-360 degree case * update comment * formatting --------- Co-authored-by: Marcus Spelman <marcus.spelman@metoffice.gov.uk> * Apply mask to cube (#2014) * Add apply_mask cli and function * remove print statement * Updates to make sure plugin doesn't require cubelist input * Update Docstring --------- Co-authored-by: Marcus Spelman <marcus.spelman@metoffice.gov.uk> * Adds masked_add to cube combiner (#2015) * Adds masked_add to cube combiner * Remove unnecessary import * formatting * add docstrings and simplifications * formatting --------- Co-authored-by: Marcus Spelman <marcus.spelman@metoffice.gov.uk> * Plugin discovery (#2009) * BUG: Fix Scheduled Tests Sphinx-Pytest-Coverage (#2021) Pulling over pinned packages introduced to the `conda-forge.yml` into the `latest.yml` file. #2011 * Migrate CLI functionality: simple_bias_correction (#2018) * refactored simple_bias_correction cli * docstring and contributing * changes for actions * sphinx building locally without error * whitespace change * bugfix: test upper_bound * changes based on review * review changes * review - doc-strings * Height of max (#2020) * Plugin for calculating the height of the maximum vertical velocity once the maximum vertical velocity has been calculated. * Minor code updates. * Deleting or adding required spaces. * Updating some comments. * Adds masked_add to cube combiner * Remove unnecessary import * formatting * Updating and improving unit tests, and adding cli. * Isort changes. * Formatting changes. * Sorting out trailing whitespaces. * Sorting out formatting. * flake8 change. * Further changes to get acceptance tests to work. * Changes to get acceptance test data to run. * add docstrings and simplifications * formatting * update checksums and correct acceptance test bug * Update default name and change low_or_high to boolean * updates acceptance tests * formatting --------- Co-authored-by: Kathryn.Howard <kathryn.howard@metoffice.gov.uk> Co-authored-by: Marcus Spelman <marcus.spelman@metoffice.gov.uk> * Removes differences in spatial.py w.r.t. reviewed version * Resolves differences with reviewed version * Make difference module handle circular cubes (#2016) * Resolved merge conflict * Added another failing test for difference module. Differences calculated correctly but need to add extra dimension for the wrap-around distance. * Resolved merge conflicts * Auto-formatting * Import Bugfix * Ran isort against spatial.py * Added test for appropriate error for unhandled cubes and made code match it. * iSort on spatial.py again * Another attempt at autoformatting * Tidied up * Removed confusing variable name * Autoformatting * Tried manually sorting import order as isort still failing in github actions despite running locally and pushing * And again * And again * Linting suggestion * Review Actions * Ran isort to fix gha * Added test for cubes with flipped axes (lonitude increasing along columns not rows and vice versa for latitude). Fixed code to handle this case. * black, isort, flake8 * Ran isort in isolation * Fixed test failures resulting from breaking changes made by upstream PR * Black * updates as per review comments and adds test for 0-360 degree case * update comment * formatting --------- Co-authored-by: Marcus Spelman <marcus.spelman@metoffice.gov.uk> * Resolves differences with reviewed version again --------- Co-authored-by: Marcus Spelman <marcus.spelman@metoffice.gov.uk> Co-authored-by: mspelman07 <99179165+mspelman07@users.noreply.github.com> Co-authored-by: Carwyn Pelley <carwyn.pelley@metoffice.gov.uk> Co-authored-by: SamGriffithsMO <122271903+SamGriffithsMO@users.noreply.github.com> Co-authored-by: Kathryn.Howard <kathryn.howard@metoffice.gov.uk>

mspelman07 added 2 commits July 12, 2024 16:53

Add apply_mask cli and function

9acf1cd

remove print statement

005075c

mspelman07 added the EPP PR's related to Enhancing Post Processing team label Jul 15, 2024

mspelman07 mentioned this pull request Jul 15, 2024

Apply-mask data metoppv/improver_test_data#50

Merged

cpelley requested changes Jul 15, 2024

View reviewed changes

improver/cli/apply_mask.py Outdated Show resolved Hide resolved

improver/utilities/mask.py Outdated Show resolved Hide resolved

improver/utilities/mask.py Outdated Show resolved Hide resolved

Updates to make sure plugin doesn't require cubelist input

f2d6551

cpelley reviewed Jul 15, 2024

View reviewed changes

Katie-Howard requested changes Jul 22, 2024

View reviewed changes

Update Docstring

a6ed3d3

Katie-Howard approved these changes Jul 22, 2024

View reviewed changes

cpelley approved these changes Jul 23, 2024

View reviewed changes

cpelley reviewed Jul 23, 2024

View reviewed changes

cpelley self-requested a review July 23, 2024 06:56

cpelley approved these changes Jul 23, 2024

View reviewed changes

mspelman07 merged commit 707436b into metoppv:master Jul 23, 2024
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Apply mask to cube #2014

Apply mask to cube #2014

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Apply mask to cube #2014

Apply mask to cube #2014

Uh oh!

Conversation

Uh oh!

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!