10000 Fix missing extent cache invalidation by brisvag · Pull Request #7015 · napari/napari · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Fix missing extent cache invalidation #7015

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Aug 14, 2024

Conversation

brisvag
Copy link
Contributor
@brisvag brisvag commented Jun 21, 2024

References and relevant issues

Started on zulip: https://napari.zulipchat.com/#narrow/stream/212875-general/topic/Layer.2Erefresh.28.29.20doesn't.20reset.20dims.20range

Description

The issue at the core is the fact that we cache extents, and are unable to invalidate this cache when layer data is changed inplace (since we have no events associated with the array inside).

Example failing code:

import napari
import numpy as np
viewer = napari.Viewer()
points = np.array([[0, 1, 2], [1, 2, 3], [2, 3, 4], [3, 4, 5]])
actual_slices = np.array([4, 6, 8, 10])

layer = viewer.add_points(points)
viewer.layers['points'].data[:,0] = actual_slices

This should result in dims going from 0 to 6, with alternating empty slices. Instead, we get 0-3, with only empty slices, cause the points layer extents are outdated.

Adding the following lines will fix the issue:

layer._clear_extents_and_refresh()  # invalidates cached extents
viewer._on_layers_change()  # recomputes dims extents

For now, I only switched from refresh() to _clear_extents_and_refresh() where we do have knowledge of a change in data. However, to actually fix the above code, we need to somehow either:

  1. always invalidate the cache on refresh() for non array-like layers (might be a big hit on performance cause we refresh in a lot of places that don't really need this
  2. expose a "hard refresh" method
  3. change refresh to always invalidate cahce, but expose a "soft refresh" that does not, and use that throughout napari every time we don't need cache invalidation

I think 3. is the best, but it's also a bit more invasive and I'm not sure how many people use refresh and rely on it being performant outside of napari core.

@brisvag brisvag requested review from psobolewskiPhD and Czaki June 21, 2024 14:17
@Czaki
Copy link
Collaborator
Czaki commented Jun 21, 2024

I do not think, that we should support modification data in place.

I suggest that people should use

layer.data = layer.data to refresh.

All other solution is high risk for performance problem or inconsistency. I do not feel like current code of layers will allow for this.

Maby after big refactor.

@brisvag
Copy link
Contributor Author
brisvag commented Jun 21, 2024

I do not think, that we should support modification data in place.

I'm not saying we should be automatically firing events (I agree that's too much work we need to inject into arrays and risk having huge performance issues), but I definitely think we should support setting data inplace (and not straight up explode or fail) as long as there's a manual way of letting napari now "Data has changed!".

But yeah, we can also encourage people to use layer.data = data, but that's also a bigger performance hit than needed, in many cases.

Copy link
codecov bot commented Jun 21, 2024

Codecov Report

Attention: Patch coverage is 97.77778% with 1 line in your changes missing coverage. Please review.

Project coverage is 92.68%. Comparing base (a8c822c) to head (4fb61f2).
Report is 1 commits behind head on main.

Files Patch % Lines
napari/layers/surface/surface.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7015      +/-   ##
==========================================
- Coverage   92.78%   92.68%   -0.10%     
==========================================
  Files         620      620              
  Lines       56779    56781       +2     
==========================================
- Hits        52680    52627      -53     
- Misses       4099     4154      +55     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Czaki
Copy link
Collaborator
Czaki commented Jun 21, 2024

But yeah, we can also encourage people to use layer.data = data, but that's also a bigger performance hit than needed, in many cases.

Why is it a bigger performance hit? If you allow modifying in place. Then how it will differ from setting, a new data?

@brisvag
Copy link
Contributor Author
brisvag commented Jun 21, 2024

It's a bigger performance hit than just doing layer.refresh(), because layer.refresh() does not invalidate the extent caches. There are many cases where people use refresh() where they don't need to recompute extents, and if they do so as part of a rapid-firing event it might be significantly slower.

@Czaki
Copy link
Collaborator
Czaki commented Jun 21, 2024

I know, that refresh is less expensive (it is still too expensive - for example trigger recomputation of thumbnail).

But I do not see how modify data in place could be less expensive than assign data.

@psobolewskiPhD
Copy link
Member

I kinda like the idea of the hard refresh that the user can do?
Basically take the layer.data and treat like I just added it as a new layer?
That's what I expected .refresh() to do frankly.
I don't think we need to support in-place edits with auto-magic, but making it so someone could do it would be nice.
The specific case that brought up this issue was someone who gets a layer generated by a plugin, but wants to fix the ordering of the slices. So doing it in-place is th 8000 e easiest, because they don't have any/easy access to the original array that was passed as data for the layer.

@jni
Copy link
Member
jni commented Jun 22, 2024

That's what I expected .refresh() to do frankly.

👆 I agree wholeheartedly with this. If something is not refreshed by layer.refresh, I consider that a bug. Obviously, in many places we need partial refresh — we can add support for that in other ways. e.g. layer.refresh(thumbnail=False, extent=False)?

@jni jni added the bugfix PR with bugfix label Jun 22, 2024
@jni jni added this to the 0.5 milestone Jun 22, 2024
@Czaki
Copy link
Collaborator
Czaki commented Jun 22, 2024

👆 I agree wholeheartedly with this. If something is not refreshed by layer.refresh, I consider that a bug. Obviously, in many places we need partial refresh — we can add support for that in other ways. e.g. layer.refresh(thumbnail=False, extent=False)?

You forget about colormap etc.
The general problem is that refresh is used in multiple places and even now, provides us significant negative impact on performance of layers.

@jni
Copy link
Member
jni commented Jun 22, 2024

The general problem is that refresh is used in multiple places and even now, provides us significant negative impact on performance of layers.

of course. That's a problem with our programming style, though. It does not change the point that as a user, I would definitely expect refresh() to actually refresh. It should be the last resort for forcing the viewer to sync changes.

Anyway, I wouldn't mind some middle ground, where all the attributes that can be refreshed are false by default, but we add refresh_all() that sets them all to true. Then we can audit our calls to refresh and set only necessary elements to True.

@brisvag
Copy link
Contributor Author
brisvag commented Jun 22, 2024

What about something like this in the last commit?

Then I would have to change every occurence of layer.refresh() to use False for the unneeded parts.

@psobolewskiPhD
Copy link
Member

Sorry to be so late to this.
I can appreciate not wanting to support/handle/encourge in place modification of data.
In the "trigger case" the user had a layer generated by a plugin, so they didn't have easy access to the "original" data. I warned that this wasn't a good way to do it!
I guess the solution should have been:

current_data = viewer.layers['the layer'].data
current_data[;,0] = actual_data
viewer.layers['the layer'].data = current_data

Still would be nice to do a "no seriously refresh everything like for real"

@jni
Copy link
Member
jni commented Jul 1, 2024

What about something like this in the last commit?

Then I would have to change every occurence of layer.refresh() to use False for the unneeded parts.

Personally I really like this approach!

@brisvag brisvag marked this pull request as ready for review July 4, 2024 10:07
@brisvag
Copy link
Contributor Author
brisvag commented Jul 4, 2024

Let me know what you think. I went through all the refresh calls and updated it accordingly... it's a bit tricky to understand the control flow of our refreshes sometimes, so it would be great to get other eyes on it. I also expect some tests to fail and let me know where I messed up :P

I also left some TODOs where I wasn't sure what to do.

@Czaki Czaki added the run-benchmarks Add this label to trigger a full benchmark run in a PR label Jul 4, 2024
@brisvag
Copy link
Contributor Author
brisvag commented Jul 4, 2024

I don't understand this mypy error. It points to clearly wrong lines, so I don't know how to debug it.

Ok now it's correct. Not sure what I did wrong ^^'

Copy link
Collaborator
@Czaki Czaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should fix mypy error.
If you commit it, please restart benchmarks

@jni jni modified the milestones: 0.5, 0.5.1 Jul 10, 2024
Copy link
Contributor Author
@brisvag brisvag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few TODOs in there, it would be good to get some feedback on them and address them before merging!

@brisvag brisvag requested a review from andy-sweet July 23, 2024 14:06
@jni jni modified the milestones: 0.5.1, 0.5.2 Jul 24, 2024
@jni
Copy link
Member
jni commented Jul 24, 2024

Discussed in community meeting: @Czaki pointed out (and I agree) that cache invalidation is hard 😂 and we would probably benefit with longer testing on main, so we have decided to move it to the next milestone.

@jni
Copy link
Member
jni commented Jul 24, 2024

(with an idea to merge immediately after the 0.5.1 release)

@brisvag brisvag added the performance Relates to performance label Aug 5, 2024
@brisvag
Copy link
Contributor Author
brisvag commented Aug 5, 2024

@jni we didn't merge immediately after 0.5.1.... shoudl we still get this in?

@Czaki
Copy link
Collaborator
Czaki commented Aug 5, 2024

@jni we didn't merge immediately after 0.5.1.... shoudl we still get this in?

I think that you should merge this just after 0.5.2 release

@jni jni modified the milestones: 0.5.2, 0.5.3 Aug 5, 2024
@jni
Copy link
Member
jni commented Aug 5, 2024

Ach. I'll try to remember... 😕

@brisvag
Copy link
Contributor Author
brisvag commented Aug 12, 2024

@jni reminder since you're in release mode and I thought about this!

@jni
Copy link
Member
jni commented Aug 12, 2024

haha I also thought about this (especially because I think #7146 should also get merged quickly so we can all play with it on main). But your warning comes a few hours too early, and is thus useless! 😂 I have set a reminder on my watch to merge this in the morning though. 😃

@brisvag brisvag added the ready to merge Last chance for comments! Will be merged in ~24h label Aug 14, 2024
Copy link
Collaborator
@Czaki Czaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's play with this

@brisvag brisvag merged commit e6a4d9b into napari:main Aug 14, 2024
36 checks passed
@brisvag brisvag deleted the fix/points-extent-refresh branch August 14, 2024 20:33
@github-actions github-actions bot removed the ready to merge Last chance for comments! Will be merged in ~24h label Aug 14, 2024
TimMonko pushed a commit that referenced this pull request Jun 13, 2025
# References and relevant issues


# Description

In #7015 the change from cleaning extent to refresh it miss that
`_block_refresh` prevents from invalidating cache when a callback
connected to event expects it.

It also updates pasting spacing information to reduce layer refresh.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix PR with bugfix performance Relates to performance
Projects
None yet
Development

Successfully merging this pull request may close these iss 3AC0 ues.

4 participants
0