-
-
Notifications
You must be signed in to change notification settings - Fork 445
Use blobs instead of random integers #6527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #6527 +/- ##
==========================================
- Coverage 92.26% 92.21% -0.06%
==========================================
Files 601 603 +2
Lines 53427 53766 +339
==========================================
+ Hits 49294 49579 +285
- Misses 4133 4187 +54 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gen_blobs
seems generally useful as an alternative to skimage binary_blobs
.
In fact I really like it! I think it could be used to generate nice sample data using say np.uint8
as an alternative to binary blobs.
Is it worth moving to a more general location in the code base?
yes and no It may be a useful example, but without cherry-picking this to 0.4.19, we need to use only utils.py content in benchmark to be able to compare main against the latest release. |
Ok. Maybe we make it a sample or something in 0.5.0. Can make a task issue once merged. |
We could make it sample now. But it will require to have two copies of code. Or we agree on cherry pick it to 0.4.19. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this function. I've left some suggestions for the API. I also suggest that sizes should be configurable.
napari/benchmarks/utils.py
Outdated
|
||
|
||
@lru_cache | ||
def gen_blobs(shape, dtype, blob_count=144): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We definitely need a docstring, including the limitation that this is 2D or 3D only. Personally, I feel like a huge value-add would be to make this n-dimensional, for example with 3D blobs moving about in the added dimensions with brownian motion. That could come in a later PR but just thinking out loud about the kinds of limitations I find when working with skimage binary blobs.
I would also suggest a more descriptive name. At a minimum, generate_blobs
, but, perhaps better, something like labeled_particles
?
Also, I think everything other than shape should be a kwarg, dtype should be optional and default to the smallest dtype for the given number of blobs. I would also rename blob_count
to just n=
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see below that the values are obtained from the dtype. That's also ok but then there should be a default dtype.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you check now?
It is not an issue. The background is visible only for 0 value. |
Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com>
napari/benchmarks/utils.py
Outdated
""" | ||
helper function to update data with structure at given coordinates | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is structure
? What is assign_operator
? Can you write a complete docstring here? Even though it's private, it will help others understand the code in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
for clarity
This gives a 10% speedup and is clearer.
This allows users to get back a specific configuration of blobs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Czaki I've pushed a bunch of changes including a significant refactor. I think the code is cleaner (for example, it uses standard numpy reductions such as np.maximum), and it is a bit faster (about 10%) too.
I have also made the seed part of the sample layer metadata so that people could reproduce a specific configuration if needed.
I have added tests |
And found that the code was not python 3.8 compatible. |
@Czaki this is for 0.5.0 so 3.8 compatibility is not important? |
I see that the CI still uses 3.8, I guess because we wanted to keep it until 0.4.19. I'm a little shocked that the walrus operator there is reported as a syntax error, since it was introduced in 3.8? Is it because it's not allowed in square brackets due to implementation details? |
It looks like this. |
* main: Use blobs instead of random integers (napari#6527) Set default dtype for empty `_ImageSliceResponse` (napari#6552) Add creating image from clipboard (napari#6532) Fix check if plugin is available from conda (napari#6545) Fix generation of layer creation functions docstrings (napari#6558) Print a whole stack if throttler is not finished (napari#6549) Update `app-model`, `babel`, `coverage`, `dask`, `fsspec`, `hypothesis`, `imageio`, `ipython`, `lxml`, `magicgui`, `pandas`, `pint`, `psutil`, `pydantic`, `pyqt6`, `pytest-qt`, `tensorstore`, `tifffile`, `virtualenv`, `xarray` (napari#6478) Minimal changes for mlx (and jax) compatibility for Image layers (napari#6553) [pre-commit.ci] pre-commit autoupdate (napari#6528) Moving IntensityVisualizationMixin from _ImageBase to Image (napari#6548)
Description
During investigation of scikit-image/scikit-image#7262 I have created a blobs function that is quite fast and produces data more realistic than just
np.random.randint
.So I suggest adopting it in benchmarks.
It is slightly faster on my machine
main:
This branch:
It also adds Balls 2D and Balls 3D example data with importing from
napari.examples.utils