-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Optimize vindex #11625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize vindex #11625
Conversation
1. Tokenize before broadcasting 2. avoid copying indexer arrays as much as possible. 3. Use cached_max, cached_cumsum for determining chunksize and chunk bounds 4. broadcast as late as possible.
This reverts commit e56f72fafc07f7aceae5f05b9b69bb5b3a530a18.
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 15 files ±0 15 suites ±0 3h 59m 27s ⏱️ + 2m 50s Results for commit e23eac5. ± Comparison against base commit fe465e0. ♻️ This comment has been updated with latest results. |
967b129
to
a4379f3
Compare
slicer = slice(start, stop) | ||
key = sorted_keys[start] | ||
outblock, *input_blocks = np.unravel_index(key, ravel_shape) | ||
inblock = [_[slicer] for _ in sorted_inblock_idxs] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps we should just scatter out sorted_inblock_idxs
as dask arrays and embed a reference to a dask array sliced by these slicers in the task instead of the full array? The graph is quite big for the example in #11018
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah UserWarning: Sending large graph of size 738.91 MiB.
for
import dask.array as da
import numpy as np
import scipy as sp
chunksize = 100
size = 10_000
n_points = 5000
X = da.random.poisson(15, (size, size), chunks = (chunksize, chunksize))
index_0 = np.random.randint(0, X.shape[0], n_points)
index_0.sort()
index_1 = np.random.randint(0, X.shape[1], n_points)
index_1.sort()
print('vindex timing:')
X.vindex[np.ix_(index_0, index_1)].compute()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if index_0
has size (M,1)
and index_1
has size (1, N)
, it M+N
values instead of the current MxN
.
That said, I think this PR can be merged (barring any other comments). Such a graph takes so long to build today no one is using it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I agree, we can merge this now, it already reduced graph size by 50%
Thanks! |
bounds
pre-commit run --all-files
vindex
as outer indexer: memory and time performance #11018 (1.87s, 2.1GB memory)