-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Implement unify_chunks and Rechunk #11692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 15 files + 15 15 suites +15 4h 13m 9s ⏱️ + 4h 13m 9s Results for commit c1c37af. ± Comparison against base commit 430a951. ♻️ This comment has been updated with latest results. |
# Conflicts: # dask/array/__init__.py # dask/array/_array_expr/__init__.py # dask/array/_array_expr/_blockwise.py # dask/array/_array_expr/_collection.py # dask/array/_array_expr/_expr.py # dask/array/_array_expr/tests/test_collection.py
) | ||
from dask.utils import apply, deepmap, derived_from | ||
|
||
if da._array_expr_enabled(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This allows us to keep the actual implementations the same and just switch he imports around. Saves a lot of code duplication
for _ in range(depth - 1): | ||
x = PartialReduce( | ||
x, | ||
func, | ||
split_every, | ||
True, | ||
dtype=dtype, | ||
name=(name or funcname(combine or aggregate)) + "-partial", | ||
reduced_meta=reduced_meta, | ||
) | ||
func = partial(aggregate, axis=axis, keepdims=keepdims) | ||
if concatenate: | ||
func = compose(func, partial(_concatenate2, axes=sorted(axis))) | ||
return new_collection( | ||
PartialReduce( | ||
x, | ||
func, | ||
split_every, | ||
keepdims=keepdims, | ||
dtype=dtype, | ||
name=(name or funcname(aggregate)) + "-aggregate", | ||
reduced_meta=reduced_meta, | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes me wonder if it shouldn't be a single TreeReduce
expression. Is there (from an expression POV) any value in using the PartialReduce
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is what you're calling out with the top level comment about ACA. Just want to double check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, correct. This should be one but that makes migration harder. The whole reduction should probably be a single expression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this entire module is just moved code, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, sorry for not highlighting this better
|
||
def _compute_rechunk(old_name, old_chunks, chunks, level, name): | ||
"""Compute the rechunk of *x* to the given *chunks*.""" | ||
# TODO: redo this logic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mean to redo the entire function or just the commented out code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's not new but copied from the legacy implementation
|
||
|
||
def unify_chunks_expr(*args): | ||
# TODO(expr): This should probably be a dedicated expression |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason why you chose not to inroduce the expression right away?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not completely sure how this would interact with blockwise unfortunately. I'd rather do this when we have a passing test suite
pre-commit run --all-files
sits on top of #11689