8000 Allow mapping assets to partitions · Issue #29713 · dagster-io/dagster · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Allow mapping assets to partitions #29713

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
danielgafni opened this issue May 1, 2025 · 0 comments
Open

Allow mapping assets to partitions #29713

danielgafni opened this issue May 1, 2025 · 0 comments

Comments

@danielgafni
Copy link
Contributor
danielgafni commented May 1, 2025

What's the use case?

I have a set of heterogeneous unpartitioned assets which are normalized downstream into a single partitioned asset.

import dagster as dg

@dg.asset(partitions_def=dg.StaticPartitionsDefinition(["a", "b", "c"]))
def merged_asset(context: dg.AssetExecutionContext, a, b, c):
    if context.partition_key == "a":
        return a
    elif context.partition_key == "b":
        return b
    elif context.partition_key == "c":
        return c
    else:
        raise NotImplementedError(f"partition key {context.partition_key} is not supported")

I would like to be able to express that a single upstream asset corresponds to a single downstream partition. Think IdentityPartitionMapping, but instead of upstream partitions I have upstream assets.

This is important because otherwise a change to one of the upstream assets marks all downstream asset partitions as stale. Solvable by specifying DataVersion manually, but still not ideal.

cc @cmpadden @schrockn as discussed on our call

Ideas of implementation

We can support a bunch of predefined strategies for such mapping. For example, we could match the last element of the upstream asset key with partition keys.

["my_prefix", "a"] -> "a"
["my_prefix", "b"] -> "b"

Not sure if this will require any changes in the current mapping framework (since the entity being mapped is not a partition key anymore but an asset key).

Additional information

I can only get away with the current implementation because my assets are loaded as polars.LazyFrame. It won't scale well for a lot of non-lazy input assets.

It would be great to discuss alternative approaches to this problem here.

Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant
0