8000 DaskVine Dag Reconstruction, Visualization + other fixes by BarrySlyDelgado · Pull Request #4145 · cooperative-computing-lab/cctools · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

DaskVine Dag Reconstruction, Visualization + other fixes #4145

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

BarrySlyDelgado
Copy link
Contributor

Proposed Changes

Adds some functionality to DaskVine with a couple of fixes as well

Task Graph Reconstruction

A user annotates a function with the @daskvine_merge decorator. When the DaskVineDag object is created and reconstruct=True, expand_merge will reconstruct the graph to perform a hierarchical merge/reduction based on merge_size, which is 2 by default. We replace the original node in the graph with the root node of the hierarchical tree

This turns a DAG like this
original.pdf
To one like this
reconstructed.pdf

Visualization

The visualize function in dask_dag.py allows users to visualize the graph. Setting visualize=True will create a dot graph. This is done after reconstruction.

other fixes

In set_result taskrefs, more specifically, dts.Alias is not handled properly.

Merge Checklist

The following items must be completed before PRs can be merged.
Check these off to verify you have completed all steps.

  • make test Run local tests prior to pushing.
  • make format Format source code to comply with lint policies. Note that some lint errors can only be resolved manually (e.g., Python)
  • make lint Run lint on source code prior to pushing.
  • Manual Update: Update the manual to reflect user-visible changes.
  • Type Labels: Select a github label for the type: bugfix, enhancement, etc.
  • Product Labels: Select a github label for the product: TaskVine, Makeflow, etc.
  • PR RTM: Mark your PR as ready to merge.

self.set_result(p, self.get_result(node.key))
) # case e.g, "x": "y", and we just set the value of "y"
x = self.set_result(p, self.get_result(node.target))
# rs.update(x) # case e.g, "x": "y", and we just set the value of "y"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct?

@@ -162,10 +234,14 @@ def set_result(self, key, value):
else:
self._depth_of[r] = 0

for c in self._dependencies_of[key]:
self._pending_needed_by[c].discard(key)
if not DaskVineDag.taskref(self._working_graph[key]):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this if needed?


return rs.values()
if DaskVineDag.taskref(self._working_graph[key]):
return rs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this mixing old and new graph styles? If so, this file should only have the new style, and the one in compat the old style.

@@ -211,6 +290,33 @@ def set_targets(self, keys):
def get_targets(self):
return self._targets

def visualize(self, name='DaskVine_DAG'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be different from dask.visualize?

dag = DaskVineDag(dsk)
dag = DaskVineDag(dsk, reconstruct=self.reconstruct, merge_size=self.merge_size)

if self.visualize:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to leave the visualization to dask proper.

@@ -56,7 +57,7 @@ def containerp(s):
def symbolp(s):
return isinstance(s, dts.DataNode)

def __init__(self, dsk):
def __init__(self, dsk, reconstruct=False, merge_size=2):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add detailed comments of what these options do.

@btovar btovar added this to the 7.15.4 milestone May 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0