Add Runtime Dataflow Viewer #1023

saulshanabrook · 2021-05-27T01:49:09Z

This pull request adds a visual viewer/debugger for the runtime dataflow.

Motivation

Vega (Lite) provides a wonderful declarative mechanism for specifying visualizations, but by default, it requires all data to be loaded in memory to the client's browsers. It is often impractical or impossible to let Vega handle all of the data transforms and instead we wish to "push down" the queries that Vega visualization needs to some other data system. This could be another system in your browser, like Arquero, or a remote database.

Previously, I have achieved this by transforming the Vega spec itself, extracting out the existing transforms, and replacing them with a combined transform that executes on a remote database (ibis-vega-transform). This approach was able to move some interactive visualizations to SQL, but it was tied to Python. When we wanted to explore a strictly client-side version of this, that could work without a Python kernel, @domoritz from the Vega team recommended that we look into operating at the Vega runtime dataflow level, instead of the vega spec level, in order to more accurately capture all of the data transformations.

In order to move forward with this, I wanted to have a better understanding of how this runtime dataflow operated. I also wanted to be able to debug how each node in it functioned, which is vital to being able to modify the graph and the nodes.

There are also a number of related issues other have opened on debugging Vega (Lite): vega/vega-lite#4134 vega/vega#407 vega/vega#1879

Background

There was an existing article, by @jheer, called "How Vega Works" that showed a visual representation of the dataflow and let you see how different "pulses" of the graph executed it. @chengluyu then took that code and created an interactive vega-inspector to add more information and support more node types.

They helped for smaller graphs, but I knew that I needed to be able to inspect graphs like the "Interactive Layered Crossfilter" example, which became very hard to read and interact with, using those tools.

Features

So in this pull request, I have added a visual runtime dataflow viewer. In this video look at the data pipeline of the Interactive Layered Crossfilter example, by looking at what changes as you interact with the diagram and what nodes are executed:

Untitled.mp4

It currently includes:

Zooming, with mouse wheel
Selecting a node or edge to filter to the related nodes (ancestors and children)
Selecting a pulse to filter to those nodes touched in that pulse
Hovering over a node, to see the parameters used to instantiate it
If a pulse is selected, on hover the current value of the node will be shown as well
Filtering by type of node. We add nodes to the graph for all operators in the graph, as well as the updates, bindings, streams, and data.
Background processing of the node layout in a webworker, to allow for continued interaction
Caching of layouts to speed up switching to an existing one

Details

To render the graph, I use Cytoscape JS, which is a popular canvas based graph renderer. To layout the nodes, I used the Eclipse Layout Kernel. Originally, I used the Cytoscape ELK layout plugin, but switched to using elkjs directly, in order to have greater control of the layout timing and caching.

This PR depends on a corresponding PR in the Vega main repo to add typing for the runtime: vega/vega#3237, which it uses to properly type the function which turn the runtime into a graph.

The state management is bit complex in this code, unfortunately, primarily due to the need to interface with an async layout engine (ELK) and an imperative view layer (Cytoscape). I have gone through a number of different iterations on how to synchronize all of this properly (component state, React's useReducer, Cytoscape state), and have currently settled on moving as much state to Redux as possible, since the application is already using Redux for the rest of its state.

I tried not to disturb any of the existing application code, but I did create all of the necceary state manage 8000 ment code for this viewer in a "feature" subfolder, instead of following the existing pattern in the code base of keeping all reducers in the same file. I did this because this functionality is tightly coupled and splitting it off into its own folder made it much easier to iterate on and add to gradually. I tried to follow Redux best practices and pulled in the Redux toolkit to help implement those.

Future work

I have found this debuger useful to get a better grasp of the vega runtime dataflow, through particular examples, but there are a number of areas for follow up work. Since this PR is already quite large (too large?), I hope that any additional features could be added afterword. A few I have collected are:

Add more nuanced time profiling for each pulse, to understand how much time is spent on each node. We could then size the nodes by time.
On any action that is about to selection, on hover grey out all the nodes that wouldn't be selected. This would be on nodes themselves, on pulses, and on types.
Try filtering out axis and legends to reduce graph size
Improve styles of side panel, to make them more consistent and inline with application
Only record pulses when dataflow panel is open, to reduce memory consumption and CPU usage normally
Move node parameter details and values to side panel from popup
Auto select first pulse when loading, to show those values by default
When selecting a pulse, also show streams that caused that pulse to run
Add details to the graph to show semantics of nested nodes better, by showing what the special parent signal is for and the root node.

TODO

Now if you hover any node in the scene graph tree, the corresponding element will be highlighted.

domoritz · 2021-09-29T21:30:45Z

You can click on a row in the table, and that selects a pulse. I should make it more clear somehow what it's doing! Let me know if you have suggestions.

Oh, I got confused since my chart doesn't have pulses so the table is empty. You should hide the table when it has no rows.

saulshanabrook · 2021-09-30T17:05:13Z

Oh, I got confused since my chart doesn't have pulses so the table is empty. You should hide the table when it has no rows.

Huh, I think it should always have one pulse? The initial one? If you select that one, then at least you can see the initial values when you hover over each node.

saulshanabrook · 2021-09-30T17:12:35Z

Also, I noticed that the cursor switches to a pointer when I hover over the table header. That should not be the case, no?

The current behavior was reversed, so that the header hada pointer and the rows did not. I fixed it so the rows had a pointer, and the header did not.

lgtm-com · 2021-09-30T17:16:59Z

This pull request fixes 1 alert when merging 6818e2a into ff322c4 - view on LGTM.com

fixed alerts:

1 for Unused variable, import, function or class

domoritz · 2021-10-01T00:02:34Z

Thank you. Any idea why the CI doesn't run on your fork?

saulshanabrook · 2021-10-01T17:54:01Z

It looks like this PR isn't run either (#591). I just tried editing the github action config to run on PRs and that seems to make it work now.

Also I believe vega/vega#3237 will need to be released before this passes?

I am not sure why I have to add this, but was also getting the same error locally that we are getting on CI about it not being installed

lgtm-com · 2021-10-01T18:03:47Z

This pull request fixes 1 alert when merging 658098e into ff322c4 - view on LGTM.com

fixed alerts:

1 for Unused variable, import, function or class

.github/workflows/test.yml

lgtm-com · 2021-10-02T16:41:53Z

This pull request fixes 1 alert when merging dab3c61 into 8635307 - view on LGTM.com

fixed alerts:

1 for Unused variable, import, function or class

saulshanabrook · 2021-10-02T17:20:55Z

Yeah we are now getting a test failure since the upstream vega typings PR isn't merged:

Error: src/features/dataflow/utils/runtimeToGraph.ts(17,8): error TS2307: Cannot find module 'vega-typings/types/runtime/runtime' or its corresponding type declarations.

domoritz · 2021-10-14T13:41:46Z

Thank you for building out this feature. I'll merge this pull request when we have a typings release.

saulshanabrook · 2021-10-21T16:02:22Z

@domoritz if we wanted to get this in before the typings are merged and released, I could comment out the typings import and alias them to any for now? Thoughts?

domoritz · 2021-10-22T19:20:54Z

I released vega-typings@0.22.1.

saulshanabrook · 2021-10-22T20:37:42Z

@domoritz sweet, thank you! I will update this PR to include that release.

lgtm-com · 2021-10-22T23:56:04Z

This pull request fixes 1 alert when merging 5577257 into 8635307 - view on LGTM.com

fixed alerts:

1 for Unused variable, import, function or class

* refactor: simplify button code by removing editor-button class * chore(deps-dev): bump postcss from 8.3.6 to 8.3.8 (#1085) Bumps [postcss](https://github.com/postcss/postcss) from 8.3.6 to 8.3.8. - [Release notes](https://github.com/postcss/postcss/releases) - [Changelog](https://github.com/postcss/postcss/blob/main/CHANGELOG.md) - [Commits](postcss/postcss@8.3.6...8.3.8) --- updated-dependencies: - dependency-name: postcss dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump webpack from 5.53.0 to 5.56.0 (#1084) Bumps [webpack](https://github.com/webpack/webpack) from 5.53.0 to 5.56.0. - [Release notes](https://github.com/webpack/webpack/releases) - [Commits](webpack/webpack@v5.53.0...v5.56.0) --- updated-dependencies: - dependency-name: webpack dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump monaco-editor-webpack-plugin from 4.1.2 to 4.2.0 (#1082) Bumps [monaco-editor-webpack-plugin](https://github.com/Microsoft/monaco-editor-webpack-plugin) from 4.1.2 to 4.2.0. - [Release notes](https://github.com/Microsoft/monaco-editor-webpack-plugin/releases) - [Commits](microsoft/monaco-editor-webpack-plugin@v4.1.2...v4.2.0) --- updated-dependencies: - dependency-name: monaco-editor-webpack-plugin dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump d3-scale from 4.0.1 to 4.0.2 (#1081) Bumps [d3-scale](https://github.com/d3/d3-scale) from 4.0.1 to 4.0.2. - [Release notes](https://github.com/d3/d3-scale/releases) - [Commits](d3/d3-scale@v4.0.1...v4.0.2) --- updated-dependencies: - dependency-name: d3-scale dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump webpack-dev-server from 4.2.1 to 4.3.0 (#1079) Bumps [webpack-dev-server](https://github.com/webpack/webpack-dev-server) from 4.2.1 to 4.3.0. - [Release notes](https://github.com/webpack/webpack-dev-server/releases) - [Changelog](https://github.com/webpack/webpack-dev-server/blob/master/CHANGELOG.md) - [Commits](webpack/webpack-dev-server@v4.2.1...v4.3.0) --- updated-dependencies: - dependency-name: webpack-dev-server dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump d3-array from 3.0.4 to 3.1.0 (#1077) Bumps [d3-array](https://github.com/d3/d3-array) from 3.0.4 to 3.1.0. - [Release notes](https://github.com/d3/d3-array/releases) - [Commits](d3/d3-array@v3.0.4...v3.1.0) --- updated-dependencies: - dependency-name: d3-array dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): bump @types/react from 17.0.24 to 17.0.26 (#1076) Bumps [@types/react](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react) from 17.0.24 to 17.0.26. - [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases) - [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react) --- updated-dependencies: - dependency-name: "@types/react" dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump autoprefixer from 10.3.4 to 10.3.6 (#1080) Bumps [autoprefixer](https://github.com/postcss/autoprefixer) from 10.3.4 to 10.3.6. - [Release notes](https://github.com/postcss/autoprefixer/releases) - [Changelog](https://github.com/postcss/autoprefixer/blob/main/CHANGELOG.md) - [Commits](postcss/autoprefixer@10.3.4...10.3.6) --- updated-dependencies: - dependency-name: autoprefixer dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump @types/react-select from 4.0.17 to 5.0.1 (#1078) * chore(deps): bump actions/setup-node from 2.4.0 to 2.4.1 (#1075) * Add Runtime Dataflow Viewer (#1023) Co-authored-by: chengluyu <chengluyu@live.cn> Co-authored-by: JackZ <emailjiong@126.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Saul Shanabrook <s.shanabrook@gmail.com> Co-authored-by: chengluyu <chengluyu@live.cn> Co-authored-by: JackZ <emailjiong@126.com>

domoritz · 2021-10-23T00:06:00Z

Thank you @saulshanabrook for the dataflow viewer!

declann · 2021-10-25T13:10:23Z

Great job @saulshanabrook , this is an awesome feature!

chengluyu and others added 30 commits September 20, 2019 10:33

Initial commit from Create React App

e0f9b10

Make linter happy

2ea76c6

Add chart and scene graph display

f672b6f

Add data flow graph display

41dcca9

Fix: remove comment typo

b35c855

Keep react-vega from update, complete scene graph display

f9f0add

Extract graphviz rendering as a component

ff32f1f

Use react-inspector instead of react-json-view

5117aa0

Remove unused function to make compiler happy

874750e

Customize the inspector and add highlight feature

35ce8dd

Now if you hover any node in the scene graph tree, the corresponding element will be highlighted.

Add simple styles to UI

29cd24a

Re-organize files and remove unless stuff

53c9acc

Fix runtime warnings caused by mistypings in inspector"

51f10ed

SVG is parsed to React rather than set by innerHTML

5c22b8e

Extract highlight routines to a class

ea45ab7

Add data flow graph highlight feature

0099386

Hide SVG element in scene graph nodes

7ab5494

Fix transformation of SVGTitleElements

57b6e06

chore(bundler): use webpack instead of react-scripts

9fbad4c

refactor(files): re-organize files and remove service worker code

1072db2

fix(style): use sans-serif as default font-family

4f1412e

style: use default config and run format

ec030db

make subcontext nodes display

74eacef

chore(deps): introduce Tailwind

ac9bf62

refactor(title): change title and navbar brand

3553cd9

feat: add source editor and adjust layout and styles

5eb767f

feat(logo): add favicon and navbar logo

5c4e5d8

feat(ui): add icons to buttons

3d56777

feat(dataflow): support display dot source

0733f9f

feat(dataflow): use react-svg-pan-zoom to display data flow graph

abc5029

saulshanabrook added 2 commits September 30, 2021 10:11

Show data and signals by default

141f293

Fix pointer on sidebar table

6818e2a

Try running action on PR

b8dbde7

Add eslint plugin

658098e

I am not sure why I have to add this, but was also getting the same error locally that we are getting on CI about it not being installed

domoritz requested changes Oct 1, 2021

View reviewed changes

.github/workflows/test.yml Outdated Show resolved Hide resolved

saulshanabrook added 3 commits October 2, 2021 09:29

Run CI only once on internal branch PRs

dd4bf94

Fix lint errors

50a153d

Merge upstream/master into dataflow

dab3c61

Bump to latest typings

5577257

domoritz merged commit c0e5120 into vega:master Oct 23, 2021

saulshanabrook deleted the dataflow branch October 25, 2021 21:12

saulshanabrook mentioned this pull request May 8, 2023

Visualization of EGraph State egraphs-good/egglog#144

Closed

saulshanabrook mentioned this pull request Aug 2, 2024

Eclipse Layout Kernel Layout Support plotly/dash-cytoscape#222

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Runtime Dataflow Viewer #1023

Add Runtime Dataflow Viewer #1023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Add Runtime Dataflow Viewer #1023

Add Runtime Dataflow Viewer #1023

Uh oh!

Conversation

Uh oh!

Motivation

Background

Features

Details

Future work

TODO

Uh oh!

Uh oh!

Uh oh!