8000 Initial implementation of DaskCloudProviderEnvironment by joeschmid · Pull Request #2360 · PrefectHQ/prefect · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Initial implementation of DaskCloudProviderEnvironment #2360

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 49 commits into from
May 4, 2020

Conversation

joeschmid
Copy link
@joeschmid joeschmid commented Apr 20, 2020

The Dask Cloud Provider project aims to make it easy to bring up Dask clusters using services from cloud providers, e.g. AWS Fargate. This Prefect environment uses Dask Cloud Provider to dynamically create a Dask cluster, run the Flow on it, and tear down the cluster after use. This environment is meant to provide an easy path to Dask scalability for users of cloud platforms, like AWS.

  • adds new tests (if appropriate)
  • updates CHANGELOG.md (if appropriate)
  • updates docstrings for any new functions or function arguments, including docs/outline.toml for API reference docs (if appropriate)

What does this PR change?

This adds a new execution environment, DaskCloudProviderEnvironment.

Why is this PR important?

Prefect offers excellent scalability by running Flows on Dask clusters. However, configuring a truly distributed Dask cluster with multiple workers on separate compute resources can be complicated. Kubernetes can be used for this, but Prefect users who aren't familiar it (and Dask) may find it complex to configure. Even for experienced users currently using Prefect and Dask on k8s, cloud services like Fargate may make integration with CI/CD pipelines and automated deployment of Flows easier, e.g. Docker image versions can be updated in Fargate task definitions after an automated build completes, etc.

@codecov
Copy link
codecov bot commented Apr 20, 2020

Codecov Report

Merging #2360 into master will decrease coverage by 3.04%.
The diff coverage is 20.00%.

@joshmeek
Copy link

Thanks for the PR @joeschmid! Will have a look!

@joeschmid
Copy link
Author

Thanks @joshmeek! I happened to talk to @lauralorenz and @jlowin this morning on the PyCon contributors video conf and they suggested improving auth along the lines of the recent Better secrets PR from @cicdw. When I get a chance I'll update this to follow those guidelines, but also interested in any other thoughts and feedback. No rush from my side, this is mostly a "nights and weekends" project for me. (Though I can say that we're already using this successfully in our test environment.)

Copy link
@joshmeek joshmeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking really good so far!

@joeschmid joeschmid changed the title [WIP] Initial implementation of DaskCloudProviderEnvironment Initial implementation of DaskCloudProviderEnvironment Apr 26, 2020
@joeschmid
Copy link
Author

@joshmeek and team, this is finally ready for review. I'd love some feedback, especially on:

  • The on_start() callback. I decided to extend this to take 2 arguments, but it breaks the convention of on_start() taking no arguments. I could name it something else (maybe on_execute()?) and leave on_start alone.
  • The docs. I tried to put a fair bit of detail and code examples, but it could also use explanation of AWS requirements. @lauralorenz, it would be great to get your thoughts. Maybe better to address the AWS stuff separately?

Copy link
@joshmeek joshmeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small doc nits

joeschmid and others added 6 commits May 4, 2020 09:45
Co-authored-by: Josh Meek <40716964+joshmeek@users.noreply.github.com>
Co-authored-by: Josh Meek <40716964+joshmeek@users.noreply.github.com>
Co-authored-by: Josh Meek <40716964+joshmeek@users.noreply.github.com>
Copy link
@joshmeek joshmeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@joshmeek joshmeek merged commit 3d9058a into PrefectHQ:master May 4, 2020
zanieb added a commit that referenced this pull request Jul 19, 2022
Change the flow/task `run` method to private
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0