A command-line utility to provision infrastructure for ML workflows
Documentation | Issues | Twitter | Slack
dstack
is a lightweight command-line utility to provision infrastructure for ML workflows.
- Define your ML workflows declaratively, incl. their dependencies, environment, and required compute resources
- Run workflows via the
dstack
CLI. Have infrastructure provisioned automatically in a configured cloud account. - Save output artifacts, such as data and models, and reuse them in other ML workflows
- Use
dstack
to process data, train models, host apps, and launch dev environments
Use pip to install dstack
locally:
pip install dstack
The dstack
CLI needs your AWS account credentials to be configured locally
(e.g. in ~/.aws/credentials
or AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
environment variables).
Before you can use the dstack
CLI, you need to configure it:
dstack config
It will prompt you to select the AWS region where dstack will provision compute resources, and the S3 bucket, where dstack will save data.
Region name (eu-west-1):
S3 bucket name (dstack-142421590066-eu-west-1):
Support for GCP and Azure is in the roadmap.
- Install
dstack
locally - Define ML workflows in
.dstack/workflows.yaml
(within your existing Git repository) - Run ML workflows via the
dstack run
CLI command - Use other
dstack
CLI commands to manage runs, artifacts, etc.
When you run an ML workflow via the
dstack
CLI, it provisions the required compute resources (in a configured cloud account), sets up environment (such as Python, Conda, CUDA, etc), fetches your code, downloads deps, saves artifacts, and tears down compute resources.