TensorPool is the easiest way to use GPUs, at a fraction of the cost of traditional cloud providers.
- Zero Infra Setup: No GCP, no AWS, no Docker, no Kubernetes, no cloud configuration, no cloud accounts required
- Optimization Priorities: optimize your job for price or time, we'll find the best GPU for your job
- >50% cheaper than traditional cloud providers: TensorPool aggregates GPUs from multiple cloud providers to get you the best price possible. We also use spot node recovery technology to reliably run all jobs on spot
- Create an account at tensorpool.dev
- Get your API key from the dashboard
- Install the CLI:
pip install tensorpool
TensorPool offers two ways to get started:
- Manual Configuration:
tp config new
- Autogenerated Configuration:
tp config train my model for 100 epochs on a T4
Both approaches generate a tp.config.toml
which you can review and modify before running your job.
Adjust the tp.config.toml
to match your project. For example, a simple python project a would look like:
commands = [
"pip install -r requirements.txt",
"python main.py --epochs 100",
]
optimization_priority = "PRICE"
gpu = "T4"
See Configuration for more details.
Now run your job!
tp run tp.config.toml
More examples can be found in tensorpool/examples
tp run
- Execute a jobtp init
- Create a new empty TensorPool configuration filetp init <prompt>
- Autogenerate a TensorPool configuration file from a natural language prompttp listen <job_id> [--pull] [--overwrite]
- Attach to a running job to see its output.- You can leave & reattach at any time!
- use --pull to download all outputs/changed files
- use --overwrite to force replace existing files on pull
tp pull <job_id> [files...] [--overwrite]
- Download outputs/changed files from your job.- Optionally provide a list of files to download
- use --overwrite to force replace existing files
tp cancel <job_id(s)>
- Cancel a running job(s)tp dashboard
- View all your jobs and their outputs
The heart of Tensorpool is the tp.config.toml
which defines your job.
This file is generated by the tp config
command and can be executed by the tp run
command.
Here's a complete list of all fields supported in the tp.config.toml
:
# List of commands to run as if you were starting from a fresh virtual environment
commands = [
# For example:
# "pip install -r requirements.txt",
# "python main.py",
]
# The optimization priority for the job
optimization_priority = "PRICE" # Either "PRICE" or "TIME"
ignore = [
# List of files to ignore sending with your project
# For example:
# ".git",
# ".DS_Store",
]
# The GPU you'd like to use
gpu = "auto" # Either "auto", "P4", "P100", "V100", "T4", "L4", "A100", "A100-80GB".
# Defaults to "auto", where TensorPool will select the best GPU based on your optimization priority
# Instance specification (optional)
# gpu_count = n # Number of GPUs to use
# vcpus = x # Number of vCPUs to use
# memory = y # Amount of memory in GB to use
# See https://github.com/tensorpool/tensorpool/blob/main/docs/instances.md for all supported configurations
For all supported gpu_count, vcpus, and memory configurations see docs/instances
The beauty of the tp.config.toml
is its simplicity and flexibility, this allows
7A4F
you to:
- Run your job with a single command
- Kick off several experiments with different configurations
- Version control it with your code
- Reuse it for future runs
What does optimization_priority
mean?
optimization_priority = "PRICE"
means that TensorPool will execute your job for the lowest price possible.
This doesn't always mean the cheapest GPU, but the best value (typically $/performance) GPU for your job.
optimization_priority = "TIME"
means that TensorPool will search for the fastest instance types (best GPU) across all cloud providers.
TensorPool uses heuristics to find the best GPU for your job based on the optimization priority you set.
What cloud providers are supported?
Currently GCP, AWS, Azure, Runpod are supported. More cloud providers are coming soon!
- Run commands like you're starting from scratch
- TensorPool executes your job in a fresh environment - include all setup commands like installing dependencies, downloading data, etc
- Save your outputs
- Always save your model weights and outputs to disk, you'll get them back at the end of the job!
- Don't save files outside of your project directory, you won't be able to get them back
- Download datasets and big files within your script
- All TensorPool machines are equipped 10+Gb/s networking, so large files can be downloaded faster if done within your script
- Run from the root of your project
- TensorPool will send your project directory to the cloud, so make sure you're in the right directory
- Don't run from your home directory or a subdirectory!
- Don't require stdin
- TensorPool runs your job in the background, so don't require any input from stdin (like
input()
in Python) - If you need to pass arguments to your script, use command line arguments or environment variables
- TensorPool runs your job in the background, so don't require any input from stdin (like
- Simplicity: Use GPUs without the hassle of cloud setup, machine configuration, or quotas
- Flexibility: Change your job configuration on the fly, run multiple experiments, and version control your job configurations
- Cost Effective: By aggregating excess GPU capacity from multiple cloud providers TensorPool offer GPUs at a fraction of the cost
Get started today at tensorpool.dev!