Rclone + Airflow for scheduled, easily configurable backups & syncs
- Have Docker & docker-compose installed
- Clone this repository:
git clone https://github.com/chenseanxy/rclone-airflow.git && cd rclone-airflow
- Rclone config:
- Copy your current config file
~/.config/rclone/rclone.conf
to conf/rclone, or - Generate a new one with
docker run --rm -it --user $UID:$GID --volume $PWD/conf/rclone:/config/rclone rclone/rclone config
- Copy your current config file
- Jobs config: See "Configuration"
- Docker-compose:
- Regarding compose files:
- Use prebuilt image with
docker-compose.yml
, or - Use locally built image with
docker-compose.local.yml
and add your own DAGs & plugins, build withAIRFLOW_UID=$UID docker-compose -f docker-compose.local.yml build
- Use prebuilt image with
- Add your data volumes in x-rclone-conf.volumes, preferablly using :rw flag if you're just using this for backups
- Change TZ to your local timezone, use TZ database name format
- Regarding compose files:
- Start the service stack
- Prebuilt image:
AIRFLOW_UID=$UID docker-compose up -d
- Local image:
AIRFLOW_UID=$UID docker-compose -f docker-compose.local.yml up -d
- Prebuilt image:
Jobs config: /conf/jobs.yml
Config will be automatically refreshed when Airflow refreshes DAG (defaults: every 60 seconds)
isos: # Job name
cron: 0 0 0 * * # See crontab.guru, required
source: /data/isos # Source location, required
target: remote:isos # Target location, required
# See backup-dir in rclone docs, value here is a prefix
# Actual backup-dir will be `value/<date-time>`,
# like `remote:backup/isos/2021-10-24-11-45`
# Not required, leave out to disable backup-dir
backup: remote:backup/isos
Each job will have one DAG (workflow) generated, you'll have to manually enable each one in Airflow UI
- Or you could set
AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION
tofalse
in compose, to automatically enable these dags upon creation
Each config job will only have one instance running at a time, later instances will queue after running instances.
TODO; via rclone's prometheus metric endpoint
Using rclone rcd as rclone server, and Airflow + rclonerc to control rclone using HTTP API;
- RClone: pinned to 1.56.2 via compose file, though upgrading shouldn't pose much problems
- Airflow: pinned to 2.2.0, upgrading might need further changes to compose file
- rclonerc: Via Dockerfile, not pinned
This is a POC at the moment, and all issues & pull requests are welcome!
Current ideas:
- Configurable success & error hooks, for notification, etc
- Allow for generic rclone flags, filters, options, etc
- Runnable DAG for global bandwidth limit, etc
Dev environment: use docker-compose -f docker-compose.local.yml up -d
& pipenv install --dev