restic repo initialised twice when starting up docker with both backup/pruneprune actually runs #48

sumnerboy12 · 2020-06-18T09:15:05Z

I noticed when I started my docker swarm stack which included both backup and prune, that it attempted to (and succeeded) in initialising the same repo twice.

This put the repo in a bad state and any attempts to run a backup/prune resulted in;

Fatal: config cannot be loaded: ciphertext verification failed

Can we either only attempt to initialise during backup? In prune can we check if the repo is initialised and if not then exit early (clearly nothing to do).

The text was updated successfully, but these errors were encountered:

ThomDietrich · 2020-06-18T12:48:23Z

How is this even possible?

Initialization happens outside of the backup or prune code, inside entrypoint - could this error be related to another aspect of your setup?

Also, whatever led to the issue might be resolved after PR #47 is merged. Would you be able to test the new logic with your setup?

(Btw. I recognize your profile pic from some years ago, not sure from where 😄)

sumnerboy12 · 2020-06-18T20:21:08Z

I have a docker stack with two services, backup and prune (i.e. one with BACKUP_CRON and one with PRUNE_CRON). When I deployed the stack they both started up simultaneously on the same node against an uninitialised repo (on Backblaze).

They both managed to initialise the repo at the same time and put it into an invalid state. This has been reported here.

(BTW - I think I recognise your handle also - openhab perhaps?!)

ThomDietrich · 2020-06-20T18:39:41Z

Hey!

Which part of the issue behind your link talks about initialization? The main thing I get from the issue is that "this kind of issue" is normally linked to the storage and less to restic. Can you verify this happens with the new logic and with simultaneous startup.
Regarding simultaneous: Why did you set it up like this? Wouldn't it then be better to add --prune to the forget command?

Let's clarify first that this is indeed a real issue, if so we need to discuss whether initialization should only happen with BACKUP_CRON.

Right. openHAB, of course! :)

sumnerboy12 · 2020-06-21T07:37:21Z

Just the comment that talked about that error being caused by the repo being initialised twice (somehow). When I got this error I was searching for relevant posts and came across that one. I noticed when checking the docker logs for my djmaze/resticker services that both the backup and prune services had started up simultaneously and had logging to say they had both initialised the repo.

The next time either of them attempted to do anything, or I tried to access the repo manually (via restic snapshots), I got the ciphertext validation error.

I was following the example in https://github.com/djmaze/resticker/blob/master/docker-swarm.example.yml.

zoispag · 2020-06-21T07:40:10Z

Hey!

Which part of the issue behind your link talks about initialization? The main thing I get from the issue is that "this kind of issue" is normally linked to the storage and less to restic. Can you verify this happens with the new logic and with simultaneous startup.

Regarding simultaneous: Why did you set it up like this? Wouldn't it then be better to add --prune to the forget command?

Let's clarify first that this is indeed a real issue, if so we need to discuss whether initialization should only happen with BACKUP_CRON.

Right. openHAB, of course! :)

@ThomDietrich the --prune is not realistic for big repos with multiple hosts running backup, since prune is a blocking action. That’s why @djmaze created the second container for pruning in #36 to run it on different intervals

2 different containers (backup & prune) starting at the same time would indeed cause the initialization twice. When I tried it, all my repos were already initialized.

ThomDietrich · 2020-06-21T09:51:22Z

@zoispag that makes sense. Thanks.
Got it. I guess we can safely assume that a repo will already be initialized in the prune usecase of the image. Shall we separate the check in #47 then. Check whether repository reachable and initialized but only initialize in the backup case? @djmaze

Also: in the example yml files the container start with a 30min difference. For the purpose of an example I would suggest to make it 12 hours.

zoispag · 2020-06-21T09:53:12Z

@zoispag that makes sense. Thanks.

Got it. I guess we can safely assume that a repo will already be initialized in the prune usecase of the image. Shall we separate the check in #47 then. Check whether repository reachable and initialized but only initialize in the backup case? @djmaze

Also: in the example yml files the container start with a 30min difference. For the purpose of an example I would suggest to make it 12 hours.

Pruning will start with 30min difference. The containers will start at the same time, if they are at the same docker-compose file.

ThomDietrich · 2020-06-21T10:31:27Z

Of course. Obviously I was talking about the cron job definition :)

djmaze · 2020-06-21T15:03:53Z

Check whether repository reachable and initialized but only initialize in the backup case? @djmaze

That sounds reasonable. Let's do it.

djmaze · 2020-06-21T15:33:49Z

I just realized that, at least in a swarm deployment, multiple initialization can also occur when the backup service is deployed to multiple machines. So just skipping the initialization in the prune service does not completely alleviate this problem.

In my Docker swarm deployments, I sometimes make use of separate "one-shot" initialization services. They are just started once and then exit. On swarm, this is possible using a restart policy with condition: on-failure. When using Docker compose, it should be possible to use restart: on-failure instead.

So we could have a separate init service which is just used for initializing the repository on first use. Maybe this would be the way to go. What do you think?

djmaze · 2020-06-21T15:35:59Z

Of course, when using Docker compose, it would be enough to add an init command for the entrypoint so you would just need to call something like docker-compose run backup init once.

ThomDietrich · 2020-06-21T19:33:21Z

Before we discuss this any further: Is there any reason why restic should even allow this? Instead of building a workaround here maybe we can help come up with an improvement in restic itself!?

ThomDietrich · 2020-06-21T19:42:10Z

Also: in the example yml files the container start with a 30min difference. For the purpose of an example I would suggest to make it 12 hours.

Moved to #49

djmaze · 2020-06-21T21:01:38Z

Before we discuss this any further: Is there any reason why restic should even allow this? Instead of building a workaround here maybe we can help come up with an improvement in restic itself!?

Would be okay for me. Then we should at least give a hint about this in the documentation (as you thankfully already began in #49).

sumnerboy12 · 2020-06-21T22:53:51Z

What about removing the auto-initialisation and requiring the user to do this step manually (with suitable documentation/instructions)?

zoispag · 2020-06-22T05:44:55Z

What about removing the auto-initialisation and requiring the user to do this step manually (with suitable documentation/instructions)?

I don’t think that’s a good idea. I do rely on restiker initializing the repos for me.

On the other hand, we can add a mild delay on the prune container on startup, to wait a minute before attempting initialization.

djmaze · 2020-06-22T10:03:03Z

I don't really like both ideas. Initializing the repo manually yourself means you have to also setup all the SSH keys, credentials etc. locally, which is prone to errors. The 1 minute wait, on the other hand, would rather be an ugly workaround and does not protect when deploying multiple backup containers simultaneously.

Ideally, this should be solved on the restic side. Or for a more short-term solution, we could have a separate init command as proposed before.

ThomDietrich · 2020-06-22T10:32:40Z

I like the carved out init command (which is process wise what @sumnerboy12 said) and am not a fan of the wait solution for its unclear new issues.

Let's talk init. In my opinion it is anyhow better to leave this one-time command to an intentional action by the user - even though I can't currently think of a scenario in which it might lead to unexpected behavior or security risk.

To summarize the solution:

In backup and prune "mode" the entrypoint script would check the connection to the repository and exit with an error if it is not accessible or initialized.
Initialization has to be executed manually with e.g. docker-compose run backup init
An optional environment variable (default off) allows the user to re-enable current auto-initialize behavior

In other words: The solution is a PR that is 10% code and 90% docu changes.

What do you think?

zoispag · 2020-06-22T11:01:32Z

If init is removed from both backup/prune containers, i would definitely need an ENV variable, to turn auto-init on. I am using this as part of my automation.

On the other hand, does it make sense to init a repo to do prune? It should already exist. We can remove init from prune altogether. (This still does not solve the issue of multiple backup containers on swarm)

stale · 2020-08-21T20:07:48Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

ThomDietrich · 2020-08-21T23:56:16Z

Not stale, just didn't have the time to work on yet

stale · 2020-1 8000 0-21T03:24:51Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

camo-f · 2020-11-03T14:36:08Z

Hello,

init is still an issue, I got the previously reported error with the current docker-compose example.

I think what @ThomDietrich summed up here is a good approach.

To summarize the solution:

In backup and prune "mode" the entrypoint script would check the connection to the repository and exit with an error if it is not accessible or initialized.

Initialization has to be executed manually with e.g. docker-compose run backup init

An optional environment variable (default off) allows the user to re-enable current auto-initialize behavior

I also agree with @zoispag that init shouldn't be done when pruning, as repo should already be initialised. So the optional env variable would only have effect in backup containers.

For Docker Swarm, @djmaze mentioned an initialisation service, which seems the cleanest way to handle multiple backup services to me. It could also be used for Docker Compose to only have a single file, which can be better for a quickstart.

@ThomDietrich Have you already worked on this issue ? It doesn't seem too complex to me, so I can help if needed.

ThomDietrich · 2020-11-03T16:18:25Z

Triggered by a linked project I will most probably take some time to work on this issue this week. I will respond to your thoughts and suggestions later. Thanks for the push

@djmaze could you please reopen the issue? Thanks

varac · 2021-02-14T20:35:12Z

Yes, please reopen this bug - I hit it every time I want to init a repo for a new host where both backup and prune containers are initializing at the same time.

djmaze · 2021-02-14T21:18:28Z

Oh, well, seems I overlooked the previous comment. Sorry

varac · 2021-04-07T17:43:59Z

Any news on this ? Right now, every new host added to our backup needs manual intervention, would be great to have this fixed.

djmaze · 2021-04-18T21:46:10Z

Sorry for the late answer. If there is no one on this, I will try to find some time to implement this in the upcoming days.

stale · 2021-06-18T22:24:48Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

MRezaNasirloo · 2023-01-07T12:04:13Z

I faced this issue while trying to set up restic on a new host with an existing repo, no my repo is broken and not usable anymore

Fatal: config or key 6a88fbbb5b0fa776b220d4cff42dd5716a3f691b550ffbec5a5fb15f15b2153c is damaged: ciphertext verification failed. Try again

pquerner · 2023-06-03T22:44:36Z

I just had the same (running 1.7.0) and temporary added a profiles entry to the service prune and check like described here.

After starting (docker compose up ...), only the backup service ran - and found no repository, therefore issued the init command on restic.
Then I stopped the container - and removed the profiles config from the .yml file and started it again - this time it did not create multiple keys (etc) and it ran flawlessly.

Maybe this needs a init service and all services need to wait for this service?

djmaze mentioned this issue Jun 21, 2020

Replace repository checking logic #47

Merged

ThomDietrich mentioned this issue Jun 21, 2020

Explain prune job and add better example times #49

Closed

stale bot added the wontfix label Aug 21, 2020

stale bot removed the wontfix label Aug 21, 2020

stale bot added the wontfix label Oct 21, 2020

stale bot closed this as completed Oct 28, 2020

djmaze reopened this Feb 14, 2021

stale bot removed the wontfix label Feb 14, 2021

stale bot added the wontfix label Jun 18, 2021

stale bot closed this as completed Jun 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

restic repo initialised twice when starting up docker with both backup/pruneprune actually runs #48

restic repo initialised twice when starting up docker with both backup/pruneprune actually runs #48

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

restic repo initialised twice when starting up docker with both backup/pruneprune actually runs #48

restic repo initialised twice when starting up docker with both backup/pruneprune actually runs #48

Comments

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!