-
Notifications
You must be signed in to change notification settings - Fork 882
proposal: Fetchers refactor #2964
base: master
Are you sure you want to change the base?
Conversation
This proposal, starts from the distribution concept, to provide a fetchers refactors to adapt to the above concept. Additionally it also provides a complete workflow and ideas for transport level caching to fit different needs: * Distribution handlers * Transport plugin handler * Transport plugins * Transport level caching * Complete separation between image store and transport level caching
6d45a2e
to
239fdff
Compare
After seeing this I'm pretty sure we're on the exact same page, I just may have not articulated it clearly enough and you seem to have done the opposite. Also by actually providing definitions for "distribution" and "transport" I can better see the separation you want. :) If you ignore the docker portion of my plugins proposal, what I've implemented is pretty close to this, it'll just need some refactoring to have a clearer distinction between the distribution handlers and the transport handlers. When the fetch function gets called it'll either A: directly hand it off to the transport handler (which uses plugins to perform the actual network requests) or B: hand it do a distribution handler (only AppC discovery atm) which will do some distribution logic to produce concrete URLs (AppC discovery) and then use the transport handlers to fetch them. A big obvious difference between this and my current PR is that my PR is geared for the old store, so I'll need to do some work to make my suggestion adhere to this. Also a question: transport handlers cover things like For what it's worth, my answer in my fetching proposal is that the distribution handlers aren't called in this case. |
Oh nevermind that question, just reread and noticed the archive distribution. As an addition to your comment on how the AppC distribution will internally use the archive distribution, I expect it to be the same if we implement an OCI distribution, since it could use the logic in the archive distribution to fetch the blobs over https it needs. |
Does this mean that the archive distribution is the only distribution that ever calls out to the transport handlers? |
|
||
## Distribution Handlers plugins | ||
|
||
With this approach the distributions handler will live in the stage0. If there's the need to make them plugins the main tricky points is that they need to talk with an image store, so it'll be difficult to code them in a different language than go. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I wanted in my plugins proposal was to enable support for alternate transports, so I'm fine keeping this logic inside the stage0.
Aaaaand nevermind that question too. Now after reading more I see that the docker/OCI distribution may call them directly. At first my opinion was that distributions should always call the transport handler directly, as otherwise things seemed a little murkier to me (for AppC, which distribution would do the image import?) but after dwelling on it a bit I think the way you outlined it here is best. After all the AppC distribution is the exact same as the archive distribution, just with 1-2 additional steps it needs to do first (perform discovery, maybe fetch/trust public keys). I'd be happy to start refactoring my plugins PR to align with this, once the others here have a little more time for feedback on this. |
GC is the part of this proposal that I like the less. I'm not sure if I'm missing any details, but it seems avoidable. I guess we can augment current CAS with cacheManifest linked to an already downloaded blob, containing all distribution entries it has been fetched/discovered from; on removal, the cacheManifest is inspected and all cache entries evicted. This is already backward compatible, as a blob in CAS without a cacheManifest has no to cache entries to be purged. Am I missing some caveats? |
@lucab I see that in the case of transport caching GC could be overkill. I'll try to explain my thoughts:
Anyway I don't think cache transport gc is a big issue, it should also just remove entries older than N daysbwithout reading cache data, blob digest etc... BTW there're other places were a GC or a cleanup will be useful/required (to avoid complex logics). I will open an issue on this to clarify this and find how to expose/run GC functions. |
The oci-image-layout states that this layout should be a basis for distributing OCI images via an archive, http or other ways. Thinking in distribution terms I tend too see different distribution types:
The docker registry is not related to oci image layout but is just another distribution type that can be used to fetch oci images. Thoughts?
I think that, if the distribution idea is considered good and will pass, this work on the fetchers refactor can go ahead without having to wait for the new image store. The new image store changes will just change the distribution handler parts that imports images in the store. Instead, having this just in place will ease the creation of new oci image layout archive and docker registry distribution that are needed to really start using the store with OCI images. |
This proposal, starts from the distribution concept in #2953, to provide a
fetchers refactors to adapt to the above concept.
Additionally it also provides a complete workflow and ideas for
transport level caching to fit different needs:
@dgonyeo I opened this trying to define my view starting will all the various requirements that came to my mind and trying find a common denominator. This is obviously my view and I'd like to hear your thoughts on it and see how it'll fit with your proposed "fetcher" plugins in #2830.