8000 proposal: Fetchers refactor by sgotti · Pull Request #2964 · rkt/rkt · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
This repository was archived by the owner on Feb 24, 2020. It is now read-only.

proposal: Fetchers refactor #2964

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

sgotti
Copy link
Contributor
@sgotti sgotti commented Jul 22, 2016

This proposal, starts from the distribution concept in #2953, to provide a
fetchers refactors to adapt to the above concept.

Additionally it also provides a complete workflow and ideas for
transport level caching to fit different needs:

  • Distribution handlers
  • Transport plugin handler
  • Transport plugins
  • Transport level caching
  • Complete separation between image store and transport level caching

@dgonyeo I opened this trying to define my view starting will all the various requirements that came to my mind and trying find a common denominator. This is obviously my view and I'd like to hear your thoughts on it and see how it'll fit with your proposed "fetcher" plugins in #2830.

@sgotti sgotti changed the title proposal: Fetcher refactors proposal: Fetchers refactor Jul 22, 2016
This proposal, starts from the distribution concept, to provide a
fetchers refactors to adapt to the above concept.

Additionally it also provides a complete workflow and ideas for
transport level caching to fit different needs:

* Distribution handlers
* Transport plugin handler
* Transport plugins
* Transport level caching
* Complete separation between image store and transport level caching
@sgotti sgotti force-pushed the fetchers_refactor branch from 6d45a2e to 239fdff Compare July 22, 2016 07:12
@cgonyeo
Copy link
Member
cgonyeo commented Jul 22, 2016

After seeing this I'm pretty sure we're on the exact same page, I just may have not articulated it clearly enough and you seem to have done the opposite. Also by actually providing definitions for "distribution" and "transport" I can better see the separation you want. :)

If you ignore the docker portion of my plugins proposal, what I've implemented is pretty close to this, it'll just need some refactoring to have a clearer distinction between the distribution handlers and the transport handlers. When the fetch function gets called it'll either A: directly hand it off to the transport handler (which uses plugins to perform the actual network requests) or B: hand it do a distribution handler (only AppC discovery atm) which will do some distribution logic to produce concrete URLs (AppC discovery) and then use the transport handlers to fetch them.

A big obvious difference between this and my current PR is that my PR is geared for the old store, so I'll need to do some work to make my suggestion adhere to this.

Also a question: transport handlers cover things like https:// resources. If I provide rkt a URL correlating to a transport handler, are the distribution handlers still used?

For what it's worth, my answer in my fetching proposal is that the distribution handlers aren't called in this case.

@cgonyeo
Copy link
Member
cgonyeo commented Jul 22, 2016

Oh nevermind that question, just reread and noticed the archive distribution. As an addition to your comment on how the AppC distribution will internally use the archive distribution, I expect it to be the same if we implement an OCI distribution, since it could use the logic in the archive distribution to fetch the blobs over https it needs.

@cgonyeo
Copy link
Member
cgonyeo commented Jul 22, 2016

Does this mean that the archive distribution is the only distribution that ever calls out to the transport handlers?


## Distribution Handlers plugins

With this approach the distributions handler will live in the stage0. If there's the need to make them plugins the main tricky points is that they need to talk with an image store, so it'll be difficult to code them in a different language than go.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I wanted in my plugins proposal was to enable support for alternate transports, so I'm fine keeping this logic inside the stage0.

@cgonyeo
Copy link
Member
cgonyeo commented Jul 22, 2016

Aaaaand nevermind that question too. Now after reading more I see that the docker/OCI distribution may call them directly.

At first my opinion was that distributions should always call the transport handler directly, as otherwise things seemed a little murkier to me (for AppC, which distribution would do the image import?) but after dwelling on it a bit I think the way you outlined it here is best. After all the AppC distribution is the exact same as the archive distribution, just with 1-2 additional steps it needs to do first (perform discovery, maybe fetch/trust public keys).

I'd be happy to start refactoring my plugins PR to align with this, once the others here have a little more time for feedback on this.

@lucab
Copy link
Member
lucab commented Jul 22, 2016

GC is the part of this proposal that I like the less. I'm not sure if I'm missing any details, but it seems avoidable.

I guess we can augment current CAS with cacheManifest linked to an already downloaded blob, containing all distribution entries it has been fetched/discovered from; on removal, the cacheManifest is inspected and all cache entries evicted. This is already backward compatible, as a blob in CAS without a cacheManifest has no to cache entries to be purged.

Am I missing some caveats?

@sgotti
Copy link
Contributor Author
sgotti commented Jul 25, 2016

@lucab I see that in the case of transport caching GC could be overkill. I'll try to explain my thoughts:

  • The main intent was to keep the different layers (in this case image store and transports) decoupled (also for reusability in other projexts).The image store should only save images related information.
  • to avoid transport cache GC we can just use transport cache to know the blob digest. An upper layer (image rm) should just clean transport entries when a blob is removed. But this won't work if the image store does blob GC as I was thinking (see below).
    • Since this isn't atomic, if something fails (error, process killed etc...) we are left with stale entries that will require a GC function to clean them.

Anyway I don't think cache transport gc is a big issue, it should also just remove entries older than N daysbwithout reading cache data, blob digest etc...

BTW there're other places were a GC or a cleanup will be useful/required (to avoid complex logics). I will open an issue on this to clarify this and find how to expose/run GC functions.

@sgotti
Copy link
Contributor Author
sgotti commented Jul 25, 2016

As an addition to your comment on how the AppC distribution will internally use the archive distribution, I expect it to be the same if we implement an OCI distribution, since it could use the logic in the archive distribution to fetch the blobs over https it needs.

The oci-image-layout states that this layout should be a basis for distributing OCI images via an archive, http or other ways. Thinking in distribution terms I tend too see different distribution types:

  • oci image layout archive: will fetch a single file via any transport (file, http etc...), then the distribution handler will read the archive, take the ref and use the descriptor to fetch the required blobs from the archive.
  • oci image layout via an http server: (I'm just imagining since nothing has been defined) an http server will expose a directory layout like the oci image layout and the distribution will call the transport handler multiple times to fetch the ref and the blobs.

The docker registry is not related to oci image layout but is just another distribution type that can be used to fetch oci images.

Thoughts?

A big obvious difference between this and my current PR is that my PR is geared for the old store, so I'll need to do some work to make my suggestion adhere to this.

I think that, if the distribution idea is considered good and will pass, this work on the fetchers refactor can go ahead without having to wait for the new image store.

The new image store changes will just change the distribution handler parts that imports images in the store. Instead, having this just in place will ease the creation of new oci image layout archive and docker registry distribution that are needed to really start using the store with OCI images.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0