8000 Writable API proposal · Issue #2879 · rkt/rkt · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
This repository was archived by the owner on Feb 24, 2020. It is now read-only.

Writable API proposal #2879

Open
yifan-gu opened this issue Jul 7, 2016 · 16 comments
Open

Writable API proposal #2879

yifan-gu opened this issue Jul 7, 2016 · 16 comments

Comments

@yifan-gu
Copy link
Contributor
yifan-gu commented Jul 7, 2016

Goals

  • To remove all rkt shell-outs in kubernetes.
  • Much cleaner integration points, which can make kubelet_wrapper easier to run rktnetes.
  • Make it easier for performance monitoring.

Non-goals

Implement container runtime level interface

Design

First, take a look at the rkt shell-outs we currently have:

rkt fetch
rkt prepare --pod-manifest
rkt rm
rkt image rm
rkt enter

These are the commands we want in short term, other APIs we probably want to have are:
rkt gc
rkt run
rkt trust

Besides, since we would want systemd to parent the rkt process (by rkt run-prepared today), so rkt run-prepared can be left out in the first pass as well.
Also for rkt prepare, we could just implement the --pod-manifest for now.

Here is a draft for the protobuf definition we will add.

// ImageType defines the supported image types.
enum ImageType {
    IMAGE_TYPE_UNDEFINED = 0;
    IMAGE_TYPE_APPC      = 1;
    IMAGE_TYPE_DOCKER    = 2;
    IMAGE_TYPE_OCI       = 3;
}


enum InsecureOption {
    INSECURE_OPTION_NONE = 0;
    INSECURE_OPTION_ALL = 1;
    INSECURE_OPTION_IMAGE = 2;
    INSECURE_OPTION_TLS = 4;
    INSECURE_OPTION_ON_DISK = 8;
    INSECURE_OPTION_HTTP = 16;
    INSECURE_OPTION_PUBKEY = 32;
}

enum StoreOption {
    STORE_OPTION_NONE = 0;
    STORE_OPTION_NO_STORE = 1;
    STORE_OPTION_ONLY_STORE = 2;
}

message Config {
    // Data dir.
    string data_dir = 1;

    // Auth credentials to use.
    bytes auth_credentials = 2; // Use TLS for the connection even it’s locally?
}

message FetchImageRequest {
    // Type of the image, e.g. aci, docker, oci, etc.
    ImageType type = 1;

    // Resolvable Name of the image, used for discovery, can be string, hash, URL, file path.
    string name = 2;

    // Insecure options for fetch, it’s a bitflag.
    InsecureOption insecure_option = 3;

    // Whether to fetch from the store, or skip the store.
    StoreOption store_option = 4;

    // Whether to fetch the depedencies
    bool WithDeps = 5;

    // Config to use for the command, including docker auth, stage1s, dirs.
    // If not empty, this will override the config used by API service.
    Config config = 3;
}

message FetchImageResponse {
    // ID of the fetched image, which can be used to uniquely identify the image.
    string id = 1;
}

message PreparePodRequest {
    // The pod manifest to use for preparing.
    bytes pod_manifest = 1;

    // Config to use for the command, including docker auth, stage1s, dirs.
    // If not empty, this will override the config used by API service.
    Config config = 2;
}

message PreparePodResponse {
    // ID of the prepared rkt pod, which can be used to invoke `rkt run-prepared`.
    string id = 1;
}

message RemovePodRequest {
    // ID of the pod to remove, when specified, at most one image will be removed.
    string id = 1;

    // Config to use for the command, including docker auth, stage1s, dirs.
    // If not empty, this will override the config used by API service.
    Config config = 2;
 }

message RemovePodResponse {}

message RemoveImageRequest {
    // ID of the image to remove, when specified, at most one image will be removed.
    string id = 1;

    // Name of the image to remove, if multiple images have the same name, 
    // all of them will be removed.
    string name = 2;

    // Config to use for the command, including docker auth, stage1s, dirs.
    // If not empty, this will override the config used by API service.
    Config config = 3;
}

message RemoveImageResponse {}

message EnterPodRequest {
    // ID of the pod to enter
    string id = 1;

    // Name of the app inside the pod to enter. 
    // This must be non-empty if there are multiple apps in the pod.
    string app_name = 2;

    // The command to run for enter, optional.
    repeated string command = 3;

    // The stdin to for the enter, optional
    // string stdin = 4;

    // Config to use for the command, including docker auth, stage1s, dirs.
    // If not empty, this will override the config used by API service.
    Config config = 5;
}

message EnterPodReponse {
    // The stdout to for the enter, optional
    string stdout = 1;

    // The stderr to for the enter, optional
    string stderr = 2;
}


service PublicAPI {
    // old APIs.rpc FetchImage (FetchImageRequest) returns (FetchImageResponse) {}
    rpc PreparePod (PreparePodRequest) returns (PreparePodResponse) {}
    rpc RemovePod (RemovePodRequest) returns (RemovePodResponse) {}
    rpc RemoveImage (RemoveImageRequest) returns (RemoveImageResponse) {}
    rpc EnterPod (EnterPodRequest) returns (stream EnterPodResponse) {} // This returns a bi-directional stream, where client can send stdin and server can send stdout/stderr.
}

Consideration

  • Writable API will require rkt api service to run as root for those operations.
  • Since we are moving towards container level APIs (), in the end of the day, it seems that these API calls will be dropped in future releases.
  • I am not confident we will have container level interface for 1.4, so it’s still worth for us to implement the write API, especially the benefits it will bring is visible (see goals above).

Plan

  • Implement the currently being used rkt shell-outs.
  • Cleanup rktnetes’s shell-outs
  • Implement any other necessary writable APIs.

cc @coreos/rkt-maintainers @coreos/rkt-engineering

@chancez
Copy link
Contributor
chancez commented Jul 7, 2016

Something like this would make implementing something like docker for mac or boot2docker but for rkt much easier!

@euank
Copy link
Member
euank commented Jul 7, 2016

I'm not sure that a global config object should accompany each request.

For this like auth for fetch and stage1 for prepare, it does make sense to provide those optoins, but I don't think they have to be part of the same larger config struct.

Some options, like data dir, I don't think there's really a need to vary per api-service instance.

It also might be nice to have some discussion about how the service will behave with respect to privileges; could I run it with user privileges and a data-dir I own, and then only be able to fetch, but not run?

@sgotti
Copy link
Contributor
sgotti commented Jul 7, 2016

// Type of the image, e.g. aci, docker, oci, etc.
ImageType type = 1;

This will probably deserve its own issue but I'd like to use this as a starting point.

I think this will actually map to the current rkt command line in this way:

  • aci: accept an image name string, aci hash, URL, file
  • docker: use docker:// scheme with docker image string
  • oci: same as the above

I think that in future the current semantic will change (how to handle backward compatibility?), some examples:

In future the idea will be to directly fetch oci refs and blobs (and docker v2.1 and v2.2 formats can be converted to this) in a dedicated oci local store (that's is quite different than the current aci store given the differences between the two specs), render to an oci runtime bundle and execute (or as a first step convert it to a rendered aci converting the oci config to an aci manifest) (#2541), this won't probably work with v1 registries (so docker2aci will be needed for these) since they aren't content addressable but uses imageID for every layer.

The main question is that oci image spec doesn't cover distribution but only image format since in future there may be different kind of distributions (not only the docker registry) for oci images.

I'm not sure if using specific shemes (docker:// oci://) covering both image type and distribution concepts is correct or there's the need for two distinct options (or a scheme like format+distribution:// like oci+registry://)

This choice will reflect on the rkt command line and this API.
Additionally, probably the k8s image spec definition (https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/container/runtime.go#L45 now just a string) looks like the point where a similar logic should be implemented.

Since we are moving towards container level APIs (), in the end of the day, it seems that these API calls will be dropped in future releases.

That's a good point. I can't say if the required work is worth the effort but at least it can be useful to clarify some questions like the above.

@lucab
Copy link
Member
lucab commented Jul 8, 2016

Writable API will require rkt api service to run as root for those operations.

This looks like a major design point, departing quite a bit from the current status quo.

Where do you plan to expose this interface? I see a comment related to TLS, so I guess some kind of network socket; if so how do you plan to do authn/authz there? And how should we map it to the current POSIX permissions model we are using?

@jonboulle
Copy link
Contributor

Since we are moving towards container level APIs (), in the end of the day, it seems that these API calls will be dropped in future releases.

This really troubles me - I don't understand the win here. We shouldn't design and implement an API knowing it's imminently going to be deprecated.

To remove all rkt shell-outs in kubernetes.

Is there evidence that this is causing issues today?

Much cleaner integration points, which can make kubelet_wrapper easier to run rktnetes.

Can you explain how this would simplify kubelet_wrapper?

Make it easier for performance monitoring.

How does a writeable API help with this?

@yifan-gu
Copy link
Contributor Author
yifan-gu commented Jul 8, 2016

This really troubles me - I don't understand the win here. We shouldn't design and implement an API knowing it's imminently going to be deprecated.

I agree, the argument is those APIs can be usable by other systems as well (true?), although in this proposal, I suggest we implement the onces used by k8s first.

Is there evidence that this is causing issues today?

See below.
Also as I mentioned, it causes profiling harder IMO.

Can you explain how this would simplify kubelet_wrapper?

See #2878

Can you explain how this would simplify kubelet_wrapper?

If all the commands go through API service, then we can just profile the API service to get the CPU / memory metrics. Otherwise, say 100 pods is launched, then it will at least generate 100 profiles.

@yifan-gu
Copy link
Contributor Author
yifan-gu commented Jul 8, 2016

Where do you plan to expose this interface? I see a comment related to TLS, so I guess some kind of network socket; if so how do you plan to do authn/authz there? And how should we map it to the current POSIX permissions model we are using?

This is a good concern, we will need to face it anyway when a writable API (either pod level or container level) is going to be implemented.

On the realistic side, the kubelet is running as root today anyway, so in practice we don't lose much, and I am not sure we should dig into the authn/authz rabbit hole at this point?

@yifan-gu
Copy link
Contributor Author
yifan-gu commented Jul 8, 2016

@sgotti Thanks for the input! So my understand of your concern is that only ImageType is not sufficient, right? I can replace that with ImageFormat which is another protobuf message itself, and we can add things we need later, like distribution, etc.

@sgotti
Copy link
Contributor
sgotti commented Jul 11, 2016

@sgotti Thanks for the input! So my understand of your concern is that only ImageType is not sufficient, right? I can replace that with ImageFormat which is another protobuf message itself, and we can add things we need later, like distribution, etc.

Yeah, basically I'm not sure which is the best way to define what image format/type and distribution to use, so I'm not sure if just a string (with specific scheme) will be enough for both. Will open a new issue on this.

IMHO ImageFormat sound strange to define both format and distribution but I'm not sure about a better name.

@yifan-gu
Copy link
Contributor Author

@sgotti Or ImageSpec. But that will break current API.

@euank
Copy link
Member
euank commented Jul 11, 2016

Since we are moving towards container level APIs (), in the end of the day, it seems that these API calls will be dropped in future releases.

If we expect this API to be used by anything other than rktnetes, we shouldn't add and deprecate things based on k8s needs.

However, if we expect this API to be a rktnetes specific api, we should develop it in a separate repository (probably a k8s one) and feel free to do whatever adding/removing as is needed in the k8s code, and tie it closely to k8s releases.

Are we trying to make:

  1. A general purpose api for rkt (usable by arbitrary other projects as well; rktnetes as the first user)
  2. An api explicitly to support rktnetes

Based on the goals, it looks more like 2, and it's possible this should live as part of the k8s project instead.
If it's instead 1, that's fine as well, but we should be clear about that.

I'd personally prefer to do 1 if we can, though 1 I think also implies we need to consider what this API looks like a little bit more.

Opinions?

@alban alban removed this from the v1.11.0 milestone Jul 20, 2016
@alban
Copy link
Member
alban commented Jul 20, 2016

This needs more discussions.
@euank @yifan-gu @tmrts : could you reassess the milestone?

@philips
Copy link
Contributor
philips commented Jul 20, 2016

For more context: The goal of having this API is to implement the Kubernetes CRI. Essentially the Kubelet wants a way of separating itself from the details of the container fetching and execution and instead focus on a handful of things like: log curation, pod cgroup setup, managing volumes, and managing networking (maybe). This is in-line with the overall goal of making Kubernetes less of a monolith and easier to hack on.

Some alternative ways of tackling this that could be considered:

  • Creating some sort of rkt library for the Kubelet to call
  • Breaking rkt into smaller pieces that could map to the CRI

Overall, rkt is doing two things:

  1. It is working very well as the system level container engine for applications like the kubelet itself, docker engine, etc
  2. It is working as a Kubernetes container engine but Kubernetes node team is trying to improve parts of the abstraction and add new features like init containers, dynamic pods, etc which necessitates this refactor.

Hope that helps set the stage.

8000
@yifan-gu
Copy link
Contributor Author

@philips Sorry for not being clear. The writable API proposed here (which is still on pod level that aligns with the rkt cli) is not targeted for the k8s CRI (operated on container level).
And this is part of the reason we are arguing whether we should put effort implement and maintain it, given that we are moving towards container level APIs.
The other reason is the rpc style writable API seems to break the POSIX permission model as @lucab mentioned #2879 (comment) , but we maybe could solve this by using unix socket.

The benefits for this pod level API would be:

  • Standardized integration point (everything goes through gRPC)
    • Easier for testing.
    • Easier for running rkt in rkt fly.
  • Make other people's life easier if they want to integrate rkt.

@yifan-gu
Copy link
Contributor Author
yifan-gu commented Jul 21, 2016

IMHO ImageFormat sound strange to define both format and distribution but I'm not sure about a better name.

Can we add some field in the FetchImageRequest later for describing the discovery scheme? @sgotti

@lucab
Copy link
Member
lucab commented Jul 22, 2016

@yifan-gu regarding format/distribution, there are some sketches for a unified scheme at #2953

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants
0