8000 Design document for the Network Policies based isolation by roytman · Pull Request #1944 · fybrik/fybrik · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Design document for the Network Policies based isolation #1944

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Mar 21, 2023

Conversation

roytman
Copy link
Collaborator
@roytman roytman commented Jan 24, 2023

Design document for the Network Policies based isolation

/fixes #1962

Signed-off-by: Alexey Roytman roytman@il.ibm.com

@roytman roytman changed the title initial version Design document for the Network Policies based isolation Jan 24, 2023
Copy link
Member
@simanadler simanadler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my questions / comments


## Network Policies

Fybrik is a Kubernetes application, therefore, the simplest isolation method can be based on
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the other options, other than network policies? Why were they ruled out?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I plan to work incrementally, they will be added later in parallel with NP implementation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is another design doc mentioning all the approaches please link to it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is in a box under development now. After that it will be a separate document or extension of this one.

- deprecate the “workloadSelector” element, but continue to support it for backward compatibility. Process it as an
additional “podSelector” entry in the “NetworkPoliciesPeers” array

- terminate the backward compatibility assumption that all Read flow should have at least a single workload label.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous question about workload label


- add the relevant ports to the NetworkPolicyIngressRules.
- in future Fybrik releases, if the same module serves several workloads or other modules, we will create an
independent NP instance per source. That will help us to manage the NP instances.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate? Who/what will manage the NPs across workloads? Who will decide when to delete them, etc?

Copy link
10000
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we will support a shared modules, when a single module can be used by different Plotters / blueprints, each blueprint will deploy and manage its independent NP instance.
For example:
a fybrik application A created a Plotter/blueprint and they deployed a module M. together with the module the blueprint deployed a NP instance NP1.
now, another fybrik application B created another Plotter/blueprint but the new blueprint discovered that the required module M was been deployed, so it will reuse the module and instead of deployment a new instance of module, it will create only new instance of NP - NP2.

I know that this scenario is not supported, i Just mentioned it here to show that the NP mechanism is very flexible, and allows to do it.

Copy link
Collaborator Author
@roytman roytman Jan 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, you gave me another use case.
A user created a fybrik application, there is a data plane, but not the user wants to change the isolation, adds or remove workload sources, without any changes in the data plane. We have to check if it possible to update fybrik application spec (the selector only) without redeployment the data plan - only update the NP instance.

Copy link
Member
@simanadler simanadler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments


## Network Policies

Fybrik is a Kubernetes application, therefore, the simplest isolation method can be based on
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is another design doc mentioning all the approaches please link to it.

## Requirements

- Only predefined users/workloads should be able to access the data plane.
- We do not assume that client workloads will be collocated on the same Kubernetes cluster with the data module,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently this is the assumption. Fybrik does not currently work if the workload is in a different cluster than the control plane. It is something we want to change in the future.

Current “FybrikApplicationSpec” has the “[selector](https://fybrik.io/v1.2/reference/crds/#fybrikapplicationspecselector)”
element, which is a combination of “clusterName” and “workloadSelector”. Unfortunately, based on this information we can
implement only one label based “podSelector”. Furthermore, for the backward compatibility of the current implementation
assumes that a Read scenario should have at least one label, which is not always true.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above

- extend the “FybrikApplicationSpec.selector” element with an array of
[NetworkPolicyPeers](https://github.com/kubernetes/api/blob/59fcd23597fd090dba6b7e903eb0a8c9e8efb0a6/networking/v1/types.go#L183)
this “NetworkPoliciesPeers” array will be an input element for the NP `from`element. This will restrict possible incoming
connections to the relevant Fybrik module.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is unclear to me. Can you give some examples? It's really important that we provide something as simple as possible for the user who is creating the FybrikApplication to indicate the workload. Let Fybrik discover as much as possible on its own.

connect to only predefined data sets.
- Currently, Fybrik assumes co-location of client workloads with data plane entry point modules. Future implementations
might support deployment workloads on different clusters or run them out of Kubernetes as a standalone applications.
- We check the co-location of a Fybrik module and a relevant workload in the same cluster as a specific use case.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we check the co-location? We may enforce it using IT config policies.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, plus it can be seen from the input provided by users. in the FybrikApplication.spec.selector

Signed-off-by: Alexey Roytman <roytman@il.ibm.com>
Fybrik will not support different network restrictions to different module ports. (NP allow it)

The egress element will be a combination of destinations, such as, the next module (`2a`), data source (`5`), Fybrik (`3`)
or module required (`4`) services. The connection types are taken from the [network topologies](FybrikNetworkTopologiesAndRequirements.md)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are these destinations formed? Specifically:
2a - Fybrik knows where the module is deployed (namespace) but not the labels of the deployed pods which is internal to the module chart implementation. Unless I am missing some assumptions on module implementation.
3 - any change to the existing deployment of Vault and the relevant policies?
5 - Fybrik is not aware of the data store destination. The connection details are passed as-is to the access module. How do you see this policy implemented?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2a, it is a regular chaining convection, fybrik adds some of the labels, and actually for NP I need only a single one app.kubernetes.io/instance, I can demonstrate how it works
3- I don't see any changes
5 - I plan to get the destination details from the connection details

PlotterController who has the entire picture, should provide this information.

NetworkPolicies select pods according to labels. In order to separate policies for 2 different modules, each module should
have a unique label, in additional to the common, fybrik application defined labels.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fybrik passes a set of labels inside Helm values but does not enforce these labels on the deployed resources. No one guarantees that a 3rd party module will propagate the labels to its pods.

@roytman roytman merged commit f6f63a6 into fybrik:master Mar 21, 2023
aradhalevy pushed a commit to aradhalevy/fybrik that referenced this pull request Apr 3, 2023
* initial version

* Add more explanations
---------
Signed-off-by: Alexey Roytman <roytman@il.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide a design of isolation based on K8s NetworkPolicies
4 participants
0