Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(MADR): policy on namespace #10148

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
Open

docs(MADR): policy on namespace #10148

wants to merge 13 commits into from

Conversation

lobkovilya
Copy link
Contributor

@lobkovilya lobkovilya commented Apr 30, 2024

Checklist prior to review

  • Link to relevant issue as well as docs and UI issues -- Fix MADR for applying new policies on different namespaces #7269
  • This will not break child repos: it doesn't hardcode values (.e.g "kumahq" as a image registry) and it will work on Windows, system specific functions like syscall.Mkfifo have equivalent implementation on the other OS --
  • Tests (Unit test, E2E tests, manual test on universal and k8s) --
    • Don't forget ci/ labels to run additional/fewer tests
  • Do you need to update UPGRADE.md? --
  • Does it need to be backported according to the backporting policy? (this GH action will add "backport" label based on these file globs, if you want to prevent it from adding the "backport" label use no-backport-autolabel label) --

lobkovilya and others added 3 commits April 30, 2024 16:28
Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
Signed-off-by: Marcin Skalski <skalskimarcin33@gmail.com>
Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
@lobkovilya lobkovilya added the ci/skip-test PR: Don't run unit and e2e tests (maybe this is just a doc change) label Apr 30, 2024
@lobkovilya lobkovilya requested a review from a team as a code owner April 30, 2024 20:01
@lobkovilya lobkovilya requested review from jijiechen, jakubdyszkiewicz and lukidzi and removed request for a team April 30, 2024 20:01
Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
@lobkovilya lobkovilya changed the title docs(MADR): Policy On Namespace docs(MADR): policy on namespace May 6, 2024
docs/madr/decisions/046-policy-on-namespace.md Outdated Show resolved Hide resolved
docs/madr/decisions/046-policy-on-namespace.md Outdated Show resolved Hide resolved
docs/madr/decisions/046-policy-on-namespace.md Outdated Show resolved Hide resolved
docs/madr/decisions/046-policy-on-namespace.md Outdated Show resolved Hide resolved
docs/madr/decisions/046-policy-on-namespace.md Outdated Show resolved Hide resolved
docs/madr/decisions/046-policy-on-namespace.md Outdated Show resolved Hide resolved
Co-authored-by: Charly Molter <charly.molter@konghq.com>
Signed-off-by: Ilya Lobkov <lobkovilya@yandex.ru>
Automaat and others added 4 commits May 9, 2024 11:57
Signed-off-by: Marcin Skalski <skalskimarcin33@gmail.com>
Signed-off-by: Marcin Skalski <skalskimarcin33@gmail.com>
Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
@lobkovilya lobkovilya force-pushed the docs/madr-046 branch 2 times, most recently from 6b7bf7a to db02685 Compare May 15, 2024 08:13
Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
Copy link
Contributor

@lahabana lahabana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This MADR is very hard to follow for the following reasons:

  1. It should probably start by introducing Consumer and Producer policies which I think are paramount to understand what this is about.
  2. It starts by explaining the implementation which is confusing as you haven't explained what you want to do.

docs/madr/decisions/046-policy-on-namespace.md Outdated Show resolved Hide resolved
docs/madr/decisions/046-policy-on-namespace.md Outdated Show resolved Hide resolved
docs/madr/decisions/046-policy-on-namespace.md Outdated Show resolved Hide resolved
docs/madr/decisions/046-policy-on-namespace.md Outdated Show resolved Hide resolved
docs/madr/decisions/046-policy-on-namespace.md Outdated Show resolved Hide resolved
docs/madr/decisions/046-policy-on-namespace.md Outdated Show resolved Hide resolved
docs/madr/decisions/046-policy-on-namespace.md Outdated Show resolved Hide resolved
docs/madr/decisions/046-policy-on-namespace.md Outdated Show resolved Hide resolved

Applying a policy on the custom namespace **affects requests to the pods in the custom namespace**.
Since some policies are applied on the client-side, the scope of policy's effect is not limited to the namespace
where it was applied.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 2 sentences contradict each other.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? The first sentence says policy in the custom namespaces affects requests to the custom namespace (not pods in the custom namespace). And the second sentence just elaborates – since some policies are applied on the client-side the policy scope is no longer limited to policy's namespace.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I didn't see requests. I had to reread this 3 times, I think it could be made more clear

At this moment, it works only between MeshTimeout and MeshHTTPRoute, but there are plans to support it for other
policies, i.e. [#6645](https://github.com/kumahq/kuma/issues/6645).
Referencing a policy from another namespace or another zone should be done in a similar way MeshService is referenced.
Fields `name/namespace` and `labels` are mutually exclusive.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this: #10247 suggests an alternative more straightforward approach in this case that may simplify a lot of things

Copy link
Contributor

@lahabana lahabana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While thinking about: #10247 and thinking about syncing to zones with producer/consumer policies it struck me that you could have multiple to entries with different kinds (one could be producer, the other consumer). What would be the syncing behaviour in this case?

@lobkovilya
Copy link
Contributor Author

While thinking about: #10247 and thinking about syncing to zones with producer/consumer policies it struck me that you could have multiple to entries with different kinds (one could be producer, the other consumer). What would be the syncing behaviour in this case?

The new validation rules are going to limit what policies you can create. If top-level targetRef doesn't specify kuma.io/zone and k8s.kuma.io/namespace then each item in to[] has to specify both kuma.io/zone and k8s.kuma.io/namespace. That's kind of a downside we're having with new design – consumer policies must have kuma.io/zone and k8s.kuma.io/namespace tags in top-level targetRef, i.e Frontend Owner wants to change how frontend communicates with others:

metadata:
  namespace: frontend-ns
  labels:
    kuma.io/origin: zone
spec:
  targetRef:
    kind: MeshSubset
    tags:
      kuma.io/zone: i-have-to-know-the-name-of-my-zone
      k8s.kuma.io/namespace: frontend-ns
  to:
    - targetRef:
        kind: Mesh
      default: {}

Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
### Namespace-scoped policies and multizone

Namespace-scoped policies can be applied to a custom namespace only on zones as there are simply no DPPs and custom
namespaces on Global.
Copy link
Contributor

@lahabana lahabana May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be enforced by validation webhook

After a user applied a policy on zone's custom namespace, the webhook automatically adds a `k8s.kuma.io/namespace` label to the policy.
The label is part of the hash suffix when policy is synced to global.

As mentioned previously, a policy applied in producer namespace can affect envoy configs of consumers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the first ocurence of the work "producer" so it's not "mentioned previously"

With the increased blast radius of the policies comes great responsibility,
we don't want small mistake in the policy to break traffic on the other side of the world.
A few things could be done to protect users from impactful mistakes:
* Enhanced validation. If policy is created in `backend-ns` of the `zone-1` and the top-level targetRef is `Mesh`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm why? I'm confused I think I need an example to understand this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The next section #### Enhanced validation provides more details and also has some examples.

A few things could be done to protect users from impactful mistakes:
* Enhanced validation. If policy is created in `backend-ns` of the `zone-1` and the top-level targetRef is `Mesh`,
then `to[].targetRef` should have both `k8s.kuma.io/namespace: backend-ns` and `kuma.io/zone: zone-1`.
* Ability to opt-in or opt-out from syncing the policy to other zones.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's necessary you can always use a top spec.tagerRef.kind: MeshSubset to reduce blast radius no? That's what it is for no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably wasn't clear that these points:

* Enhanced validation...
* Ability to opt-in or opt-out...

are followed by the separate sections to elaborate on each point:

#### Enhanced validation
...
#### Considered options on opting-in(out) to cross-zone policy syncing
...

and the decision for opting-in(out) is exactly what you said – we can always use a top spec.targetRef.kind: MeshSubset to reduce blast radius.


The validation rules for namespace-scoped policy are the following:
* to-policy when placed into the custom namespace must have the namespace and zone either in the top-level targetRef or in each to-item
* from-policy when placed into the custom namespace must always have the namespace and zone in the top-level targetRef
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems unnecessarily annoying. Again UX is important here!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't from policy on custom namespace having an "implicit MeshSubset" a better behaviour? Especially if we go for #10247 top level targetRef slowly disappears.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also with the current zone originated policies is similar to having an implicit MeshSubset no?

they have to put the policy into `kuma-system` namespace by checking with Mesh Operator first.

The validation rules for namespace-scoped policy are the following:
* to-policy when placed into the custom namespace must have the namespace and zone either in the top-level targetRef or in each to-item
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annoying!

Zone-originated policies in `kuma-system` won't be synced to other zones.
This eliminated the problem of backwards compatibility.
Also, with the introduction of namespace-originated policies, applying policies to `kuma-system` becomes an edge use case.
The only reason to apply a policy on zone's `kuma-system` is Zone Cluster Operator stepping in to fix the undesired behaviour.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only undesired? Let's say 1 zone is very far away maybe the operator will add increased default timeouts.

When policies have the same zone, go to the next step.
5. **[new step]** Policy from local namespace is more specific than a policy from another namespace.
When namespaces are equal, go to the next step.
6. The policy is more specific if it's name is lexicographically less than other policy's name ("aaa" < "bbb" so that "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just write pseudo code? This is hard to process

`MeshGateway` at this moment is not namespace-scoped. It can change in the future with
the [MeshBuiltinGateway resource](https://github.com/kumahq/kuma/issues/10014).

For policies in custom namespaces, we've decided that we would allow targeting `MeshGateway` in `MeshHTTPRoute` and `MeshTCPRoute`, and policies
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

targeting where? to? top level?

backendRefs:
- kind: MeshService
name: frontend
namespace: frontend-ns
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This namespace can be implicit right?
Also, if this namespace were different from frontend-ns would this be rejected?

to:
- targetRef:
kind: MeshService # you need to specify MeshService in to[].targetRef.kind
name: frontend
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is equivalent to having the implicit namespace: frontend-ns ?
What would happen if I had namespace: backend-ns

Finally what happens with:

type: MeshCircuitBreaker
mesh: default
name: circuit-breaker
namespace: frontend-ns
spec:
  targetRef:
    kind: MeshGateway
    name: edge-gateway
  to:
  - targetRef:
      kind: MeshService # you need to specify MeshService in to[].targetRef.kind
      name: frontend
     default: {}
  - targetRef:
      kind: MeshService
      name: backend
      namespace: backend-ns
    default: {}

My gut feeling is that this should rejected (entries in the to must have the same namespace). We should likely suggest users to not set multiple to entries

Comment on lines +131 to +165
metadata:
namespace: backend-ns
labels:
kuma.io/origin: zone
kuma.io/zone: producer-zone
spec:
targetRef:
kind: MeshSubset
tags:
k8s.kuma.io/namespace: backend-ns
to:
- targetRef:
kind: MeshService
labels:
kuma.io/display-name: redis
kuma.io/zone: producer-zone
---
metadata:
namespace: backend-ns
labels:
kuma.io/origin: zone
kuma.io/zone: producer-zone
spec:
targetRef:
kind: MeshSubset
tags:
k8s.kuma.io/namespace: backend-ns
kuma.io/zone: producer-zone
to:
- targetRef:
kind: MeshService
name: redis
namespace: redis-ns
- targetRef:
kind: Mesh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just get confused with the example, it looks like a cross namespace control?
So, can we define our top TargetRef MeshSubset with another NS front-end if our policy is backend-ns?

Actually, I feel it's pretty complicated in the cross-namespace use cases definition :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I just get understand with this.
Maybe we should also paste the link here: https://gateway-api.sigs.k8s.io/mesh/#gateway-api-for-service-mesh

For those people who are unfamiliar with the context

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/skip-test PR: Don't run unit and e2e tests (maybe this is just a doc change)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MADR for applying new policies on different namespaces
6 participants