Migrate index path id annotations #4190

natasha41575 · 2021-09-18T00:00:09Z

This PR is step 1 of #4024.

Annotation changes:

config.k8s.io/id -> internal.config.kubernetes.io/id
config.kubernetes.io/path -> internal.config.kubernetes.io/path
config.kubernetes.io/index -> internal.config.kubernetes.io/index

For a period of time, we must support both the new internal annotations as well as the legacy ones. This means:

The orchestrator, when setting annotations, will set both new (internal.config.kubernetes.io/*) and legacy annotations.
When it encounters one (either the internal or legacy annotation), it will copy that annotation over to the other to ensure that it supports functions that consume or modify either.
All functions that consume these annotations will consume both. The orchestrator first checks for the new internal annotation, and falls back to the legacy annotation if it is missing.
After a function is run, these annotations need to be reconciled. That is, if a function modifies either the legacy or new internal annotation, the orchestrator must update the other to reflect the change. TestLegacyAnnotationReconciliation provide some examples, such as:
- There is a generator function that creates a new resource and sets its path or index via the legacy annotations. The orchestrator must ensure that it copies this over to the new internal annotations.
- There is a pipeline of functions, one of which modifies the legacy path annotation, and a later one does some operation on the new internal annotation. The orchestrator must ensure that the latter function consumes the modified path annotation by copying the legacy path annotation to the new internal annotation.

In general, while we are supporting both the legacy and new annotations, we must output both in all scenarios, even if the input resources start with one. This will ensure a pipeline of functions will still work, even if some are consuming the old annotations and some are consuming the new.

Reconciliation Algorithm

The main problems to solve were:

Each function receives a ResourceList and returns a ResourceList. There is no guarantee of the resources being in any particular order, so we must have a reliable method of determining which inputs map to which outputs.
Even if we know which inputs map to which outputs, we need a reliable way of determining which annotation has changed if they differ. If internal.config.kubernets.io/path is foo and config.kubernetes.io/path is bar, we need to determine which one the function changed and update the other accordingly.
There is an "id" annotation - internal.config.kubernetes.io/id - that is intended to resolve problem 1, but it is one of the annotations being migrated. If internal.config.kubernetes.io/id and config.k8s.io/id differ, we need to know which one to use. I wouldn't mind changing this PR to just use internal.config.kubernetes.io/id to identify the input->output mapping since theoretically a function is not supposed to change it, but I believe the implementation I opted for accounts for functions that change it anyways.

The solution I opted for here was to store the annotations for each resource in a map structure map[string]map[string]string that maps path -> index -> id. The orchestrator stores this information for each resource before running each function, so that each resource has one unique entry in the map prior to the function being run.

After the function is run, we iterate through all the resources again. If the legacy annotations are identical to the new internal annotations, there is no reconciliation needed.

If they differ, we check the map. If we find the legacy path/index/id entry in the map, that means it's the internal annotations that have changed. If we find the internal path/index/id entry in the map, that means it's the legacy annotations that have changed. Then we update the other annotations to reflect the same change that was made by the function.

Cases not handled

This does not handle the case that a function, say, modifies config.k8s.io/id and internal.config.kubernetes.io/path - where it has modified one internal annotation and one legacy annotation. I am not sure why a function author would choose to do this, and such a case may be considered invalid. With this implementation of the migration, the behavior would be undefined. If somehow at the end of the reconciliation algorithm, the annotations still differ, the orchestrator favors the new internal annotation when writing output.

ALLOW_MODULE_SPAN

cc @yuwenma

k8s-ci-robot · 2021-09-18T00:00:18Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: natasha41575

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [natasha41575]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

natasha41575 · 2021-09-18T00:00:20Z

/hold

natasha41575 · 2021-09-18T00:00:42Z

/retest

monopole · 2021-09-24T14:59:16Z

/lgtm

monopole · 2021-09-24T15:00:10Z

Might want to file an issue to clean up after this, or at least record what cleanup could be done - while the context is still fresh in mind

mengqiy · 2021-09-24T18:49:23Z

@natasha41575 Please help me understand how it work in this scenario.
I have a function that modify path annotation. The input resourceList has both annotations that points to the same file. After processing by the function, the output resourceList will still have both annotations but points to different files.
When the orchestrator (e.g. kpt or kustomize) looks at the output. It's clear what is the users' intend by only looking at the output.
A potential solution is to compare the input and output and then infer the user intend.
Thoughts?

natasha41575 · 2021-09-24T19:40:10Z

I have a function that modify path annotation. The input resourceList has both annotations that points to the same file. After processing by the function, the output resourceList will still have both annotations but points to different files.
When the orchestrator (e.g. kpt or kustomize) looks at the output. It's clear what is the users' intend by only looking at the output.
A potential solution is to compare the input and output and then infer the user intend.

If I understand you correctly, that's exactly what this PR does. Say the input resource has the following:

config.kubernetes.io/path: foo
internal.config.kubernetes.io/path: foo

and the output resource has the following:

config.kubernetes.io/path: foo
internal.config.kubernetes.io/path: bar

This PR creates a map structure that will store the original path value "foo" (along with the index and id annotations, which are empty here). Then, when processing the output, it will look in the map for "bar". It doesn't find it, so it then looks for "foo". (It will actually look for something in the map with matching path, index, and id but I am ignoring that for the simplicity of this example). Because it finds it in the stored map of original values, it infers the intent and changes the output resource to the following:

config.kubernetes.io/path: bar
internal.config.kubernetes.io/path: bar

See the test here: https://github.com/natasha41575/kustomize/blob/67a5f6d68f9f5f21f59b7a939bb4d66c2c779aef/kyaml/kio/kio_test.go#L349 and all the tests after it.

which I believe test exactly the scenario you've described. Please let me know if that answers your question.

The reason that I am storing the original annotations in a map, rather than checking the input directly against the output, is that the order of the resources may change, some resources may be generated or removed, etc. There is no guarantee and so we have to have another mechanism of looking up what the original annotations were.

mengqiy · 2021-09-24T22:02:14Z

Thanks for your detailed explanation!

natasha41575 requested review from droot, mengqiy, monopole and KnVerey September 18, 2021 00:00

k8s-ci-robot requested a review from mortent September 18, 2021 00:00

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 18, 2021

k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Sep 18, 2021

natasha41575 mentioned this pull request Sep 18, 2021

Kpt should use internal.config.kubernetes.io/* annotations kptdev/kpt#2478

Closed

support krm spec v1 and legacy path, index, and id annotations

67a5f6d

natasha41575 force-pushed the MigrateIndexPathIdAnnotations branch from 590983c to 67a5f6d Compare September 18, 2021 00:10

k8s-ci-robot assigned monopole Sep 24, 2021

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 24, 2021

monopole merged commit 22b7358 into kubernetes-sigs:master Sep 24, 2021

natasha41575 mentioned this pull request Sep 28, 2021

update kyaml to v0.12 and cli-utils to latest kptdev/kpt#2523

Merged

natasha41575 mentioned this pull request Oct 12, 2021

fix bug with migrating kyaml reader path, index, and id annotations #4227

Merged

natasha41575 mentioned this pull request Oct 20, 2021

kyaml should implement KRM Functions Spec v1 #4024

Open

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate index path id annotations #4190

Migrate index path id annotations #4190

natasha41575 commented Sep 18, 2021 •

edited

k8s-ci-robot commented Sep 18, 2021

natasha41575 commented Sep 18, 2021

natasha41575 commented Sep 18, 2021

monopole commented Sep 24, 2021

monopole commented Sep 24, 2021

mengqiy commented Sep 24, 2021

natasha41575 commented Sep 24, 2021 •

edited

mengqiy commented Sep 24, 2021

Migrate index path id annotations #4190

Migrate index path id annotations #4190

Conversation

natasha41575 commented Sep 18, 2021 • edited

Reconciliation Algorithm

Cases not handled

k8s-ci-robot commented Sep 18, 2021

natasha41575 commented Sep 18, 2021

natasha41575 commented Sep 18, 2021

monopole commented Sep 24, 2021

monopole commented Sep 24, 2021

mengqiy commented Sep 24, 2021

natasha41575 commented Sep 24, 2021 • edited

mengqiy commented Sep 24, 2021

natasha41575 commented Sep 18, 2021 •

edited

natasha41575 commented Sep 24, 2021 •

edited