Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate index path id annotations #4190

Merged

Conversation

natasha41575
Copy link
Contributor

@natasha41575 natasha41575 commented Sep 18, 2021

This PR is step 1 of #4024.

Annotation changes:

  • config.k8s.io/id -> internal.config.kubernetes.io/id
  • config.kubernetes.io/path -> internal.config.kubernetes.io/path
  • config.kubernetes.io/index -> internal.config.kubernetes.io/index

For a period of time, we must support both the new internal annotations as well as the legacy ones. This means:

  • The orchestrator, when setting annotations, will set both new (internal.config.kubernetes.io/*) and legacy annotations.
  • When it encounters one (either the internal or legacy annotation), it will copy that annotation over to the other to ensure that it supports functions that consume or modify either.
  • All functions that consume these annotations will consume both. The orchestrator first checks for the new internal annotation, and falls back to the legacy annotation if it is missing.
  • After a function is run, these annotations need to be reconciled. That is, if a function modifies either the legacy or new internal annotation, the orchestrator must update the other to reflect the change. TestLegacyAnnotationReconciliation provide some examples, such as:
    • There is a generator function that creates a new resource and sets its path or index via the legacy annotations. The orchestrator must ensure that it copies this over to the new internal annotations.
    • There is a pipeline of functions, one of which modifies the legacy path annotation, and a later one does some operation on the new internal annotation. The orchestrator must ensure that the latter function consumes the modified path annotation by copying the legacy path annotation to the new internal annotation.

In general, while we are supporting both the legacy and new annotations, we must output both in all scenarios, even if the input resources start with one. This will ensure a pipeline of functions will still work, even if some are consuming the old annotations and some are consuming the new.

Reconciliation Algorithm

The main problems to solve were:

  1. Each function receives a ResourceList and returns a ResourceList. There is no guarantee of the resources being in any particular order, so we must have a reliable method of determining which inputs map to which outputs.

  2. Even if we know which inputs map to which outputs, we need a reliable way of determining which annotation has changed if they differ. If internal.config.kubernets.io/path is foo and config.kubernetes.io/path is bar, we need to determine which one the function changed and update the other accordingly.

  3. There is an "id" annotation - internal.config.kubernetes.io/id - that is intended to resolve problem 1, but it is one of the annotations being migrated. If internal.config.kubernetes.io/id and config.k8s.io/id differ, we need to know which one to use. I wouldn't mind changing this PR to just use internal.config.kubernetes.io/id to identify the input->output mapping since theoretically a function is not supposed to change it, but I believe the implementation I opted for accounts for functions that change it anyways.

The solution I opted for here was to store the annotations for each resource in a map structure map[string]map[string]string that maps path -> index -> id. The orchestrator stores this information for each resource before running each function, so that each resource has one unique entry in the map prior to the function being run.

After the function is run, we iterate through all the resources again. If the legacy annotations are identical to the new internal annotations, there is no reconciliation needed.

If they differ, we check the map. If we find the legacy path/index/id entry in the map, that means it's the internal annotations that have changed. If we find the internal path/index/id entry in the map, that means it's the legacy annotations that have changed. Then we update the other annotations to reflect the same change that was made by the function.

Cases not handled

This does not handle the case that a function, say, modifies config.k8s.io/id and internal.config.kubernetes.io/path - where it has modified one internal annotation and one legacy annotation. I am not sure why a function author would choose to do this, and such a case may be considered invalid. With this implementation of the migration, the behavior would be undefined. If somehow at the end of the reconciliation algorithm, the annotations still differ, the orchestrator favors the new internal annotation when writing output.

ALLOW_MODULE_SPAN

cc @yuwenma

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: natasha41575

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 18, 2021
@natasha41575
Copy link
Contributor Author

/hold

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Sep 18, 2021
@natasha41575
Copy link
Contributor Author

/retest

@monopole
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 24, 2021
@monopole
Copy link
Contributor

Might want to file an issue to clean up after this, or at least record what cleanup could be done - while the context is still fresh in mind

@monopole monopole merged commit 22b7358 into kubernetes-sigs:master Sep 24, 2021
@mengqiy
Copy link
Member

mengqiy commented Sep 24, 2021

@natasha41575 Please help me understand how it work in this scenario.
I have a function that modify path annotation. The input resourceList has both annotations that points to the same file. After processing by the function, the output resourceList will still have both annotations but points to different files.
When the orchestrator (e.g. kpt or kustomize) looks at the output. It's clear what is the users' intend by only looking at the output.
A potential solution is to compare the input and output and then infer the user intend.
Thoughts?

@natasha41575
Copy link
Contributor Author

natasha41575 commented Sep 24, 2021

I have a function that modify path annotation. The input resourceList has both annotations that points to the same file. After processing by the function, the output resourceList will still have both annotations but points to different files.
When the orchestrator (e.g. kpt or kustomize) looks at the output. It's clear what is the users' intend by only looking at the output.
A potential solution is to compare the input and output and then infer the user intend.

If I understand you correctly, that's exactly what this PR does. Say the input resource has the following:

config.kubernetes.io/path: foo
internal.config.kubernetes.io/path: foo

and the output resource has the following:

config.kubernetes.io/path: foo
internal.config.kubernetes.io/path: bar

This PR creates a map structure that will store the original path value "foo" (along with the index and id annotations, which are empty here). Then, when processing the output, it will look in the map for "bar". It doesn't find it, so it then looks for "foo". (It will actually look for something in the map with matching path, index, and id but I am ignoring that for the simplicity of this example). Because it finds it in the stored map of original values, it infers the intent and changes the output resource to the following:

config.kubernetes.io/path: bar
internal.config.kubernetes.io/path: bar

See the test here: https://github.com/natasha41575/kustomize/blob/67a5f6d68f9f5f21f59b7a939bb4d66c2c779aef/kyaml/kio/kio_test.go#L349 and all the tests after it.

which I believe test exactly the scenario you've described. Please let me know if that answers your question.

The reason that I am storing the original annotations in a map, rather than checking the input directly against the output, is that the order of the resources may change, some resources may be generated or removed, etc. There is no guarantee and so we have to have another mechanism of looking up what the original annotations were.

@mengqiy
Copy link
Member

mengqiy commented Sep 24, 2021

Thanks for your detailed explanation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants