Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Add cache option to filter by field #1404

Closed

Conversation

qinqon
Copy link
Contributor

@qinqon qinqon commented Mar 1, 2021

All instance for a same resources are being cached by
controller-runtime, for some use cases this consumes a lot of memory and
CPU. This change add a option to the cache so resources can be filtered
by field.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 1, 2021
@k8s-ci-robot
Copy link
Contributor

Hi @qinqon. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 1, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: qinqon
To complete the pull request process, please assign mengqiy after the PR has been reviewed.
You can assign the PR to them by writing /assign @mengqiy in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 1, 2021
@qinqon qinqon changed the title Add cache option to filter by field 🐛 🐛 Add cache option to filter by field 🐛 Mar 1, 2021
@qinqon qinqon force-pushed the add-cache-filter-option branch 3 times, most recently from 250a741 to c1948c8 Compare March 1, 2021 13:53
Copy link
Member

@vincepri vincepri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @qinqon, thanks for the contribution. Left a few comments, let's please reduce the PR to the absolute minimum set of changes needed for the requested feature addition. Also this doesn't seem to be a 🐛, but rather a new feature ✨

Comment on lines 79 to 80
// Scheme maps runtime.Objects to GroupVersionKinds
Scheme *runtime.Scheme
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems outside the scope of this PR?

Copy link
Contributor Author

@qinqon qinqon Mar 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 85 to 87
// mapper maps GroupVersionKinds to Resources
mapper meta.RESTMapper

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, could we keep these?

Copy link
Contributor Author

@qinqon qinqon Mar 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 100 to 104
// resync is the base frequency the informers are resynced
// a 10 percent jitter will be added to the resync period between informers
// so that all informers will not send list requests simultaneously.
resync time.Duration

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, could we keep these?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 120 to 122
// namespace is the namespace that all ListWatches are restricted to
// default or empty string means all namespaces
namespace string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, could we keep these?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


// FieldSelectorByResource restricts the cache's ListWatch to the desired
// fields per resource
FieldSelectorByResource map[string]string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the key and value expected to look like here? Would it better to use a GroupKind as key instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or GroupResource?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used GroupResource so we can match all the versions.

Comment on lines 10 to 17
// Options are the optional arguments for creating a new InformersMap object
type Options struct {
Scheme *runtime.Scheme
Mapper meta.RESTMapper
Resync *time.Duration
Namespace string
FieldSelectorByResource map[string]string
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this is all internal, how are users supposed to use this new feature?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the internal Options

@vincepri
Copy link
Member

vincepri commented Mar 1, 2021

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 1, 2021
@qinqon qinqon changed the title 🐛 Add cache option to filter by field 🐛 ✨ Add cache option to filter by field 🐛 Mar 2, 2021
@qinqon qinqon force-pushed the add-cache-filter-option branch 4 times, most recently from f5be2b0 to 2dd1373 Compare March 2, 2021 15:15
@qinqon qinqon requested a review from vincepri March 2, 2021 15:16
@qinqon
Copy link
Contributor Author

qinqon commented Mar 2, 2021

Remove the commit that encapsulate some internal attributes to an internal Options struct and create a small helper to find the fieldselector on the map using the resource from mapper, @vincepri you can take a look now

@qinqon
Copy link
Contributor Author

qinqon commented Mar 2, 2021

/test pull-controller-runtime-test-master


// FieldSelectorByResource restricts the cache's ListWatch to the desired
// fields per resource
FieldSelectorByResource map[schema.GroupResource]string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a sample field selector here and link to the kube docs about field selectors, many ppl are not aware that they only work on a very limited set of fields and not at all for CRDs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its medadata.name, not meta.name or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, correct value is metadata.name.

@alvaroaleman
Copy link
Member

@qinqon this doesn't close #244 as it doesn't allow filtering by labels (which I would assume to be more common), so please remove that from the PR body

@@ -231,6 +235,15 @@ func (ip *specificInformersMap) addInformerToMap(gvk schema.GroupVersionKind, ob
return i, ip.started, nil
}

func (ip *specificInformersMap) findFieldSelectorByGVR(gvr schema.GroupVersionResource) string {
gr := schema.GroupResource{Group: gvr.Group, Resource: gvr.Resource}
fieldSelector, ok := ip.fieldSelectorByResource[gr]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This is an empty string if there isn't a match in the map, so you do not need the ok check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@qinqon
Copy link
Contributor Author

qinqon commented Mar 3, 2021

@qinqon this doesn't close #244 as it doesn't allow filtering by labels (which I would assume to be more common), so please remove that from the PR body

I can try to add labels filtering at a follow up PR to keep this small.

Copy link
Member

@alvaroaleman alvaroaleman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good except for the doc sample.

I can try to add labels filtering at a follow up PR to keep this small.

If you have the time for that, that would be great but I didn't mean to say that you are in any way obligated to that, just that it would be premature to close the issue


// FieldSelectorByResource restricts the cache's ListWatch to the desired
// fields per resource
FieldSelectorByResource map[schema.GroupResource]string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its medadata.name, not meta.name or not?

@qinqon
Copy link
Contributor Author

qinqon commented Mar 8, 2021

@vincepri do you know if Is the issue at pull-controller-runtime-apidiff-master important ?

@qinqon
Copy link
Contributor Author

qinqon commented Mar 8, 2021

force-push: Rebased

@qinqon qinqon changed the title ✨ Add cache option to filter by field 🐛 ✨ Add cache option to filter by field Mar 8, 2021
@alvaroaleman
Copy link
Member

@qinqon The one thing this change ignores entirely is what impact this has for the cache-backed client. Could you write a small design doc in a different PR that:

  • Outlines the motivation for this and possible alternatives
  • What this means for the cache-backed client
  • Also: Are there ways to defined a fieldSelector other than via a plain string?

Doesn't have to be anything big, but we asked for that multiple times in #244 (comment) and declined other PRs because of that. I personally like the simplicity of just saying "We allow this on the cache and if you use a custom cache on your manager, you imply that you know better" but its a bigger design question and I want to be sure there is agreement about that.

@qinqon
Copy link
Contributor Author

qinqon commented Mar 9, 2021

@qinqon The one thing this change ignores entirely is what impact this has for the cache-backed client. Could you write a small design doc in a different PR

@alvaroaleman done a PR with the proposal #1419

@qinqon
Copy link
Contributor Author

qinqon commented Mar 16, 2021

force-push: Rebased and converting the map value to fields.Selector

qinqon added a commit to qinqon/controller-runtime that referenced this pull request Mar 16, 2021
As a follow up of [1] this change instruct the ListWatcher from the
informers to select by label/field but instead of using a manager/cache
Options it pass the select using the builder.

[1] kubernetes-sigs#1404

Signed-off-by: Quique Llorente <ellorent@redhat.com>
All instance for a same resources are being cached by
controller-runtime, for some use cases this consumes a lot of memory and
CPU. This change add a option to the cache so resources can be filtered
by field.

Signed-off-by: Quique Llorente <ellorent@redhat.com>
@k8s-ci-robot
Copy link
Contributor

@qinqon: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-controller-runtime-apidiff-master ff97b8f link /test pull-controller-runtime-apidiff-master

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@qinqon qinqon closed this Apr 6, 2021
@qinqon qinqon deleted the add-cache-filter-option branch April 6, 2021 07:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants