Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ multiNamespaceCache: support custom newCache funcs per namespace #1962

Closed

Conversation

joelanford
Copy link
Member

This PR adds more flexibility to multi-namespace caches by introducing a new cache builder function (cache.ByNamespaceBuilder) that enables callers to specify individual cache builder functions on a per-namespace basis (with additional support for specifying a catch-all cache builder and a cluster-scoped cache builder)

The primary use case of this is to enable caching the same GVK with a different set of selectors/transformers/deepcopy config based on which namespace the objects are in.

Concretely, imagine a controller that needs to read configmaps in its own namespace and also manage configmaps during reconciliation of its primary object, where each set has effectively nothing to do with the other (i.e. a shared label selector is not feasible). In order to efficiently cache both sets, one might need to:

  • Cache all configmaps in <controller>-system namespace
  • In all other namespaces, cache only configmaps with label managed-by: <controller>

I originally considered implementing this out-of-tree, but because it so closely resembles and builds upon the existing in-tree multiNamespaceCache, I thought it made more sense to include here.

Signed-off-by: Joe Lanford joe.lanford@gmail.com

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 28, 2022
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 28, 2022
@joelanford joelanford removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 28, 2022
@joelanford joelanford requested review from vincepri and alvaroaleman and removed request for gerred August 8, 2022 14:53
Comment on lines 116 to 189
func MultiNamespacedCacheBuilder(namespaces []string) NewCacheFunc {
byNamespaceOpts := ByNamespaceOptions{NewNamespaceCaches: map[string]NewCacheFunc{}}
for _, ns := range namespaces {
byNamespaceOpts.NewNamespaceCaches[ns] = New
}
return BuilderByNamespace(byNamespaceOpts)
}
Copy link
Member Author

@joelanford joelanford Aug 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, MultiNamespacedCacheBuilder is simply a specific configuration of the BuilderByNamespace where the catch-all namespace-scoped cache is left as nil (to avoid caching namespace-scoped objects outside the specified namespaces)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @alvaroaleman on making the functionality look similar across both options. But the question is if BuilderByNamespace and MultiNamespaceCacheBuilder support specifying cache funcs per namespace/cluster/all other namespaces wise, then what would be the underlying difference between them. Can we embed byNamespaceOpts directly into MultiNamespacedCacheBuilder? This way, the user can either provide the list of namespaces they want to cache or cache funcs for each of them. One option for this would be expose these options through WithNamespace or WithCacheOptions methods. This is definitely a major breaking change, but I guess its worth exploring instead of having two methods do the same with difference just being the parameters passed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just felt it didn't make sense to break MultiNamespacedCacheBuilder for no good reason. I'd be on board with deprecating it though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having BuilderByNamespace and MultiNamespaceCacheBuilder as it is currently, can we introduce MultiNamespaceCacheBuilderWithOptions (or some better name) that allows both options - providing a list of namespaces, and/or cache funcs (with builder opts) and then deprecate (eventually remove) MultiNamespacedCacheBuilder in following releases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK updated to:

  • Call the new function MultiNamespacedCacheWithOptionsBuilder
  • Use functional options instead of a struct
  • Deprecate MultiNamespacedCacheBuilder
  • Include an example in the MultiNamespacedCacheBuilder deprecation message that explains how to achieve the existing functionality with the new builder function.

Does this get closer to what y'all are imagining?

if err != nil {
return nil, err
}
nsToCache[corev1.NamespaceAll] = defaultNamespaceCache
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I understand this. The clusterCache is for cluster-scoped objects, so what for is this? I thought NewDefaultNamespaceCache was intended to be the default used if a namespace has no explicit NewCacheFunc?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we fall through to this cache if no other cache matches, but that is new behavior or not? I don't think it is very expected that we have a multi_namespace_cache that will create a global cache if used to request an object that is not in one of the configured namespaces?

Copy link
Member Author

@joelanford joelanford Aug 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought NewDefaultNamespaceCache was intended to be the default used if a namespace has no explicit NewCacheFunc?

That's correct.

It looks like we fall through to this cache if no other cache matches, but that is new behavior or not?

That is new behavior, but only if you're using the ByNamespaceBuilder variation of the builder. The existing MultiNamespacedCacheBuilder leaves NewDefaultNamespaceCache set to nil, meaning there is no catch-all global cache, so the behavior there remains the same.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is new behavior, but only if you're using the ByNamespaceBuilder variation of the builder. The existing MultiNamespacedCacheBuilder leaves NewDefaultNamespaceCache set to nil, meaning there is no catch-all global cache, so the behavior there remains the same.

IMHO it is extremely unintuitive that this happens in one builder and not the other and IMHO it is wrong to do this by default. If anything, we should have an explicit knob to opt it in to this in both the builders.

Also IMHO rather than forcing a selector for all namespaces by using the map, it would be nicer to have a struct as value that optionally takes a selector and an option for a DefaultSelector that if set will be used if a namespace does not have a selector.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO it is extremely unintuitive that this happens in one builder and not the other and IMHO it is wrong to do this by default. If anything, we should have an explicit knob to opt it in to this in both the builders.

  • In the BuilderByNamespace, it is opt-in. You have to specifically set NewDefaultNamespaceCache.
  • In MultiNamespacedCacheBuilder there has never been a fall-back option, and my thought was to not make a breaking change to that builder whatsoever, meaning there's no way to opt-in.

Also IMHO rather than forcing a selector for all namespaces by using the map, it would be nicer to have a struct as value that optionally takes a selector and an option for a DefaultSelector that if set will be used if a namespace does not have a selector.

I'm not sure I'm following this. There's nothing in this layer that deals directly with selectors. This is just "I want different NewCacheFuncs per specific namespace/catch-all namespaces/cluster-scope."

pkg/cache/cache.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 18, 2022
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 25, 2022
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 29, 2022
@joelanford joelanford force-pushed the cache-by-namespace-builder branch 3 times, most recently from c579070 to bec1314 Compare August 31, 2022 19:03
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Aug 31, 2022
@joelanford
Copy link
Member Author

joelanford commented Aug 31, 2022

#1980 merged, so I rebased to latest master. This PR is ready for review again.

@joelanford
Copy link
Member Author

/retest

1 similar comment
@joelanford
Copy link
Member Author

/retest

@joelanford joelanford force-pushed the cache-by-namespace-builder branch 2 times, most recently from c5fee54 to 60dddc0 Compare September 2, 2022 14:14

// WithNamespaceCache configures MultiNamespacedCacheWithOptionsBuilder
// with a namespace cache that uses the provided NewCacheFunc.
func WithNamespaceCache(namespace string, f NewCacheFunc) MultiNamespacedOption {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MultiNamespacedCacheWithOptionsBuilder is 👍 . Just one question, since we are exposing
WithNamespaceCaches and WithNamespaceCache, what if a user does:

cache.WithNamespaceCache(testNamespaceOne, cache.BuilderWithOptions(....)),
cache.WithNamespaceCaches([]string{testNamespaceOne, testNamespaceTwo, testNamespaceThree}, cache.New),

In this case, looks like we will simply override the custom cache func which the user provided for testNamespaceOne. Is this something we want to address and validate the inputs, or leave it to the user's responsibility to not give us duplicates?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, that would result in testNamespaceOne: cache.New

I could document that if the same namespace is provided multiple times, the last applied NewCacheFunc for that namespace wins.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last commit adds a line in the godoc that last setting for a particular namespace wins.

Signed-off-by: Joe Lanford <joe.lanford@gmail.com>
Copy link
Member

@varshaprasad96 varshaprasad96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold
/lgtm
Looks good from my end. Placing it on hold for @alvaroaleman's reviews.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 2, 2022
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 2, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: joelanford, varshaprasad96

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Member

@sbueringer sbueringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few nits

pkg/cache/multi_namespace_cache.go Outdated Show resolved Hide resolved
pkg/cache/multi_namespace_cache.go Outdated Show resolved Hide resolved
pkg/cache/multi_namespace_cache.go Outdated Show resolved Hide resolved
pkg/cache/multi_namespace_cache.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 6, 2022
Signed-off-by: Joe Lanford <joe.lanford@gmail.com>
@sbueringer
Copy link
Member

Thx!
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 6, 2022
}
}

func setDefaultNamespacedCacheOpts(opts Options, newObjectCaches map[string]NewCacheFunc) Options {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this will end up setting up a cluster-scoped cache with a field-selector that excludes all namespaces for which we have a cache?

If yes, the problem with that is that it will fail if you don't have cluster-wide rbac - Wouldn't it be more useful to create a namespace-scoped cache with the configured selector whenever we get a request for a namespace that doesn't have a cache?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this will end up setting up a cluster-scoped cache with a field-selector that excludes all namespaces for which we have a cache?

It depends on the NewCacheFunc defined by WithDefaultNamespacedCache. But yeah, I'd generally imagine that would be a cluster-wide cache.

Wouldn't it be more useful to create a namespace-scoped cache with the configured selector whenever we get a request for a namespace that doesn't have a cache?

Two thoughts:

  1. If a caller uses client.List() without client.InNamespace they're essentially asking for all $things in the cluster (modulo the selectors used to configure the various subcaches). It isn't immediately obvious to me how we'd handle this situation with individual per-namespace caches for catch-all namespaces.
  2. There's a performance/permissions tradeoff here. If you assume cluster-wide list/watch RBAC, you only need one informer. If you don't/can't assume cluster-wide list/watch RBAC, you'd end up creating an informer and open a watch connection for each catch-all namespace you end up querying. In a cluster with lots of namespaces, that could end up putting quite a load on the apiserver.

Thoughts? Perhaps we could make this configurable?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a caller uses client.List() without client.InNamespace they're essentially asking for all $things in the cluster (modulo the selectors used to configure the various subcaches). It isn't immediately obvious to me how we'd handle this situation with individual per-namespace caches for catch-all namespaces.

I think the expectation there would be that you get whatever is in the cache, which is a subset of what is actually in the cluster

There's a performance/permissions tradeoff here. If you assume cluster-wide list/watch RBAC, you only need one informer. If you don't/can't assume cluster-wide list/watch RBAC, you'd end up creating an informer and open a watch connection for each catch-all namespace you end up querying. In a cluster with lots of namespaces, that could end up putting quite a load on the apiserver.

But what would be the reason for using this cache in the first place if you have cluster-wide rbac? Just getting a selector for one namespace only? Or is the actual, hidden use-case behind this "namespaced cache for object type X, global cache for everything else"?

I feel like the more we start supporting such rather exotic use-cases, the more ppl will just come up with even more of them. What if the next person wants to have different cache configurations based on object type or a combination of object type and namespace? Wouldn't that be the point where it would be easier to have some kind of CacheRouter that allows to freely configure something that inspects requests and then forwards them to the appropriate, configurable cache?

Sorry for being a bit negative here, it is just that this is IMHO both complicated enough to not be obvious to reason about and not solving the problem of "People might want an arbitrary cache depending on any possible combination of object type and selectors".

Copy link
Member Author

@joelanford joelanford Sep 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what would be the reason for using this cache in the first place if you have cluster-wide rbac?

The specific motivation I have for this cache is memory optimization. I want to cache all configmaps in my controller's namespace; but in every other namespace, I only want to cache configmaps that match a certain label selector.

@joelanford
Copy link
Member Author

@alvaroaleman I'm still interested in getting this merged. I think my overall feeling on this is that it's just a refactoring of the existing MultiNamespaceCacheBuilder that gives users more flexibility in that it allows the NewCache functions to be provided by the user rather than hard-coded (and it adds the optional catch-all cache for namespaced objects)

From my PR description:

I originally considered implementing this out-of-tree, but because it so closely resembles and builds upon the existing in-tree multiNamespaceCache, I thought it made more sense to include here.

I did look into implementing this out-of-tree, but IIRC it would have meant one of:

  • exporting the multiNamespaceCache struct and its fields to enable and out-of-tree caller to manipulate them to support different cache building functions
  • copying basically all of the multiNamespaceCache implementation out-of-tree AND exporting some of the internal cache code that it uses.

I completely understand your point about not wanting controller-runtime to be a dumping ground for exotic solutions to niche problems.

  1. I don't think this is that. I think the use case I'm solving for is probably quite common, but I'd imagine that lots of controllers just end up using a cluster-wide cache and unknowingly consume more memory than necessary.
  2. We might need to think about ways to make it easier to build caches like this out-of-tree. It seems to me that there's a good chunk of internal code that would be really useful for building other cache implementations.

@joelanford
Copy link
Member Author

For what it's worth, I've been able to workaround not having this PR merged by making use of a separate Cluster, which is configured with the same apiserver credentials, but with a different namespace scoping.

This workaround means there are two separate caches:

  • The "main" cache, which is scoped to all namespaces but is configured with a label selector for objects.
  • A cache for the operator's namespace, which is scoped to just the operator's namespace but without any selectors.

As a result of the separate caches, I have to be careful to setup my watches. Specifically, I need to use source.NewKindWithCache for watches of objects in my operator's namespace.

Nonetheless, IMO this workaround is non-intuitive and isn't really an intended use of the Cluster object, so I still believe this PR should be accepted and merged.

@alvaroaleman
Copy link
Member

alvaroaleman commented Nov 25, 2022

Okay, I am trying to reason through all the possible use-cases here:

  1. Limit the namespaces that can be used globally (implemented in multi-ns cache today)
  2. Configure a PerNS+PerObject selector (not available today)
  3. Configure a per-object selector (available today, if we implement 2 this would only be used for all namesapces that do not have a perNS+perObject selector)
  4. Limit namespaces per object, this isn't really possible today (there is no or field selector from what I can see, might be possible to propose upstream? Not sure)
  5. Configure a per-namespace selector, this would be used for all namespace+object combination where we don't have a namesapce+object or object level selector
  6. Configure a global default for everything else (available today)

Is the above approximately correct or did I miss anything? If yes, I would probably suggest that we end up extenting the cache.Options with:

  1. Adding a NamespacesByObject option that limits the namespaces by object, probably with a useSingleInformer setting if it is somehow possible to do that
  2. Adding a Namespaces option that globally limits the namespaces and makes the constructor use the multi-namespace cache underneath for objects that do not have a NamespacesByObject configuration
  3. If this is possible add a UseSingleInformer option that will be used in the multi ns cache
  4. Adding an SelectorByNamespaceAndObject option. This has precedence over the existing SelectorByObject for the configured namespaces. This is basically what you want with this change, IIUIC.
  5. Adding a SelectorByNamespace option that we will use unless there is a SelectorByNamespaceAndObject or SelectorByObject config.
  6. Finally, we will fall through to DefaultSelector if nothing more specific exists
  7. Eventually someone will likely ask for DefaultSelectorExclude lol

What do you think, did I miss anything? I am btw not asking you to implement this, I would just like to have a coherent design of where we want to get to in the end rather than a series of ad-hoc changes that don't really fit in together and are confusing to use.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 1, 2023
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot
Copy link
Contributor

@joelanford: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-controller-runtime-test 40e2e1f link true /test pull-controller-runtime-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@vincepri
Copy link
Member

@joelanford We should chat about a design document on how the above can become a result of reworking the options, the multi-namespace constructor got deprecated on main, and it'll be removed later on. We should work on an overall design that makes sense around the cache.Options

@vincepri
Copy link
Member

Closing this for now, let's rally behind a feature design in a hackmd of sorts, or a different PR based on the latest changes, once ready

/close

@k8s-ci-robot
Copy link
Contributor

@vincepri: Closed this PR.

In response to this:

Closing this for now, let's rally behind a feature design in a hackmd of sorts, or a different PR based on the latest changes, once ready

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants