Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manage resourceVersion to allow resilient restart of watch method #2223

Open
alexisdondon opened this issue May 2, 2024 · 2 comments
Open
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature.

Comments

@alexisdondon
Copy link

What is the feature and why do you need it:

We are using the method stream of object Watch

def stream(self, func, *args, **kwargs):
.

Let say i use this method with v1.list_namespace with no timeout specified (https://github.com/kubernetes-client/python/blob/master/examples/watch/timeout-settings.md) then we see that:

  • with no resourceVersion and timeout specified, the stream atfirst list all namespaces as 'ADDED' event the namespace are alphatically ordered.
  • then the stream wait for event with a self.resource_version probably quite old
  • we then hit a server timeout default kubernetes specified in the previous link between 30min and 1h then the watch and hit a 410.
  • we need then to restart the stream

If during the 30min 1 hour period a namespace is created then the watch store a more recent resourceVersion and then the 410 is reached quite further in the time (probably depending of the history or activity on the cluster).

Describe the solution you'd like to see:

From our test the good resourceVersion to plan a restart is not the resourceVersion of the last event seen but the resourceVersion available in the metadata in the func argument of the stream metdhod func .

In the response there is a metadata.resourceVersion given by kubernetes that allow to restart the stream from this resourceVersion that generate no error.

Not sure if this metadata is available on all func method.

It's quite hard to understand how to use the watch method in the api if we want to maintain a daemon program with no error.
With a no resourceVersion and no timeout specified everyone should now that there is this kind of problem due to the self.resourceVersion storage

@alexisdondon alexisdondon added the kind/feature Categorizes issue or PR as related to a new feature. label May 2, 2024
@roycaihw
Copy link
Member

roycaihw commented May 8, 2024

/help

@k8s-ci-robot
Copy link
Contributor

@roycaihw:
This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

  • Why are we solving this issue?
  • To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
  • Does this issue have zero to low barrier of entry?
  • How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

3 participants