-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tests/node_labeller.go: Fix test_id: 6247 #11848
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR @Barakmor1.
Could you please give a few more words about the current code (whats wrong with it) and how does the change fixes it?
In my opinion, the description is clear enough. Please let me know if there's anything that isn't clear to you. |
What was the reason to explicitly set the KubeVirt CR? |
Hey i modified the PR description i hope it is clearer now |
Thank you for adding additional context. |
node, err = virtClient.CoreV1().Nodes().Get(context.Background(), node.Name, metav1.GetOptions{}) | ||
Expect(err).ToNot(HaveOccurred()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIU, the test will immediately fail in case there was an error fetching the node.
Please consider returning an error, so there will be a retry.
Also, please consider giving the fetched node another name, so the details of the original node will not be lost in case of an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the Get API call shouldn't fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There could be a temporary network hiccup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case i don't think we should retry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not?
Failing immediately due to an API call failure could make this test flaky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any other place in the functional tests, I never saw that we retry if the get API call fails. I prefer to keep it consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to agree with @orelmisan!
This is an Eventually
block and we should satisfy the "eventually" behavior.
As is, if an error occurs the block is never retried.
I suggest you to switch to:
Eventually(func(g Gomega) {
...
g.Expect(err).ToNot(HaveOccurred())
...
Or, as suggested, to return the error, so that it will be wrapped by the Eventually
.
IMHO if there are other places where things are done like this, we should fix them instead of spreading it. :)
Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fossedihelm I disagree.
Eventually
retries doesn't fit all cases. For instance, in the scenario of a connectivity issue causing the get API call to fail, I would prefer to know about it as soon as possible. Such failures might suggest other issues due to instability. The Eventually
retries should be used in cases where we expect components to eventually become consistent. In the mentioned scenario, the Eventually
retries fits perfectly because we change the configuration and wait for the node labeller to propagate the new configuration and label the nodes accordingly. Additionally, if we would retry every function that returns an error, the code would include significant amount boilerplate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All over the e2e test suite, we do API calls like Get()
or even Create()
and expect no error occurred, without giving it a chance to try again.
I think I'm with Barak here and don't think we should necessarily keep looping if an API call failed...
As Orel mentioned, returning the error would make this test slightly more robust against flaky infra, but I don't think tests are responsible for doing that.
This is an interesting debate though. Once we reach an agreement, we should document the decision as a KubeVirt coding guideline (in that doc maybe? #11456)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! Agree with you that we should reach an agreement and align all the cases to it.
Thank you
We are setting kubevirt.configuration.ObsoleteCPUModels to nil to align with test expectations. The test assumes that kubevirt.configuration.ObsoleteCPUModels is not set. Before this PR, if kubevirt.configuration.ObsoleteCPUModels was set before the tests started running, then the test would fail. After this PR, if kubevirt.configuration.ObsoleteCPUModels was set before the tests started running, it will be set to nil to prevent failure. Additionally, we will wait for 30 seconds before failing to give the node labeller enough time to set the new labels on the nodes. Signed-off-by: bmordeha <bmordeha@redhat.com>
@Barakmor1: The following test failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @Barakmor1
What this PR does
We are setting
kubevirt.configuration.ObsoleteCPUModels
to nil to align with test expectations. The test assumes thatkubevirt.configuration.ObsoleteCPUModels
is not set.Before this PR, if
kubevirt.configuration.ObsoleteCPUModels
was set before the tests started running, then the test would fail.After this PR, if
kubevirt.configuration.ObsoleteCPUModels
was set before the tests started running, it will be set to nil to prevent failure.Additionally, we will wait for 30 seconds before failing to give the node labeller enough time to set the new labels on the nodes.
Fixes #
Why we need it and why it was done in this way
The following tradeoffs were made:
The following alternatives were considered:
Links to places where the discussion took place:
Special notes for your reviewer
Checklist
This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR.
Approvers are expected to review this list.
Release note