tests/node_labeller.go: Fix test_id: 6247 #11848

Barakmor1 · 2024-05-05T08:29:14Z

What this PR does

We are setting kubevirt.configuration.ObsoleteCPUModels to nil to align with test expectations. The test assumes that kubevirt.configuration.ObsoleteCPUModels is not set.
Before this PR, if kubevirt.configuration.ObsoleteCPUModels was set before the tests started running, then the test would fail.
After this PR, if kubevirt.configuration.ObsoleteCPUModels was set before the tests started running, it will be set to nil to prevent failure.

Additionally, we will wait for 30 seconds before failing to give the node labeller enough time to set the new labels on the nodes.

Fixes #

Why we need it and why it was done in this way

The following tradeoffs were made:

The following alternatives were considered:

Links to places where the discussion took place:

Special notes for your reviewer

Checklist

This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR.
Approvers are expected to review this list.

Design: A design document was considered and is present (link) or not required
PR: The PR description is expressive enough and will help future contributors
Code: Write code that humans can understand and Keep it simple
Refactor: You have left the code cleaner than you found it (Boy Scout Rule)
Upgrade: Impact of this change on upgrade flows was considered and addressed if required
Testing: New code requires new unit tests. New features and bug fixes require at least on e2e test
Documentation: A user-guide update was considered and is present (link) or not required. You want a user-guide update if it's a user facing feature / API change.
Community: Announcement to kubevirt-dev was considered

Release note

NONE

kubevirt-bot · 2024-05-05T08:29:27Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign enp0s3 for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

tests/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

orelmisan

Thank you for the PR @Barakmor1.

Could you please give a few more words about the current code (whats wrong with it) and how does the change fixes it?

tests/infrastructure/node-labeller.go

Barakmor1 · 2024-05-05T11:04:46Z

Thank you for the PR @Barakmor1.

Could you please give a few more words about the current code (whats wrong with it) and how does the change fixes it?

In my opinion, the description is clear enough. Please let me know if there's anything that isn't clear to you.

orelmisan · 2024-05-05T11:30:13Z

Thank you for the PR @Barakmor1.
Could you please give a few more words about the current code (whats wrong with it) and how does the change fixes it?

In my opinion, the description is clear enough. Please let me know if there's anything that isn't clear to you.

What was the reason to explicitly set the KubeVirt CR?
What is the reason node is assigned nodesWithKVM[0] and is then overridden?
What was the reason to use Eventually?

Barakmor1 · 2024-05-05T12:01:25Z

Thank you for the PR @Barakmor1.
Could you please give a few more words about the current code (whats wrong with it) and how does the change fixes it?

In my opinion, the description is clear enough. Please let me know if there's anything that isn't clear to you.

What was the reason to explicitly set the KubeVirt CR? What is the reason node is assigned nodesWithKVM[0] and is then overridden? What was the reason to use Eventually?

Hey i modified the PR description i hope it is clearer now

orelmisan · 2024-05-05T12:06:38Z

Thank you for adding additional context.
Could you please update the commit message as well?

orelmisan · 2024-05-05T12:12:31Z

tests/infrastructure/node-labeller.go

+				node, err = virtClient.CoreV1().Nodes().Get(context.Background(), node.Name, metav1.GetOptions{})
+				Expect(err).ToNot(HaveOccurred())


AFAIU, the test will immediately fail in case there was an error fetching the node.
Please consider returning an error, so there will be a retry.
Also, please consider giving the fetched node another name, so the details of the original node will not be lost in case of an error.

the Get API call shouldn't fail.

There could be a temporary network hiccup.

In that case i don't think we should retry.

Why not?
Failing immediately due to an API call failure could make this test flaky.

In any other place in the functional tests, I never saw that we retry if the get API call fails. I prefer to keep it consistent.

I tend to agree with @orelmisan!
This is an Eventually block and we should satisfy the "eventually" behavior.
As is, if an error occurs the block is never retried.
I suggest you to switch to:

Eventually(func(g Gomega) { ... g.Expect(err).ToNot(HaveOccurred()) ...

Or, as suggested, to return the error, so that it will be wrapped by the Eventually.
IMHO if there are other places where things are done like this, we should fix them instead of spreading it. :)
Thank you!

@fossedihelm I disagree.
Eventually retries doesn't fit all cases. For instance, in the scenario of a connectivity issue causing the get API call to fail, I would prefer to know about it as soon as possible. Such failures might suggest other issues due to instability. The Eventually retries should be used in cases where we expect components to eventually become consistent. In the mentioned scenario, the Eventually retries fits perfectly because we change the configuration and wait for the node labeller to propagate the new configuration and label the nodes accordingly. Additionally, if we would retry every function that returns an error, the code would include significant amount boilerplate.

All over the e2e test suite, we do API calls like Get() or even Create() and expect no error occurred, without giving it a chance to try again.
I think I'm with Barak here and don't think we should necessarily keep looping if an API call failed...
As Orel mentioned, returning the error would make this test slightly more robust against flaky infra, but I don't think tests are responsible for doing that.
This is an interesting debate though. Once we reach an agreement, we should document the decision as a KubeVirt coding guideline (in that doc maybe? #11456)

Sure! Agree with you that we should reach an agreement and align all the cases to it.
Thank you

We are setting kubevirt.configuration.ObsoleteCPUModels to nil to align with test expectations. The test assumes that kubevirt.configuration.ObsoleteCPUModels is not set. Before this PR, if kubevirt.configuration.ObsoleteCPUModels was set before the tests started running, then the test would fail. After this PR, if kubevirt.configuration.ObsoleteCPUModels was set before the tests started running, it will be set to nil to prevent failure. Additionally, we will wait for 30 seconds before failing to give the node labeller enough time to set the new labels on the nodes. Signed-off-by: bmordeha <bmordeha@redhat.com>

kubevirt-bot · 2024-05-05T13:21:26Z

@Barakmor1: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-kubevirt-e2e-kind-1.27-vgpu	`fa6953d`	link	false	`/test pull-kubevirt-e2e-kind-1.27-vgpu`

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

orelmisan

Thank you @Barakmor1

kubevirt-bot added release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels May 5, 2024

kubevirt-bot requested review from kbidarkar and lyarwood May 5, 2024 08:29

kubevirt-bot added the size/S label May 5, 2024

orelmisan reviewed May 5, 2024

View reviewed changes

tests/infrastructure/node-labeller.go Outdated Show resolved Hide resolved

Barakmor1 force-pushed the fixtest branch from 35a87de to 987ac23 Compare May 5, 2024 12:12

orelmisan reviewed May 5, 2024

View reviewed changes

Barakmor1 force-pushed the fixtest branch from 987ac23 to fa6953d Compare May 5, 2024 12:45

orelmisan approved these changes May 6, 2024

View reviewed changes

kubevirt-bot assigned orelmisan May 6, 2024

kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label May 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tests/node_labeller.go: Fix test_id: 6247 #11848

tests/node_labeller.go: Fix test_id: 6247 #11848

Barakmor1 commented May 5, 2024 •

edited

kubevirt-bot commented May 5, 2024

orelmisan left a comment

Barakmor1 commented May 5, 2024

orelmisan commented May 5, 2024

Barakmor1 commented May 5, 2024

orelmisan commented May 5, 2024

orelmisan May 5, 2024

Barakmor1 May 5, 2024

orelmisan May 5, 2024

Barakmor1 May 5, 2024

orelmisan May 5, 2024

Barakmor1 May 5, 2024 •

edited

fossedihelm May 5, 2024

Barakmor1 May 5, 2024 •

edited

jean-edouard May 6, 2024 •

edited

fossedihelm May 6, 2024

kubevirt-bot commented May 5, 2024 •

edited

orelmisan left a comment

		node, err = virtClient.CoreV1().Nodes().Get(context.Background(), node.Name, metav1.GetOptions{})
		Expect(err).ToNot(HaveOccurred())

tests/node_labeller.go: Fix test_id: 6247 #11848

Are you sure you want to change the base?

tests/node_labeller.go: Fix test_id: 6247 #11848

Conversation

Barakmor1 commented May 5, 2024 • edited

What this PR does

Why we need it and why it was done in this way

Special notes for your reviewer

Checklist

Release note

kubevirt-bot commented May 5, 2024

orelmisan left a comment

Choose a reason for hiding this comment

Barakmor1 commented May 5, 2024

orelmisan commented May 5, 2024

Barakmor1 commented May 5, 2024

orelmisan commented May 5, 2024

orelmisan May 5, 2024

Choose a reason for hiding this comment

Barakmor1 May 5, 2024

Choose a reason for hiding this comment

orelmisan May 5, 2024

Choose a reason for hiding this comment

Barakmor1 May 5, 2024

Choose a reason for hiding this comment

orelmisan May 5, 2024

Choose a reason for hiding this comment

Barakmor1 May 5, 2024 • edited

Choose a reason for hiding this comment

fossedihelm May 5, 2024

Choose a reason for hiding this comment

Barakmor1 May 5, 2024 • edited

Choose a reason for hiding this comment

jean-edouard May 6, 2024 • edited

Choose a reason for hiding this comment

fossedihelm May 6, 2024

Choose a reason for hiding this comment

kubevirt-bot commented May 5, 2024 • edited

orelmisan left a comment

Choose a reason for hiding this comment

Barakmor1 commented May 5, 2024 •

edited

Barakmor1 May 5, 2024 •

edited

Barakmor1 May 5, 2024 •

edited

jean-edouard May 6, 2024 •

edited

kubevirt-bot commented May 5, 2024 •

edited