Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated upgrade test fails #1891

Closed
bjee19 opened this issue Apr 26, 2024 · 5 comments · Fixed by #1994
Closed

Automated upgrade test fails #1891

bjee19 opened this issue Apr 26, 2024 · 5 comments · Fixed by #1994
Assignees
Labels
refined Requirements are refined and the issue is ready to be implemented. tests Pull requests that update tests
Milestone

Comments

@bjee19
Copy link
Contributor

bjee19 commented Apr 26, 2024

Describe the bug
The upgrade_test.go automated NFR fails.

To Reproduce

  1. Run automated upgrade test.

Expected behavior
Test should match 1.2 results.

Your environment

  • Version of the NGINX Gateway Fabric - edge
  • Version of Kubernetes - 1.28.7-gke.1026000
  • Kubernetes platform (e.g. Mini-kube or GCP) - GKE
  • Details on how you expose the NGINX Gateway Fabric Pod (e.g. Service of type LoadBalancer or port-forward) - LoadBalancer

Additional context

[FAILED] [159.897 seconds]
Upgrade testing [It] upgrades NGF with zero downtime [nfr, upgrade]
/home/username/nginx-gateway-fabric/tests/suite/upgrade_test.go:83

  [FAILED] Expected success, but got an error:
      <*fmt.wrapError | 0xc00102f360>: 
      client rate limiter Wait returned an error: context deadline exceeded
      {
          msg: "client rate limiter Wait returned an error: context deadline exceeded",
          err: <context.deadlineExceededError>{},
      }
  In [It] at: /home/username/nginx-gateway-fabric/tests/suite/upgrade_test.go:210 @ 04/29/24 20:15:02.521

  Full Stack Trace
    github.com/nginxinc/nginx-gateway-fabric/tests/suite.init.func8.3.2({0x1dcd6500?, 0xc00006ea80?})
        /home/username/nginx-gateway-fabric/tests/suite/upgrade_test.go:210 +0xb6
    k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func2(0xc0004c7a88?, {0x1e61310?, 0xc00104ea80?})
        /home/username/go/pkg/mod/k8s.io/apimachinery@v0.30.0/pkg/util/wait/loop.go:87 +0x52
    k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext({0x1e61310, 0xc00104ea80}, {0x1e56418, 0xc00102f0a0}, 0x1, 0x0, 0xc0004c7e90)
        /home/username/go/pkg/mod/k8s.io/apimachinery@v0.30.0/pkg/util/wait/loop.go:88 +0x24d
    k8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel({0x1e61310, 0xc00104ea80}, 0xdf8475800?, 0x1, 0xc0009e3e90)
        /home/username/go/pkg/mod/k8s.io/apimachinery@v0.30.0/pkg/util/wait/poll.go:33 +0x56
    github.com/nginxinc/nginx-gateway-fabric/tests/suite.init.func8.3()
        /home/username/nginx-gateway-fabric/tests/suite/upgrade_test.go:205 +0x97e


This chunk of code in upgrade_test.go:

var lease coordination.Lease
		key := types.NamespacedName{Name: "ngf-test-nginx-gateway-fabric-leader-election", Namespace: ngfNamespace}
		Expect(wait.PollUntilContextCancel(
			leaseCtx,
			500*time.Millisecond,
			true, /* poll immediately */
			func(_ context.Context) (bool, error) {
				Expect(k8sClient.Get(leaseCtx, key, &lease)).To(Succeed())

				if lease.Spec.HolderIdentity != nil {
					for _, podName := range podNames {
						if podName == *lease.Spec.HolderIdentity {
							return true, nil
						}
					}
				}

				return false, nil
			},
		)).To(Succeed())

Will fail due to the *lease.Spec.HolderIdentity always containing a hash after the podName. e.g. my-release-nginx-gateway-fabric-fd4bc4cb6-dx7dp_9e75f1dc-3ed4-4ff5-969a-0c98940a7721

@bjee19
Copy link
Contributor Author

bjee19 commented Apr 26, 2024

May also want to run other automated tests to see if they are still functioning correctly.

Copy link
Contributor

This issue is stale because it has been open 14 days with no activity. Remove stale label or comment or this will be closed in 14 days.

@github-actions github-actions bot added the stale Pull requests/issues with no activity label May 14, 2024
@sjberman sjberman removed the stale Pull requests/issues with no activity label May 14, 2024
@sjberman
Copy link
Contributor

@bjee19 Is this still an issue? If so, we should prioritize this so it doesn't fail when we do release testing.

@bjee19
Copy link
Contributor Author

bjee19 commented May 14, 2024

@sjberman yep, just re-ran the test on main and got the same error.

@sjberman sjberman added this to the v1.3.0 milestone May 14, 2024
@sjberman sjberman added the tests Pull requests that update tests label May 14, 2024
@ciarams87 ciarams87 self-assigned this May 20, 2024
@mpstefan mpstefan added the refined Requirements are refined and the issue is ready to be implemented. label May 20, 2024
@ciarams87
Copy link
Member

Re-ran the rest of the automated tests with no issues found. Reran longevity test also with time set to 5m just to ensure the automation around was still working correctly. I didn't do any analysis, my only focus was to ensure the automation was working correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refined Requirements are refined and the issue is ready to be implemented. tests Pull requests that update tests
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

4 participants