New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add non graceful shutdown integration test #119478
base: master
Are you sure you want to change the base?
add non graceful shutdown integration test #119478
Conversation
Signed-off-by: Ashutosh Kumar <sonasingh46@gmail.com>
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retest |
Signed-off-by: Ashutosh Kumar <sonasingh46@gmail.com>
Signed-off-by: Ashutosh Kumar <sonasingh46@gmail.com>
a4e7b44
to
898a7a1
Compare
t.Fatalf("error in deleting pod: %v", err) | ||
} | ||
waitForPodDeletionTimeStampToSet(t, testClient, pod.Name, namespaceName) | ||
waitForMetric(t, metrics.ForceDetachMetricCounter.WithLabelValues(metrics.ForceDetachReasonOutOfService), 1, "detach-metrics") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
metrics are not thread safe, if there are other tests in the same binary they will use the same global registry, please check there is no possibility of test pollution, I can't remember if the metrics registry is already initialized when the apiserver starts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the heads up re the metrics.
I have seen existing tests using metrics with gotCountofEvent>=expectedCountOfEvent
for similar reason and waitForMetric
function does the same.
cc @xing-yang
pod, err := testingClient.CoreV1().Pods(podNamespace).Get(context.TODO(), podName, metav1.GetOptions{}) | ||
if err != nil { | ||
t.Fatal(err) | ||
} | ||
if pod.DeletionTimestamp != nil { | ||
return true, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will be racy if the pod does not have a finalizer because can be deleted and return a 404 Not found error, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pod in question has been set a grace time period of 300 seconds in this test to allow enough time. So this should be fine. Let me know if you still have a different opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good, thanks for clarifying
func waitForMetric(t *testing.T, m basemetric.CounterMetric, expectedCount float64, identifier string) { | ||
if err := wait.Poll(100*time.Millisecond, 60*time.Second, func() (bool, error) { | ||
gotCount, err := metricstestutil.GetCounterMetricValue(m) | ||
fmt.Println(gotCount) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or this slipped from your local test or use t.Logf with a more meaninful message so it is debuggable
Signed-off-by: Ashutosh Kumar <sonasingh46@gmail.com>
6081207
to
9b63711
Compare
/retest |
1 similar comment
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your PR. Test scenario itself looks good. I've added some comments so please check them.
Signed-off-by: Ashutosh Kumar <sonasingh46@gmail.com>
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: sonasingh46 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@sonasingh46 Thank you for updating. Unfortunately, this failed on my local env.
|
Signed-off-by: Ashutosh Kumar <sonasingh46@gmail.com>
Thank you for updating. Looks good for me. |
|
||
ctx, cancel := context.WithCancel(context.Background()) | ||
defer cancel() | ||
go informers.Core().V1().Nodes().Informer().Run(ctx.Done()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that informers.Start() already starts this informer, is not this duplicate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will remove this line.
// wait for volume to be attached | ||
for i := 0; i < 10; i++ { | ||
node, err = testClient.CoreV1().Nodes().Get(context.TODO(), nodeName, metav1.GetOptions{}) | ||
if err != nil { | ||
t.Fatalf("Failed to get the node : %v", err) | ||
} | ||
if len(node.Status.VolumesAttached) > 1 { | ||
break | ||
} | ||
time.Sleep(1 * time.Second) | ||
} | ||
if len(node.Status.VolumesAttached) < 1 { | ||
t.Logf("failed to attach volume for pod %s on node %s", pod.Name, node.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have already libraries for doing these async checks
kubernetes/test/integration/apiserver/watchcache_test.go
Lines 78 to 83 in 854d0e7
if err := wait.Poll(100*time.Millisecond, wait.ForeverTestTimeout, func() (bool, error) { | |
_, err := clientSet.CoreV1().Services("default").Get(ctx, "kubernetes", metav1.GetOptions{}) | |
return err == nil, nil | |
}); err != nil { | |
t.Fatalf("Failed to wait for kubernetes service: %v:", err) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, you already using them below, why do you implement the loop to check here differently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will fix it. Missed this one. Thanks
Signed-off-by: Ashutosh Kumar <sonasingh46@gmail.com>
/retest |
looking into the test failures |
/retest |
@sonasingh46: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Hi @sonasingh46, can you please rebase and address the CI failures? Thanks. |
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR adds integration test for node out of service detach feature.
Ref feature PR: #108486
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: